提交 · 9f459fadbb38abe68aa342f533ca17d8d90d6f2e · openeuler / raspberrypi-kernel

20 8月, 2009 1 次提交

xen: rearrange things to fix stackprotector · ce2eef33

由 Jeremy Fitzhardinge 提交于 8月 17, 2009

Make sure the stack-protector segment registers are properly set up
before calling any functions which may have stack-protection compiled
into them.

[ Impact: prevent Xen early-boot crash when stack-protector is enabled ]
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

ce2eef33

16 5月, 2009 1 次提交

x86: Fix performance regression caused by paravirt_ops on native kernels · b4ecc126

由 Jeremy Fitzhardinge 提交于 5月 13, 2009

Xiaohui Xin and some other folks at Intel have been looking into what's
behind the performance hit of paravirt_ops when running native.

It appears that the hit is entirely due to the paravirtualized
spinlocks introduced by:

 | commit 8efcbab6
 | Date:   Mon Jul 7 12:07:51 2008 -0700
 |
 |     paravirt: introduce a "lock-byte" spinlock implementation

The extra call/return in the spinlock path is somehow
causing an increase in the cycles/instruction of somewhere around 2-7%
(seems to vary quite a lot from test to test).  The working theory is
that the CPU's pipeline is getting upset about the
call->call->locked-op->return->return, and seems to be failing to
speculate (though I haven't seen anything definitive about the precise
reasons).  This doesn't entirely make sense, because the performance
hit is also visible on unlock and other operations which don't involve
locked instructions.  But spinlock operations clearly swamp all the
other pvops operations, even though I can't imagine that they're
nearly as common (there's only a .05% increase in instructions
executed).

If I disable just the pv-spinlock calls, my tests show that pvops is
identical to non-pvops performance on native (my measurements show that
it is actually about .1% faster, but Xiaohui shows a .05% slowdown).

Summary of results, averaging 10 runs of the "mmperf" test, using a
no-pvops build as baseline:

		nopv		Pv-nospin	Pv-spin
CPU cycles	100.00%		99.89%		102.18%
instructions	100.00%		100.10%		100.15%
CPI		100.00%		99.79%		102.03%
cache ref	100.00%		100.84%		100.28%
cache miss	100.00%		90.47%		88.56%
cache miss rate	100.00%		89.72%		88.31%
branches	100.00%		99.93%		100.04%
branch miss	100.00%		103.66%		107.72%
branch miss rt	100.00%		103.73%		107.67%
wallclock	100.00%		99.90%		102.20%

The clear effect here is that the 2% increase in CPI is
directly reflected in the final wallclock time.

(The other interesting effect is that the more ops are
out of line calls via pvops, the lower the cache access
and miss rates.  Not too surprising, but it suggests that
the non-pvops kernel is over-inlined.  On the flipside,
the branch misses go up correspondingly...)

So, what's the fix?

Paravirt patching turns all the pvops calls into direct calls, so
_spin_lock etc do end up having direct calls.  For example, the compiler
generated code for paravirtualized _spin_lock is:

<_spin_lock+0>:		mov    %gs:0xb4c8,%rax
<_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
<_spin_lock+15>:	callq  *0xffffffff805a5b30
<_spin_lock+22>:	retq

The indirect call will get patched to:
<_spin_lock+0>:		mov    %gs:0xb4c8,%rax
<_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
<_spin_lock+15>:	callq <__ticket_spin_lock>
<_spin_lock+20>:	nop; nop		/* or whatever 2-byte nop */
<_spin_lock+22>:	retq

One possibility is to inline _spin_lock, etc, when building an
optimised kernel (ie, when there's no spinlock/preempt
instrumentation/debugging enabled).  That will remove the outer
call/return pair, returning the instruction stream to a single
call/return, which will presumably execute the same as the non-pvops
case.  The downsides arel 1) it will replicate the
preempt_disable/enable code at eack lock/unlock callsite; this code is
fairly small, but not nothing; and 2) the spinlock definitions are
already a very heavily tangled mass of #ifdefs and other preprocessor
magic, and making any changes will be non-trivial.

The other obvious answer is to disable pv-spinlocks.  Making them a
separate config option is fairly easy, and it would be trivial to
enable them only when Xen is enabled (as the only non-default user).
But it doesn't really address the common case of a distro build which
is going to have Xen support enabled, and leaves the open question of
whether the native performance cost of pv-spinlocks is worth the
performance improvement on a loaded Xen system (10% saving of overall
system CPU when guests block rather than spin).  Still it is a
reasonable short-term workaround.

[ Impact: fix pvops performance regression when running native ]
Analysed-by: N"Xin Xiaohui" <xiaohui.xin@intel.com>
Analysed-by: N"Li Xin" <xin.li@intel.com>
Analysed-by: N"Nakajima Jun" <jun.nakajima@intel.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Xen-devel <xen-devel@lists.xensource.com>
LKML-Reference: <4A0B62F7.5030802@goop.org>
[ fixed the help text ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b4ecc126

13 5月, 2009 1 次提交

xen: use header for EXPORT_SYMBOL_GPL · 44408ad7

由 Randy Dunlap 提交于 5月 12, 2009

mmu.c needs to #include module.h to prevent these warnings:

 arch/x86/xen/mmu.c:239: warning: data definition has no type or storage class
 arch/x86/xen/mmu.c:239: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
 arch/x86/xen/mmu.c:239: warning: parameter names (without types) in function declaration

[ Impact: cleanup ]
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

44408ad7

09 5月, 2009 3 次提交

xen: cache cr0 value to avoid trap'n'emulate for read_cr0 · a789ed5f

由 Jeremy Fitzhardinge 提交于 4月 24, 2009

stts() is implemented in terms of read_cr0/write_cr0 to update the
state of the TS bit.  This happens during context switch, and so
is fairly performance critical.  Rather than falling back to
a trap-and-emulate native read_cr0, implement our own by caching
the last-written value from write_cr0 (the TS bit is the only one
we really care about).

Impact: optimise Xen context switches
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

a789ed5f

xen/x86-64: clean up warnings about IST-using traps · b80119bb

由 Jeremy Fitzhardinge 提交于 4月 24, 2009

Ignore known IST-using traps. Aside from the debugger traps, they're
low-level faults which Xen will handle for us, so the kernel needn't
worry about them. Keep warning in case unknown trap starts using IST.

Impact: suppress spurious warnings
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

b80119bb

xen/x86-64: fix breakpoints and hardware watchpoints · 6cac5a92

由 Jeremy Fitzhardinge 提交于 3月 29, 2009

Native x86-64 uses the IST mechanism to run int3 and debug traps on
an alternative stack.  Xen does not do this, and so the frames were
being misinterpreted by the ptrace code.  This change special-cases
these two exceptions by using Xen variants which run on the normal
kernel stack properly.

Impact: avoid crash or bad data when IST trap is invoked under Xen
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

6cac5a92

08 5月, 2009 2 次提交

xen: reserve Xen start_info rather than e820 reserving · 6b2e8523

由 Jeremy Fitzhardinge 提交于 5月 07, 2009

Use reserve_early rather than e820 reservations for Xen start info and mfn->pfn
table, so that the memory use is a bit more self-documenting.

[ Impact: cleanup ]
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Xen-devel <xen-devel@lists.xensource.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4A032EF1.6070708@goop.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6b2e8523

x86: xen, i386: reserve Xen pagetables · 33df4db0

由 Jeremy Fitzhardinge 提交于 5月 07, 2009

The Xen pagetables are no longer implicitly reserved as part of the other
i386_start_kernel reservations, so make sure we explicitly reserve them.
This prevents them from being released into the general kernel free page
pool and reused.

[ Impact: fix Xen guest crash ]
Also-Bisected-by: NBryan Donlan <bdonlan@gmail.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Xen-devel <xen-devel@lists.xensource.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4A032EEC.30509@goop.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

33df4db0

22 4月, 2009 1 次提交

clocksource: pass clocksource to read() callback · 8e19608e

由 Magnus Damm 提交于 4月 21, 2009

Pass clocksource pointer to the read() callback for clocksources.  This
allows us to share the callback between multiple instances.

[hugh@veritas.com: fix powerpc build of clocksource pass clocksource mods]
[akpm@linux-foundation.org: cleanup]
Signed-off-by: NMagnus Damm <damm@igel.co.jp>
Acked-by: NJohn Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8e19608e

11 4月, 2009 1 次提交

x86: fix set_fixmap to use phys_addr_t · 9b987aeb

由 Masami Hiramatsu 提交于 4月 09, 2009

Impact: fix kprobes crash on 32-bit with RAM above 4G

Use phys_addr_t for receiving a physical address argument
instead of unsigned long. This allows fixmap to handle
pages higher than 4GB on x86-32.
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Acked-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: systemtap-ml <systemtap@sources.redhat.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <49DE3695.6040800@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9b987aeb

10 4月, 2009 2 次提交

x86: fix set_fixmap to use phys_addr_t · 3b3809ac

由 Masami Hiramatsu 提交于 4月 09, 2009

Use phys_addr_t for receiving a physical address argument instead of
unsigned long.  This allows fixmap to handle pages higher than 4GB on
x86-32.
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3b3809ac

xen: add FIX_TEXT_POKE to fixmap · e7c06488

由 Jeremy Fitzhardinge 提交于 3月 07, 2009

FIX_TEXT_POKE[01] are used to map kernel addresses, so they're mapping
pfns, not mfns.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

e7c06488

09 4月, 2009 13 次提交

xen: add FIX_TEXT_POKE to fixmap · 3ecb1b7d

由 Jeremy Fitzhardinge 提交于 3月 07, 2009

FIX_TEXT_POKE[01] are used to map kernel addresses, so they're mapping
pfns, not mfns.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

3ecb1b7d

xen: clean up gate trap/interrupt constants · 2b2a7334

由 Jeremy Fitzhardinge 提交于 3月 29, 2009

Use GATE_INTERRUPT/TRAP rather than 0xe/f.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

2b2a7334

xen: set _PAGE_NX in __supported_pte_mask before pagetable construction · bc6081ff

由 Jeremy Fitzhardinge 提交于 3月 27, 2009

Some 64-bit machines don't support the NX flag in ptes.
Check for NX before constructing the kernel pagetables.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

bc6081ff

xen/mmu: weaken flush_tlb_other test · e3f8a74e

由 Jeremy Fitzhardinge 提交于 3月 04, 2009

Impact: fixes crashing bug

There's no particular problem with getting an empty cpu mask,
so just shortcut-return if we get one.

Avoids crash reported by Christophe Saout <christophe@saout.de>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

e3f8a74e

xen/mmu: some early pagetable cleanups · b96229b5

由 Jeremy Fitzhardinge 提交于 3月 17, 2009

1. make sure early-allocated ptes are pinned, so they can be later
   unpinned
2. don't pin pmd+pud, just make them RO
3. scatter some __inits around
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

b96229b5

xen: mask XSAVE from cpuid · 191216b9

由 Jeremy Fitzhardinge 提交于 3月 07, 2009

Xen leaves XSAVE set in cpuid, but doesn't allow cr4.OSXSAVE
to be set.  This confuses the kernel and it ends up crashing on
an xsetbv instruction.

At boot time, try to set cr4.OSXSAVE, and mask XSAVE out of
cpuid it we can't.  This will produce a spurious error from Xen,
but allows us to support XSAVE if/when Xen does.

This also factors out the cpuid mask decisions to boot time.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

191216b9

NULL noise: arch/x86/xen/smp.c · 1207cf8e

由 Hannes Eder 提交于 3月 05, 2009

Fix this sparse warnings:
  arch/x86/xen/smp.c:316:52: warning: Using plain integer as NULL pointer
  arch/x86/xen/smp.c:421:60: warning: Using plain integer as NULL pointer
Signed-off-by: NHannes Eder <hannes@hanneseder.net>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

1207cf8e

J
xen: remove xen_load_gdt debug · c667d5d6
由 Jeremy Fitzhardinge 提交于 3月 04, 2009
```
Don't need the noise.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
```
c667d5d6

xen: make xen_load_gdt simpler · a957fac5

由 Jeremy Fitzhardinge 提交于 3月 04, 2009

Remove use of multicall machinery which is unused (gdt loading
is never performance critical). This removes the implicit use
of percpu variables, which simplifies understanding how
the percpu code's use of load_gdt interacts with this code.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

a957fac5

xen: clean up xen_load_gdt · c7da8c82

由 Jeremy Fitzhardinge 提交于 3月 04, 2009

Makes the logic a bit clearer.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

c7da8c82

xen: split construction of p2m mfn tables from registration · cdaead6b

由 Jeremy Fitzhardinge 提交于 2月 27, 2009

Build the p2m_mfn_list_list early with the rest of the p2m table, but
register it later when the real shared_info structure is in place.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

cdaead6b

xen: separate p2m allocation from setting · e791ca0f

由 Jeremy Fitzhardinge 提交于 2月 26, 2009

When doing very early p2m setting, we need to separate setting
from allocation, so split things up accordingly.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

e791ca0f

xen: disable preempt for leave_lazy_mmu · d6382bf7

由 Jeremy Fitzhardinge 提交于 2月 20, 2009

xen_mc_flush() requires preemption to be disabled for its own sanity,
so disable it while we're flushing.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

d6382bf7

31 3月, 2009 10 次提交

xen: clean up gate trap/interrupt constants · 6d02c426

由 Jeremy Fitzhardinge 提交于 3月 29, 2009

Use GATE_INTERRUPT/TRAP rather than 0xe/f.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

6d02c426

xen: set _PAGE_NX in __supported_pte_mask before pagetable construction · 707ebbc8

由 Jeremy Fitzhardinge 提交于 3月 27, 2009

Some 64-bit machines don't support the NX flag in ptes.
Check for NX before constructing the kernel pagetables.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

707ebbc8

xen/mmu: weaken flush_tlb_other test · 8de07bbd

由 Jeremy Fitzhardinge 提交于 3月 04, 2009

Impact: fixes crashing bug

There's no particular problem with getting an empty cpu mask,
so just shortcut-return if we get one.

Avoids crash reported by Christophe Saout <christophe@saout.de>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

8de07bbd

xen/mmu: some early pagetable cleanups · 4185f354

由 Jeremy Fitzhardinge 提交于 3月 17, 2009

1. make sure early-allocated ptes are pinned, so they can be later
   unpinned
2. don't pin pmd+pud, just make them RO
3. scatter some __inits around
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

4185f354

xen: mask XSAVE from cpuid · e826fe1b

由 Jeremy Fitzhardinge 提交于 3月 07, 2009

Xen leaves XSAVE set in cpuid, but doesn't allow cr4.OSXSAVE
to be set.  This confuses the kernel and it ends up crashing on
an xsetbv instruction.

At boot time, try to set cr4.OSXSAVE, and mask XSAVE out of
cpuid it we can't.  This will produce a spurious error from Xen,
but allows us to support XSAVE if/when Xen does.

This also factors out the cpuid mask decisions to boot time.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

e826fe1b

NULL noise: arch/x86/xen/smp.c · e9e2d1ff

由 Hannes Eder 提交于 3月 05, 2009

Fix this sparse warnings:
  arch/x86/xen/smp.c:316:52: warning: Using plain integer as NULL pointer
  arch/x86/xen/smp.c:421:60: warning: Using plain integer as NULL pointer
Signed-off-by: NHannes Eder <hannes@hanneseder.net>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

e9e2d1ff

J
xen: remove xen_load_gdt debug · b4b7e585
由 Jeremy Fitzhardinge 提交于 3月 04, 2009
```
Don't need the noise.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
```
b4b7e585

xen: make xen_load_gdt simpler · 3ce5fa7e

由 Jeremy Fitzhardinge 提交于 3月 04, 2009

3ce5fa7e

xen: clean up xen_load_gdt · 6ed6bf42

由 Jeremy Fitzhardinge 提交于 3月 04, 2009

Makes the logic a bit clearer.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

6ed6bf42

xen: split construction of p2m mfn tables from registration · 7571a604

由 Jeremy Fitzhardinge 提交于 2月 27, 2009

7571a604

30 3月, 2009 5 次提交

xen: separate p2m allocation from setting · 59d71871

由 Jeremy Fitzhardinge 提交于 2月 26, 2009

When doing very early p2m setting, we need to separate setting
from allocation, so split things up accordingly.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

59d71871

xen: disable preempt for leave_lazy_mmu · 5caecb94

由 Jeremy Fitzhardinge 提交于 2月 20, 2009

xen_mc_flush() requires preemption to be disabled for its own sanity,
so disable it while we're flushing.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

5caecb94

x86/paravirt: allow preemption with lazy mmu mode · 2829b449

由 Jeremy Fitzhardinge 提交于 2月 17, 2009

Impact: remove obsolete checks, simplification

Lift restrictions on preemption with lazy mmu mode, as it is now allowed.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

2829b449

x86/paravirt: finish change from lazy cpu to context switch start/end · 224101ed

由 Jeremy Fitzhardinge 提交于 2月 18, 2009

Impact: fix lazy context switch API

Pass the previous and next tasks into the context switch start
end calls, so that the called functions can properly access the
task state (esp in end_context_switch, in which the next task
is not yet completely current).
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

224101ed

x86/paravirt: flush pending mmu updates on context switch · b407fc57

由 Jeremy Fitzhardinge 提交于 2月 17, 2009

Impact: allow preemption during lazy mmu updates

If we're in lazy mmu mode when context switching, leave
lazy mmu mode, but remember the task's state in
TIF_LAZY_MMU_UPDATES.  When we resume the task, check this
flag and re-enter lazy mmu mode if its set.

This sets things up for allowing lazy mmu mode while preemptible,
though that won't actually be active until the next change.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

b407fc57