提交 · 791bad9d28d405d9397ea0c370ffb7c7bdd2aa6e · openanolis / cloud-kernel

31 1月, 2009 6 次提交

x86/paravirt: implement PVOP_CALL macros for callee-save functions · 791bad9d

由 Jeremy Fitzhardinge 提交于 1月 28, 2009

Impact: Optimization

Functions with the callee save calling convention clobber many fewer
registers than the normal C calling convention.  Implement variants of
PVOP_V?CALL* accordingly.  This only bothers with functions up to 3
args, since functions with more args may as well use the normal
calling convention.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

791bad9d

x86/paravirt: add register-saving thunks to reduce caller register pressure · ecb93d1c

由 Jeremy Fitzhardinge 提交于 1月 28, 2009

Impact: Optimization

One of the problems with inserting a pile of C calls where previously
there were none is that the register pressure is greatly increased.
The C calling convention says that the caller must expect a certain
set of registers may be trashed by the callee, and that the callee can
use those registers without restriction.  This includes the function
argument registers, and several others.

This patch seeks to alleviate this pressure by introducing wrapper
thunks that will do the register saving/restoring, so that the
callsite doesn't need to worry about it, but the callee function can
be conventional compiler-generated code.  In many cases (particularly
performance-sensitive cases) the callee will be in assembler anyway,
and need not use the compiler's calling convention.

Standard calling convention is:
	 arguments	    return	scratch
x86-32	 eax edx ecx	    eax		?
x86-64	 rdi rsi rdx rcx    rax		r8 r9 r10 r11

The thunk preserves all argument and scratch registers.  The return
register is not preserved, and is available as a scratch register for
unwrapped callee code (and of course the return value).

Wrapped function pointers are themselves wrapped in a struct
paravirt_callee_save structure, in order to get some warning from the
compiler when functions with mismatched calling conventions are used.

The most common paravirt ops, both statically and dynamically, are
interrupt enable/disable/save/restore, so handle them first.  This is
particularly easy since their calls are handled specially anyway.

XXX Deal with VMI.  What's their calling convention?
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

ecb93d1c

x86/paravirt: selectively save/restore regs around pvops calls · 9104a18d

由 Jeremy Fitzhardinge 提交于 1月 28, 2009

Impact: Optimization

Each asm paravirt-ops call says what registers are available for
clobbering.  This patch makes use of this to selectively save/restore
registers around each pvops call.  In many cases this significantly
shrinks code size.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

9104a18d

x86: fix paravirt clobber in entry_64.S · b8aa287f

由 Jeremy Fitzhardinge 提交于 1月 28, 2009

Impact: Fix latent bug

The clobber is trying to say that anything except RDI is available for
clobbering, but actually clobbers everything.  This hasn't mattered
because the clobbers were basically ignored, but subsequent patches
will rely on them.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

b8aa287f

x86/pvops: add a paravirt_ident functions to allow special patching · 41edafdb

由 Jeremy Fitzhardinge 提交于 1月 28, 2009

Impact: Optimization

Several paravirt ops implementations simply return their arguments,
the most obvious being the make_pte/pte_val class of operations on
native.

On 32-bit, the identity function is literally a no-op, as the calling
convention uses the same registers for the first argument and return.
On 64-bit, it can be implemented with a single "mov".

This patch adds special identity functions for 32 and 64 bit argument,
and machinery to recognize them and replace them with either nops or a
mov as appropriate.

At the moment, the only users for the identity functions are the
pagetable entry conversion functions.

The result is a measureable improvement on pagetable-heavy benchmarks
(2-3%, reducing the pvops overhead from 5 to 2%).
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

41edafdb

xen: move remaining mmu-related stuff into mmu.c · 319f3ba5

由 Jeremy Fitzhardinge 提交于 1月 28, 2009

Impact: Cleanup

Move remaining mmu-related stuff into mmu.c.
A general cleanup, and lay the groundwork for later patches.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

319f3ba5

30 1月, 2009 2 次提交

Documentation: move DMA-mapping.txt to Doc/PCI/ · 5872fb94

由 Randy Dunlap 提交于 1月 29, 2009

Move DMA-mapping.txt to Documentation/PCI/.

DMA-mapping.txt was supposed to be moved from Documentation/ to
Documentation/PCI/.  The 00-INDEX files in those two directories
were updated, along with a few other text files, but the file
itself somehow escaped being moved, so move it and update more
text files and source files with its new location.
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
cc:	Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5872fb94

lguest: typos fix · 72410af9

由 Atsushi SAKAI 提交于 1月 16, 2009

3 points

lguest_asm.S => i386_head.S
LHCALL_BREAK => LHREQ_BREAK
perferred    => preferred
Signed-off-by: NAtsushi SAKAI <sakaia@jp.fujitsu.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

72410af9

27 1月, 2009 15 次提交

T
x86: clean up indentation in setup_per_cpu_areas() · cf3997f5
由 Tejun Heo 提交于 1月 27, 2009
```
Impact: cosmetic cleanup
Signed-off-by: NTejun Heo <tj@kernel.org>
```
cf3997f5

x86: fix build breakage on voyage · 22f25138

由 James Bottomley 提交于 1月 27, 2009

Impact: build fix

x86_cpu_to_apicid and x86_bios_cpu_apicid aren't defined for voyage.
Earlier patch forgot to conditionalize early percpu clearing.  Fix it.
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

22f25138

x86: load new GDT after setting up boot cpu per-cpu area · 2697fbd5

由 Brian Gerst 提交于 1月 27, 2009

Impact: sync 32 and 64-bit code

Merge load_gs_base() into switch_to_new_gdt().  Load the GDT and
per-cpu state for the boot cpu when its new area is set up.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

2697fbd5

x86: remove extra barriers from load_gs_base() · 1825b8ed

由 Brian Gerst 提交于 1月 27, 2009

Impact: optimization

mb() generates an mfence instruction, which is not needed here.  Only
a compiler barrier is needed, and that is handled by the memory clobber
in the wrmsrl function.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

1825b8ed

x86: initialize per-cpu GDT segment in per-cpu setup · b2d2f431

由 Brian Gerst 提交于 1月 27, 2009

Impact: cleanup

Rename init_gdt() to setup_percpu_segment(), and move it to
setup_percpu.c.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

b2d2f431

x86: make Voyager use x86 per-cpu setup. · 89c9c4c5

由 Brian Gerst 提交于 1月 27, 2009

Impact: standardize all x86 platforms on same setup code

With the preceding changes, Voyager can use the same per-cpu setup
code as all the other x86 platforms.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

89c9c4c5

x86: don't assume boot cpu is #0 · 34019be1

由 Brian Gerst 提交于 1月 27, 2009

Impact: minor cleanup
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

34019be1

x86: move this_cpu_offset · 1688401a

由 Brian Gerst 提交于 1月 27, 2009

Impact: Small cleanup

Define BOOT_PERCPU_OFFSET and use it for this_cpu_offset and
__per_cpu_offset initializers.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

1688401a

x86: only compile setup_percpu.o on SMP · 996db817

由 Brian Gerst 提交于 1月 27, 2009

Impact: Minor build optimization
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

996db817

x86: move apic variables to apic.c · ec70de8b

由 Brian Gerst 提交于 1月 27, 2009

Impact: Code movement

Move the variable definitions to apic.c.  Ifdef the copying of
the two early per-cpu variables, since Voyager doesn't use them.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

ec70de8b

x86: always page-align per-cpu area start and size · 74631a24

由 Brian Gerst 提交于 1月 27, 2009

Impact: cleanup

The way the code is written, align is always PAGE_SIZE.  Simplify
the code by removing the align variable.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

74631a24

x86: move setup_cpu_local_masks() · 2f2f52ba

由 Brian Gerst 提交于 1月 27, 2009

Impact: Code movement, no functional change.

Move setup_cpu_local_masks() to kernel/cpu/common.c, where the
masks are defined.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

2f2f52ba

x86: move 64-bit NUMA code · 6470aff6

由 Brian Gerst 提交于 1月 27, 2009

Impact: Code movement, no functional change.

Move the 64-bit NUMA code from setup_percpu.c to numa_64.c
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

6470aff6

x86: merge setup_per_cpu_maps() into setup_per_cpu_areas() · 0d77e7f0

由 Brian Gerst 提交于 1月 27, 2009

Impact: minor optimization

Eliminates the need for two loops over possible cpus.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

0d77e7f0

eeprom: More consistent symbol names · dd7f8dbe

由 Jean Delvare 提交于 1月 26, 2009

Now that all EEPROM drivers live in the same place, let's harmonize
their symbol names.

Also fix eeprom's dependencies, it definitely needs sysfs, and is no
longer experimental after many years in the kernel tree.
Signed-off-by: NJean Delvare <khali@linux-fr.org>
Acked-by: NWolfram Sang <w.sang@pengutronix.de>
Cc: David Brownell <dbrownell@users.sourceforge.net>

dd7f8dbe

26 1月, 2009 3 次提交

x86: fix section mismatch warning · 659d2618

由 Rakib Mullick 提交于 1月 24, 2009

Here function vmi_activate calls a init function activate_vmi , which
causes the following section mismatch warnings:

  LD      arch/x86/kernel/built-in.o
WARNING: arch/x86/kernel/built-in.o(.text+0x13ba9): Section mismatch
in reference from the function vmi_activate() to the function
.init.text:vmi_time_init()
The function vmi_activate() references
the function __init vmi_time_init().
This is often because vmi_activate lacks a __init
annotation or the annotation of vmi_time_init is wrong.

WARNING: arch/x86/kernel/built-in.o(.text+0x13bd1): Section mismatch
in reference from the function vmi_activate() to the function
.devinit.text:vmi_time_bsp_init()
The function vmi_activate() references
the function __devinit vmi_time_bsp_init().
This is often because vmi_activate lacks a __devinit
annotation or the annotation of vmi_time_bsp_init is wrong.

WARNING: arch/x86/kernel/built-in.o(.text+0x13bdb): Section mismatch
in reference from the function vmi_activate() to the function
.devinit.text:vmi_time_ap_init()
The function vmi_activate() references
the function __devinit vmi_time_ap_init().
This is often because vmi_activate lacks a __devinit
annotation or the annotation of vmi_time_ap_init is wrong.

Fix it by marking vmi_activate() as __init too.
Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

659d2618

x86: unmask CPUID levels on Intel CPUs, fix · 99fb4d34

由 Ingo Molnar 提交于 1月 26, 2009

Impact: fix boot hang on pre-model-15 Intel CPUs

rdmsrl_safe() does not work in very early bootup code yet, because we
dont have the pagefault handler installed yet so exception section
does not get parsed. rdmsr_safe() will just crash and hang the bootup.

So limit the MSR_IA32_MISC_ENABLE MSR read to those CPU types that
support it.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

99fb4d34

x86: work around PAGE_KERNEL_WC not getting WC in iomap_atomic_prot_pfn. · ef5fa0ab

由 Eric Anholt 提交于 1月 23, 2009

In the absence of PAT, PAGE_KERNEL_WC ends up mapping to a memory type that
gets UC behavior even in the presence of a WC MTRR covering the area in
question.  By swapping to PAGE_KERNEL_UC_MINUS, we can get the actual
behavior the caller wanted (WC if you can manage it, UC otherwise).

This recovers the 40% performance improvement of using WC in the DRM
to upload vertex data.
Signed-off-by: NEric Anholt <eric@anholt.net>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

ef5fa0ab

25 1月, 2009 1 次提交

x86: use standard PIT frequency · e1b4d114

由 Ingo Molnar 提交于 1月 25, 2009

the RDC and ELAN platforms use slighly different PIT clocks, resulting in
a timex.h hack that changes PIT_TICK_RATE during build time. But if a
tester enables any of these platform support .config options, the PIT
will be miscalibrated on standard PC platforms.

So use one frequency - in a subsequent patch we'll add a quirk to allow
x86 platforms to define different PIT frequencies.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e1b4d114

24 1月, 2009 1 次提交

x86, mm: fix pte_free() · 42ef73fe

由 Peter Zijlstra 提交于 1月 23, 2009

On -rt we were seeing spurious bad page states like:

Bad page state in process 'firefox'
page:c1bc2380 flags:0x40000000 mapping:c1bc2390 mapcount:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Pid: 503, comm: firefox Not tainted 2.6.26.8-rt13 #3
[<c043d0f3>] ? printk+0x14/0x19
[<c0272d4e>] bad_page+0x4e/0x79
[<c0273831>] free_hot_cold_page+0x5b/0x1d3
[<c02739f6>] free_hot_page+0xf/0x11
[<c0273a18>] __free_pages+0x20/0x2b
[<c027d170>] __pte_alloc+0x87/0x91
[<c027d25e>] handle_mm_fault+0xe4/0x733
[<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
[<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
[<c0218875>] do_page_fault+0x36f/0x88a

This is the case where a concurrent fault already installed the PTE and
we get to free the newly allocated one.

This is due to pgtable_page_ctor() doing the spin_lock_init(&page->ptl)
which is overlaid with the {private, mapping} struct.

union {
    struct {
        unsigned long private;
        struct address_space *mapping;
    };
    spinlock_t ptl;
    struct kmem_cache *slab;
    struct page *first_page;
};

Normally the spinlock is small enough to not stomp on page->mapping, but
PREEMPT_RT=y has huge 'spin'locks.

But lockdep kernels should also be able to trigger this splat, as the
lock tracking code grows the spinlock to cover page->mapping.

The obvious fix is calling pgtable_page_dtor() like the regular pte free
path __pte_free_tlb() does.

It seems all architectures except x86 and nm10300 already do this, and
nm10300 doesn't seem to use pgtable_page_ctor(), which suggests it
doesn't do SMP or simply doesnt do MMU at all or something.
Signed-off-by: NPeter Zijlstra <a.p.zijlsta@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>

42ef73fe

23 1月, 2009 8 次提交

x86, xen: fix hardirq.h merge fallout · 99d0000f

由 Ingo Molnar 提交于 1月 23, 2009

Impact: build fix

This build error:

 arch/x86/xen/suspend.c:22: error: implicit declaration of function 'fix_to_virt'
 arch/x86/xen/suspend.c:22: error: 'FIX_PARAVIRT_BOOTMAP' undeclared (first use in this function)
 arch/x86/xen/suspend.c:22: error: (Each undeclared identifier is reported only once
 arch/x86/xen/suspend.c:22: error: for each function it appears in.)

triggers because the hardirq.h unification removed an implicit fixmap.h
include - on which arch/x86/xen/suspend.c depended. Add the fixmap.h
include explicitly.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

99d0000f

x86: make irq_cpustat_t fields conditional · 2de3a5f7

由 Brian Gerst 提交于 1月 23, 2009

Impact: shrink size of irq_cpustat_t when possible
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

2de3a5f7

x86: merge hardirq_{32,64}.h into hardirq.h · 22da7b3d

由 Brian Gerst 提交于 1月 23, 2009

Impact: cleanup
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

22da7b3d

x86: sync hardirq_{32,64}.h · 658a9a2c

由 Brian Gerst 提交于 1月 23, 2009

Impact: better code generation and removal of unused field for 32bit

In general, use the 64-bit version.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

658a9a2c

x86: remove include of apic.h from hardirq_64.h · 3819cd48

由 Brian Gerst 提交于 1月 23, 2009

Impact: cleanup

APIC definitions aren't needed here.  Remove the include and fix
up the fallout.

tj: added include to mce_intel_64.c.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

3819cd48

x86: remove idle_timestamp from 32bit irq_cpustat_t · 03d2989d

由 Brian Gerst 提交于 1月 23, 2009

Impact: bogus irq_cpustat field removed

idle_timestamp is left over from the removed irqbalance code.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

03d2989d

x86: add pte_set_flags/clear_flags for pte flag manipulation · 6522869c

由 Jeremy Fitzhardinge 提交于 1月 22, 2009

It's not necessary to deconstruct and reconstruct a pte every time its
flags are being updated. Introduce pte_set_flags and pte_clear_flags
to set and clear flags in a pte. This allows the flag manipulation
code to be inlined, and avoids calls via paravirt-ops.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6522869c

x86/pvops: remove pte_flags pvop · ab897d20

由 Jeremy Fitzhardinge 提交于 1月 22, 2009

pte_flags() was introduced as a new pvop in order to extract just the
flags portion of a pte, which is a potentially cheaper operation than
extracting the page number as well.  It turns out this operation is
not needed, because simply using a mask to extract the flags from a
pte is sufficient for all current users.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ab897d20

22 1月, 2009 4 次提交

debugobjects: add and use INIT_WORK_ON_STACK · 336f6c32

由 Thomas Gleixner 提交于 1月 22, 2009

Impact: Fix debugobjects warning

debugobject enabled kernels spit out a warning in hpet code due to a
workqueue which is initialized on stack.

Add INIT_WORK_ON_STACK() which calls init_timer_on_stack() and use it
in hpet.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

336f6c32

x86: unmask CPUID levels on Intel CPUs · 066941bd

由 H. Peter Anvin 提交于 1月 21, 2009

Impact: Fixes crashes with misconfigured BIOSes on XSAVE hardware

Avuton Olrich reported early boot crashes with v2.6.28 and
bisected it down to dc1e35c6
("x86, xsave: enable xsave/xrstor on cpus with xsave support").

If the CPUID limit bit in MSR_IA32_MISC_ENABLE is set, clear it to
make all CPUID information available.  This is required for some
features to work, in particular XSAVE.
Reported-and-bisected-by: NAvuton Olrich <avuton@gmail.com>
Tested-by: NAvuton Olrich <avuton@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

066941bd

x86: add MSR_IA32_MISC_ENABLE bits to <asm/msr-index.h> · bdf21a49

由 H. Peter Anvin 提交于 1月 21, 2009

Impact: None (new bit definitions currently unused)

Add bit definitions for the MSR_IA32_MISC_ENABLE MSRs to
<asm/msr-index.h>.
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

bdf21a49

x86: fix PTE corruption issue while mapping RAM using /dev/mem · 95971342

由 Suresh Siddha 提交于 1月 13, 2009

Beschorner Daniel reported:
> hwinfo problem since 2.6.28, showing this in the oops:
>	Corrupted page table at address 7fd04de3ec00

Also, PaX Team reported a regression with this commit:

>	commit 9542ada8
>	Author: Suresh Siddha <suresh.b.siddha@intel.com>
>	Date:   Wed Sep 24 08:53:33 2008 -0700
>
>	    x86: track memtype for RAM in page struct

This commit breaks mapping any RAM page through /dev/mem, as the
reserve_memtype() was not initializing the return attribute type and as such
corrupting the PTE entry that was setup with the return attribute type.

Because of this bug, application mapping this RAM page through /dev/mem
will die with "Corrupted page table at address xxxx" message in the kernel
log and also the kernel identity mapping which maps the underlying RAM
page gets converted to UC.

Fix this by initializing the return attribute type before calling
reserve_ram_pages_type()
Reported-by: NPaX Team <pageexec@freemail.hu>
Reported-and-tested-by: NBeschorner Daniel <Daniel.Beschorner@facton.com>
Tested-and-Acked-by: NPaX Team <pageexec@freemail.hu>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

95971342

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功