提交 · 9af45651f1f7c89942e016a1a00a7ebddfa727f8 · gsplhtlxg / clone-Linux

18 1月, 2009 6 次提交

x86-64: Move kernelstack from PDA to per-cpu. · 9af45651

由 Brian Gerst 提交于 1月 19, 2009

Also clean up PER_CPU_VAR usage in xen-asm_64.S

tj: * remove now unused stack_thread_info()
    * s/kernelstack/kernel_stack/
    * added FIXME comment in xen-asm_64.S
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

9af45651

B
x86-64: Move current task from PDA to per-cpu and consolidate with 32-bit. · c6f5e0ac
由 Brian Gerst 提交于 1月 19, 2009
```
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
```
c6f5e0ac

x86-64: Move cpu number from PDA to per-cpu and consolidate with 32-bit. · ea927906

由 Brian Gerst 提交于 1月 19, 2009

tj: moved cpu_number definition out of CONFIG_HAVE_SETUP_PER_CPU_AREA
    for voyager.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

ea927906

x86-64: Convert irqstacks to per-cpu · 26f80bd6

由 Brian Gerst 提交于 1月 19, 2009

Move the irqstackptr variable from the PDA to per-cpu.  Make the
stacks themselves per-cpu, removing some specific allocation code.
Add a seperate flag (is_boot_cpu) to simplify the per-cpu boot
adjustments.

tj: * sprinkle some underbars around.

    * irq_stack_ptr is not used till traps_init(), no reason to
      initialize it early.  On SMP, just leaving it NULL till proper
      initialization in setup_per_cpu_areas() works.  Dropped
      is_boot_cpu and early irq_stack_ptr initialization.

    * do DECLARE/DEFINE_PER_CPU(char[IRQ_STACK_SIZE], irq_stack)
      instead of (char, irq_stack[IRQ_STACK_SIZE]).
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

26f80bd6

B
x86-64: Move TLB state from PDA to per-cpu and consolidate with 32-bit. · 9eb912d1
由 Brian Gerst 提交于 1月 19, 2009
```
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
```
9eb912d1
B
x86-64: Move irq stats from PDA to per-cpu and consolidate with 32-bit. · 1b437c8c
由 Brian Gerst 提交于 1月 19, 2009
```
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
```
1b437c8c

16 1月, 2009 10 次提交

percpu: add optimized generic percpu accessors · 6dbde353

由 Ingo Molnar 提交于 1月 15, 2009

It is an optimization and a cleanup, and adds the following new
generic percpu methods:

  percpu_read()
  percpu_write()
  percpu_add()
  percpu_sub()
  percpu_and()
  percpu_or()
  percpu_xor()

and implements support for them on x86. (other architectures will fall
back to a default implementation)

The advantage is that for example to read a local percpu variable,
instead of this sequence:

 return __get_cpu_var(var);

 ffffffff8102ca2b:	48 8b 14 fd 80 09 74 	mov    -0x7e8bf680(,%rdi,8),%rdx
 ffffffff8102ca32:	81
 ffffffff8102ca33:	48 c7 c0 d8 59 00 00 	mov    $0x59d8,%rax
 ffffffff8102ca3a:	48 8b 04 10          	mov    (%rax,%rdx,1),%rax

We can get a single instruction by using the optimized variants:

 return percpu_read(var);

 ffffffff8102ca3f:	65 48 8b 05 91 8f fd 	mov    %gs:0x7efd8f91(%rip),%rax

I also cleaned up the x86-specific APIs and made the x86 code use
these new generic percpu primitives.

tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out
    * added percpu_and() for completeness's sake
    * made generic percpu ops atomic against preemption
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NTejun Heo <tj@kernel.org>

6dbde353

x86: misc clean up after the percpu update · 004aa322

由 Tejun Heo 提交于 1月 13, 2009

Do the following cleanups:

* kill x86_64_init_pda() which now is equivalent to pda_init()

* use per_cpu_offset() instead of cpu_pda() when initializing
  initial_gs
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

004aa322

x86: convert pda ops to wrappers around x86 percpu accessors · 49357d19

由 Tejun Heo 提交于 1月 13, 2009

pda is now a percpu variable and there's no reason it can't use plain
x86 percpu accessors.  Add x86_test_and_clear_bit_percpu() and replace
pda op implementations with wrappers around x86 percpu accessors.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

49357d19

x86: make pda a percpu variable · b12d8db8

由 Tejun Heo 提交于 1月 13, 2009

[ Based on original patch from Christoph Lameter and Mike Travis. ]

As pda is now allocated in percpu area, it can easily be made a proper
percpu variable.  Make it so by defining per cpu symbol from linker
script and declaring it in C code for SMP and simply defining it for
UP.  This change cleans up code and brings SMP and UP closer a bit.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b12d8db8

x86: merge 64 and 32 SMP percpu handling · 9939ddaf

由 Tejun Heo 提交于 1月 13, 2009

Now that pda is allocated as part of percpu, percpu doesn't need to be
accessed through pda.  Unify x86_64 SMP percpu access with x86_32 SMP
one.  Other than the segment register, operand size and the base of
percpu symbols, they behave identical now.

This patch replaces now unnecessary pda->data_offset with a dummy
field which is necessary to keep stack_canary at its place.  This
patch also moves per_cpu_offset initialization out of init_gdt() into
setup_per_cpu_areas().  Note that this change also necessitates
explicit per_cpu_offset initializations in voyager_smp.c.

With this change, x86_OP_percpu()'s are as efficient on x86_64 as on
x86_32 and also x86_64 can use assembly PER_CPU macros.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9939ddaf

x86: fold pda into percpu area on SMP · 1a51e3a0

由 Tejun Heo 提交于 1月 13, 2009

[ Based on original patch from Christoph Lameter and Mike Travis. ]

Currently pdas and percpu areas are allocated separately.  %gs points
to local pda and percpu area can be reached using pda->data_offset.
This patch folds pda into percpu area.

Due to strange gcc requirement, pda needs to be at the beginning of
the percpu area so that pda->stack_canary is at %gs:40.  To achieve
this, a new percpu output section macro - PERCPU_VADDR_PREALLOC() - is
added and used to reserve pda sized chunk at the start of the percpu
area.

After this change, for boot cpu, %gs first points to pda in the
data.init area and later during setup_per_cpu_areas() gets updated to
point to the actual pda.  This means that setup_per_cpu_areas() need
to reload %gs for CPU0 while clearing pda area for other cpus as cpu0
already has modified it when control reaches setup_per_cpu_areas().

This patch also removes now unnecessary get_local_pda() and its call
sites.

A lot of this patch is taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1a51e3a0

x86: use static _cpu_pda array · c8f3329a

由 Tejun Heo 提交于 1月 13, 2009

_cpu_pda array first uses statically allocated storage in data.init
and then switches to allocated bootmem to conserve space.  However,
after folding pda area into percpu area, _cpu_pda array will be
removed completely.  Drop the reallocation part to simplify the code
for soon-to-follow changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c8f3329a

x86: load pointer to pda into %gs while brining up a CPU · f32ff538

由 Tejun Heo 提交于 1月 13, 2009

[ Based on original patch from Christoph Lameter and Mike Travis. ]

CPU startup code in head_64.S loaded address of a zero page into %gs
for temporary use till pda is loaded but address to the actual pda is
available at the point.  Load the real address directly instead.

This will help unifying percpu and pda handling later on.

This patch is mostly taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.
Signed-off-by: NTejun Heo <tj@kernel.org>

f32ff538

x86: make early_per_cpu() a lvalue and use it · f10fcd47

由 Tejun Heo 提交于 1月 13, 2009

Make early_per_cpu() a lvalue as per_cpu() is and use it where
applicable.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f10fcd47

x86: fix pda_to_op() · 7de6883f

由 Tejun Heo 提交于 1月 13, 2009

There's no instruction to move a 64bit immediate into memory location.
Drop "i".
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7de6883f

14 1月, 2009 5 次提交

x86: make 32bit MAX_HARDIRQS_PER_CPU to be NR_VECTORS · b6659679

由 Yinghai Lu 提交于 1月 12, 2009

Impact: clean up to be same as 64bit

32-bit is using per-cpu vector too, so don't use default NR_IRQS.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b6659679

x86: replacing mp_config_intsrc with mpc_intsrc · c2c21745

由 Jaswinder Singh Rajput 提交于 1月 12, 2009

Impact: cleanup, solve 80 columns wrap problems
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c2c21745

x86: replacing mp_config_ioapic with mpc_ioapic · b5ba7e6d

由 Jaswinder Singh Rajput 提交于 1月 12, 2009

Impact: cleanup, solve 80 columns wrap problems
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b5ba7e6d

x86 PAT: consolidate old memtype new memtype check into a function · afc7d20c

由 venkatesh.pallipadi@intel.com 提交于 1月 09, 2009

Impact: cleanup

Move the new memtype old memtype allowed check to header so that is can be
shared by other users. Subsequent patch uses this in pat.c in remap_pfn_range()
code path. No functionality change in this patch.
Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

afc7d20c

x86, generic: mark complex bitops.h inlines as __always_inline · c8399943

由 Andi Kleen 提交于 1月 12, 2009

Impact: reduce kernel image size

Hugh Dickins noticed that older gcc versions when the kernel
is built for code size didn't inline some of the bitops.

Mark all complex x86 bitops that have more than a single
asm statement or two as always inline to avoid this problem.

Probably should be done for other architectures too.

Ingo then found a better fix that only requires
a single line change, but it unfortunately only
works on gcc 4.3.

On older gccs the original patch still makes a ~0.3% defconfig
difference with CONFIG_OPTIMIZE_INLINING=y.

With gcc 4.1 and a defconfig like build:

    61169987 1138540  883788 8139326  7c323e vmlinux-oi-with-patch
    6137043 1138540  883788 8159371  7c808b vmlinux-optimize-inlining

~20k / 0.3% difference.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c8399943

13 1月, 2009 2 次提交

x86: arch_probe_nr_irqs · 4a046d17

由 Yinghai Lu 提交于 1月 12, 2009

Impact: save RAM with large NR_CPUS, get smaller nr_irqs
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NMike Travis <travis@sgi.com>

4a046d17

x86: fix apic.c build error on latest git · 2bc13797

由 Jaswinder Singh Rajput 提交于 1月 11, 2009

Fix this by reintroducing asm/smp.h include in apic.c - later on
I will fix this by removing non-smp data from smp.h

Also fix the __inquire_remote_apic() prototype/inline.
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2bc13797

12 1月, 2009 3 次提交

cpumask, irq: non-x86 build failures · 92296c6d

由 Mike Travis 提交于 1月 11, 2009

Ingo Molnar wrote:

> All non-x86 architectures fail to build:
>
> In file included from /home/mingo/tip/include/linux/random.h:11,
>                  from /home/mingo/tip/include/linux/stackprotector.h:6,
>                  from /home/mingo/tip/init/main.c:17:
> /home/mingo/tip/include/linux/irqnr.h:26:63: error: asm/irq_vectors.h: No such file or directory

Do not include asm/irq_vectors.h in generic code - it's not available
on all architectures.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

92296c6d

irq: initialize nr_irqs based on nr_cpu_ids · 9332fccd

由 Mike Travis 提交于 1月 10, 2009

Impact: Reduce memory usage.

This is the second half of the changes to make the irq_desc_ptrs be
variable sized based on nr_cpu_ids.  This is done by adding a new
"max_nr_irqs" macro to irq_vectors.h (and a dummy in irqnr.h) to
return a max NR_IRQS value based on NR_CPUS or nr_cpu_ids.

This necessitated moving the define of MAX_IO_APICS to a separate
file (asm/apicnum.h) so it could be included without the baggage
of the other asm/apicdef.h declarations.
Signed-off-by: NMike Travis <travis@sgi.com>

9332fccd

x86: change flush_tlb_others to take a const struct cpumask · 4595f962

由 Rusty Russell 提交于 1月 10, 2009

Impact: reduce stack usage, use new cpumask API.

This is made a little more tricky by uv_flush_tlb_others which
actually alters its argument, for an IPI to be sent to the remaining
cpus in the mask.

I solve this by allocating a cpumask_var_t for this case and falling back
to IPI should this fail.

To eliminate temporaries in the caller, all flush_tlb_others implementations
now do the this-cpu-elimination step themselves.

Note also the curious "cpus_or(f->flush_cpumask, cpumask, f->flush_cpumask)"
which has been there since pre-git and yet f->flush_cpumask is always zero
at this point.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NMike Travis <travis@sgi.com>

4595f962

11 1月, 2009 4 次提交
- J
  x86: smp.h move cpu_sibling_setup_mask and cpu_sibling_setup_map declartion to cpumask.h · 52811d8c
  由 Jaswinder Singh Rajput 提交于 1月 10, 2009
```
Impact: cleanup
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
  52811d8c
- J
  x86: smp.h move cpu_initialized_mask and cpu_initialized declartion to cpumask.h · 493f6ca5
  由 Jaswinder Singh Rajput 提交于 1月 10, 2009
```
Impact: cleanup
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
  493f6ca5
- J
  x86: smp.h move cpu_callout_mask and cpu_callout_map declartion to cpumask.h · fb8fd077
  由 Jaswinder Singh Rajput 提交于 1月 10, 2009
```
Impact: cleanup
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
  fb8fd077
- J
  x86: smp.h move cpu_callin_mask and cpu_callin_map declartion to cpumask.h · 06879033
  由 Jaswinder Singh Rajput 提交于 1月 10, 2009
```
Impact: cleanup
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
  06879033
10 1月, 2009 1 次提交

x86: make 'constant_test_bit()' take an unsigned bit number · c4295fbb

由 Linus Torvalds 提交于 1月 09, 2009

Ingo noticed that using signed arithmetic seems to confuse the gcc
inliner, and make it potentially decide that it's all too complicated.

(Yeah, yeah, it's a constant. It's always positive. Still..)

Based-on: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c4295fbb

08 1月, 2009 7 次提交

x86, mtrr: fix types used in userspace exported header · 79f3b3cb

由 Kyle McMartin 提交于 1月 08, 2009

Commit 932d27a7 exported some mtrr
structures without using the exportable __uX types, causing userspace
build failures.
Signed-off-by: NKyle McMartin <kyle@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

79f3b3cb

x86: rename all fields of mpf_intel mpf_X to X · 1eb1b3b6

由 Jaswinder Singh Rajput 提交于 1月 08, 2009

Impact: cleanup, solve 80 columns wrap problems

It would be cleaner to rename all the mpf->mpf_X fields to
mpf->X - that alone would give 4 characters per usage site.
(we already know that it's an 'mpf' entity -
no need to duplicate that in the field too)
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1eb1b3b6

x86: rename intel_mp_floating to mpf_intel · 41401db6

由 Jaswinder Singh Rajput 提交于 1月 08, 2009

Impact: cleanup, solve 80 columns wrap problems

intel_mp_floating should be renamed to mpf_intel.

The reason: the 'f' in MPF already means 'floating'
which means MP Floating pointer structure -
no need to repeat that in the type name.
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

41401db6

x86: smp.h move boot_cpu_id declartion to cpu.h · 6d652ea1

由 Jaswinder Singh Rajput 提交于 1月 07, 2009

Impact: cleanup
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6d652ea1

x86: smp.h move cpu_physical_id declartion to cpu.h · af8968ab

由 Jaswinder Singh Rajput 提交于 1月 07, 2009

Impact: cleanup
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

af8968ab

x86: smp.h move safe_smp_processor_id declartion to cpu.h · 96b89dc6

由 Jaswinder Singh Rajput 提交于 1月 07, 2009

Impact: cleanup
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

96b89dc6

x86: smp.h move stack_processor_id declartion to cpu.h · f472cdba

由 Jaswinder Singh Rajput 提交于 1月 07, 2009

Impact: cleanup
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f472cdba

07 1月, 2009 2 次提交

x86: smp.h move prefill_possible_map declartion to cpu.h · 6e5385d4

由 Jaswinder Singh Rajput 提交于 1月 07, 2009

Impact: cleanup, moving NON-SMP stuff from smp.h
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6e5385d4

x86: smp.h move zap_low_mappings declartion to tlbflush.h · dacf7333

由 Jaswinder Singh Rajput 提交于 1月 07, 2009

Impact: cleanup, moving NON-SMP stuff from smp.h
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

dacf7333