提交 · 02cf94c370e0dc9bf408fe45eb86fe9ad58eaf7f · openeuler / raspberrypi-kernel

21 1月, 2009 9 次提交

x86: make x86_32 use tlb_64.c · 02cf94c3

由 Tejun Heo 提交于 1月 21, 2009

Impact: less contention when issuing invalidate IPI, cleanup

Make x86_32 use the same tlb code as 64bit.  The 64bit code uses
multiple IPI vectors for tlb shootdown to reduce contention.  This
patch makes x86_32 allocate the same 8 IPIs as x86_64 and share the
code paths.

Note that the usage of asmlinkage is inconsistent for x86_32 and 64
and calls for further cleanup.  This has been noted with a FIXME
comment in tlb_64.c.
Signed-off-by: NTejun Heo <tj@kernel.org>

02cf94c3

x86: prepare for tlb merge · 6dd01bed

由 Tejun Heo 提交于 1月 21, 2009

Impact: clean up, ipi vector number reordering for x86_32

Make the following changes to prepare for tlb merge.

* reorder x86_32 ip vectors

* adjust tlb_32.c and tlb_64.c such that their logics coincide exactly
	- on spurious invalidate ipi, tlb_32 acks the irq
	- tlb_64 now has proper memory barriers around clearing
          flush_cpumask (no change in generated code)

* unexport flush_tlb_page from tlb_32.c, there's no user

* use unsigned int for cpu id

* drop unnecessary includes from tlb_64.c
Signed-off-by: NTejun Heo <tj@kernel.org>

6dd01bed

x86: uv cleanup · bdbcdd48

由 Tejun Heo 提交于 1月 21, 2009

Impact: cleanup

Make the following uv related cleanups.

* collect visible uv related definitions and interfaces into uv/uv.h
  and use it.  this cleans up the messy situation where on 64bit, uv
  is defined properly, on 32bit generic it's dummy and on the rest
  undefined.  after this clean up, uv is defined on 64 and dummy on
  32.

* update uv_flush_tlb_others() such that it takes cpumask of
  to-be-flushed cpus as argument, instead of that minus self, and
  returns yet-to-be-flushed cpumask, instead of modifying the passed
  in parameter.  this interface change will ease dummy implementation
  of uv_flush_tlb_others() and makes uv tlb flush related stuff
  defined in tlb_uv proper.
Signed-off-by: NTejun Heo <tj@kernel.org>

bdbcdd48

x86: merge irq_regs.h · d650a514

由 Brian Gerst 提交于 1月 21, 2009

Impact: cleanup, better irq_regs code generation for x86_64

Make 64-bit use the same optimizations as 32-bit.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

d650a514

x86: merge mmu_context.h · 6826c8ff

由 Brian Gerst 提交于 1月 21, 2009

Impact: cleanup

tj: * changed cpu to unsigned as was done on mmu_context_64.h as cpu
      id is officially unsigned int
    * added missing ';' to 32bit version of deactivate_mm()
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

6826c8ff

x86: set %fs to __KERNEL_PERCPU unconditionally for x86_32 · 0dd76d73

由 Brian Gerst 提交于 1月 21, 2009

Impact: cleanup

%fs is currently set to __KERNEL_DS at boot, and conditionally
switched to __KERNEL_PERCPU for secondary cpus.  Instead, initialize
GDT_ENTRY_PERCPU to the same attributes as GDT_ENTRY_KERNEL_DS and
set %fs to __KERNEL_PERCPU unconditionally.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

0dd76d73

x86: fix percpu_write with 64-bit constants · 299e2699

由 Brian Gerst 提交于 1月 21, 2009

Impact: slightly better code generation for percpu_to_op()

The processor will sign-extend 32-bit immediate values in 64-bit
operations.  Use the 'e' constraint ("32-bit signed integer constant,
or a symbolic reference known to fit that range") for 64-bit constants.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

299e2699

x86: clean up gdt_page definition · 06deef89

由 Brian Gerst 提交于 1月 21, 2009

Impact: cleanup && more compact percpu area layout with future changes

Move 64-bit GDT to page-aligned section and clean up comment
formatting.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

06deef89

x86: update canary handling during switch · 67e68bde

由 Tejun Heo 提交于 1月 21, 2009

Impact: cleanup

In switch_to(), instead of taking offset to irq_stack_union.stack,
make it a proper percpu access using __percpu_arg() and per_cpu_var().
Signed-off-by: NTejun Heo <tj@kernel.org>

67e68bde

20 1月, 2009 7 次提交

B
x86: remove pda.h · 0d974d45
由 Brian Gerst 提交于 1月 18, 2009
```
Impact: cleanup
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
```
0d974d45

x86: move stack_canary into irq_stack · 947e76cd

由 Brian Gerst 提交于 1月 19, 2009

Impact: x86_64 percpu area layout change, irq_stack now at the beginning

Now that the PDA is empty except for the stack canary, it can be removed.
The irqstack is moved to the start of the per-cpu section.  If the stack
protector is enabled, the canary overlaps the bottom 48 bytes of the irqstack.

tj: * updated subject
    * dropped asm relocation of irq_stack_ptr
    * updated comments a bit
    * rebased on top of stack canary changes
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

947e76cd

x86: rework __per_cpu_load adjustments · 8c7e58e6

由 Brian Gerst 提交于 1月 19, 2009

Impact: cleanup

Use cpu_number to determine if the adjustment is necessary.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

8c7e58e6

x86: remove pda_init() · 8ce03197

由 Brian Gerst 提交于 1月 19, 2009

Impact: cleanup

Copy the code to cpu_init() to satisfy the requirement that the cpu
be reinitialized.  Remove all other calls, since the segments are
already initialized in head_64.S.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

8ce03197

x86: conditionalize stack canary handling in hot path · b4a8f7a2

由 Tejun Heo 提交于 1月 20, 2009

Impact: no unnecessary stack canary swapping during context switch

There's no point in moving stack_canary around during context switch
if it's not enabled.  Conditionalize it.
Signed-off-by: NTejun Heo <tj@kernel.org>

b4a8f7a2

x86: cleanup stack protector · c6e50f93

由 Tejun Heo 提交于 1月 20, 2009

Impact: cleanup

Make the following cleanups.

* remove duplicate comment from boot_init_stack_canary() which fits
  better in the other place - cpu_idle().

* move stack_canary offset check from __switch_to() to
  boot_init_stack_canary().
Signed-off-by: NTejun Heo <tj@kernel.org>

c6e50f93

I
x86: fully honor "nolapic", fix · 5cdc5e9e
由 Ingo Molnar 提交于 1月 19, 2009
```
Impact: build fix
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
5cdc5e9e

18 1月, 2009 12 次提交

x86-64: Use absolute displacements for per-cpu accesses. · 87b26406

由 Brian Gerst 提交于 1月 19, 2009

Accessing memory through %gs should not use rip-relative addressing.
Adding a P prefix for the argument tells gcc to not add (%rip) to
the memory references.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

87b26406

x86-64: Move isidle from PDA to per-cpu. · c2558e0e

由 Brian Gerst 提交于 1月 19, 2009

tj: s/isidle/is_idle/
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

c2558e0e

x86-64: Move nodenumber from PDA to per-cpu. · e7a22c1e

由 Brian Gerst 提交于 1月 19, 2009

tj: * s/nodenumber/node_number/
    * removed now unused pda variable from pda_init()
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

e7a22c1e

x86-64: Move irqcount from PDA to per-cpu. · 56895530

由 Brian Gerst 提交于 1月 19, 2009

tj: s/irqcount/irq_count/
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

56895530

x86-64: Move oldrsp from PDA to per-cpu. · 3d1e42a7

由 Brian Gerst 提交于 1月 19, 2009

tj: * in asm-offsets_64.c, pda.h inclusion shouldn't be removed as pda
      is still referenced in the file
    * s/oldrsp/old_rsp/
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

3d1e42a7

x86-64: Move kernelstack from PDA to per-cpu. · 9af45651

由 Brian Gerst 提交于 1月 19, 2009

Also clean up PER_CPU_VAR usage in xen-asm_64.S

tj: * remove now unused stack_thread_info()
    * s/kernelstack/kernel_stack/
    * added FIXME comment in xen-asm_64.S
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

9af45651

B
x86-64: Move current task from PDA to per-cpu and consolidate with 32-bit. · c6f5e0ac
由 Brian Gerst 提交于 1月 19, 2009
```
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
```
c6f5e0ac

x86-64: Move cpu number from PDA to per-cpu and consolidate with 32-bit. · ea927906

由 Brian Gerst 提交于 1月 19, 2009

tj: moved cpu_number definition out of CONFIG_HAVE_SETUP_PER_CPU_AREA
    for voyager.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

ea927906

x86-64: Convert exception stacks to per-cpu · 92d65b23

由 Brian Gerst 提交于 1月 19, 2009

Move the exception stacks to per-cpu, removing specific allocation code.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

92d65b23

x86-64: Convert irqstacks to per-cpu · 26f80bd6

由 Brian Gerst 提交于 1月 19, 2009

Move the irqstackptr variable from the PDA to per-cpu.  Make the
stacks themselves per-cpu, removing some specific allocation code.
Add a seperate flag (is_boot_cpu) to simplify the per-cpu boot
adjustments.

tj: * sprinkle some underbars around.

    * irq_stack_ptr is not used till traps_init(), no reason to
      initialize it early.  On SMP, just leaving it NULL till proper
      initialization in setup_per_cpu_areas() works.  Dropped
      is_boot_cpu and early irq_stack_ptr initialization.

    * do DECLARE/DEFINE_PER_CPU(char[IRQ_STACK_SIZE], irq_stack)
      instead of (char, irq_stack[IRQ_STACK_SIZE]).
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

26f80bd6

B
x86-64: Move TLB state from PDA to per-cpu and consolidate with 32-bit. · 9eb912d1
由 Brian Gerst 提交于 1月 19, 2009
```
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
```
9eb912d1
B
x86-64: Move irq stats from PDA to per-cpu and consolidate with 32-bit. · 1b437c8c
由 Brian Gerst 提交于 1月 19, 2009
```
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
```
1b437c8c

17 1月, 2009 1 次提交

linker script: add missing .data.percpu.page_aligned · 74e79045

由 Tejun Heo 提交于 1月 17, 2009

arm, arm/mach-integrator and powerpc were missing
.data.percpu.page_aligned in their percpu output section definitions.
Add it.
Signed-off-by: NTejun Heo <tj@kernel.org>

74e79045

16 1月, 2009 11 次提交

x86_64: initialize this_cpu_off to __per_cpu_load · cd3adf52

由 Tejun Heo 提交于 1月 16, 2009

On x86_64, if get_per_cpu_var() is used before per cpu area is setup
(if lockdep is turned on, it happens), it needs this_cpu_off to point
to __per_cpu_load.  Initialize accordingly.
Signed-off-by: NTejun Heo <tj@kernel.org>

cd3adf52

x86: fix build bug introduced during merge · a338af2c

由 Tejun Heo 提交于 1月 16, 2009

EXPORT_PER_CPU_SYMBOL() got misplaced during merge leading to build
failure.  Fix it.
Signed-off-by: NTejun Heo <tj@kernel.org>

a338af2c

percpu: add optimized generic percpu accessors · 6dbde353

由 Ingo Molnar 提交于 1月 15, 2009

It is an optimization and a cleanup, and adds the following new
generic percpu methods:

  percpu_read()
  percpu_write()
  percpu_add()
  percpu_sub()
  percpu_and()
  percpu_or()
  percpu_xor()

and implements support for them on x86. (other architectures will fall
back to a default implementation)

The advantage is that for example to read a local percpu variable,
instead of this sequence:

 return __get_cpu_var(var);

 ffffffff8102ca2b:	48 8b 14 fd 80 09 74 	mov    -0x7e8bf680(,%rdi,8),%rdx
 ffffffff8102ca32:	81
 ffffffff8102ca33:	48 c7 c0 d8 59 00 00 	mov    $0x59d8,%rax
 ffffffff8102ca3a:	48 8b 04 10          	mov    (%rax,%rdx,1),%rax

We can get a single instruction by using the optimized variants:

 return percpu_read(var);

 ffffffff8102ca3f:	65 48 8b 05 91 8f fd 	mov    %gs:0x7efd8f91(%rip),%rax

I also cleaned up the x86-specific APIs and made the x86 code use
these new generic percpu primitives.

tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out
    * added percpu_and() for completeness's sake
    * made generic percpu ops atomic against preemption
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NTejun Heo <tj@kernel.org>

6dbde353

x86: misc clean up after the percpu update · 004aa322

由 Tejun Heo 提交于 1月 13, 2009

Do the following cleanups:

* kill x86_64_init_pda() which now is equivalent to pda_init()

* use per_cpu_offset() instead of cpu_pda() when initializing
  initial_gs
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

004aa322

x86: convert pda ops to wrappers around x86 percpu accessors · 49357d19

由 Tejun Heo 提交于 1月 13, 2009

pda is now a percpu variable and there's no reason it can't use plain
x86 percpu accessors.  Add x86_test_and_clear_bit_percpu() and replace
pda op implementations with wrappers around x86 percpu accessors.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

49357d19

x86: make pda a percpu variable · b12d8db8

由 Tejun Heo 提交于 1月 13, 2009

[ Based on original patch from Christoph Lameter and Mike Travis. ]

As pda is now allocated in percpu area, it can easily be made a proper
percpu variable.  Make it so by defining per cpu symbol from linker
script and declaring it in C code for SMP and simply defining it for
UP.  This change cleans up code and brings SMP and UP closer a bit.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b12d8db8

x86: merge 64 and 32 SMP percpu handling · 9939ddaf

由 Tejun Heo 提交于 1月 13, 2009

Now that pda is allocated as part of percpu, percpu doesn't need to be
accessed through pda.  Unify x86_64 SMP percpu access with x86_32 SMP
one.  Other than the segment register, operand size and the base of
percpu symbols, they behave identical now.

This patch replaces now unnecessary pda->data_offset with a dummy
field which is necessary to keep stack_canary at its place.  This
patch also moves per_cpu_offset initialization out of init_gdt() into
setup_per_cpu_areas().  Note that this change also necessitates
explicit per_cpu_offset initializations in voyager_smp.c.

With this change, x86_OP_percpu()'s are as efficient on x86_64 as on
x86_32 and also x86_64 can use assembly PER_CPU macros.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9939ddaf

x86: fold pda into percpu area on SMP · 1a51e3a0

由 Tejun Heo 提交于 1月 13, 2009

[ Based on original patch from Christoph Lameter and Mike Travis. ]

Currently pdas and percpu areas are allocated separately.  %gs points
to local pda and percpu area can be reached using pda->data_offset.
This patch folds pda into percpu area.

Due to strange gcc requirement, pda needs to be at the beginning of
the percpu area so that pda->stack_canary is at %gs:40.  To achieve
this, a new percpu output section macro - PERCPU_VADDR_PREALLOC() - is
added and used to reserve pda sized chunk at the start of the percpu
area.

After this change, for boot cpu, %gs first points to pda in the
data.init area and later during setup_per_cpu_areas() gets updated to
point to the actual pda.  This means that setup_per_cpu_areas() need
to reload %gs for CPU0 while clearing pda area for other cpus as cpu0
already has modified it when control reaches setup_per_cpu_areas().

This patch also removes now unnecessary get_local_pda() and its call
sites.

A lot of this patch is taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1a51e3a0

x86: use static _cpu_pda array · c8f3329a

由 Tejun Heo 提交于 1月 13, 2009

_cpu_pda array first uses statically allocated storage in data.init
and then switches to allocated bootmem to conserve space.  However,
after folding pda area into percpu area, _cpu_pda array will be
removed completely.  Drop the reallocation part to simplify the code
for soon-to-follow changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c8f3329a

x86: load pointer to pda into %gs while brining up a CPU · f32ff538

由 Tejun Heo 提交于 1月 13, 2009

[ Based on original patch from Christoph Lameter and Mike Travis. ]

CPU startup code in head_64.S loaded address of a zero page into %gs
for temporary use till pda is loaded but address to the actual pda is
available at the point.  Load the real address directly instead.

This will help unifying percpu and pda handling later on.

This patch is mostly taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.
Signed-off-by: NTejun Heo <tj@kernel.org>

f32ff538

x86: make percpu symbols zerobased on SMP · 3e5d8f97

由 Tejun Heo 提交于 1月 13, 2009

[ Based on original patch from Christoph Lameter and Mike Travis. ]

This patch makes percpu symbols zerobased on x86_64 SMP by adding
PERCPU_VADDR() to vmlinux.lds.h which helps setting explicit vaddr on
the percpu output section and using it in vmlinux_64.lds.S.  A new
PHDR is added as existing ones cannot contain sections near address
zero.  PERCPU_VADDR() also adds a new symbol __per_cpu_load which
always points to the vaddr of the loaded percpu data.init region.

The following adjustments have been made to accomodate the address
change.

* code to locate percpu gdt_page in head_64.S is updated to add the
  load address to the gdt_page offset.

* __per_cpu_load is used in places where access to the init data area
  is necessary.

* pda->data_offset is initialized soon after C code is entered as zero
  value doesn't work anymore.

This patch is mostly taken from Mike Travis' "x86_64: Base percpu
variables at zero" patch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3e5d8f97