提交 · 625037cc405eabbfd2a39e9297e583a94886225f · openanolis / cloud-kernel

18 6月, 2009 1 次提交

x86-32: make sure clts is batched during context switch · 2fcddce1

由 Jeremy Fitzhardinge 提交于 4月 24, 2009

If we're preloading the fpu state during context switch, make sure the clts
happens while we're batching the cpu context update, then do the actual
__math_state_restore once the updates are flushed.

This allows more efficient context switches when running paravirtualized,
as all the hypercalls can be folded together into one.

[ Impact: optimise paravirtual FPU context switch ]
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>

2fcddce1

12 5月, 2009 2 次提交

x86: process.c, remove useless headers · bf78ad69

由 Amerigo Wang 提交于 5月 11, 2009

<stdarg.h> is not needed by these files, remove them.

[ Impact: cleanup ]
Signed-off-by: NWANG Cong <amwang@redhat.com>
Cc: akpm@linux-foundation.org
LKML-Reference: <20090512032956.5040.77055.sendpatchset@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bf78ad69

x86: merge process.c a bit · 9d62dcdf

由 Amerigo Wang 提交于 5月 11, 2009

Merge arch_align_stack() and arch_randomize_brk(), since
they are the same.

Tested on x86_64.

[ Impact: cleanup ]
Signed-off-by: NAmerigo Wang <amwang@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9d62dcdf

07 4月, 2009 1 次提交

x86, ds: add leakage warning · 2311f0de

由 Markus Metzger 提交于 4月 03, 2009

Add a warning in case a debug store context is not removed before
the task it is attached to is freed.

Remove the old warning at thread exit. It is too early.

Declare the debug store context field in thread_struct unconditionally.

Remove ds_copy_thread() and ds_exit_thread() and do the work directly
in process*.c.
Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
Cc: roland@redhat.com
Cc: eranian@googlemail.com
Cc: oleg@redhat.com
Cc: juan.villacis@intel.com
Cc: ak@linux.jf.intel.com
LKML-Reference: <20090403144601.254472000@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2311f0de

03 4月, 2009 1 次提交

Simplify copy_thread() · 6f2c55b8

由 Alexey Dobriyan 提交于 4月 02, 2009

First argument unused since 2.3.11.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6f2c55b8

30 3月, 2009 2 次提交

x86/paravirt: finish change from lazy cpu to context switch start/end · 224101ed

由 Jeremy Fitzhardinge 提交于 2月 18, 2009

Impact: fix lazy context switch API

Pass the previous and next tasks into the context switch start
end calls, so that the called functions can properly access the
task state (esp in end_context_switch, in which the next task
is not yet completely current).
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

224101ed

x86/pvops: replace arch_enter_lazy_cpu_mode with arch_start_context_switch · 7fd7d83d

由 Jeremy Fitzhardinge 提交于 2月 17, 2009

Impact: simplification, prepare for later changes

Make lazy cpu mode more specific to context switching, so that
it makes sense to do more context-switch specific things in
the callbacks.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

7fd7d83d

02 3月, 2009 2 次提交

x86: unify chunks of kernel/process*.c · 389d1fb1

由 Jeremy Fitzhardinge 提交于 2月 27, 2009

With x86-32 and -64 using the same mechanism for managing the
tss io permissions bitmap, large chunks of process*.c are
trivially unifyable, including:

 - exit_thread
 - flush_thread
 - __switch_to_xtra (along with tsc enable/disable)

and as bonus pickups:

 - sys_fork
 - sys_vfork

(Note: asmlinkage expands to empty on x86-64)
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

389d1fb1

x86-32: use non-lazy io bitmap context switching · db949bba

由 Jeremy Fitzhardinge 提交于 2月 27, 2009

Impact: remove 32-bit optimization to prepare unification

x86-32 and -64 differ in the way they context-switch tasks
with io permission bitmaps.  x86-64 simply copies the next
tasks io bitmap into place (if any) on context switch.  x86-32
invalidates the bitmap on context switch, so that the next
IO instruction will fault; at that point it installs the
appropriate IO bitmap.

This makes context switching IO-bitmap-using tasks a bit more
less expensive, at the cost of making the next IO instruction
slower due to the extra fault.  This tradeoff only makes sense
if IO-bitmap-using processes are relatively common, but they
don't actually use IO instructions very often.

However, in a typical desktop system, the only process likely
to be using IO bitmaps is the X server, and nothing at all on
a server.  Therefore the lazy context switch doesn't really win
all that much, and its just a gratuitious difference from
64-bit code.

This patch removes the lazy context switch, with a view to
unifying this code in a later change.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

db949bba

18 2月, 2009 1 次提交

x86, rcu: fix strange load average and ksoftirqd behavior · bf51935f

由 Paul E. McKenney 提交于 2月 17, 2009

Damien Wyart reported high ksoftirqd CPU usage (20%) on an
otherwise idle system.

The function-graph trace Damien provided:

>   799.521187 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521371 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521555 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521738 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521934 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522068 |   1)  ksoftir-2324  |               |                rcu_check_callbacks() {
>   799.522208 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522392 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522575 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522759 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522956 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523074 |   1)  ksoftir-2324  |               |                  rcu_check_callbacks() {
>   799.523214 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523397 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523579 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523762 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523960 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524079 |   1)  ksoftir-2324  |               |                  rcu_check_callbacks() {
>   799.524220 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524403 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524587 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524770 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
> [ . . . ]

Shows rcu_check_callbacks() being invoked way too often. It should be called
once per jiffy, and here it is called no less than 22 times in about
3.5 milliseconds, meaning one call every 160 microseconds or so.

Why do we need to call rcu_pending() and rcu_check_callbacks() from the
idle loop of 32-bit x86, especially given that no other architecture does
this?

The following patch removes the call to rcu_pending() and
rcu_check_callbacks() from the x86 32-bit idle loop in order to
reduce the softirq load on idle systems.
Reported-by: NDamien Wyart <damien.wyart@free.fr>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bf51935f

12 2月, 2009 1 次提交

x86: use regparm(3) for passed-in pt_regs pointer · b12bdaf1

由 Brian Gerst 提交于 2月 11, 2009

Some syscalls need to access the pt_regs structure, either to copy
user register state or to modifiy it.  This patch adds stubs to load
the address of the pt_regs struct into the %eax register, and changes
the syscalls to take the pointer as an argument instead of relying on
the assumption that the pt_regs structure overlaps the function
arguments.

Drop the use of regparm(1) due to concern about gcc bugs, and to move
in the direction of the eventual removal of regparm(0) for asmlinkage.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

b12bdaf1

11 2月, 2009 2 次提交

x86: pass in pt_regs pointer for syscalls that need it · 253f29a4

由 Brian Gerst 提交于 2月 10, 2009

Some syscalls need to access the pt_regs structure, either to copy
user register state or to modifiy it.  This patch adds stubs to load
the address of the pt_regs struct into the %eax register, and changes
the syscalls to regparm(1) to receive the pt_regs pointer as the
first argument.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

253f29a4

x86: fix x86_32 stack protector bugs · 5c79d2a5

由 Tejun Heo 提交于 2月 11, 2009

Impact: fix x86_32 stack protector

Brian Gerst found out that %gs was being initialized to stack_canary
instead of stack_canary - 20, which basically gave the same canary
value for all threads.  Fixing this also exposed the following bugs.

* cpu_idle() didn't call boot_init_stack_canary()

* stack canary switching in switch_to() was being done too late making
  the initial run of a new thread use the old stack canary value.

Fix all of them and while at it update comment in cpu_idle() about
calling boot_init_stack_canary().
Reported-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5c79d2a5

10 2月, 2009 3 次提交

x86: implement x86_32 stack protector · 60a5317f

由 Tejun Heo 提交于 2月 09, 2009

Impact: stack protector for x86_32

Implement stack protector for x86_32.  GDT entry 28 is used for it.
It's set to point to stack_canary-20 and have the length of 24 bytes.
CONFIG_CC_STACKPROTECTOR turns off CONFIG_X86_32_LAZY_GS and sets %gs
to the stack canary segment on entry.  As %gs is otherwise unused by
the kernel, the canary can be anywhere.  It's defined as a percpu
variable.

x86_32 exception handlers take register frame on stack directly as
struct pt_regs.  With -fstack-protector turned on, gcc copies the
whole structure after the stack canary and (of course) doesn't copy
back on return thus losing all changed.  For now, -fno-stack-protector
is added to all files which contain those functions.  We definitely
need something better.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

60a5317f

x86: make lazy %gs optional on x86_32 · ccbeed3a

由 Tejun Heo 提交于 2月 09, 2009

Impact: pt_regs changed, lazy gs handling made optional, add slight
        overhead to SAVE_ALL, simplifies error_code path a bit

On x86_32, %gs hasn't been used by kernel and handled lazily.  pt_regs
doesn't have place for it and gs is saved/loaded only when necessary.
In preparation for stack protector support, this patch makes lazy %gs
handling optional by doing the followings.

* Add CONFIG_X86_32_LAZY_GS and place for gs in pt_regs.

* Save and restore %gs along with other registers in entry_32.S unless
  LAZY_GS.  Note that this unfortunately adds "pushl $0" on SAVE_ALL
  even when LAZY_GS.  However, it adds no overhead to common exit path
  and simplifies entry path with error code.

* Define different user_gs accessors depending on LAZY_GS and add
  lazy_save_gs() and lazy_load_gs() which are noop if !LAZY_GS.  The
  lazy_*_gs() ops are used to save, load and clear %gs lazily.

* Define ELF_CORE_COPY_KERNEL_REGS() which always read %gs directly.

xen and lguest changes need to be verified.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ccbeed3a

x86: add %gs accessors for x86_32 · d9a89a26

由 Tejun Heo 提交于 2月 09, 2009

Impact: cleanup

On x86_32, %gs is handled lazily.  It's not saved and restored on
kernel entry/exit but only when necessary which usually is during task
switch but there are few other places.  Currently, it's done by
calling savesegment() and loadsegment() explicitly.  Define
get_user_gs(), set_user_gs() and task_user_gs() and use them instead.

While at it, clean up register access macros in signal.c.

This cleans up code a bit and will help future changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d9a89a26

23 1月, 2009 1 次提交

x86: remove idle_timestamp from 32bit irq_cpustat_t · 03d2989d

由 Brian Gerst 提交于 1月 23, 2009

Impact: bogus irq_cpustat field removed

idle_timestamp is left over from the removed irqbalance code.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

03d2989d

18 1月, 2009 1 次提交

x86-64: Move cpu number from PDA to per-cpu and consolidate with 32-bit. · ea927906

由 Brian Gerst 提交于 1月 19, 2009

tj: moved cpu_number definition out of CONFIG_HAVE_SETUP_PER_CPU_AREA
    for voyager.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

ea927906

16 1月, 2009 1 次提交

percpu: add optimized generic percpu accessors · 6dbde353

由 Ingo Molnar 提交于 1月 15, 2009

It is an optimization and a cleanup, and adds the following new
generic percpu methods:

  percpu_read()
  percpu_write()
  percpu_add()
  percpu_sub()
  percpu_and()
  percpu_or()
  percpu_xor()

and implements support for them on x86. (other architectures will fall
back to a default implementation)

The advantage is that for example to read a local percpu variable,
instead of this sequence:

 return __get_cpu_var(var);

 ffffffff8102ca2b:	48 8b 14 fd 80 09 74 	mov    -0x7e8bf680(,%rdi,8),%rdx
 ffffffff8102ca32:	81
 ffffffff8102ca33:	48 c7 c0 d8 59 00 00 	mov    $0x59d8,%rax
 ffffffff8102ca3a:	48 8b 04 10          	mov    (%rax,%rdx,1),%rax

We can get a single instruction by using the optimized variants:

 return percpu_read(var);

 ffffffff8102ca3f:	65 48 8b 05 91 8f fd 	mov    %gs:0x7efd8f91(%rip),%rax

I also cleaned up the x86-specific APIs and made the x86 code use
these new generic percpu primitives.

tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out
    * added percpu_and() for completeness's sake
    * made generic percpu ops atomic against preemption
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NTejun Heo <tj@kernel.org>

6dbde353

04 1月, 2009 1 次提交

x86: process_32.c fix style problems · befa9e78

由 Jaswinder Singh Rajput 提交于 1月 04, 2009

Impact: cleanup

Fix:

 WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
 WARNING: Use #include <linux/io.h> instead of <asm/io.h>
 WARNING: Use #include <linux/kdebug.h> instead of <asm/kdebug.h>
 WARNING: Use #include <linux/smp.h> instead of <asm/smp.h>
 ERROR: "foo * bar" should be "foo *bar"
 ERROR: trailing whitespace
 ERROR: spaces required around that ':' (ctx:WxO)
 ERROR: spaces required around that ':' (ctx:OxW)

 total: 7 errors, 4 warnings
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

befa9e78

20 12月, 2008 1 次提交

x86, bts: add fork and exit handling · bf53de90

由 Markus Metzger 提交于 12月 19, 2008

Impact: introduce new ptrace facility

Add arch_ptrace_untrace() function that is called when the tracer
detaches (either voluntarily or when the tracing task dies);
ptrace_disable() is only called on a voluntary detach.

Add ptrace_fork() and arch_ptrace_fork(). They are called when a
traced task is forked.

Clear DS and BTS related fields on fork.

Release DS resources and reclaim memory in ptrace_untrace(). This
releases resources already when the tracing task dies. We used to do
that when the traced task dies.
Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bf53de90

12 12月, 2008 1 次提交

x86, bts: provide in-kernel branch-trace interface · c2724775

由 Markus Metzger 提交于 12月 11, 2008

Impact: cleanup

Move the BTS bits from ptrace.c into ds.c.
Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c2724775

08 12月, 2008 1 次提交

tracing/function-graph-tracer: introduce __notrace_funcgraph to filter special functions · 8b96f011

由 Frederic Weisbecker 提交于 12月 06, 2008

Impact: trace more functions

When the function graph tracer is configured, three more files are not
traced to prevent only four functions to be traced. And this impacts the
normal function tracer too.

arch/x86/kernel/process_64/32.c:

I had crashes when I let this file traced. After some debugging, I saw
that the "current" task point was changed inside__swtich_to(), ie:
"write_pda(pcurrent, next_p);" inside process_64.c Since the tracer store
the original return address of the function inside current, we had
crashes. Only __switch_to() has to be excluded from tracing.

kernel/module.c and kernel/extable.c:

Because of a function used internally by the function graph tracer:
__kernel_text_address()

To let the other functions inside these files to be traced, this patch
introduces the __notrace_funcgraph function prefix which is __notrace if
function graph tracer is configured and nothing if not.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8b96f011

13 10月, 2008 1 次提交

x86: __show_registers() and __show_regs() API unification · e2ce07c8

由 Pekka Enberg 提交于 4月 03, 2008

Currently the low-level function to dump user-passed registers on i386 is
called __show_registers() whereas on x86-64 it's called __show_regs(). Unify
the API to simplify porting of kmemcheck to x86-64.
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Acked-by: NVegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e2ce07c8

24 9月, 2008 1 次提交

x86: prevent stale state of c1e_mask across CPU offline/online, fix · 1eda8149

由 Marc Dionne 提交于 9月 23, 2008

Fix build error introduced by commit 4faac97d ("x86: prevent stale
state of c1e_mask across CPU offline/online").

process_32.c needs to include idle.h to get the prototype for
c1e_remove_cpu()
Signed-off-by: NMarc Dionne <marc.c.dionne@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1eda8149

23 9月, 2008 1 次提交

x86: prevent stale state of c1e_mask across CPU offline/online · 4faac97d

由 Thomas Gleixner 提交于 9月 22, 2008

Impact: hang which happens across CPU offline/online on AMD C1E systems.

When a CPU goes offline then the corresponding bit in the broadcast
mask is cleared. For AMD C1E enabled CPUs we do not reenable the
broadcast when the CPU comes online again as we do not clear the
corresponding bit in the c1e_mask, which keeps track which CPUs
have been switched to broadcast already. So on those !$@#& machines
we never switch back to broadcasting after a CPU offline/online cycle.

Clear the bit when the CPU plays dead.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

4faac97d

17 9月, 2008 1 次提交

x86: print DMI information in the oops trace · 90f7d25c

由 Arjan van de Ven 提交于 9月 16, 2008

in order to diagnose hard system specific issues, it's useful to
have the system name in the oops (as provided by DMI)
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

90f7d25c

05 9月, 2008 1 次提交

x86: build fix for !CONFIG_SMP · 913da64b

由 Alex Nixon 提交于 9月 03, 2008

Move reset_lazy_tlbstate into tlb_32.c, and define noop versions of
play_dead() in process_{32,64}.c when !CONFIG_SMP.
Signed-off-by: NAlex Nixon <alex.nixon@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

913da64b

25 8月, 2008 3 次提交

x86: unify x86_32 and x86_64 play_dead into one function · a21f5d88

由 Alex Nixon 提交于 8月 22, 2008

Add the new play_dead into smpboot.c, as it fits more cleanly in there
alongside other CONFIG_HOTPLUG functions.

Separate out the common code into its own function.
Signed-off-by: NAlex Nixon <alex.nixon@citrix.com>
Acked-by: NJeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a21f5d88

x86_32: clean up play_dead · 37900258

由 Alex Nixon 提交于 8月 22, 2008

The removal of the CPU from the various maps was redundant as it already
happened in cpu_disable.

After cleaning this up, cpu_uninit only resets the tlb state, so rename
it and create a noop version for the X86_64 case (so the two play_deads
can be unified later).
Signed-off-by: NAlex Nixon <alex.nixon@citrix.com>
Acked-by: NJeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

37900258

x86: add cpu hotplug hooks into smp_ops · 93be71b6

由 Alex Nixon 提交于 8月 22, 2008

Signed-off-by: NAlex Nixon <alex.nixon@citrix.com>
Acked-by: NJeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

93be71b6

15 8月, 2008 1 次提交

x86: invalidate caches before going into suspend · 394a1505

由 Mark Langsdorf 提交于 8月 14, 2008

When a CPU core is shut down, all of its caches need to be flushed
to prevent stale data from causing errors if the core is resumed.
Current Linux suspend code performs an assignment after the flush,
which can add dirty data back to the cache.  On some AMD platforms,
additional speculative reads have caused crashes on resume because
of this dirty data.

Relocate the cache flush to be the very last thing done before
halting.  Tie into an assembly line so the compile will not
reorder it.  Add some documentation explaining what is going
on and why we're doing this.
Signed-off-by: NMark Langsdorf <mark.langsdorf@amd.com>
Acked-by: NMark Borden <mark.borden@amd.com>
Acked-by: NMichael Hohmuth <michael.hohmuth@amd.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

394a1505

22 7月, 2008 2 次提交

x86: process_32.c declare cpu_number before they get used · fb26132b

由 Jaswinder Singh 提交于 7月 21, 2008

Moved DECLARE_PER_CPU(int, cpu_number) from CONFIG_X86_32_SMP to CONFIG_X86_32
because cpu_number is required for both.
And include asm/smp.h in process_32.c
Signed-off-by: NJaswinder Singh <jaswinder@infradead.org>

fb26132b

x86: Introducing asm/syscalls.h · bbc1f698

由 Jaswinder Singh 提交于 7月 21, 2008

Declaring arch-dependent syscalls for x86 architecture
Signed-off-by: NJaswinder Singh <jaswinder@infradead.org>

bbc1f698

19 7月, 2008 1 次提交

nohz: prevent tick stop outside of the idle loop · b8f8c3cf

由 Thomas Gleixner 提交于 7月 18, 2008

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

	scheduler switch to idle task
	enable interrupts

Window starts here

	----> interrupt happens (does not set NEED_RESCHED)
	      	irq_exit() stops the tick

	----> interrupt happens (does set NEED_RESCHED)

	return from schedule()
	
	cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
	preempt_disable();

	while(1) {
		 tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
		 			          are in the idle loop

		 while (!need_resched())
		       halt();

		 tick_nohz_restart_sched_tick(); <- disables NOHZ mode
		 preempt_enable_no_resched();
		 schedule();
		 preempt_disable();
	}
}

In hindsight we should have done this forever, but ... 

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>, 
Debugged-by: Neric miao <eric.y.miao@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

b8f8c3cf

08 7月, 2008 1 次提交

x86: move cpu_exit_clear to process_32.c · 1481a3dd

由 Glauber Costa 提交于 6月 04, 2008

Take it out of smpboot.c, and move it to process_32.c, closer
to its only user.
Signed-off-by: NGlauber Costa <gcosta@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1481a3dd

19 6月, 2008 1 次提交

x86: fix NULL pointer deref in __switch_to · 75118a82

由 Suresh Siddha 提交于 6月 13, 2008

Patrick McHardy reported a crash:

> > I get this oops once a day, its apparently triggered by something
> > run by cron, but the process is a different one each time.
> >
> > Kernel is -git from yesterday shortly before the -rc6 release
> > (last commit is the usb-2.6 merge, the x86 patches are missing),
> > .config is attached.
> >
> > I'll retry with current -git, but the patches that have gone in
> > since I last updated don't look related.
> >
> > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at
> > 000001ff
> > [62060.043009] IP: [<c0102a9b>] __switch_to+0x2f/0x118
> > [62060.043009] *pde = 00000000
> > [62060.043009] Oops: 0002 [#1] PREEMPT

Vegard Nossum analyzed it:

> This decodes to
>
>    0:   0f ae 00                fxsave (%eax)
>
> so it's related to the floating-point context. This is the exact
> location of the crash:
>
> $ addr2line -e arch/x86/kernel/process_32.o -i ab0
> include/asm/i387.h:232
> include/asm/i387.h:262
> arch/x86/kernel/process_32.c:595
>
> ...so it looks like prev_task->thread.xstate->fxsave has become NULL.
> Or maybe it never had any other value.

Somehow (as described below) TS_USEDFPU is set but the fpu is not
allocated or freed.

Another possible FPU pre-emption issue with the sleazy FPU optimization
which was benign before but not so anymore, with the dynamic FPU allocation
patch.

New task is getting exec'd and it is prempted at the below point.

flush_thread() {
	...
	/*
	* Forget coprocessor state..
	*/
	clear_fpu(tsk);
		<----- Preemption point
	clear_used_math();
	...
}

Now when it context switches in again, as the used_math() is still set
and fpu_counter can be > 5, we will do a math_state_restore() which sets
the task's TS_USEDFPU. After it continues from the above preemption point
it does clear_used_math() and much later free_thread_xstate().

Now, at the next context switch, it is quite possible that xstate is
null, used_math() is not set and TS_USEDFPU is still set. This will
trigger unlazy_fpu() causing kernel oops.

Fix this  by clearing tsk's fpu_counter before clearing task's fpu.
Reported-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

75118a82

10 6月, 2008 2 次提交

x86: move more common idle functions/variables to process.c · 00dba564

由 Thomas Gleixner 提交于 6月 09, 2008

more unification. Should cause no change in functionality.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

00dba564

x86: simplify idle selection · 6ddd2a27

由 Thomas Gleixner 提交于 6月 09, 2008

default_idle is selected in cpu_idle(), when no other idle routine is
selected. Select it in select_idle_routine() when mwait is not
selected.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6ddd2a27

04 6月, 2008 1 次提交

x86, fpu: fix CONFIG_PREEMPT=y corruption of application's FPU stack · 870568b3

由 Suresh Siddha 提交于 6月 02, 2008

Jürgen Mell reported an FPU state corruption bug under CONFIG_PREEMPT,
and bisected it to commit v2.6.19-1363-gacc20761, "i386: add sleazy FPU
optimization".

Add tsk_used_math() checks to prevent calling math_state_restore()
which can sleep in the case of !tsk_used_math(). This prevents
making a blocking call in __switch_to().

Apparently "fpu_counter > 5" check is not enough, as in some signal handling
and fork/exec scenarios, fpu_counter > 5 and !tsk_used_math() is possible.

It's a side effect though. This is the failing scenario:

process 'A' in save_i387_ia32() just after clear_used_math()

Got an interrupt and pre-empted out.

At the next context switch to process 'A' again, kernel tries to restore
the math state proactively and sees a fpu_counter > 0 and !tsk_used_math()

This results in init_fpu() during the __switch_to()'s math_state_restore()

And resulting in fpu corruption which will be saved/restored
(save_i387_fxsave and restore_i387_fxsave) during the remaining
part of the signal handling after the context switch.
Bisected-by: NJürgen Mell <j.mell@t-online.de>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Tested-by: NJürgen Mell <j.mell@t-online.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org

870568b3

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功