提交 · a0bfa1373859e9d11dc92561a8667588803e42d8 · openeuler / raspberrypi-kernel

04 8月, 2011 1 次提交

cpuidle: stop depending on pm_idle · a0bfa137

由 Len Brown 提交于 4月 01, 2011

cpuidle users should call cpuidle_call_idle() directly
rather than via (pm_idle)() function pointer.

Architecture may choose to continue using (pm_idle)(),
but cpuidle need not depend on it:

  my_arch_cpu_idle()
	...
	if(cpuidle_call_idle())
		pm_idle();

cc: Kevin Hilman <khilman@deeprootsystems.com>
cc: Paul Mundt <lethal@linux-sh.org>
cc: x86@kernel.org
Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: NLen Brown <len.brown@intel.com>

a0bfa137

10 6月, 2011 1 次提交

exec: delay address limit change until point of no return · dac853ae

由 Mathias Krause 提交于 6月 09, 2011

Unconditionally changing the address limit to USER_DS and not restoring
it to its old value in the error path is wrong because it prevents us
using kernel memory on repeated calls to this function.  This, in fact,
breaks the fallback of hard coded paths to the init program from being
ever successful if the first candidate fails to load.

With this patch applied switching to USER_DS is delayed until the point
of no return is reached which makes it possible to have a multi-arch
rootfs with one arch specific init binary for each of the (hard coded)
probed paths.

Since the address limit is already set to USER_DS when start_thread()
will be invoked, this redundancy can be safely removed.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dac853ae

13 1月, 2011 1 次提交

cpuidle/x86/perf: fix power:cpu_idle double end events and throw cpu_idle... · f77cfe4e

由 Thomas Renninger 提交于 1月 07, 2011

cpuidle/x86/perf: fix power:cpu_idle double end events and throw cpu_idle events from the cpuidle layer

Currently intel_idle and acpi_idle driver show double cpu_idle "exit idle"
events -> this patch fixes it and makes cpu_idle events throwing less complex.

It also introduces cpu_idle events for all architectures which use
the cpuidle subsystem, namely:
  - arch/arm/mach-at91/cpuidle.c
  - arch/arm/mach-davinci/cpuidle.c
  - arch/arm/mach-kirkwood/cpuidle.c
  - arch/arm/mach-omap2/cpuidle34xx.c
  - arch/drivers/acpi/processor_idle.c (for all cases, not only mwait)
  - arch/x86/kernel/process.c (did throw events before, but was a mess)
  - drivers/idle/intel_idle.c (did throw events before)

Convention should be:
Fire cpu_idle events inside the current pm_idle function (not somewhere
down the the callee tree) to keep things easy.

Current possible pm_idle functions in X86:
c1e_idle, poll_idle, cpuidle_idle_call, mwait_idle, default_idle
-> this is really easy is now.

This affects userspace:
The type field of the cpu_idle power event can now direclty get
mapped to:
/sys/devices/system/cpu/cpuX/cpuidle/stateX/{name,desc,usage,time,...}
instead of throwing very CPU/mwait specific values.
This change is not visible for the intel_idle driver.
For the acpi_idle driver it should only be visible if the vendor
misses out C-states in his BIOS.
Another (perf timechart) patch reads out cpuidle info of cpu_idle
events from:
/sys/.../cpuidle/stateX/*, then the cpuidle events are mapped
to the correct C-/cpuidle state again, even if e.g. vendors miss
out C-states in their BIOS and for example only export C1 and C3.
-> everything is fine.
Signed-off-by: NThomas Renninger <trenn@suse.de>
CC: Robert Schoene <robert.schoene@tu-dresden.de>
CC: Jean Pihet <j-pihet@ti.com>
CC: Arjan van de Ven <arjan@linux.intel.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: linux-pm@lists.linux-foundation.org
CC: linux-acpi@vger.kernel.org
CC: linux-kernel@vger.kernel.org
CC: linux-perf-users@vger.kernel.org
CC: linux-omap@vger.kernel.org
Signed-off-by: NLen Brown <len.brown@intel.com>

f77cfe4e

04 1月, 2011 1 次提交

perf: Clean up power events by introducing new, more generic ones · 25e41933

由 Thomas Renninger 提交于 1月 03, 2011

Add these new power trace events:

 power:cpu_idle
 power:cpu_frequency
 power:machine_suspend

The old C-state/idle accounting events:
  power:power_start
  power:power_end

Have now a replacement (but we are still keeping the old
tracepoints for compatibility):

  power:cpu_idle

and
  power:power_frequency

is replaced with:
  power:cpu_frequency

power:machine_suspend is newly introduced.

Jean Pihet has a patch integrated into the generic layer
(kernel/power/suspend.c) which will make use of it.

the type= field got removed from both, it was never
used and the type is differed by the event type itself.

perf timechart userspace tool gets adjusted in a separate patch.
Signed-off-by: NThomas Renninger <trenn@suse.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NArjan van de Ven <arjan@linux.intel.com>
Acked-by: NJean Pihet <jean.pihet@newoldbits.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: rjw@sisk.pl
LKML-Reference: <1294073445-14812-3-git-send-email-trenn@suse.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
LKML-Reference: <1290072314-31155-2-git-send-email-trenn@suse.de>

25e41933

18 6月, 2010 1 次提交

x86, perf: Add power_end event to process_*.c cpu_idle routine · c882e0fe

由 Robert Schöne 提交于 6月 14, 2010

Systems using the idle thread from process_32.c and process_64.c
do not generate power_end events which could be traced using
perf. This patch adds the event generation for such systems.
Signed-off-by: NRobert Schoene <robert.schoene@tu-dresden.de>
Acked-by: NArjan van de Ven <arjan@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1276515440.5441.45.camel@localhost>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c882e0fe

11 5月, 2010 1 次提交

x86: Introduce 'struct fpu' and related API · 86603283

由 Avi Kivity 提交于 5月 06, 2010

Currently all fpu state access is through tsk->thread.xstate.  Since we wish
to generalize fpu access to non-task contexts, wrap the state in a new
'struct fpu' and convert existing access to use an fpu API.

Signal frame handlers are not converted to the API since they will remain
task context only things.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1273135546-29690-3-git-send-email-avi@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

86603283

26 3月, 2010 1 次提交

x86, perf, bts, mm: Delete the never used BTS-ptrace code · faa4602e

由 Peter Zijlstra 提交于 3月 25, 2010

Support for the PMU's BTS features has been upstreamed in
v2.6.32, but we still have the old and disabled ptrace-BTS,
as Linus noticed it not so long ago.

It's buggy: TIF_DEBUGCTLMSR is trampling all over that MSR without
regard for other uses (perf) and doesn't provide the flexibility
needed for perf either.

Its users are ptrace-block-step and ptrace-bts, since ptrace-bts
was never used and ptrace-block-step can be implemented using a
much simpler approach.

So axe all 3000 lines of it. That includes the *locked_memory*()
APIs in mm/mlock.c as well.
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Markus Metzger <markus.t.metzger@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20100325135413.938004390@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

faa4602e

14 1月, 2010 1 次提交

x86: Merge show_regs() · 3bef4447

由 Brian Gerst 提交于 1月 13, 2010

Using kernel_stack_pointer() allows 32-bit and 64-bit versions to
be merged.  This is more correct for 64-bit, since the old %rsp is
always saved on the stack.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
LKML-Reference: <1263397555-27695-1-git-send-email-brgerst@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

3bef4447

28 12月, 2009 1 次提交

x86: Use KERN_DEFAULT log-level in __show_regs() · d015a092

由 Pekka Enberg 提交于 12月 28, 2009

Andrew Morton reported a strange looking kmemcheck warning:

  WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88004fba6c20)
  0000000000000000310000000000000000000000000000002413000000c9ffff
   u u u u u u u u u u u u u u u u i i i i i i i i u u u u u u u u

   [<ffffffff810af3aa>] kmemleak_scan+0x25a/0x540
   [<ffffffff810afbcb>] kmemleak_scan_thread+0x5b/0xe0
   [<ffffffff8104d0fe>] kthread+0x9e/0xb0
   [<ffffffff81003074>] kernel_thread_helper+0x4/0x10
   [<ffffffffffffffff>] 0xffffffffffffffff

The above printout is missing register dump completely. The
problem here is that the output comes from syslog which doesn't
show KERN_INFO log-level messages. We didn't see this before
because both of us were testing on 32-bit kernels which use the
_default_ log-level.

Fix that up by explicitly using KERN_DEFAULT log-level for
__show_regs() printks.
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1261988819.4641.2.camel@penberg-laptop>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d015a092

11 12月, 2009 3 次提交

x86: Merge kernel_thread() · df59e7bf

由 Brian Gerst 提交于 12月 09, 2009

Signed-off-by: NBrian Gerst <brgerst@gmail.com>
LKML-Reference: <1260380084-3707-6-git-send-email-brgerst@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

df59e7bf

x86: Sync 32/64-bit kernel_thread · f443ff42

由 Brian Gerst 提交于 12月 09, 2009

Signed-off-by: NBrian Gerst <brgerst@gmail.com>
LKML-Reference: <1260380084-3707-5-git-send-email-brgerst@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

f443ff42

x86, 32-bit: Use same regs as 64-bit for kernel_thread_helper · e840227c

由 Brian Gerst 提交于 12月 09, 2009

The arg should be in %eax, but that is clobbered by the return value
of clone.  The function pointer can be in any register.  Also, don't
push args onto the stack, since regparm(3) is the normal calling
convention now.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
LKML-Reference: <1260380084-3707-4-git-send-email-brgerst@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

e840227c

10 12月, 2009 2 次提交

x86: Merge sys_clone · f839bbc5

由 Brian Gerst 提交于 12月 09, 2009

Change 32-bit sys_clone to new PTREGSCALL stub, and merge with 64-bit.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
LKML-Reference: <1260403316-5679-7-git-send-email-brgerst@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

f839bbc5

x86: Merge sys_execve · 11cf88bd

由 Brian Gerst 提交于 12月 09, 2009

Change 32-bit sys_execve to PTREGSCALL3, and merge with 64-bit.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
LKML-Reference: <1260403316-5679-4-git-send-email-brgerst@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

11cf88bd

09 12月, 2009 1 次提交

x86: Factor duplicated code out of __show_regs() into show_regs_common() · 814e2c84

由 Andy Isaacson 提交于 12月 08, 2009

Unify x86_32 and x86_64 implementations of __show_regs() header,
standardizing on the x86_64 format string in the process. Also,
32-bit will now call print_modules.
Signed-off-by: NAndy Isaacson <adi@hexapodia.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Robert Hancock <hancockrwd@gmail.com>
Cc: Richard Zidlicky <rz@linux-m68k.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20091208082942.GA27174@hexapodia.org>
[ v2: resolved conflict ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

814e2c84

08 11月, 2009 1 次提交

hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf events · 24f1e32c

由 Frederic Weisbecker 提交于 9月 09, 2009

This patch rebase the implementation of the breakpoints API on top of
perf events instances.

Each breakpoints are now perf events that handle the
register scheduling, thread/cpu attachment, etc..

The new layering is now made as follows:

       ptrace       kgdb      ftrace   perf syscall
          \          |          /         /
           \         |         /         /
                                        /
            Core breakpoint API        /
                                      /
                     |               /
                     |              /

              Breakpoints perf events

                     |
                     |

               Breakpoints PMU ---- Debug Register constraints handling
                                    (Part of core breakpoint API)
                     |
                     |

             Hardware debug registers

Reasons of this rewrite:

- Use the centralized/optimized pmu registers scheduling,
  implying an easier arch integration
- More powerful register handling: perf attributes (pinned/flexible
  events, exclusive/non-exclusive, tunable period, etc...)

Impact:

- New perf ABI: the hardware breakpoints counters
- Ptrace breakpoints setting remains tricky and still needs some per
  thread breakpoints references.

Todo (in the order):

- Support breakpoints perf counter events for perf tools (ie: implement
  perf_bpcounter_event())
- Support from perf tools

Changes in v2:

- Follow the perf "event " rename
- The ptrace regression have been fixed (ptrace breakpoint perf events
  weren't released when a task ended)
- Drop the struct hw_breakpoint and store generic fields in
  perf_event_attr.
- Separate core and arch specific headers, drop
  asm-generic/hw_breakpoint.h and create linux/hw_breakpoint.h
- Use new generic len/type for breakpoint
- Handle off case: when breakpoints api is not supported by an arch

Changes in v3:

- Fix broken CONFIG_KVM, we need to propagate the breakpoint api
  changes to kvm when we exit the guest and restore the bp registers
  to the host.

Changes in v4:

- Drop the hw_breakpoint_restore() stub as it is only used by KVM
- EXPORT_SYMBOL_GPL hw_breakpoint_restore() as KVM can be built as a
  module
- Restore the breakpoints unconditionally on kvm guest exit:
  TIF_DEBUG_THREAD doesn't anymore cover every cases of running
  breakpoints and vcpu->arch.switch_db_regs might not always be
  set when the guest used debug registers.
  (Waiting for a reliable optimization)

Changes in v5:

- Split-up the asm-generic/hw-breakpoint.h moving to
  linux/hw_breakpoint.h into a separate patch
- Optimize the breakpoints restoring while switching from kvm guest
  to host. We only want to restore the state if we have active
  breakpoints to the host, otherwise we don't care about messed-up
  address registers.
- Add asm/hw_breakpoint.h to Kbuild
- Fix bad breakpoint type in trace_selftest.c

Changes in v6:

- Fix wrong header inclusion in trace.h (triggered a build
  error with CONFIG_FTRACE_SELFTEST
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jan Kiszka <jan.kiszka@web.de>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>

24f1e32c

03 11月, 2009 1 次提交

x86: Make sure we also print a Code: line for show_regs() · a489ca35

由 Arjan van de Ven 提交于 11月 02, 2009

show_regs() is called as a mini BUG() equivalent in some places,
specifically for the "scheduling while atomic" case.

Unfortunately right now it does not print a Code: line unlike
a real bug/oops.

This patch changes the x86 implementation of show_regs() so that
it calls the same function as oopses do to print the registers
as well as the Code: line.
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
LKML-Reference: <20091102165915.4a980fc0@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a489ca35

13 10月, 2009 1 次提交

x86: use kernel_stack_pointer() in process_32.c · def3c5d0

由 H. Peter Anvin 提交于 10月 12, 2009

The way to obtain a kernel-mode stack pointer from a struct pt_regs in
32-bit mode is "subtle": the stack doesn't actually contain the stack
pointer, but rather the location where it would have been marks the
actual previous stack frame.  For clarity, use kernel_stack_pointer()
instead of coding this weirdness explicitly.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

def3c5d0

04 8月, 2009 1 次提交

x86, percpu: Collect hot percpu variables into one cacheline · bdf977b3

由 Tejun Heo 提交于 8月 03, 2009

On x86_64, percpu variables current_task and kernel_stack are used for
get_current() and current_thread_info() respectively and thus are
often used close to each other.  Move definition of current_task to
kernel/cpu/common.c right above kernel_stack definition and align it
to cacheline so that they always fall into the same cacheline.  Two
percpu variables defined there together - irq_stack_ptr and irq_count
- are also pretty hot and will benefit from sharing the cacheline.

For consistency, current_task definition for x86_32 is also moved to
kernel/cpu/common.c.

Putting current_task and kernel_stack into the same cacheline was
suggested by Linus Torvalds.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

bdf977b3

18 6月, 2009 1 次提交

x86-32: make sure clts is batched during context switch · 2fcddce1

由 Jeremy Fitzhardinge 提交于 4月 24, 2009

If we're preloading the fpu state during context switch, make sure the clts
happens while we're batching the cpu context update, then do the actual
__math_state_restore once the updates are flushed.

This allows more efficient context switches when running paravirtualized,
as all the hypercalls can be folded together into one.

[ Impact: optimise paravirtual FPU context switch ]
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>

2fcddce1

03 6月, 2009 1 次提交

hw-breakpoints: use the new wrapper routines to access debug registers in process/thread code · 66cb5917

由 K.Prasad 提交于 6月 01, 2009

This patch enables the use of abstract debug registers in
process-handling routines, according to the new hardware breakpoint
Api.

[ Impact: adapt thread breakpoints handling code to the new breakpoint Api ]
Original-patch-by: NAlan Stern <stern@rowland.harvard.edu>
Signed-off-by: NK.Prasad <prasad@linux.vnet.ibm.com>
Reviewed-by: NAlan Stern <stern@rowland.harvard.edu>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>

66cb5917

12 5月, 2009 2 次提交

x86: process.c, remove useless headers · bf78ad69

由 Amerigo Wang 提交于 5月 11, 2009

<stdarg.h> is not needed by these files, remove them.

[ Impact: cleanup ]
Signed-off-by: NWANG Cong <amwang@redhat.com>
Cc: akpm@linux-foundation.org
LKML-Reference: <20090512032956.5040.77055.sendpatchset@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bf78ad69

x86: merge process.c a bit · 9d62dcdf

由 Amerigo Wang 提交于 5月 11, 2009

Merge arch_align_stack() and arch_randomize_brk(), since
they are the same.

Tested on x86_64.

[ Impact: cleanup ]
Signed-off-by: NAmerigo Wang <amwang@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9d62dcdf

07 4月, 2009 1 次提交

x86, ds: add leakage warning · 2311f0de

由 Markus Metzger 提交于 4月 03, 2009

Add a warning in case a debug store context is not removed before
the task it is attached to is freed.

Remove the old warning at thread exit. It is too early.

Declare the debug store context field in thread_struct unconditionally.

Remove ds_copy_thread() and ds_exit_thread() and do the work directly
in process*.c.
Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
Cc: roland@redhat.com
Cc: eranian@googlemail.com
Cc: oleg@redhat.com
Cc: juan.villacis@intel.com
Cc: ak@linux.jf.intel.com
LKML-Reference: <20090403144601.254472000@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2311f0de

03 4月, 2009 1 次提交

Simplify copy_thread() · 6f2c55b8

由 Alexey Dobriyan 提交于 4月 02, 2009

First argument unused since 2.3.11.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6f2c55b8

30 3月, 2009 2 次提交

x86/paravirt: finish change from lazy cpu to context switch start/end · 224101ed

由 Jeremy Fitzhardinge 提交于 2月 18, 2009

Impact: fix lazy context switch API

Pass the previous and next tasks into the context switch start
end calls, so that the called functions can properly access the
task state (esp in end_context_switch, in which the next task
is not yet completely current).
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

224101ed

x86/pvops: replace arch_enter_lazy_cpu_mode with arch_start_context_switch · 7fd7d83d

由 Jeremy Fitzhardinge 提交于 2月 17, 2009

Impact: simplification, prepare for later changes

Make lazy cpu mode more specific to context switching, so that
it makes sense to do more context-switch specific things in
the callbacks.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

7fd7d83d

02 3月, 2009 2 次提交

x86: unify chunks of kernel/process*.c · 389d1fb1

由 Jeremy Fitzhardinge 提交于 2月 27, 2009

With x86-32 and -64 using the same mechanism for managing the
tss io permissions bitmap, large chunks of process*.c are
trivially unifyable, including:

 - exit_thread
 - flush_thread
 - __switch_to_xtra (along with tsc enable/disable)

and as bonus pickups:

 - sys_fork
 - sys_vfork

(Note: asmlinkage expands to empty on x86-64)
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

389d1fb1

x86-32: use non-lazy io bitmap context switching · db949bba

由 Jeremy Fitzhardinge 提交于 2月 27, 2009

Impact: remove 32-bit optimization to prepare unification

x86-32 and -64 differ in the way they context-switch tasks
with io permission bitmaps.  x86-64 simply copies the next
tasks io bitmap into place (if any) on context switch.  x86-32
invalidates the bitmap on context switch, so that the next
IO instruction will fault; at that point it installs the
appropriate IO bitmap.

This makes context switching IO-bitmap-using tasks a bit more
less expensive, at the cost of making the next IO instruction
slower due to the extra fault.  This tradeoff only makes sense
if IO-bitmap-using processes are relatively common, but they
don't actually use IO instructions very often.

However, in a typical desktop system, the only process likely
to be using IO bitmaps is the X server, and nothing at all on
a server.  Therefore the lazy context switch doesn't really win
all that much, and its just a gratuitious difference from
64-bit code.

This patch removes the lazy context switch, with a view to
unifying this code in a later change.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

db949bba

18 2月, 2009 1 次提交

x86, rcu: fix strange load average and ksoftirqd behavior · bf51935f

由 Paul E. McKenney 提交于 2月 17, 2009

Damien Wyart reported high ksoftirqd CPU usage (20%) on an
otherwise idle system.

The function-graph trace Damien provided:

>   799.521187 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521371 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521555 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521738 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521934 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522068 |   1)  ksoftir-2324  |               |                rcu_check_callbacks() {
>   799.522208 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522392 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522575 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522759 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522956 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523074 |   1)  ksoftir-2324  |               |                  rcu_check_callbacks() {
>   799.523214 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523397 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523579 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523762 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523960 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524079 |   1)  ksoftir-2324  |               |                  rcu_check_callbacks() {
>   799.524220 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524403 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524587 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524770 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
> [ . . . ]

Shows rcu_check_callbacks() being invoked way too often. It should be called
once per jiffy, and here it is called no less than 22 times in about
3.5 milliseconds, meaning one call every 160 microseconds or so.

Why do we need to call rcu_pending() and rcu_check_callbacks() from the
idle loop of 32-bit x86, especially given that no other architecture does
this?

The following patch removes the call to rcu_pending() and
rcu_check_callbacks() from the x86 32-bit idle loop in order to
reduce the softirq load on idle systems.
Reported-by: NDamien Wyart <damien.wyart@free.fr>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bf51935f

12 2月, 2009 1 次提交

x86: use regparm(3) for passed-in pt_regs pointer · b12bdaf1

由 Brian Gerst 提交于 2月 11, 2009

Some syscalls need to access the pt_regs structure, either to copy
user register state or to modifiy it.  This patch adds stubs to load
the address of the pt_regs struct into the %eax register, and changes
the syscalls to take the pointer as an argument instead of relying on
the assumption that the pt_regs structure overlaps the function
arguments.

Drop the use of regparm(1) due to concern about gcc bugs, and to move
in the direction of the eventual removal of regparm(0) for asmlinkage.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

b12bdaf1

11 2月, 2009 2 次提交

x86: pass in pt_regs pointer for syscalls that need it · 253f29a4

由 Brian Gerst 提交于 2月 10, 2009

Some syscalls need to access the pt_regs structure, either to copy
user register state or to modifiy it.  This patch adds stubs to load
the address of the pt_regs struct into the %eax register, and changes
the syscalls to regparm(1) to receive the pt_regs pointer as the
first argument.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

253f29a4

x86: fix x86_32 stack protector bugs · 5c79d2a5

由 Tejun Heo 提交于 2月 11, 2009

Impact: fix x86_32 stack protector

Brian Gerst found out that %gs was being initialized to stack_canary
instead of stack_canary - 20, which basically gave the same canary
value for all threads.  Fixing this also exposed the following bugs.

* cpu_idle() didn't call boot_init_stack_canary()

* stack canary switching in switch_to() was being done too late making
  the initial run of a new thread use the old stack canary value.

Fix all of them and while at it update comment in cpu_idle() about
calling boot_init_stack_canary().
Reported-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5c79d2a5

10 2月, 2009 3 次提交

x86: implement x86_32 stack protector · 60a5317f

由 Tejun Heo 提交于 2月 09, 2009

Impact: stack protector for x86_32

Implement stack protector for x86_32.  GDT entry 28 is used for it.
It's set to point to stack_canary-20 and have the length of 24 bytes.
CONFIG_CC_STACKPROTECTOR turns off CONFIG_X86_32_LAZY_GS and sets %gs
to the stack canary segment on entry.  As %gs is otherwise unused by
the kernel, the canary can be anywhere.  It's defined as a percpu
variable.

x86_32 exception handlers take register frame on stack directly as
struct pt_regs.  With -fstack-protector turned on, gcc copies the
whole structure after the stack canary and (of course) doesn't copy
back on return thus losing all changed.  For now, -fno-stack-protector
is added to all files which contain those functions.  We definitely
need something better.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

60a5317f

x86: make lazy %gs optional on x86_32 · ccbeed3a

由 Tejun Heo 提交于 2月 09, 2009

Impact: pt_regs changed, lazy gs handling made optional, add slight
        overhead to SAVE_ALL, simplifies error_code path a bit

On x86_32, %gs hasn't been used by kernel and handled lazily.  pt_regs
doesn't have place for it and gs is saved/loaded only when necessary.
In preparation for stack protector support, this patch makes lazy %gs
handling optional by doing the followings.

* Add CONFIG_X86_32_LAZY_GS and place for gs in pt_regs.

* Save and restore %gs along with other registers in entry_32.S unless
  LAZY_GS.  Note that this unfortunately adds "pushl $0" on SAVE_ALL
  even when LAZY_GS.  However, it adds no overhead to common exit path
  and simplifies entry path with error code.

* Define different user_gs accessors depending on LAZY_GS and add
  lazy_save_gs() and lazy_load_gs() which are noop if !LAZY_GS.  The
  lazy_*_gs() ops are used to save, load and clear %gs lazily.

* Define ELF_CORE_COPY_KERNEL_REGS() which always read %gs directly.

xen and lguest changes need to be verified.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ccbeed3a

x86: add %gs accessors for x86_32 · d9a89a26

由 Tejun Heo 提交于 2月 09, 2009

Impact: cleanup

On x86_32, %gs is handled lazily.  It's not saved and restored on
kernel entry/exit but only when necessary which usually is during task
switch but there are few other places.  Currently, it's done by
calling savesegment() and loadsegment() explicitly.  Define
get_user_gs(), set_user_gs() and task_user_gs() and use them instead.

While at it, clean up register access macros in signal.c.

This cleans up code a bit and will help future changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d9a89a26

23 1月, 2009 1 次提交

x86: remove idle_timestamp from 32bit irq_cpustat_t · 03d2989d

由 Brian Gerst 提交于 1月 23, 2009

Impact: bogus irq_cpustat field removed

idle_timestamp is left over from the removed irqbalance code.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

03d2989d

18 1月, 2009 1 次提交

x86-64: Move cpu number from PDA to per-cpu and consolidate with 32-bit. · ea927906

由 Brian Gerst 提交于 1月 19, 2009

tj: moved cpu_number definition out of CONFIG_HAVE_SETUP_PER_CPU_AREA
    for voyager.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

ea927906

16 1月, 2009 1 次提交

percpu: add optimized generic percpu accessors · 6dbde353

由 Ingo Molnar 提交于 1月 15, 2009

It is an optimization and a cleanup, and adds the following new
generic percpu methods:

  percpu_read()
  percpu_write()
  percpu_add()
  percpu_sub()
  percpu_and()
  percpu_or()
  percpu_xor()

and implements support for them on x86. (other architectures will fall
back to a default implementation)

The advantage is that for example to read a local percpu variable,
instead of this sequence:

 return __get_cpu_var(var);

 ffffffff8102ca2b:	48 8b 14 fd 80 09 74 	mov    -0x7e8bf680(,%rdi,8),%rdx
 ffffffff8102ca32:	81
 ffffffff8102ca33:	48 c7 c0 d8 59 00 00 	mov    $0x59d8,%rax
 ffffffff8102ca3a:	48 8b 04 10          	mov    (%rax,%rdx,1),%rax

We can get a single instruction by using the optimized variants:

 return percpu_read(var);

 ffffffff8102ca3f:	65 48 8b 05 91 8f fd 	mov    %gs:0x7efd8f91(%rip),%rax

I also cleaned up the x86-specific APIs and made the x86 code use
these new generic percpu primitives.

tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out
    * added percpu_and() for completeness's sake
    * made generic percpu ops atomic against preemption
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NTejun Heo <tj@kernel.org>

6dbde353

04 1月, 2009 1 次提交

x86: process_32.c fix style problems · befa9e78

由 Jaswinder Singh Rajput 提交于 1月 04, 2009

Impact: cleanup

Fix:

 WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
 WARNING: Use #include <linux/io.h> instead of <asm/io.h>
 WARNING: Use #include <linux/kdebug.h> instead of <asm/kdebug.h>
 WARNING: Use #include <linux/smp.h> instead of <asm/smp.h>
 ERROR: "foo * bar" should be "foo *bar"
 ERROR: trailing whitespace
 ERROR: spaces required around that ':' (ctx:WxO)
 ERROR: spaces required around that ':' (ctx:OxW)

 total: 7 errors, 4 warnings
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

befa9e78