1. 09 12月, 2009 1 次提交
  2. 08 11月, 2009 1 次提交
    • F
      hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf events · 24f1e32c
      Frederic Weisbecker 提交于
      This patch rebase the implementation of the breakpoints API on top of
      perf events instances.
      
      Each breakpoints are now perf events that handle the
      register scheduling, thread/cpu attachment, etc..
      
      The new layering is now made as follows:
      
             ptrace       kgdb      ftrace   perf syscall
                \          |          /         /
                 \         |         /         /
                                              /
                  Core breakpoint API        /
                                            /
                           |               /
                           |              /
      
                    Breakpoints perf events
      
                           |
                           |
      
                     Breakpoints PMU ---- Debug Register constraints handling
                                          (Part of core breakpoint API)
                           |
                           |
      
                   Hardware debug registers
      
      Reasons of this rewrite:
      
      - Use the centralized/optimized pmu registers scheduling,
        implying an easier arch integration
      - More powerful register handling: perf attributes (pinned/flexible
        events, exclusive/non-exclusive, tunable period, etc...)
      
      Impact:
      
      - New perf ABI: the hardware breakpoints counters
      - Ptrace breakpoints setting remains tricky and still needs some per
        thread breakpoints references.
      
      Todo (in the order):
      
      - Support breakpoints perf counter events for perf tools (ie: implement
        perf_bpcounter_event())
      - Support from perf tools
      
      Changes in v2:
      
      - Follow the perf "event " rename
      - The ptrace regression have been fixed (ptrace breakpoint perf events
        weren't released when a task ended)
      - Drop the struct hw_breakpoint and store generic fields in
        perf_event_attr.
      - Separate core and arch specific headers, drop
        asm-generic/hw_breakpoint.h and create linux/hw_breakpoint.h
      - Use new generic len/type for breakpoint
      - Handle off case: when breakpoints api is not supported by an arch
      
      Changes in v3:
      
      - Fix broken CONFIG_KVM, we need to propagate the breakpoint api
        changes to kvm when we exit the guest and restore the bp registers
        to the host.
      
      Changes in v4:
      
      - Drop the hw_breakpoint_restore() stub as it is only used by KVM
      - EXPORT_SYMBOL_GPL hw_breakpoint_restore() as KVM can be built as a
        module
      - Restore the breakpoints unconditionally on kvm guest exit:
        TIF_DEBUG_THREAD doesn't anymore cover every cases of running
        breakpoints and vcpu->arch.switch_db_regs might not always be
        set when the guest used debug registers.
        (Waiting for a reliable optimization)
      
      Changes in v5:
      
      - Split-up the asm-generic/hw-breakpoint.h moving to
        linux/hw_breakpoint.h into a separate patch
      - Optimize the breakpoints restoring while switching from kvm guest
        to host. We only want to restore the state if we have active
        breakpoints to the host, otherwise we don't care about messed-up
        address registers.
      - Add asm/hw_breakpoint.h to Kbuild
      - Fix bad breakpoint type in trace_selftest.c
      
      Changes in v6:
      
      - Fix wrong header inclusion in trace.h (triggered a build
        error with CONFIG_FTRACE_SELFTEST
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jan Kiszka <jan.kiszka@web.de>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      24f1e32c
  3. 03 11月, 2009 1 次提交
  4. 13 10月, 2009 1 次提交
    • H
      x86: use kernel_stack_pointer() in process_32.c · def3c5d0
      H. Peter Anvin 提交于
      The way to obtain a kernel-mode stack pointer from a struct pt_regs in
      32-bit mode is "subtle": the stack doesn't actually contain the stack
      pointer, but rather the location where it would have been marks the
      actual previous stack frame.  For clarity, use kernel_stack_pointer()
      instead of coding this weirdness explicitly.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      def3c5d0
  5. 04 8月, 2009 1 次提交
    • T
      x86, percpu: Collect hot percpu variables into one cacheline · bdf977b3
      Tejun Heo 提交于
      On x86_64, percpu variables current_task and kernel_stack are used for
      get_current() and current_thread_info() respectively and thus are
      often used close to each other.  Move definition of current_task to
      kernel/cpu/common.c right above kernel_stack definition and align it
      to cacheline so that they always fall into the same cacheline.  Two
      percpu variables defined there together - irq_stack_ptr and irq_count
      - are also pretty hot and will benefit from sharing the cacheline.
      
      For consistency, current_task definition for x86_32 is also moved to
      kernel/cpu/common.c.
      
      Putting current_task and kernel_stack into the same cacheline was
      suggested by Linus Torvalds.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      bdf977b3
  6. 18 6月, 2009 1 次提交
    • J
      x86-32: make sure clts is batched during context switch · 2fcddce1
      Jeremy Fitzhardinge 提交于
      If we're preloading the fpu state during context switch, make sure the clts
      happens while we're batching the cpu context update, then do the actual
      __math_state_restore once the updates are flushed.
      
      This allows more efficient context switches when running paravirtualized,
      as all the hypercalls can be folded together into one.
      
      [ Impact: optimise paravirtual FPU context switch ]
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      2fcddce1
  7. 03 6月, 2009 1 次提交
  8. 12 5月, 2009 2 次提交
  9. 07 4月, 2009 1 次提交
    • M
      x86, ds: add leakage warning · 2311f0de
      Markus Metzger 提交于
      Add a warning in case a debug store context is not removed before
      the task it is attached to is freed.
      
      Remove the old warning at thread exit. It is too early.
      
      Declare the debug store context field in thread_struct unconditionally.
      
      Remove ds_copy_thread() and ds_exit_thread() and do the work directly
      in process*.c.
      Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
      Cc: roland@redhat.com
      Cc: eranian@googlemail.com
      Cc: oleg@redhat.com
      Cc: juan.villacis@intel.com
      Cc: ak@linux.jf.intel.com
      LKML-Reference: <20090403144601.254472000@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2311f0de
  10. 03 4月, 2009 1 次提交
  11. 30 3月, 2009 2 次提交
  12. 02 3月, 2009 2 次提交
    • J
      x86: unify chunks of kernel/process*.c · 389d1fb1
      Jeremy Fitzhardinge 提交于
      With x86-32 and -64 using the same mechanism for managing the
      tss io permissions bitmap, large chunks of process*.c are
      trivially unifyable, including:
      
       - exit_thread
       - flush_thread
       - __switch_to_xtra (along with tsc enable/disable)
      
      and as bonus pickups:
      
       - sys_fork
       - sys_vfork
      
      (Note: asmlinkage expands to empty on x86-64)
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      389d1fb1
    • J
      x86-32: use non-lazy io bitmap context switching · db949bba
      Jeremy Fitzhardinge 提交于
      Impact: remove 32-bit optimization to prepare unification
      
      x86-32 and -64 differ in the way they context-switch tasks
      with io permission bitmaps.  x86-64 simply copies the next
      tasks io bitmap into place (if any) on context switch.  x86-32
      invalidates the bitmap on context switch, so that the next
      IO instruction will fault; at that point it installs the
      appropriate IO bitmap.
      
      This makes context switching IO-bitmap-using tasks a bit more
      less expensive, at the cost of making the next IO instruction
      slower due to the extra fault.  This tradeoff only makes sense
      if IO-bitmap-using processes are relatively common, but they
      don't actually use IO instructions very often.
      
      However, in a typical desktop system, the only process likely
      to be using IO bitmaps is the X server, and nothing at all on
      a server.  Therefore the lazy context switch doesn't really win
      all that much, and its just a gratuitious difference from
      64-bit code.
      
      This patch removes the lazy context switch, with a view to
      unifying this code in a later change.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      db949bba
  13. 18 2月, 2009 1 次提交
    • P
      x86, rcu: fix strange load average and ksoftirqd behavior · bf51935f
      Paul E. McKenney 提交于
      Damien Wyart reported high ksoftirqd CPU usage (20%) on an
      otherwise idle system.
      
      The function-graph trace Damien provided:
      
      >   799.521187 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.521371 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.521555 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.521738 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.521934 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.522068 |   1)  ksoftir-2324  |               |                rcu_check_callbacks() {
      >   799.522208 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.522392 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.522575 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.522759 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.522956 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.523074 |   1)  ksoftir-2324  |               |                  rcu_check_callbacks() {
      >   799.523214 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.523397 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.523579 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.523762 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.523960 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.524079 |   1)  ksoftir-2324  |               |                  rcu_check_callbacks() {
      >   799.524220 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.524403 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.524587 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      >   799.524770 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
      > [ . . . ]
      
      Shows rcu_check_callbacks() being invoked way too often. It should be called
      once per jiffy, and here it is called no less than 22 times in about
      3.5 milliseconds, meaning one call every 160 microseconds or so.
      
      Why do we need to call rcu_pending() and rcu_check_callbacks() from the
      idle loop of 32-bit x86, especially given that no other architecture does
      this?
      
      The following patch removes the call to rcu_pending() and
      rcu_check_callbacks() from the x86 32-bit idle loop in order to
      reduce the softirq load on idle systems.
      Reported-by: NDamien Wyart <damien.wyart@free.fr>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bf51935f
  14. 12 2月, 2009 1 次提交
    • B
      x86: use regparm(3) for passed-in pt_regs pointer · b12bdaf1
      Brian Gerst 提交于
      Some syscalls need to access the pt_regs structure, either to copy
      user register state or to modifiy it.  This patch adds stubs to load
      the address of the pt_regs struct into the %eax register, and changes
      the syscalls to take the pointer as an argument instead of relying on
      the assumption that the pt_regs structure overlaps the function
      arguments.
      
      Drop the use of regparm(1) due to concern about gcc bugs, and to move
      in the direction of the eventual removal of regparm(0) for asmlinkage.
      Signed-off-by: NBrian Gerst <brgerst@gmail.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      b12bdaf1
  15. 11 2月, 2009 2 次提交
  16. 10 2月, 2009 3 次提交
    • T
      x86: implement x86_32 stack protector · 60a5317f
      Tejun Heo 提交于
      Impact: stack protector for x86_32
      
      Implement stack protector for x86_32.  GDT entry 28 is used for it.
      It's set to point to stack_canary-20 and have the length of 24 bytes.
      CONFIG_CC_STACKPROTECTOR turns off CONFIG_X86_32_LAZY_GS and sets %gs
      to the stack canary segment on entry.  As %gs is otherwise unused by
      the kernel, the canary can be anywhere.  It's defined as a percpu
      variable.
      
      x86_32 exception handlers take register frame on stack directly as
      struct pt_regs.  With -fstack-protector turned on, gcc copies the
      whole structure after the stack canary and (of course) doesn't copy
      back on return thus losing all changed.  For now, -fno-stack-protector
      is added to all files which contain those functions.  We definitely
      need something better.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      60a5317f
    • T
      x86: make lazy %gs optional on x86_32 · ccbeed3a
      Tejun Heo 提交于
      Impact: pt_regs changed, lazy gs handling made optional, add slight
              overhead to SAVE_ALL, simplifies error_code path a bit
      
      On x86_32, %gs hasn't been used by kernel and handled lazily.  pt_regs
      doesn't have place for it and gs is saved/loaded only when necessary.
      In preparation for stack protector support, this patch makes lazy %gs
      handling optional by doing the followings.
      
      * Add CONFIG_X86_32_LAZY_GS and place for gs in pt_regs.
      
      * Save and restore %gs along with other registers in entry_32.S unless
        LAZY_GS.  Note that this unfortunately adds "pushl $0" on SAVE_ALL
        even when LAZY_GS.  However, it adds no overhead to common exit path
        and simplifies entry path with error code.
      
      * Define different user_gs accessors depending on LAZY_GS and add
        lazy_save_gs() and lazy_load_gs() which are noop if !LAZY_GS.  The
        lazy_*_gs() ops are used to save, load and clear %gs lazily.
      
      * Define ELF_CORE_COPY_KERNEL_REGS() which always read %gs directly.
      
      xen and lguest changes need to be verified.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ccbeed3a
    • T
      x86: add %gs accessors for x86_32 · d9a89a26
      Tejun Heo 提交于
      Impact: cleanup
      
      On x86_32, %gs is handled lazily.  It's not saved and restored on
      kernel entry/exit but only when necessary which usually is during task
      switch but there are few other places.  Currently, it's done by
      calling savesegment() and loadsegment() explicitly.  Define
      get_user_gs(), set_user_gs() and task_user_gs() and use them instead.
      
      While at it, clean up register access macros in signal.c.
      
      This cleans up code a bit and will help future changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d9a89a26
  17. 23 1月, 2009 1 次提交
  18. 18 1月, 2009 1 次提交
  19. 16 1月, 2009 1 次提交
    • I
      percpu: add optimized generic percpu accessors · 6dbde353
      Ingo Molnar 提交于
      It is an optimization and a cleanup, and adds the following new
      generic percpu methods:
      
        percpu_read()
        percpu_write()
        percpu_add()
        percpu_sub()
        percpu_and()
        percpu_or()
        percpu_xor()
      
      and implements support for them on x86. (other architectures will fall
      back to a default implementation)
      
      The advantage is that for example to read a local percpu variable,
      instead of this sequence:
      
       return __get_cpu_var(var);
      
       ffffffff8102ca2b:	48 8b 14 fd 80 09 74 	mov    -0x7e8bf680(,%rdi,8),%rdx
       ffffffff8102ca32:	81
       ffffffff8102ca33:	48 c7 c0 d8 59 00 00 	mov    $0x59d8,%rax
       ffffffff8102ca3a:	48 8b 04 10          	mov    (%rax,%rdx,1),%rax
      
      We can get a single instruction by using the optimized variants:
      
       return percpu_read(var);
      
       ffffffff8102ca3f:	65 48 8b 05 91 8f fd 	mov    %gs:0x7efd8f91(%rip),%rax
      
      I also cleaned up the x86-specific APIs and made the x86 code use
      these new generic percpu primitives.
      
      tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out
          * added percpu_and() for completeness's sake
          * made generic percpu ops atomic against preemption
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      6dbde353
  20. 04 1月, 2009 1 次提交
    • J
      x86: process_32.c fix style problems · befa9e78
      Jaswinder Singh Rajput 提交于
      Impact: cleanup
      
      Fix:
      
       WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
       WARNING: Use #include <linux/io.h> instead of <asm/io.h>
       WARNING: Use #include <linux/kdebug.h> instead of <asm/kdebug.h>
       WARNING: Use #include <linux/smp.h> instead of <asm/smp.h>
       ERROR: "foo * bar" should be "foo *bar"
       ERROR: trailing whitespace
       ERROR: spaces required around that ':' (ctx:WxO)
       ERROR: spaces required around that ':' (ctx:OxW)
      
       total: 7 errors, 4 warnings
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      befa9e78
  21. 20 12月, 2008 1 次提交
    • M
      x86, bts: add fork and exit handling · bf53de90
      Markus Metzger 提交于
      Impact: introduce new ptrace facility
      
      Add arch_ptrace_untrace() function that is called when the tracer
      detaches (either voluntarily or when the tracing task dies);
      ptrace_disable() is only called on a voluntary detach.
      
      Add ptrace_fork() and arch_ptrace_fork(). They are called when a
      traced task is forked.
      
      Clear DS and BTS related fields on fork.
      
      Release DS resources and reclaim memory in ptrace_untrace(). This
      releases resources already when the tracing task dies. We used to do
      that when the traced task dies.
      Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bf53de90
  22. 12 12月, 2008 1 次提交
  23. 08 12月, 2008 1 次提交
    • F
      tracing/function-graph-tracer: introduce __notrace_funcgraph to filter special functions · 8b96f011
      Frederic Weisbecker 提交于
      Impact: trace more functions
      
      When the function graph tracer is configured, three more files are not
      traced to prevent only four functions to be traced. And this impacts the
      normal function tracer too.
      
      arch/x86/kernel/process_64/32.c:
      
      I had crashes when I let this file traced. After some debugging, I saw
      that the "current" task point was changed inside__swtich_to(), ie:
      "write_pda(pcurrent, next_p);" inside process_64.c Since the tracer store
      the original return address of the function inside current, we had
      crashes. Only __switch_to() has to be excluded from tracing.
      
      kernel/module.c and kernel/extable.c:
      
      Because of a function used internally by the function graph tracer:
      __kernel_text_address()
      
      To let the other functions inside these files to be traced, this patch
      introduces the __notrace_funcgraph function prefix which is __notrace if
      function graph tracer is configured and nothing if not.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8b96f011
  24. 13 10月, 2008 1 次提交
  25. 24 9月, 2008 1 次提交
  26. 23 9月, 2008 1 次提交
    • T
      x86: prevent stale state of c1e_mask across CPU offline/online · 4faac97d
      Thomas Gleixner 提交于
      Impact: hang which happens across CPU offline/online on AMD C1E systems.
      
      When a CPU goes offline then the corresponding bit in the broadcast
      mask is cleared. For AMD C1E enabled CPUs we do not reenable the
      broadcast when the CPU comes online again as we do not clear the
      corresponding bit in the c1e_mask, which keeps track which CPUs
      have been switched to broadcast already. So on those !$@#& machines
      we never switch back to broadcasting after a CPU offline/online cycle.
      
      Clear the bit when the CPU plays dead.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      4faac97d
  27. 17 9月, 2008 1 次提交
  28. 05 9月, 2008 1 次提交
  29. 25 8月, 2008 3 次提交
  30. 15 8月, 2008 1 次提交
  31. 22 7月, 2008 2 次提交