1. 03 2月, 2009 1 次提交
  2. 17 12月, 2008 1 次提交
  3. 03 12月, 2008 2 次提交
  4. 02 12月, 2008 1 次提交
    • F
      tracing/function-graph-tracer: support for x86-64 · 48d68b20
      Frederic Weisbecker 提交于
      Impact: extend and enable the function graph tracer to 64-bit x86
      
      This patch implements the support for function graph tracer under x86-64.
      Both static and dynamic tracing are supported.
      
      This causes some small CPP conditional asm on arch/x86/kernel/ftrace.c I
      wanted to use probe_kernel_read/write to make the return address
      saving/patching code more generic but it causes tracing recursion.
      
      That would be perhaps useful to implement a notrace version of these
      function for other archs ports.
      
      Note that arch/x86/process_64.c is not traced, as in X86-32. I first
      thought __switch_to() was responsible of crashes during tracing because I
      believed current task were changed inside but that's actually not the
      case (actually yes, but not the "current" pointer).
      
      So I will have to investigate to find the functions that harm here, to
      enable tracing of the other functions inside (but there is no issue at
      this time, while process_64.c stays out of -pg flags).
      
      A little possible race condition is fixed inside this patch too. When the
      tracer allocate a return stack dynamically, the current depth is not
      initialized before but after. An interrupt could occur at this time and,
      after seeing that the return stack is allocated, the tracer could try to
      trace it with a random uninitialized depth. It's a prevention, even if I
      hadn't problems with it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tim Bird <tim.bird@am.sony.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      48d68b20
  5. 28 11月, 2008 3 次提交
  6. 27 11月, 2008 3 次提交
  7. 24 11月, 2008 1 次提交
  8. 23 11月, 2008 3 次提交
  9. 22 11月, 2008 4 次提交
  10. 21 11月, 2008 3 次提交
    • I
      x86: entry_64.S: rename · 14ae22ba
      Ingo Molnar 提交于
      Impact: cleanup
      
      Rename:
      
         CFI_PUSHQ  =>  pushq_cfi
         CFI_POPQ   =>  popq_cfi
         CFI_MOVQ   =>  movq_cfi
      
      To make it blend better into regular assembly code.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      14ae22ba
    • I
      x86: clean up after: move entry_64.S register saving out of the macros, fix · e8a0e276
      Ingo Molnar 提交于
      Impact: build fix
      
      The break builds with older binutils (2.16.1):
      
       arch/x86/kernel/entry_64.S: Assembler messages:
       arch/x86/kernel/entry_64.S:282: Error: too many positional arguments
       arch/x86/kernel/entry_64.S:283: Error: too many positional arguments
       arch/x86/kernel/entry_64.S:284: Error: too many positional arguments
       arch/x86/kernel/entry_64.S:285: Error: too many positional arguments
       arch/x86/kernel/entry_64.S:286: Error: too many positional arguments
       arch/x86/kernel/entry_64.S:287: Error: too many positional arguments
       arch/x86/kernel/entry_64.S:288: Error: too many positional arguments
       arch/x86/kernel/entry_64.S:289: Error: too many positional arguments
       arch/x86/kernel/entry_64.S:290: Error: too many positional arguments
      
      Took some time to figure out the detail that GAS chokes on: it's
      negative offsets. Rearrange the calculations to make sure we never
      go negative.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e8a0e276
    • A
      x86: clean up after: move entry_64.S register saving out of the macros · dcd072e2
      Alexander van Heukelum 提交于
      This add-on patch to x86: move entry_64.S register saving out
      of the macros visually cleans up the appearance of the code by
      introducing some basic helper macro's. It also adds some cfi
      annotations which were missing.
      Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dcd072e2
  11. 20 11月, 2008 1 次提交
    • A
      x86: move entry_64.S register saving out of the macros · d99015b1
      Alexander van Heukelum 提交于
      Here is a combined patch that moves "save_args" out-of-line for
      the interrupt macro and moves "error_entry" mostly out-of-line
      for the zeroentry and errorentry macros.
      
      The save_args function becomes really straightforward and easy
      to understand, with the possible exception of the stack switch
      code, which now needs to copy the return address of to the
      calling function. Normal interrupts arrive with ((~vector)-0x80)
      on the stack, which gets adjusted in common_interrupt:
      
      <common_interrupt>:
      (5)  addq   $0xffffffffffffff80,(%rsp)		/* -> ~(vector) */
      (4)  sub    $0x50,%rsp				/* space for registers */
      (5)  callq  ffffffff80211290 <save_args>
      (5)  callq  ffffffff80214290 <do_IRQ>
      <ret_from_intr>:
           ...
      
      An apic interrupt stub now look like this:
      
      <thermal_interrupt>:
      (5)  pushq  $0xffffffffffffff05			/* ~(vector) */
      (4)  sub    $0x50,%rsp				/* space for registers */
      (5)  callq  ffffffff80211290 <save_args>
      (5)  callq  ffffffff80212b8f <smp_thermal_interrupt>
      (5)  jmpq   ffffffff80211f93 <ret_from_intr>
      
      Similarly the exception handler register saving function becomes
      simpler, without the need of any parameter shuffling. The stub
      for an exception without errorcode looks like this:
      
      <overflow>:
      (6)  callq  *0x1cad12(%rip)        # ffffffff803dd448 <pv_irq_ops+0x38>
      (2)  pushq  $0xffffffffffffffff			/* no syscall */
      (4)  sub    $0x78,%rsp				/* space for registers */
      (5)  callq  ffffffff8030e3b0 <error_entry>
      (3)  mov    %rsp,%rdi				/* pt_regs pointer */
      (2)  xor    %esi,%esi				/* no error code */
      (5)  callq  ffffffff80213446 <do_overflow>
      (5)  jmpq   ffffffff8030e460 <error_exit>
      
      And one for an exception with errorcode like this:
      
      <segment_not_present>:
      (6)  callq  *0x1cab92(%rip)        # ffffffff803dd448 <pv_irq_ops+0x38>
      (4)  sub    $0x78,%rsp				/* space for registers */
      (5)  callq  ffffffff8030e3b0 <error_entry>
      (3)  mov    %rsp,%rdi				/* pt_regs pointer */
      (5)  mov    0x78(%rsp),%rsi			/* load error code */
      (9)  movq   $0xffffffffffffffff,0x78(%rsp)	/* no syscall */
      (5)  callq  ffffffff80213209 <do_segment_not_present>
      (5)  jmpq   ffffffff8030e460 <error_exit>
      
      Unfortunately, this last type is more than 32 bytes. But the total space
      savings due to this patch is about 2500 bytes on an smp-configuration,
      and I think the code is clearer than it was before. The tested kernels
      were non-paravirt ones (i.e., without the indirect call at the top of
      the exception handlers).
      
      Anyhow, I tested this patch on top of a recent -tip. The machine
      was an 2x4-core Xeon at 2333MHz. Measured where the delays between
      (almost-)adjacent rdtsc instructions. The graphs show how much
      time is spent outside of the program as a function of the measured
      delay. The area under the graph represents the total time spent
      outside the program. Eight instances of the rdtsctest were
      started, each pinned to a single cpu. The histogams are added.
      For each kernel two measurements were done: one in mostly idle
      condition, the other while running "bonnie++ -f", bound to cpu 0.
      Each measurement took 40 minutes runtime. See the attached graphs
      for the results. The graphs overlap almost everywhere, but there
      are small differences.
      Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d99015b1
  12. 17 11月, 2008 1 次提交
  13. 14 11月, 2008 1 次提交
  14. 13 11月, 2008 1 次提交
  15. 12 11月, 2008 1 次提交
    • H
      x86: 64 bits: shrink and align IRQ stubs · 939b7871
      H. Peter Anvin 提交于
      Move the IRQ stub generation to assembly to simplify it and for
      consistency with 32 bits.  Doing it in a C file with asm() statements
      doesn't help clarity, and it prevents some optimizations.
      
      Shrink the IRQ stubs down to just over four bytes per (we fit seven
      into a 32-byte chunk.)  This shrinks the total icache consumption of
      the IRQ stubs down to an even kilobyte, if all of them are in active
      use.
      
      The downside is that we end up with a double jump, which could have a
      negative effect on some pipelines.  The double jump is always inside
      the same cacheline on any modern chips.
      
      To get the most effect, cache-align the IRQ stubs.
      
      This makes the 64-bit code match changes already done to the 32-bit
      code, and should open up irqinit*.c for unification.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      939b7871
  16. 06 11月, 2008 1 次提交
    • S
      ftrace: add quick function trace stop · 60a7ecf4
      Steven Rostedt 提交于
      Impact: quick start and stop of function tracer
      
      This patch adds a way to disable the function tracer quickly without
      the need to run kstop_machine. It adds a new variable called
      function_trace_stop which will stop the calls to functions from mcount
      when set.  This is just an on/off switch and does not handle recursion
      like preempt_disable().
      
      It's main purpose is to help other tracers/debuggers start and stop tracing
      fuctions without the need to call kstop_machine.
      
      The config option HAVE_FUNCTION_TRACE_MCOUNT_TEST is added for archs
      that implement the testing of the function_trace_stop in the mcount
      arch dependent code. Otherwise, the test is done in the C code.
      
      x86 is the only arch at the moment that supports this.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      60a7ecf4
  17. 31 10月, 2008 1 次提交
  18. 21 10月, 2008 1 次提交
  19. 14 10月, 2008 1 次提交
    • S
      ftrace: x86 mcount stub · 0a37605c
      Steven Rostedt 提交于
      x86 now sets up the mcount locations through the build and no longer
      needs to record the ip when the function is executed. This patch changes
      the initial mcount to simply return. There's no need to do any other work.
      If the ftrace start up test fails, the original mcount will be what everything
      will use, so having this as fast as possible is a good thing.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0a37605c
  20. 13 10月, 2008 4 次提交
  21. 24 7月, 2008 4 次提交
    • A
      x86, 64-bit, dwarf2: push pushes 8 bytes and popf pops 8 · e0a5a5d9
      Alexander van Heukelum 提交于
      The CFI_ADJUST_CFA_OFFSET dwarf2 annotation of a push/popf
      pair in ret_from_fork wrongly used a value of 4. It should
      have been 8. Fix that.
      Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: heukelum@fastmail.fm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e0a5a5d9
    • R
      x86_64 ia32 syscall audit fast-path · 5cbf1565
      Roland McGrath 提交于
      This adds fast paths for 32-bit syscall entry and exit when
      TIF_SYSCALL_AUDIT is set, but no other kind of syscall tracing.
      These paths does not need to save and restore all registers as
      the general case of tracing does.  Avoiding the iret return path
      when syscall audit is enabled helps performance a lot.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      5cbf1565
    • R
      x86_64 syscall audit fast-path · 86a1c34a
      Roland McGrath 提交于
      This adds a fast path for 64-bit syscall entry and exit when
      TIF_SYSCALL_AUDIT is set, but no other kind of syscall tracing.
      This path does not need to save and restore all registers as
      the general case of tracing does.  Avoiding the iret return path
      when syscall audit is enabled helps performance a lot.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      86a1c34a
    • R
      x86_64: remove bogus optimization in sysret_signal · 15e8f348
      Roland McGrath 提交于
      This short-circuit path in sysret_signal looks wrong to me.
      AFAICT, in practice the branch is never taken--and if it were,
      it would go wrong.  To wit, try loading a module whose init
      function does set_thread_flag(TIF_IRET), and see insmod crash
      (presumably with a wrong user stack pointer).
      
      This is because the FIXUP_TOP_OF_STACK work hasn't been done yet
      when we jump around the call to ptregscall_common and get to
      int_with_check--where it expects the user RSP,SS,CS and EFLAGS to
      have been stored by FIXUP_TOP_OF_STACK.
      
      I don't think it's normally possible to get to sysret_signal with no
      _TIF_DO_NOTIFY_MASK bits set anyway, so these two instructions are
      already superfluous.  If it ever did happen, it is harmless to call
      do_notify_resume with nothing for it to do.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      15e8f348
  22. 17 7月, 2008 1 次提交
    • R
      x86 ptrace: unify syscall tracing · d4d67150
      Roland McGrath 提交于
      This unifies and cleans up the syscall tracing code on i386 and x86_64.
      
      Using a single function for entry and exit tracing on 32-bit made the
      do_syscall_trace() into some terrible spaghetti.  The logic is clear and
      simple using separate syscall_trace_enter() and syscall_trace_leave()
      functions as on 64-bit.
      
      The unification adds PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP support
      on x86_64, for 32-bit ptrace() callers and for 64-bit ptrace() callers
      tracing either 32-bit or 64-bit tasks.  It behaves just like 32-bit.
      
      Changing syscall_trace_enter() to return the syscall number shortens
      all the assembly paths, while adding the SYSEMU feature in a simple way.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      d4d67150