1. 13 2月, 2013 1 次提交
    • S
      tracing/syscalls: Allow archs to ignore tracing compat syscalls · f431b634
      Steven Rostedt 提交于
      The tracing of ia32 compat system calls has been a bit of a pain as they
      use different system call numbers than the 64bit equivalents.
      
      I wrote a simple 'lls' program that lists files. I compiled it as a i686
      ELF binary and ran it under a x86_64 box. This is the result:
      
      echo 0 > /debug/tracing/tracing_on
      echo 1 > /debug/tracing/events/syscalls/enable
      echo 1 > /debug/tracing/tracing_on ; ./lls ; echo 0 > /debug/tracing/tracing_on
      
      grep lls /debug/tracing/trace
      
      [.. skipping calls before TS_COMPAT is set ...]
      
                   lls-1127  [005] d...   936.409188: sys_recvfrom(fd: 0, ubuf: 4d560fc4, size: 0, flags: 8048034, addr: 8, addr_len: f7700420)
                   lls-1127  [005] d...   936.409190: sys_recvfrom -> 0x8a77000
                   lls-1127  [005] d...   936.409211: sys_lgetxattr(pathname: 0, name: 1000, value: 3, size: 22)
                   lls-1127  [005] d...   936.409215: sys_lgetxattr -> 0xf76ff000
                   lls-1127  [005] d...   936.409223: sys_dup2(oldfd: 4d55ae9b, newfd: 4)
                   lls-1127  [005] d...   936.409228: sys_dup2 -> 0xfffffffffffffffe
                   lls-1127  [005] d...   936.409236: sys_newfstat(fd: 4d55b085, statbuf: 80000)
                   lls-1127  [005] d...   936.409242: sys_newfstat -> 0x3
                   lls-1127  [005] d...   936.409243: sys_removexattr(pathname: 3, name: ffcd0060)
                   lls-1127  [005] d...   936.409244: sys_removexattr -> 0x0
                   lls-1127  [005] d...   936.409245: sys_lgetxattr(pathname: 0, name: 19614, value: 1, size: 2)
                   lls-1127  [005] d...   936.409248: sys_lgetxattr -> 0xf76e5000
                   lls-1127  [005] d...   936.409248: sys_newlstat(filename: 3, statbuf: 19614)
                   lls-1127  [005] d...   936.409249: sys_newlstat -> 0x0
                   lls-1127  [005] d...   936.409262: sys_newfstat(fd: f76fb588, statbuf: 80000)
                   lls-1127  [005] d...   936.409279: sys_newfstat -> 0x3
                   lls-1127  [005] d...   936.409279: sys_close(fd: 3)
                   lls-1127  [005] d...   936.421550: sys_close -> 0x200
                   lls-1127  [005] d...   936.421558: sys_removexattr(pathname: 3, name: ffcd00d0)
                   lls-1127  [005] d...   936.421560: sys_removexattr -> 0x0
                   lls-1127  [005] d...   936.421569: sys_lgetxattr(pathname: 4d564000, name: 1b1abc, value: 5, size: 802)
                   lls-1127  [005] d...   936.421574: sys_lgetxattr -> 0x4d564000
                   lls-1127  [005] d...   936.421575: sys_capget(header: 4d70f000, dataptr: 1000)
                   lls-1127  [005] d...   936.421580: sys_capget -> 0x0
                   lls-1127  [005] d...   936.421580: sys_lgetxattr(pathname: 4d710000, name: 3000, value: 3, size: 812)
                   lls-1127  [005] d...   936.421589: sys_lgetxattr -> 0x4d710000
                   lls-1127  [005] d...   936.426130: sys_lgetxattr(pathname: 4d713000, name: 2abc, value: 3, size: 32)
                   lls-1127  [005] d...   936.426141: sys_lgetxattr -> 0x4d713000
                   lls-1127  [005] d...   936.426145: sys_newlstat(filename: 3, statbuf: f76ff3f0)
                   lls-1127  [005] d...   936.426146: sys_newlstat -> 0x0
                   lls-1127  [005] d...   936.431748: sys_lgetxattr(pathname: 0, name: 1000, value: 3, size: 22)
      
      Obviously I'm not calling newfstat with a fd of 4d55b085. The calls are
      obviously incorrect, and confusing.
      
      Other efforts have been made to fix this:
      
      https://lkml.org/lkml/2012/3/26/367
      
      But the real solution is to rewrite the syscall internals and come up
      with a fixed solution. One that doesn't require all the kluge that the
      current solution has.
      
      Thus for now, instead of outputting incorrect data, simply ignore them.
      With this patch the changes now have:
      
       #> grep lls /debug/tracing/trace
       #>
      
      Compat system calls simply are not traced. If users need compat
      syscalls, then they should just use the raw syscall tracepoints.
      
      For an architecture to make their compat syscalls ignored, it must
      define ARCH_TRACE_IGNORE_COMPAT_SYSCALLS (done in asm/ftrace.h) and also
      define an arch_trace_is_compat_syscall() function that will return true
      if the current task should ignore tracing the syscall.
      
      I want to stress that this change does not affect actual syscalls in any
      way, shape or form. It is only used within the tracing system and
      doesn't interfere with the syscall logic at all. The changes are
      consolidated nicely into trace_syscalls.c and asm/ftrace.h.
      
      I had to make one small modification to asm/thread_info.h and that was
      to remove the include of asm/ftrace.h. As asm/ftrace.h required the
      current_thread_info() it was causing include hell. That include was
      added back in 2008 when the function graph tracer was added:
      
       commit caf4b323 "tracing, x86: add low level support for ftrace return tracing"
      
      It does not need to be included there.
      
      Link: http://lkml.kernel.org/r/1360703939.21867.99.camel@gandalf.local.homeAcked-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f431b634
  2. 22 1月, 2013 1 次提交
  3. 23 8月, 2012 1 次提交
    • S
      ftrace/x86: Add support for -mfentry to x86_64 · d57c5d51
      Steven Rostedt 提交于
      If the kernel is compiled with gcc 4.6.0 which supports -mfentry,
      then use that instead of mcount.
      
      With mcount, frame pointers are forced with the -pg option and we
      get something like:
      
      <can_vma_merge_before>:
             55                      push   %rbp
             48 89 e5                mov    %rsp,%rbp
             53                      push   %rbx
             41 51                   push   %r9
             e8 fe 6a 39 00          callq  ffffffff81483d00 <mcount>
             31 c0                   xor    %eax,%eax
             48 89 fb                mov    %rdi,%rbx
             48 89 d7                mov    %rdx,%rdi
             48 33 73 30             xor    0x30(%rbx),%rsi
             48 f7 c6 ff ff ff f7    test   $0xfffffffff7ffffff,%rsi
      
      With -mfentry, frame pointers are no longer forced and the call looks
      like this:
      
      <can_vma_merge_before>:
             e8 33 af 37 00          callq  ffffffff81461b40 <__fentry__>
             53                      push   %rbx
             48 89 fb                mov    %rdi,%rbx
             31 c0                   xor    %eax,%eax
             48 89 d7                mov    %rdx,%rdi
             41 51                   push   %r9
             48 33 73 30             xor    0x30(%rbx),%rsi
             48 f7 c6 ff ff ff f7    test   $0xfffffffff7ffffff,%rsi
      
      This adds the ftrace hook at the beginning of the function before a
      frame is set up, and allows the function callbacks to be able to access
      parameters. As kprobes now can use function tracing (at least on x86)
      this speeds up the kprobe hooks that are at the beginning of the
      function.
      
      Link: http://lkml.kernel.org/r/20120807194100.130477900@goodmis.orgAcked-by: NIngo Molnar <mingo@kernel.org>
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d57c5d51
  4. 20 7月, 2012 4 次提交
  5. 01 6月, 2012 1 次提交
    • S
      ftrace: Synchronize variable setting with breakpoints · a192cd04
      Steven Rostedt 提交于
      When the function tracer starts modifying the code via breakpoints
      it sets a variable (modifying_ftrace_code) to inform the breakpoint
      handler to call the ftrace int3 code.
      
      But there's no synchronization between setting this code and the
      handler, thus it is possible for the handler to be called on another
      CPU before it sees the variable. This will cause a kernel crash as
      the int3 handler will not know what to do with it.
      
      I originally added smp_mb()'s to force the visibility of the variable
      but H. Peter Anvin suggested that I just make it atomic.
      
      [ Added comments as suggested by Peter Zijlstra ]
      Suggested-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a192cd04
  6. 28 4月, 2012 1 次提交
    • S
      ftrace/x86: Have arch x86_64 use breakpoints instead of stop machine · 08d636b6
      Steven Rostedt 提交于
      This method changes x86 to add a breakpoint to the mcount locations
      instead of calling stop machine.
      
      Now that iret can be handled by NMIs, we perform the following to
      update code:
      
      1) Add a breakpoint to all locations that will be modified
      
      2) Sync all cores
      
      3) Update all locations to be either a nop or call (except breakpoint
         op)
      
      4) Sync all cores
      
      5) Remove the breakpoint with the new code.
      
      6) Sync all cores
      
      [
        Added updates that Masami suggested:
         Use unlikely(modifying_ftrace_code) in int3 trap to keep kprobes efficient.
         Don't use NOTIFY_* in ftrace handler in int3 as it is not a notifier.
      ]
      
      Cc: H. Peter Anvin <hpa@zytor.com>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      08d636b6
  7. 17 5月, 2011 2 次提交
  8. 27 8月, 2009 1 次提交
    • J
      tracing: Remove FTRACE_SYSCALL_MAX definitions · 117226d1
      Jason Baron 提交于
      Remove the FTRACE_SYSCALL_MAX definitions now that we have converted the
      syscall event tracing code to use NR_syscalls.
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Josh Stone <jistone@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anwin <hpa@zytor.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      LKML-Reference: <f2240cdc8f0b1ca7617390c8f5ec90ba2bd348cf.1251146513.git.jbaron@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      117226d1
  9. 12 8月, 2009 1 次提交
    • J
      tracing: Update FTRACE_SYSCALL_MAX · 9daa77e2
      Jason Baron 提交于
      update FTRACE_SYSCALL_MAX to the current number of syscalls
      
      FTRACE_SYSCALL_MAX is a temporary solution to get the number of
      syscalls supported by the arch until we find a more dynamic way
      to get this number.
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      9daa77e2
  10. 13 3月, 2009 1 次提交
  11. 19 2月, 2009 1 次提交
  12. 17 12月, 2008 1 次提交
  13. 26 11月, 2008 1 次提交
  14. 23 11月, 2008 1 次提交
  15. 16 11月, 2008 1 次提交
    • S
      ftrace: pass module struct to arch dynamic ftrace functions · 31e88909
      Steven Rostedt 提交于
      Impact: allow archs more flexibility on dynamic ftrace implementations
      
      Dynamic ftrace has largly been developed on x86. Since x86 does not
      have the same limitations as other architectures, the ftrace interaction
      between the generic code and the architecture specific code was not
      flexible enough to handle some of the issues that other architectures
      have.
      
      Most notably, module trampolines. Due to the limited branch distance
      that archs make in calling kernel core code from modules, the module
      load code must create a trampoline to jump to what will make the
      larger jump into core kernel code.
      
      The problem arises when this happens to a call to mcount. Ftrace checks
      all code before modifying it and makes sure the current code is what
      it expects. Right now, there is not enough information to handle modifying
      module trampolines.
      
      This patch changes the API between generic dynamic ftrace code and
      the arch dependent code. There is now two functions for modifying code:
      
        ftrace_make_nop(mod, rec, addr) - convert the code at rec->ip into
             a nop, where the original text is calling addr. (mod is the
             module struct if called by module init)
      
        ftrace_make_caller(rec, addr) - convert the code rec->ip that should
             be a nop into a caller to addr.
      
      The record "rec" now has a new field called "arch" where the architecture
      can add any special attributes to each call site record.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      31e88909
  16. 11 11月, 2008 1 次提交
    • F
      tracing, x86: add low level support for ftrace return tracing · caf4b323
      Frederic Weisbecker 提交于
      Impact: add infrastructure for function-return tracing
      
      Add low level support for ftrace return tracing.
      
      This plug-in stores return addresses on the thread_info structure of
      the current task.
      
      The index of the current return address is initialized when the task
      is the first one (init) and when a process forks (the child). It is
      not needed when a task does a sys_execve because after this syscall,
      it still needs to return on the kernel functions it called.
      
      Note that the code of return_to_handler has been suggested by Steven
      Rostedt as almost all of the ideas of improvements in this V3.
      
      For purpose of security, arch/x86/kernel/process_32.c is not traced
      because __switch_to() changes the current task during its execution.
      That could cause inconsistency in the stored return address of this
      function even if I didn't have any crash after testing with tracing on
      this function enabled.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      caf4b323
  17. 03 11月, 2008 1 次提交
  18. 31 10月, 2008 2 次提交
    • S
      ftrace: nmi safe code clean ups · a26a2a27
      Steven Rostedt 提交于
      Impact: cleanup
      
      This patch cleans up the NMI safe code for dynamic ftrace as suggested
      by Andrew Morton.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a26a2a27
    • S
      ftrace: nmi safe code modification · 17666f02
      Steven Rostedt 提交于
      Impact: fix crashes that can occur in NMI handlers, if their code is modified
      
      Modifying code is something that needs special care. On SMP boxes,
      if code that is being modified is also being executed on another CPU,
      that CPU will have undefined results.
      
      The dynamic ftrace uses kstop_machine to make the system act like a
      uniprocessor system. But this does not address NMIs, that can still
      run on other CPUs.
      
      One approach to handle this is to make all code that are used by NMIs
      not be traced. But NMIs can call notifiers that spread throughout the
      kernel and this will be very hard to maintain, and the chance of missing
      a function is very high.
      
      The approach that this patch takes is to have the NMIs modify the code
      if the modification is taking place. The way this works is that just
      writing to code executing on another CPU is not harmful if what is
      written is the same as what exists.
      
      Two buffers are used: an IP buffer and a "code" buffer.
      
      The steps that the patcher takes are:
      
       1) Put in the instruction pointer into the IP buffer
          and the new code into the "code" buffer.
       2) Set a flag that says we are modifying code
       3) Wait for any running NMIs to finish.
       4) Write the code
       5) clear the flag.
       6) Wait for any running NMIs to finish.
      
      If an NMI is executed, it will also write the pending code.
      Multiple writes are OK, because what is being written is the same.
      Then the patcher must wait for all running NMIs to finish before
      going to the next line that must be patched.
      
      This is basically the RCU approach to code modification.
      
      Thanks to Ingo Molnar for suggesting the idea, and to Arjan van de Ven
      for his guidence on what is safe and what is not.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      17666f02
  19. 23 10月, 2008 2 次提交
  20. 21 10月, 2008 1 次提交
  21. 14 10月, 2008 1 次提交
    • S
      ftrace: mcount call site on boot nops core · 68bf21aa
      Steven Rostedt 提交于
      This is the infrastructure to the converting the mcount call sites
      recorded by the __mcount_loc section into nops on boot. It also allows
      for using these sites to enable tracing as normal. When the __mcount_loc
      section is used, the "ftraced" kernel thread is disabled.
      
      This uses the current infrastructure to record the mcount call sites
      as well as convert them to nops. The mcount function is kept as a stub
      on boot up and not converted to the ftrace_record_ip function. We use the
      ftrace_record_ip to only record from the table.
      
      This patch does not handle modules. That comes with a later patch.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      68bf21aa
  22. 23 7月, 2008 1 次提交
    • V
      x86: consolidate header guards · 77ef50a5
      Vegard Nossum 提交于
      This patch is the result of an automatic script that consolidates the
      format of all the headers in include/asm-x86/.
      
      The format:
      
      1. No leading underscore. Names with leading underscores are reserved.
      2. Pathname components are separated by two underscores. So we can
         distinguish between mm_types.h and mm/types.h.
      3. Everything except letters and numbers are turned into single
         underscores.
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      77ef50a5
  23. 18 7月, 2008 1 次提交
  24. 24 6月, 2008 1 次提交