1. 02 12月, 2014 1 次提交
  2. 20 11月, 2014 1 次提交
    • S
      ftrace/x86/extable: Add is_ftrace_trampoline() function · aec0be2d
      Steven Rostedt (Red Hat) 提交于
      Stack traces that happen from function tracing check if the address
      on the stack is a __kernel_text_address(). That is, is the address
      kernel code. This calls core_kernel_text() which returns true
      if the address is part of the builtin kernel code. It also calls
      is_module_text_address() which returns true if the address belongs
      to module code.
      
      But what is missing is ftrace dynamically allocated trampolines.
      These trampolines are allocated for individual ftrace_ops that
      call the ftrace_ops callback functions directly. But if they do a
      stack trace, the code checking the stack wont detect them as they
      are neither core kernel code nor module address space.
      
      Adding another field to ftrace_ops that also stores the size of
      the trampoline assigned to it we can create a new function called
      is_ftrace_trampoline() that returns true if the address is a
      dynamically allocate ftrace trampoline. Note, it ignores trampolines
      that are not dynamically allocated as they will return true with
      the core_kernel_text() function.
      
      Link: http://lkml.kernel.org/r/20141119034829.497125839@goodmis.org
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      aec0be2d
  3. 12 11月, 2014 2 次提交
    • S
      ftrace: Add more information to ftrace_bug() output · 4fd3279b
      Steven Rostedt (Red Hat) 提交于
      With the introduction of the dynamic trampolines, it is useful that if
      things go wrong that ftrace_bug() produces more information about what
      the current state is. This can help debug issues that may arise.
      
      Ftrace has lots of checks to make sure that the state of the system it
      touchs is exactly what it expects it to be. When it detects an abnormality
      it calls ftrace_bug() and disables itself to prevent any further damage.
      It is crucial that ftrace_bug() produces sufficient information that
      can be used to debug the situation.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4fd3279b
    • S
      ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines · 12cce594
      Steven Rostedt (Red Hat) 提交于
      When the static ftrace_ops (like function tracer) enables tracing, and it
      is the only callback that is referencing a function, a trampoline is
      dynamically allocated to the function that calls the callback directly
      instead of calling a loop function that iterates over all the registered
      ftrace ops (if more than one ops is registered).
      
      But when it comes to dynamically allocated ftrace_ops, where they may be
      freed, on a CONFIG_PREEMPT kernel there's no way to know when it is safe
      to free the trampoline. If a task was preempted while executing on the
      trampoline, there's currently no way to know when it will be off that
      trampoline.
      
      But this is not true when it comes to !CONFIG_PREEMPT. The current method
      of calling schedule_on_each_cpu() will force tasks off the trampoline,
      becaues they can not schedule while on it (kernel preemption is not
      configured). That means it is safe to free a dynamically allocated
      ftrace ops trampoline when CONFIG_PREEMPT is not configured.
      
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      12cce594
  4. 01 11月, 2014 2 次提交
    • S
      ftrace/x86: Show trampoline call function in enabled_functions · 15d5b02c
      Steven Rostedt (Red Hat) 提交于
      The file /sys/kernel/debug/tracing/eneabled_functions is used to debug
      ftrace function hooks. Add to the output what function is being called
      by the trampoline if the arch supports it.
      
      Add support for this feature in x86_64.
      
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Tested-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      15d5b02c
    • S
      ftrace/x86: Add dynamic allocated trampoline for ftrace_ops · f3bea491
      Steven Rostedt (Red Hat) 提交于
      The current method of handling multiple function callbacks is to register
      a list function callback that calls all the other callbacks based on
      their hash tables and compare it to the function that the callback was
      called on. But this is very inefficient.
      
      For example, if you are tracing all functions in the kernel and then
      add a kprobe to a function such that the kprobe uses ftrace, the
      mcount trampoline will switch from calling the function trace callback
      to calling the list callback that will iterate over all registered
      ftrace_ops (in this case, the function tracer and the kprobes callback).
      That means for every function being traced it checks the hash of the
      ftrace_ops for function tracing and kprobes, even though the kprobes
      is only set at a single function. The kprobes ftrace_ops is checked
      for every function being traced!
      
      Instead of calling the list function for functions that are only being
      traced by a single callback, we can call a dynamically allocated
      trampoline that calls the callback directly. The function graph tracer
      already uses a direct call trampoline when it is being traced by itself
      but it is not dynamically allocated. It's trampoline is static in the
      kernel core. The infrastructure that called the function graph trampoline
      can also be used to call a dynamically allocated one.
      
      For now, only ftrace_ops that are not dynamically allocated can have
      a trampoline. That is, users such as function tracer or stack tracer.
      kprobes and perf allocate their ftrace_ops, and until there's a safe
      way to free the trampoline, it can not be used. The dynamically allocated
      ftrace_ops may, although, use the trampoline if the kernel is not
      compiled with CONFIG_PREEMPT. But that will come later.
      Tested-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f3bea491
  5. 17 7月, 2014 1 次提交
  6. 04 6月, 2014 1 次提交
  7. 14 5月, 2014 3 次提交
    • S
      ftrace: Remove FTRACE_UPDATE_MODIFY_CALL_REGS flag · f1b2f2bd
      Steven Rostedt (Red Hat) 提交于
      As the decision to what needs to be done (converting a call to the
      ftrace_caller to ftrace_caller_regs or to convert from ftrace_caller_regs
      to ftrace_caller) can easily be determined from the rec->flags of
      FTRACE_FL_REGS and FTRACE_FL_REGS_EN, there's no need to have the
      ftrace_check_record() return either a UPDATE_MODIFY_CALL_REGS or a
      UPDATE_MODIFY_CALL. Just he latter is enough. This added flag causes
      more complexity than is required. Remove it.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f1b2f2bd
    • S
      ftrace: Make get_ftrace_addr() and get_ftrace_addr_old() global · 7413af1f
      Steven Rostedt (Red Hat) 提交于
      Move and rename get_ftrace_addr() and get_ftrace_addr_old() to
      ftrace_get_addr_new() and ftrace_get_addr_curr() respectively.
      
      This moves these two helper functions in the generic code out from
      the arch specific code, and renames them to have a better generic
      name. This will allow other archs to use them as well as makes it
      a bit easier to work on getting separate trampolines for different
      functions.
      
      ftrace_get_addr_new() returns the trampoline address that the mcount
      call address will be converted to.
      
      ftrace_get_addr_curr() returns the trampoline address of what the
      mcount call address currently jumps to.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7413af1f
    • S
      ftrace/x86: Get the current mcount addr for add_breakpoint() · 94792ea0
      Steven Rostedt (Red Hat) 提交于
      The add_breakpoint() code in the ftrace updating gets the address
      of what the call will become, but if the mcount address is changing
      from regs to non-regs ftrace_caller or vice versa, it will use what
      the record currently is.
      
      This is rather silly as the code should always use what is currently
      there regardless of if it's changing the regs function or just converting
      to a nop.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      94792ea0
  8. 22 4月, 2014 1 次提交
  9. 07 3月, 2014 4 次提交
  10. 04 3月, 2014 2 次提交
    • P
      ftrace/x86: One more missing sync after fixup of function modification failure · 12729f14
      Petr Mladek 提交于
      If a failure occurs while modifying ftrace function, it bails out and will
      remove the tracepoints to be back to what the code originally was.
      
      There is missing the final sync run across the CPUs after the fix up is done
      and before the ftrace int3 handler flag is reset.
      
      Here's the description of the problem:
      
      	CPU0				CPU1
      	----				----
        remove_breakpoint();
        modifying_ftrace_code = 0;
      
      				[still sees breakpoint]
      				<takes trap>
      				[sees modifying_ftrace_code as zero]
      				[no breakpoint handler]
      				[goto failed case]
      				[trap exception - kernel breakpoint, no
      				 handler]
      				BUG()
      
      Link: http://lkml.kernel.org/r/1393258342-29978-2-git-send-email-pmladek@suse.cz
      
      Fixes: 8a4d0a68 "ftrace: Use breakpoint method to update ftrace caller"
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: NPetr Mladek <pmladek@suse.cz>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      12729f14
    • S
      ftrace/x86: Run a sync after fixup on failure · c932c6b7
      Steven Rostedt (Red Hat) 提交于
      If a failure occurs while enabling a trace, it bails out and will remove
      the tracepoints to be back to what the code originally was. But the fix
      up had some bugs in it. By injecting a failure in the code, the fix up
      ran to completion, but shortly afterward the system rebooted.
      
      There was two bugs here.
      
      The first was that there was no final sync run across the CPUs after the
      fix up was done, and before the ftrace int3 handler flag was reset. That
      means that other CPUs could still see the breakpoint and trigger on it
      long after the flag was cleared, and the int3 handler would think it was
      a spurious interrupt. Worse yet, the int3 handler could hit other breakpoints
      because the ftrace int3 handler flag would have prevented the int3 handler
      from going further.
      
      Here's a description of the issue:
      
      	CPU0				CPU1
      	----				----
        remove_breakpoint();
        modifying_ftrace_code = 0;
      
      				[still sees breakpoint]
      				<takes trap>
      				[sees modifying_ftrace_code as zero]
      				[no breakpoint handler]
      				[goto failed case]
      				[trap exception - kernel breakpoint, no
      				 handler]
      				BUG()
      
      The second bug was that the removal of the breakpoints required the
      "within()" logic updates instead of accessing the ip address directly.
      As the kernel text is mapped read-only when CONFIG_DEBUG_RODATA is set, and
      the removal of the breakpoint is a modification of the kernel text.
      The ftrace_write() includes the "within()" logic, where as, the
      probe_kernel_write() does not. This prevented the breakpoint from being
      removed at all.
      
      Link: http://lkml.kernel.org/r/1392650573-3390-1-git-send-email-pmladek@suse.czReported-by: NPetr Mladek <pmladek@suse.cz>
      Tested-by: NPetr Mladek <pmladek@suse.cz>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c932c6b7
  11. 12 2月, 2014 1 次提交
    • S
      ftrace/x86: Use breakpoints for converting function graph caller · 87fbb2ac
      Steven Rostedt (Red Hat) 提交于
      When the conversion was made to remove stop machine and use the breakpoint
      logic instead, the modification of the function graph caller is still
      done directly as though it was being done under stop machine.
      
      As it is not converted via stop machine anymore, there is a possibility
      that the code could be layed across cache lines and if another CPU is
      accessing that function graph call when it is being updated, it could
      cause a General Protection Fault.
      
      Convert the update of the function graph caller to use the breakpoint
      method as well.
      
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: stable@vger.kernel.org # 3.5+
      Fixes: 08d636b6 "ftrace/x86: Have arch x86_64 use breakpoints instead of stop machine"
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      87fbb2ac
  12. 06 11月, 2013 1 次提交
  13. 17 11月, 2012 1 次提交
  14. 20 7月, 2012 2 次提交
    • S
      ftrace/x86: Add save_regs for i386 function calls · 4de72395
      Steven Rostedt 提交于
      Add saving full regs for function tracing on i386.
      The saving of regs was influenced by patches sent out by
      Masami Hiramatsu.
      
      Link: Link: http://lkml.kernel.org/r/20120711195745.379060003@goodmis.orgReviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4de72395
    • S
      ftrace/x86: Add separate function to save regs · 08f6fba5
      Steven Rostedt 提交于
      Add a way to have different functions calling different trampolines.
      If a ftrace_ops wants regs saved on the return, then have only the
      functions with ops registered to save regs. Functions registered by
      other ops would not be affected, unless the functions overlap.
      
      If one ftrace_ops registered functions A, B and C and another ops
      registered fucntions to save regs on A, and D, then only functions
      A and D would be saving regs. Function B and C would work as normal.
      Although A is registered by both ops: normal and saves regs; this is fine
      as saving the regs is needed to satisfy one of the ops that calls it
      but the regs are ignored by the other ops function.
      
      x86_64 implements the full regs saving, and i386 just passes a NULL
      for regs to satisfy the ftrace_ops passing. Where an arch must supply
      both regs and ftrace_ops parameters, even if regs is just NULL.
      
      It is OK for an arch to pass NULL regs. All function trace users that
      require regs passing must add the flag FTRACE_OPS_FL_SAVE_REGS when
      registering the ftrace_ops. If the arch does not support saving regs
      then the ftrace_ops will fail to register. The flag
      FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED may be set that will prevent the
      ftrace_ops from failing to register. In this case, the handler may
      either check if regs is not NULL or check if ARCH_SUPPORTS_FTRACE_SAVE_REGS.
      If the arch supports passing regs it will set this macro and pass regs
      for ops that request them. All other archs will just pass NULL.
      
      Link: Link: http://lkml.kernel.org/r/20120711195745.107705970@goodmis.org
      
      Cc: Alexander van Heukelum <heukelum@fastmail.fm>
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      08f6fba5
  15. 01 6月, 2012 2 次提交
    • S
      ftrace: Use breakpoint method to update ftrace caller · 8a4d0a68
      Steven Rostedt 提交于
      On boot up and module load, it is fine to modify the code directly,
      without the use of breakpoints. This is because boot up modification
      is done before SMP is initialized, thus the modification is serial,
      and module load is done before the module executes.
      
      But after that we must use a SMP safe method to modify running code.
      Otherwise, if we are running the function tracer and update its
      function (by starting off the stack tracer, or perf tracing)
      the change of the function called by the ftrace trampoline is done
      directly. If this is being executed on another CPU, that CPU may
      take a GPF and crash the kernel.
      
      The breakpoint method is used to change the nops at all the functions, but
      the change of the ftrace callback handler itself was still using a
      direct modification. If tracing was enabled and the function callback
      was changed then another CPU could fault if it was currently calling
      the original callback. This modification must use the breakpoint method
      too.
      
      Note, the direct method is still used for boot up and module load.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      8a4d0a68
    • S
      ftrace: Synchronize variable setting with breakpoints · a192cd04
      Steven Rostedt 提交于
      When the function tracer starts modifying the code via breakpoints
      it sets a variable (modifying_ftrace_code) to inform the breakpoint
      handler to call the ftrace int3 code.
      
      But there's no synchronization between setting this code and the
      handler, thus it is possible for the handler to be called on another
      CPU before it sees the variable. This will cause a kernel crash as
      the int3 handler will not know what to do with it.
      
      I originally added smp_mb()'s to force the visibility of the variable
      but H. Peter Anvin suggested that I just make it atomic.
      
      [ Added comments as suggested by Peter Zijlstra ]
      Suggested-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a192cd04
  16. 17 5月, 2012 1 次提交
  17. 04 5月, 2012 1 次提交
  18. 28 4月, 2012 2 次提交
    • S
      ftrace/x86: Remove the complex ftrace NMI handling code · 4a6d70c9
      Steven Rostedt 提交于
      As ftrace function tracing would require modifying code that could
      be executed in NMI context, which is not stopped with stop_machine(),
      ftrace had to do a complex algorithm with various stages of setup
      and memory barriers to make it work.
      
      With the new breakpoint method, this is no longer required. The changes
      to the code can be done without any problem in NMI context, as well as
      without stop machine altogether. Remove the complex code as it is
      no longer needed.
      
      Also, a lot of the notrace annotations could be removed from the
      NMI code as it is now safe to trace them. With the exception of
      do_nmi itself, which does some special work to handle running in
      the debug stack. The breakpoint method can cause NMIs to double
      nest the debug stack if it's not setup properly, and that is done
      in do_nmi(), thus that function must not be traced.
      
      (Note the arch sh may want to do the same)
      
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4a6d70c9
    • S
      ftrace/x86: Have arch x86_64 use breakpoints instead of stop machine · 08d636b6
      Steven Rostedt 提交于
      This method changes x86 to add a breakpoint to the mcount locations
      instead of calling stop machine.
      
      Now that iret can be handled by NMIs, we perform the following to
      update code:
      
      1) Add a breakpoint to all locations that will be modified
      
      2) Sync all cores
      
      3) Update all locations to be either a nop or call (except breakpoint
         op)
      
      4) Sync all cores
      
      5) Remove the breakpoint with the new code.
      
      6) Sync all cores
      
      [
        Added updates that Masami suggested:
         Use unlikely(modifying_ftrace_code) in int3 trap to keep kprobes efficient.
         Don't use NOTIFY_* in ftrace handler in int3 as it is not a notifier.
      ]
      
      Cc: H. Peter Anvin <hpa@zytor.com>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      08d636b6
  19. 26 5月, 2011 1 次提交
  20. 19 4月, 2011 1 次提交
  21. 10 3月, 2011 1 次提交
    • S
      ftrace/graph: Trace function entry before updating index · 722b3c74
      Steven Rostedt 提交于
      Currently the index to the ret_stack is updated and the real return address
      is saved in the ret_stack. Then we call the trace function. The trace
      function could decide that it doesn't want to trace this function
      (ex. set_graph_function does not match) and it will return 0 which means
      not to trace this call.
      
      The normal function graph tracer has this code:
      
      	if (!(trace->depth || ftrace_graph_addr(trace->func)) ||
      	      ftrace_graph_ignore_irqs())
      		return 0;
      
      What this states is, if the trace depth (which is curr_ret_stack)
      is zero (top of nested functions) then test if we want to trace this
      function. If this function is not to be traced, then return  0 and
      the rest of the function graph tracer logic will not trace this function.
      
      The problem arises when an interrupt comes in after we updated the
      curr_ret_stack. The next function that gets called will have a trace->depth
      of 1. Which fools this trace code into thinking that we are in a nested
      function, and that we should trace. This causes interrupts to be traced
      when they should not be.
      
      The solution is to trace the function first and then update the ret_stack.
      Reported-by: Nzhiping zhong <xzhong86@163.com>
      Reported-by: Nwu zhangjin <wuzhangjin@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      722b3c74
  22. 30 12月, 2010 1 次提交
  23. 18 11月, 2010 1 次提交
    • M
      x86: Add RO/NX protection for loadable kernel modules · 84e1c6bb
      matthieu castet 提交于
      This patch is a logical extension of the protection provided by
      CONFIG_DEBUG_RODATA to LKMs. The protection is provided by
      splitting module_core and module_init into three logical parts
      each and setting appropriate page access permissions for each
      individual section:
      
       1. Code: RO+X
       2. RO data: RO+NX
       3. RW data: RW+NX
      
      In order to achieve proper protection, layout_sections() have
      been modified to align each of the three parts mentioned above
      onto page boundary. Next, the corresponding page access
      permissions are set right before successful exit from
      load_module(). Further, free_module() and sys_init_module have
      been modified to set module_core and module_init as RW+NX right
      before calling module_free().
      
      By default, the original section layout and access flags are
      preserved. When compiled with CONFIG_DEBUG_SET_MODULE_RONX=y,
      the patch will page-align each group of sections to ensure that
      each page contains only one type of content and will enforce
      RO/NX for each group of pages.
      
        -v1: Initial proof-of-concept patch.
        -v2: The patch have been re-written to reduce the number of #ifdefs
             and to make it architecture-agnostic. Code formatting has also
             been corrected.
        -v3: Opportunistic RO/NX protection is now unconditional. Section
             page-alignment is enabled when CONFIG_DEBUG_RODATA=y.
        -v4: Removed most macros and improved coding style.
        -v5: Changed page-alignment and RO/NX section size calculation
        -v6: Fixed comments. Restricted RO/NX enforcement to x86 only
        -v7: Introduced CONFIG_DEBUG_SET_MODULE_RONX, added
             calls to set_all_modules_text_rw() and set_all_modules_text_ro()
             in ftrace
        -v8: updated for compatibility with linux 2.6.33-rc5
        -v9: coding style fixes
       -v10: more coding style fixes
       -v11: minor adjustments for -tip
       -v12: minor adjustments for v2.6.35-rc2-tip
       -v13: minor adjustments for v2.6.37-rc1-tip
      Signed-off-by: NSiarhei Liakh <sliakh.lkml@gmail.com>
      Signed-off-by: NXuxian Jiang <jiang@cs.ncsu.edu>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Reviewed-by: NJames Morris <jmorris@namei.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Kees Cook <kees.cook@canonical.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <4CE2F914.9070106@free.fr>
      [ minor cleanliness edits, -v14: build failure fix ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      84e1c6bb
  24. 21 9月, 2010 1 次提交
  25. 25 2月, 2010 1 次提交
    • S
      ftrace: Remove memory barriers from NMI code when not needed · 0c54dd34
      Steven Rostedt 提交于
      The code in stop_machine that modifies the kernel text has a bit
      of logic to handle the case of NMIs. stop_machine does not prevent
      NMIs from executing, and if an NMI were to trigger on another CPU
      as the modifying CPU is changing the NMI text, a GPF could result.
      
      To prevent the GPF, the NMI calls ftrace_nmi_enter() which may
      modify the code first, then any other NMIs will just change the
      text to the same content which will do no harm. The code that
      stop_machine called must wait for NMIs to finish while it changes
      each location in the kernel. That code may also change the text
      to what the NMI changed it to. The key is that the text will never
      change content while another CPU is executing it.
      
      To make the above work, the call to ftrace_nmi_enter() must also
      do a smp_mb() as well as atomic_inc().  But for applications like
      perf that require a high number of NMIs for profiling, this can have
      a dramatic effect on the system. Not only is it doing a full memory
      barrier on both nmi_enter() as well as nmi_exit() it is also
      modifying a global variable with an atomic operation. This kills
      performance on large SMP machines.
      
      Since the memory barriers are only needed when ftrace is in the
      process of modifying the text (which is seldom), this patch
      adds a "modifying_code" variable that gets set before stop machine
      is executed and cleared afterwards.
      
      The NMIs will check this variable and store it in a per CPU
      "save_modifying_code" variable that it will use to check if it
      needs to do the memory barriers and atomic dec on NMI exit.
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      0c54dd34
  26. 17 2月, 2010 1 次提交
  27. 03 11月, 2009 1 次提交
  28. 14 10月, 2009 1 次提交
    • F
      tracing: Move syscalls metadata handling from arch to core · c44fc770
      Frederic Weisbecker 提交于
      Most of the syscalls metadata processing is done from arch.
      But these operations are mostly generic accross archs. Especially now
      that we have a common variable name that expresses the number of
      syscalls supported by an arch: NR_syscalls, the only remaining bits
      that need to reside in arch is the syscall nr to addr translation.
      
      v2: Compare syscalls symbols only after the "sys" prefix so that we
          avoid spurious mismatches with archs that have syscalls wrappers,
          in which case syscalls symbols have "SyS" prefixed aliases.
          (Reported by: Heiko Carstens)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      c44fc770
  29. 12 10月, 2009 1 次提交