1. 21 12月, 2011 6 次提交
    • S
      ftrace: Allow access to the boot time function enabling · 2a85a37f
      Steven Rostedt 提交于
      Change set_ftrace_early_filter() to ftrace_set_early_filter()
      and make it a global function. This will allow other subsystems
      in the kernel to be able to enable function tracing at start
      up and reuse the ftrace function parsing code.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2a85a37f
    • S
      ftrace: Decouple hash items from showing filtered functions · 69a3083c
      Steven Rostedt 提交于
      The set_ftrace_filter shows "hashed" functions, which are functions
      that are added with operations to them (like traceon and traceoff).
      
      As other subsystems may be able to show what functions they are
      using for function tracing, the hash items should no longer
      be shown just because the FILTER flag is set. As they have nothing
      to do with other subsystems filters.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      69a3083c
    • S
      ftrace: Allow other users of function tracing to use the output listing · fc13cb0c
      Steven Rostedt 提交于
      The function tracer is set up to allow any other subsystem (like perf)
      to use it. Ftrace already has a way to list what functions are enabled
      by the global_ops. It would be very helpful to let other users of
      the function tracer to be able to use the same code.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      fc13cb0c
    • S
      ftrace: Replace record newlist with record page list · 85ae32ae
      Steven Rostedt 提交于
      As new functions come in to be initalized from mcount to nop,
      they are done by groups of pages. Whether it is the core kernel
      or a module. There's no need to keep track of these on a per record
      basis.
      
      At startup, and as any module is loaded, the functions to be
      traced are stored in a group of pages and added to the function
      list at the end. We just need to keep a pointer to the first
      page of the list that was added, and use that to know where to
      start on the list for initializing functions.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      85ae32ae
    • S
      ftrace: Remove usage of "freed" records · 32082309
      Steven Rostedt 提交于
      Records that are added to the function trace table are
      permanently there, except for modules. By separating out the
      modules to their own pages that can be freed in one shot
      we can remove the "freed" flag and simplify some of the record
      management.
      
      Another benefit of doing this is that we can also move the
      records around; sort them.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      32082309
    • S
      ftrace: Allow archs to modify code without stop machine · c88fd863
      Steven Rostedt 提交于
      The stop machine method to modify all functions in the kernel
      (some 20,000 of them) is the safest way to do so across all archs.
      But some archs may not need this big hammer approach to modify code
      on SMP machines, and can simply just update the code it needs.
      
      Adding a weak function arch_ftrace_update_code() that now does the
      stop machine, will also let any arch override this method.
      
      If the arch needs to check the system and then decide if it can
      avoid stop machine, it can still call ftrace_run_stop_machine() to
      use the old method.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c88fd863
  2. 01 11月, 2011 1 次提交
    • P
      include: replace linux/module.h with "struct module" wherever possible · de477254
      Paul Gortmaker 提交于
      The <linux/module.h> pretty much brings in the kitchen sink along
      with it, so it should be avoided wherever reasonably possible in
      terms of being included from other commonly used <linux/something.h>
      files, as it results in a measureable increase on compile times.
      
      The worst culprit was probably device.h since it is used everywhere.
      This file also had an implicit dependency/usage of mutex.h which was
      masked by module.h, and is also fixed here at the same time.
      
      There are over a dozen other headers that simply declare the
      struct instead of pulling in the whole file, so follow their lead
      and simply make it a few more.
      
      Most of the implicit dependencies on module.h being present by
      these headers pulling it in have been now weeded out, so we can
      finally make this change with hopefully minimal breakage.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      de477254
  3. 11 7月, 2011 1 次提交
  4. 07 7月, 2011 1 次提交
    • S
      ftrace: Fix regression of :mod:module function enabling · 43dd61c9
      Steven Rostedt 提交于
      The new code that allows different utilities to pick and choose
      what functions they trace broke the :mod: hook that allows users
      to trace only functions of a particular module.
      
      The reason is that the :mod: hook bypasses the hash that is setup
      to allow individual users to trace their own functions and uses
      the global hash directly. But if the global hash has not been
      set up, it will cause a bug:
      
      echo '*:mod:radeon' > /sys/kernel/debug/set_ftrace_filter
      
      produces:
      
       [drm:drm_mode_getfb] *ERROR* invalid framebuffer id
       [drm:radeon_crtc_page_flip] *ERROR* failed to reserve new rbo buffer before flip
       BUG: unable to handle kernel paging request at ffffffff8160ec90
       IP: [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
       PGD 1a05067 PUD 1a09063 PMD 80000000016001e1
       Oops: 0003 [#1] SMP Jul  7 04:02:28 phyllis kernel: [55303.858604] CPU 1
       Modules linked in: cryptd aes_x86_64 aes_generic binfmt_misc rfcomm bnep ip6table_filter hid radeon r8169 ahci libahci mii ttm drm_kms_helper drm video i2c_algo_bit intel_agp intel_gtt
      
       Pid: 10344, comm: bash Tainted: G        WC  3.0.0-rc5 #1 Dell Inc. Inspiron N5010/0YXXJJ
       RIP: 0010:[<ffffffff810d9136>]  [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
       RSP: 0018:ffff88003a96bda8  EFLAGS: 00010246
       RAX: ffff8801301735c0 RBX: ffffffff8160ec80 RCX: 0000000000306ee0
       RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880137c92940
       RBP: ffff88003a96bdb8 R08: ffff880137c95680 R09: 0000000000000000
       R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff81c9df78
       R13: ffff8801153d1000 R14: 0000000000000000 R15: 0000000000000000
       FS: 00007f329c18a700(0000) GS:ffff880137c80000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: ffffffff8160ec90 CR3: 000000003002b000 CR4: 00000000000006e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Process bash (pid: 10344, threadinfo ffff88003a96a000, task ffff88012fcfc470)
       Stack:
        0000000000000fd0 00000000000000fc ffff88003a96be38 ffffffff810d92f5
        ffff88011c4c4e00 ffff880000000000 000000000b69f4d0 ffffffff8160ec80
        ffff8800300e6f06 0000000081130295 0000000000000282 ffff8800300e6f00
       Call Trace:
        [<ffffffff810d92f5>] match_records+0x155/0x1b0
        [<ffffffff810d940c>] ftrace_mod_callback+0xbc/0x100
        [<ffffffff810dafdf>] ftrace_regex_write+0x16f/0x210
        [<ffffffff810db09f>] ftrace_filter_write+0xf/0x20
        [<ffffffff81166e48>] vfs_write+0xc8/0x190
        [<ffffffff81167001>] sys_write+0x51/0x90
        [<ffffffff815c7e02>] system_call_fastpath+0x16/0x1b
       Code: 48 8b 33 31 d2 48 85 f6 75 33 49 89 d4 4c 03 63 08 49 8b 14 24 48 85 d2 48 89 10 74 04 48 89 42 08 49 89 04 24 4c 89 60 08 31 d2
       RIP [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
        RSP <ffff88003a96bda8>
       CR2: ffffffff8160ec90
       ---[ end trace a5d031828efdd88e ]---
      Reported-by: NBrian Marete <marete@toshnix.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      43dd61c9
  5. 19 5月, 2011 7 次提交
    • S
      ftrace: Modify ftrace_set_filter/notrace to take ops · 936e074b
      Steven Rostedt 提交于
      Since users of the function tracer can now pick and choose which
      functions they want to trace agnostically from other users of the
      function tracer, we need to pass the ops struct to the ftrace_set_filter()
      functions.
      
      The functions ftrace_set_global_filter() and ftrace_set_global_notrace()
      is added to keep the old filter functions which are used to modify
      the generic function tracers.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      936e074b
    • S
      ftrace: Allow dynamically allocated function tracers · cdbe61bf
      Steven Rostedt 提交于
      Now that functions may be selected individually, it only makes sense
      that we should allow dynamically allocated trace structures to
      be traced. This will allow perf to allocate a ftrace_ops structure
      at runtime and use it to pick and choose which functions that
      structure will trace.
      
      Note, a dynamically allocated ftrace_ops will always be called
      indirectly instead of being called directly from the mcount in
      entry.S. This is because there's no safe way to prevent mcount
      from being preempted before calling the function, unless we
      modify every entry.S to do so (not likely). Thus, dynamically allocated
      functions will now be called by the ftrace_ops_list_func() that
      loops through the ops that are allocated if there are more than
      one op allocated at a time. This loop is protected with a
      preempt_disable.
      
      To determine if an ftrace_ops structure is allocated or not, a new
      util function was added to the kernel/extable.c called
      core_kernel_data(), which returns 1 if the address is between
      _sdata and _edata.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      cdbe61bf
    • S
      ftrace: Implement separate user function filtering · b848914c
      Steven Rostedt 提交于
      ftrace_ops that are registered to trace functions can now be
      agnostic to each other in respect to what functions they trace.
      Each ops has their own hash of the functions they want to trace
      and a hash to what they do not want to trace. A empty hash for
      the functions they want to trace denotes all functions should
      be traced that are not in the notrace hash.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b848914c
    • S
      ftrace: Use counters to enable functions to trace · ed926f9b
      Steven Rostedt 提交于
      Every function has its own record that stores the instruction
      pointer and flags for the function to be traced. There are only
      two flags: enabled and free. The enabled flag states that tracing
      for the function has been enabled (actively traced), and the free
      flag states that the record no longer points to a function and can
      be used by new functions (loaded modules).
      
      These flags are now moved to the MSB of the flags (actually just
      the top 32bits). The rest of the bits (30 bits) are now used as
      a ref counter. Everytime a tracer register functions to trace,
      those functions will have its counter incremented.
      
      When tracing is enabled, to determine if a function should be traced,
      the counter is examined, and if it is non-zero it is set to trace.
      
      When a ftrace_ops is registered to trace functions, its hashes
      are examined. If the ftrace_ops filter_hash count is zero, then
      all functions are set to be traced, otherwise only the functions
      in the hash are to be traced. The exception to this is if a function
      is also in the ftrace_ops notrace_hash. Then that function's counter
      is not incremented for this ftrace_ops.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ed926f9b
    • S
      ftrace: Create a global_ops to hold the filter and notrace hashes · f45948e8
      Steven Rostedt 提交于
      Combine the filter and notrace hashes to be accessed by a single entity,
      the global_ops. The global_ops is a ftrace_ops structure that is passed
      to different functions that can read or modify the filtering of the
      function tracer.
      
      The ftrace_ops structure was modified to hold a filter and notrace
      hashes so that later patches may allow each ftrace_ops to have its own
      set of rules to what functions may be filtered.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f45948e8
    • S
      ftrace: Use hash instead for FTRACE_FL_FILTER · 1cf41dd7
      Steven Rostedt 提交于
      When multiple users are allowed to have their own set of functions
      to trace, having the FTRACE_FL_FILTER flag will not be enough to
      handle the accounting of those users. Each user will need their own
      set of functions.
      
      Replace the FTRACE_FL_FILTER with a filter_hash instead. This is
      temporary until the rest of the function filtering accounting
      gets in.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1cf41dd7
    • S
      ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions · b448c4e3
      Steven Rostedt 提交于
      To prepare for the accounting system that will allow multiple users of
      the function tracer, having the FTRACE_FL_NOTRACE as a flag in the
      dyn_trace record does not make sense.
      
      All ftrace_ops will soon have a hash of functions they should trace
      and not trace. By making a global hash of functions not to trace makes
      this easier for the transition.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b448c4e3
  6. 30 4月, 2011 2 次提交
    • S
      ftrace: Remove FTRACE_FL_CONVERTED flag · d2c8c3ea
      Steven Rostedt 提交于
      Since we disable all function tracer processing if we detect
      that a modification of a instruction had failed, we do not need
      to track that the record has failed. No more ftrace processing
      is allowed, and the FTRACE_FL_CONVERTED flag is pointless.
      
      The FTRACE_FL_CONVERTED flag was used to denote records that were
      successfully converted from mcount calls into nops. But if a single
      record fails, all of ftrace is disabled.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d2c8c3ea
    • S
      ftrace: Remove FTRACE_FL_FAILED flag · 45a4a237
      Steven Rostedt 提交于
      Since we disable all function tracer processing if we detect
      that a modification of a instruction had failed, we do not need
      to track that the record has failed. No more ftrace processing
      is allowed, and the FTRACE_FL_FAILED flag is pointless.
      
      Removing this flag simplifies some of the code, but some ftrace_disabled
      checks needed to be added or move around a little.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      45a4a237
  7. 12 2月, 2011 1 次提交
    • S
      ftrace: Fix memory leak with function graph and cpu hotplug · 868baf07
      Steven Rostedt 提交于
      When the fuction graph tracer starts, it needs to make a special
      stack for each task to save the real return values of the tasks.
      All running tasks have this stack created, as well as any new
      tasks.
      
      On CPU hot plug, the new idle task will allocate a stack as well
      when init_idle() is called. The problem is that cpu hotplug does
      not create a new idle_task. Instead it uses the idle task that
      existed when the cpu went down.
      
      ftrace_graph_init_task() will add a new ret_stack to the task
      that is given to it. Because a clone will make the task
      have a stack of its parent it does not check if the task's
      ret_stack is already NULL or not. When the CPU hotplug code
      starts a CPU up again, it will allocate a new stack even
      though one already existed for it.
      
      The solution is to treat the idle_task specially. In fact, the
      function_graph code already does, just not at init_idle().
      Instead of using the ftrace_graph_init_task() for the idle task,
      which that function expects the task to be a clone, have a
      separate ftrace_graph_init_idle_task(). Also, we will create a
      per_cpu ret_stack that is used by the idle task. When we call
      ftrace_graph_init_idle_task() it will check if the idle task's
      ret_stack is NULL, if it is, then it will assign it the per_cpu
      ret_stack.
      Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Suggested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stable Tree <stable@kernel.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      868baf07
  8. 21 7月, 2010 1 次提交
  9. 04 5月, 2010 1 次提交
    • S
      tracing: Convert nop macros to static inlines · 4dbf6bc2
      Steven Rostedt 提交于
      The ftrace.h file contains several functions as macros when the
      functions are disabled due to config options. This patch converts
      most of them to static inlines.
      
      There are two exceptions:
      
        register_ftrace_function() and unregister_ftrace_function()
      
      This is because their parameter "ops" must not be evaluated since
      code using the function is allowed to #ifdef out the creation of
      the parameter.
      
      This also fixes an error caused by recent changes:
      
       kernel/trace/trace_irqsoff.c: In function 'start_irqsoff_tracer':
       kernel/trace/trace_irqsoff.c:571: error: expected expression before 'do'
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4dbf6bc2
  10. 28 4月, 2010 1 次提交
  11. 22 4月, 2010 1 次提交
    • F
      tracing: Dump either the oops's cpu source or all cpus buffers · cecbca96
      Frederic Weisbecker 提交于
      The ftrace_dump_on_oops kernel parameter, sysctl and sysrq let one
      dump every cpu buffers when an oops or panic happens.
      
      It's nice when you have few cpus but it may take ages if have many,
      plus you miss the real origin of the problem in all the cpu traces.
      
      Sometimes, all you need is to dump the cpu buffer that triggered the
      opps, most of the time it is our main interest.
      
      This patch modifies ftrace_dump_on_oops to handle this choice.
      
      The ftrace_dump_on_oops kernel parameter, when it comes alone, has
      the same behaviour than before. But ftrace_dump_on_oops=orig_cpu
      will only dump the buffer of the cpu that oops'ed.
      
      Similarly, sysctl kernel.ftrace_dump_on_oops=1 and
      echo 1 > /proc/sys/kernel/ftrace_dump_on_oops keep their previous
      behaviour. But setting 2 jumps into cpu origin dump mode.
      
      v2: Fix double setup
      v3: Fix spelling issues reported by Randy Dunlap
      v4: Also update __ftrace_dump in the selftests
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      cecbca96
  12. 26 3月, 2010 1 次提交
    • P
      x86, perf, bts, mm: Delete the never used BTS-ptrace code · faa4602e
      Peter Zijlstra 提交于
      Support for the PMU's BTS features has been upstreamed in
      v2.6.32, but we still have the old and disabled ptrace-BTS,
      as Linus noticed it not so long ago.
      
      It's buggy: TIF_DEBUGCTLMSR is trampling all over that MSR without
      regard for other uses (perf) and doesn't provide the flexibility
      needed for perf either.
      
      Its users are ptrace-block-step and ptrace-bts, since ptrace-bts
      was never used and ptrace-block-step can be implemented using a
      much simpler approach.
      
      So axe all 3000 lines of it. That includes the *locked_memory*()
      APIs in mm/mlock.c as well.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Markus Metzger <markus.t.metzger@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <20100325135413.938004390@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      faa4602e
  13. 17 2月, 2010 1 次提交
  14. 04 2月, 2010 2 次提交
    • M
      ftrace: Remove record freezing · f24bb999
      Masami Hiramatsu 提交于
      Remove record freezing. Because kprobes never puts probe on
      ftrace's mcount call anymore, it doesn't need ftrace to check
      whether kprobes on it.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: przemyslaw@pawelczyk.it
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100202214925.4694.73469.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f24bb999
    • M
      ftrace/alternatives: Introducing *_text_reserved functions · 2cfa1978
      Masami Hiramatsu 提交于
      Introducing *_text_reserved functions for checking the text
      address range is partially reserved or not. This patch provides
      checking routines for x86 smp alternatives and dynamic ftrace.
      Since both functions modify fixed pieces of kernel text, they
      should reserve and protect those from other dynamic text
      modifier, like kprobes.
      
      This will also be extended when introducing other subsystems
      which modify fixed pieces of kernel text. Dynamic text modifiers
      should avoid those.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: przemyslaw@pawelczyk.it
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <20100202214911.4694.16587.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2cfa1978
  15. 08 10月, 2009 1 次提交
  16. 24 9月, 2009 1 次提交
  17. 20 9月, 2009 1 次提交
  18. 19 6月, 2009 1 次提交
    • S
      function-graph: add stack frame test · 71e308a2
      Steven Rostedt 提交于
      In case gcc does something funny with the stack frames, or the return
      from function code, we would like to detect that.
      
      An arch may implement passing of a variable that is unique to the
      function and can be saved on entering a function and can be tested
      when exiting the function. Usually the frame pointer can be used for
      this purpose.
      
      This patch also implements this for x86. Where it passes in the stack
      frame of the parent function, and will test that frame on exit.
      
      There was a case in x86_32 with optimize for size (-Os) where, for a
      few functions, gcc would align the stack frame and place a copy of the
      return address into it. The function graph tracer modified the copy and
      not the actual return address. On return from the funtion, it did not go
      to the tracer hook, but returned to the parent. This broke the function
      graph tracer, because the return of the parent (where gcc did not do
      this funky manipulation) returned to the location that the child function
      was suppose to. This caused strange kernel crashes.
      
      This test detected the problem and pointed out where the issue was.
      
      This modifies the parameters of one of the functions that the arch
      specific code calls, so it includes changes to arch code to accommodate
      the new prototype.
      
      Note, I notice that the parsic arch implements its own push_return_trace.
      This is now a generic function and the ftrace_push_return_trace should be
      used instead. This patch does not touch that code.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      71e308a2
  19. 18 4月, 2009 1 次提交
    • S
      tracing: add same level recursion detection · 261842b7
      Steven Rostedt 提交于
      The tracing infrastructure allows for recursion. That is, an interrupt
      may interrupt the act of tracing an event, and that interrupt may very well
      perform its own trace. This is a recursive trace, and is fine to do.
      
      The problem arises when there is a bug, and the utility doing the trace
      calls something that recurses back into the tracer. This recursion is not
      caused by an external event like an interrupt, but by code that is not
      expected to recurse. The result could be a lockup.
      
      This patch adds a bitmask to the task structure that keeps track
      of the trace recursion. To find the interrupt depth, the following
      algorithm is used:
      
        level = hardirq_count() + softirq_count() + in_nmi;
      
      Here, level will be the depth of interrutps and softirqs, and even handles
      the nmi. Then the corresponding bit is set in the recursion bitmask.
      If the bit was already set, we know we had a recursion at the same level
      and we warn about it and fail the writing to the buffer.
      
      After the data has been committed to the buffer, we clear the bit.
      No atomics are needed. The only races are with interrupts and they reset
      the bitmask before returning anywy.
      
      [ Impact: detect same irq level trace recursion ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      261842b7
  20. 17 4月, 2009 1 次提交
    • S
      ftrace: use module notifier for function tracer · 93eb677d
      Steven Rostedt 提交于
      The hooks in the module code for the function tracer must be called
      before any of that module code runs. The function tracer hooks
      modify the module (replacing calls to mcount to nops). If the code
      is executed while the change occurs, then the CPU can take a GPF.
      
      To handle the above with a bit of paranoia, I originally implemented
      the hooks as calls directly from the module code.
      
      After examining the notifier calls, it looks as though the start up
      notify is called before any of the module's code is executed. This makes
      the use of the notify safe with ftrace.
      
      Only the startup notify is required to be "safe". The shutdown simply
      removes the entries from the ftrace function list, and does not modify
      any code.
      
      This change has another benefit. It removes a issue with a reverse dependency
      in the mutexes of ftrace_lock and module_mutex.
      
      [ Impact: fix lock dependency bug, cleanup ]
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      93eb677d
  21. 09 4月, 2009 1 次提交
    • F
      tracing/syscalls: use a dedicated file header · 47788c58
      Frederic Weisbecker 提交于
      Impact: fix build warnings and possibe compat misbehavior on IA64
      
      Building a kernel on ia64 might trigger these ugly build warnings:
      
      CC      arch/ia64/ia32/sys_ia32.o
      In file included from arch/ia64/ia32/sys_ia32.c:55:
      arch/ia64/ia32/ia32priv.h:290:1: warning: "elf_check_arch" redefined
      In file included from include/linux/elf.h:7,
                       from include/linux/module.h:14,
                       from include/linux/ftrace.h:8,
                       from include/linux/syscalls.h:68,
                       from arch/ia64/ia32/sys_ia32.c:18:
      arch/ia64/include/asm/elf.h:19:1: warning: this is the location of the previous definition
      [...]
      
      sys_ia32.c includes linux/syscalls.h which in turn includes linux/ftrace.h
      to import the syscalls tracing prototypes.
      
      But including ftrace.h can pull too much things for a low level file,
      especially on ia64 where the ia32 private headers conflict with higher
      level headers.
      
      Now we isolate the syscall tracing headers in their own lightweight file.
      Reported-by: NTony Luck <tony.luck@intel.com>
      Tested-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Michael Rubin <mrubin@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Michael Davidson <md@google.com>
      LKML-Reference: <20090408184058.GB6017@nowhere>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      47788c58
  22. 08 4月, 2009 1 次提交
    • T
      tracing: append a comma to INIT_FTRACE_GRAPH · f876d346
      Tetsuo Handa 提交于
      Impact: dont break future extensions of INIT_TASK
      
      While not a problem right now, due to lack of a comma, build fails if
      elements are appended to INIT_TASK() macro in development code:
      
       arch/x86/kernel/init_task.c:33: error: request for member `XXXXXXXXXX' in something not a structure or union
       arch/x86/kernel/init_task.c:33: error: initializer element is not constant
       arch/x86/kernel/init_task.c:33: error: (near initialization for `init_task.ret_stack')
       make[1]: *** [arch/x86/kernel/init_task.o] Error 1
       make: *** [arch/x86/kernel] Error 2
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: srostedt@redhat.com
      LKML-Reference: <200904080505.n3855hcn017109@www262.sakura.ne.jp>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f876d346
  23. 07 4月, 2009 1 次提交
    • S
      function-graph: add proper initialization for init task · 5ac9f622
      Steven Rostedt 提交于
      Impact: fix to crash going to kexec
      
      The init task did not properly initialize the function graph pointers.
      Altough these pointers are NULL, they can not be assumed to be NULL
      for the init task, and must still be properly initialize.
      
      This usually is not an issue since a problem only arises when a task
      exits, and the init tasks do not usually exit. But when doing tests
      with kexec, the init tasks do exit, and the bug appears.
      
      This patch properly initializes the init tasks function graph data
      structures.
      Reported-and-Tested-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <alpine.DEB.2.00.0903252053080.5675@gandalf.stny.rr.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5ac9f622
  24. 25 3月, 2009 3 次提交
    • S
      function-graph: add option to calculate graph time or not · a2a16d6a
      Steven Rostedt 提交于
      graph time is the time that a function is executing another function.
      Thus if function A calls B, if graph-time is set, then the time for
      A includes B. This is the default behavior. But if graph-time is off,
      then the time spent executing B is subtracted from A.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      a2a16d6a
    • S
      tracing: move function profiler data out of function struct · 493762fc
      Steven Rostedt 提交于
      Impact: reduce size of memory in function profiler
      
      The function profiler originally introduces its counters into the
      function records itself. There is 20 thousand different functions on
      a normal system, and that is adding 20 thousand counters for profiling
      event when not needed.
      
      A normal run of the profiler yields only a couple of thousand functions
      executed, depending on what is being profiled. This means we have around
      18 thousand useless counters.
      
      This patch rectifies this by moving the data out of the function
      records used by dynamic ftrace. Data is preallocated to hold the functions
      when the profiling begins. Checks are made during profiling to see if
      more recorcds should be allocated, and they are allocated if it is safe
      to do so.
      
      This also removes the dependency from using dynamic ftrace, and also
      removes the overhead by having it enabled.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      493762fc
    • S
      tracing: add function profiler · bac429f0
      Steven Rostedt 提交于
      Impact: new profiling feature
      
      This patch adds a function profiler. In debugfs/tracing/ two new
      files are created.
      
        function_profile_enabled  - to enable or disable profiling
      
        trace_stat/functions   - the profiled functions.
      
      For example:
      
        echo 1 > /debugfs/tracing/function_profile_enabled
        ./hackbench 50
        echo 0 > /debugfs/tracing/function_profile_enabled
      
      yields:
      
        cat /debugfs/tracing/trace_stat/functions
      
        Function                               Hit
        --------                               ---
        _spin_lock                        10106442
        _spin_unlock                      10097492
        kfree                              6013704
        _spin_unlock_irqrestore            4423941
        _spin_lock_irqsave                 4406825
        __phys_addr                        4181686
        __slab_free                        4038222
        dput                               4030130
        path_put                           4023387
        unroll_tree_refs                   4019532
      [...]
      
      The most hit functions are listed first. Functions that are not
      hit are not listed.
      
      This feature depends on and uses dynamic function tracing. When the
      function profiling is disabled, no overhead occurs. But it still
      takes up around 300KB to hold the data, thus it is not recomended
      to keep it enabled for systems low on memory.
      
      When a '1' is echoed into the function_profile_enabled file, the
      counters for is function is reset back to zero. Thus you can see what
      functions are hit most by different programs.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      bac429f0
  25. 24 3月, 2009 1 次提交