1. 15 3月, 2013 1 次提交
    • S
      ftrace: Clean up function probe methods · e67efb93
      Steven Rostedt (Red Hat) 提交于
      When a function probe is created, each function that the probe is
      attached to, a "callback" method is called. On release of the probe,
      each function entry calls the "free" method.
      
      First, "callback" is a confusing name and does not really match what
      it does. Callback sounds like it will be called when the probe
      triggers. But that's not the case. This is really an "init" function,
      so lets rename it as such.
      
      Secondly, both "init" and "free" do not pass enough information back
      to the handlers. Pass back the ops, ip and data for each time the
      method is called. We have the information, might as well use it.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e67efb93
  2. 22 1月, 2013 1 次提交
  3. 18 12月, 2012 1 次提交
  4. 31 7月, 2012 4 次提交
  5. 20 7月, 2012 4 次提交
    • S
      ftrace/x86: Add separate function to save regs · 08f6fba5
      Steven Rostedt 提交于
      Add a way to have different functions calling different trampolines.
      If a ftrace_ops wants regs saved on the return, then have only the
      functions with ops registered to save regs. Functions registered by
      other ops would not be affected, unless the functions overlap.
      
      If one ftrace_ops registered functions A, B and C and another ops
      registered fucntions to save regs on A, and D, then only functions
      A and D would be saving regs. Function B and C would work as normal.
      Although A is registered by both ops: normal and saves regs; this is fine
      as saving the regs is needed to satisfy one of the ops that calls it
      but the regs are ignored by the other ops function.
      
      x86_64 implements the full regs saving, and i386 just passes a NULL
      for regs to satisfy the ftrace_ops passing. Where an arch must supply
      both regs and ftrace_ops parameters, even if regs is just NULL.
      
      It is OK for an arch to pass NULL regs. All function trace users that
      require regs passing must add the flag FTRACE_OPS_FL_SAVE_REGS when
      registering the ftrace_ops. If the arch does not support saving regs
      then the ftrace_ops will fail to register. The flag
      FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED may be set that will prevent the
      ftrace_ops from failing to register. In this case, the handler may
      either check if regs is not NULL or check if ARCH_SUPPORTS_FTRACE_SAVE_REGS.
      If the arch supports passing regs it will set this macro and pass regs
      for ops that request them. All other archs will just pass NULL.
      
      Link: Link: http://lkml.kernel.org/r/20120711195745.107705970@goodmis.org
      
      Cc: Alexander van Heukelum <heukelum@fastmail.fm>
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      08f6fba5
    • S
      ftrace: Return pt_regs to function trace callback · a1e2e31d
      Steven Rostedt 提交于
      Return as the 4th paramater to the function tracer callback the pt_regs.
      
      Later patches that implement regs passing for the architectures will require
      having the ftrace_ops set the SAVE_REGS flag, which will tell the arch
      to take the time to pass a full set of pt_regs to the ftrace_ops callback
      function. If the arch does not support it then it should pass NULL.
      
      If an arch can pass full regs, then it should define:
       ARCH_SUPPORTS_FTRACE_SAVE_REGS to 1
      
      Link: http://lkml.kernel.org/r/20120702201821.019966811@goodmis.orgReviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a1e2e31d
    • S
      ftrace: Consolidate arch dependent functions with 'list' function · ccf3672d
      Steven Rostedt 提交于
      As the function tracer starts to get more features, the support for
      theses features will spread out throughout the different architectures
      over time. These features boil down to what each arch does in the
      mcount trampoline (the ftrace_caller).
      
      Currently there's two features that are not the same throughout the
      archs.
      
       1) Support to stop function tracing before the callback
       2) passing of the ftrace ops
      
      Both of these require placing an indirect function to support the
      features if the mcount trampoline does not.
      
      On a side note, for all architectures, when more than one callback
      is registered to the function tracer, an intermediate 'list' function
      is called by the mcount trampoline to iterate through the callbacks
      that are registered.
      
      Instead of making a separate function for each of these features,
      and requiring several indirect calls, just use the single 'list' function
      as the intermediate, to handle all cases. If an arch does not support
      the 'stop function tracing' or the passing of ftrace ops, just force
      it to use the list function that will handle the features required.
      
      This makes the code cleaner and simpler and removes a lot of
       #ifdefs in the code.
      
      Link: http://lkml.kernel.org/r/20120612225424.495625483@goodmis.orgReviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ccf3672d
    • S
      ftrace: Pass ftrace_ops as third parameter to function trace callback · 2f5f6ad9
      Steven Rostedt 提交于
      Currently the function trace callback receives only the ip and parent_ip
      of the function that it traced. It would be more powerful to also return
      the ops that registered the function as well. This allows the same function
      to act differently depending on what ftrace_ops registered it.
      
      Link: http://lkml.kernel.org/r/20120612225424.267254552@goodmis.orgReviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2f5f6ad9
  6. 17 5月, 2012 3 次提交
  7. 08 5月, 2012 1 次提交
  8. 28 4月, 2012 1 次提交
    • S
      ftrace/x86: Have arch x86_64 use breakpoints instead of stop machine · 08d636b6
      Steven Rostedt 提交于
      This method changes x86 to add a breakpoint to the mcount locations
      instead of calling stop machine.
      
      Now that iret can be handled by NMIs, we perform the following to
      update code:
      
      1) Add a breakpoint to all locations that will be modified
      
      2) Sync all cores
      
      3) Update all locations to be either a nop or call (except breakpoint
         op)
      
      4) Sync all cores
      
      5) Remove the breakpoint with the new code.
      
      6) Sync all cores
      
      [
        Added updates that Masami suggested:
         Use unlikely(modifying_ftrace_code) in int3 trap to keep kprobes efficient.
         Don't use NOTIFY_* in ftrace handler in int3 as it is not a notifier.
      ]
      
      Cc: H. Peter Anvin <hpa@zytor.com>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      08d636b6
  9. 22 2月, 2012 2 次提交
    • J
      ftrace, perf: Add filter support for function trace event · 5500fa51
      Jiri Olsa 提交于
      Adding support to filter function trace event via perf
      interface. It is now possible to use filter interface
      in the perf tool like:
      
        perf record -e ftrace:function --filter="(ip == mm_*)" ls
      
      The filter syntax is restricted to the the 'ip' field only,
      and following operators are accepted '==' '!=' '||', ending
      up with the filter strings like:
      
        ip == f1[, ]f2 ... || ip != f3[, ]f4 ...
      
      with comma ',' or space ' ' as a function separator. If the
      space ' ' is used as a separator, the right side of the
      assignment needs to be enclosed in double quotes '"', e.g.:
      
        perf record -e ftrace:function --filter '(ip == do_execve,sys_*,ext*)' ls
        perf record -e ftrace:function --filter '(ip == "do_execve,sys_*,ext*")' ls
        perf record -e ftrace:function --filter '(ip == "do_execve sys_* ext*")' ls
      
      The '==' operator adds trace filter with same effect as would
      be added via set_ftrace_filter file.
      
      The '!=' operator adds trace filter with same effect as would
      be added via set_ftrace_notrace file.
      
      The right side of the '!=', '==' operators is list of functions
      or regexp. to be added to filter separated by space.
      
      The '||' operator is used for connecting multiple filter definitions
      together. It is possible to have more than one '==' and '!='
      operators within one filter string.
      
      Link: http://lkml.kernel.org/r/1329317514-8131-8-git-send-email-jolsa@redhat.comSigned-off-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5500fa51
    • J
      ftrace: Add enable/disable ftrace_ops control interface · e248491a
      Jiri Olsa 提交于
      Adding a way to temporarily enable/disable ftrace_ops. The change
      follows the same way as 'global' ftrace_ops are done.
      
      Introducing 2 global ftrace_ops - control_ops and ftrace_control_list
      which take over all ftrace_ops registered with FTRACE_OPS_FL_CONTROL
      flag. In addition new per cpu flag called 'disabled' is also added to
      ftrace_ops to provide the control information for each cpu.
      
      When ftrace_ops with FTRACE_OPS_FL_CONTROL is registered, it is
      set as disabled for all cpus.
      
      The ftrace_control_list contains all the registered 'control' ftrace_ops.
      The control_ops provides function which iterates ftrace_control_list
      and does the check for 'disabled' flag on current cpu.
      
      Adding 3 inline functions:
        ftrace_function_local_disable/ftrace_function_local_enable
        - enable/disable the ftrace_ops on current cpu
        ftrace_function_local_disabled
        - get disabled ftrace_ops::disabled value for current cpu
      
      Link: http://lkml.kernel.org/r/1329317514-8131-2-git-send-email-jolsa@redhat.comAcked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e248491a
  10. 03 2月, 2012 1 次提交
  11. 08 1月, 2012 1 次提交
  12. 21 12月, 2011 6 次提交
    • S
      ftrace: Allow access to the boot time function enabling · 2a85a37f
      Steven Rostedt 提交于
      Change set_ftrace_early_filter() to ftrace_set_early_filter()
      and make it a global function. This will allow other subsystems
      in the kernel to be able to enable function tracing at start
      up and reuse the ftrace function parsing code.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2a85a37f
    • S
      ftrace: Decouple hash items from showing filtered functions · 69a3083c
      Steven Rostedt 提交于
      The set_ftrace_filter shows "hashed" functions, which are functions
      that are added with operations to them (like traceon and traceoff).
      
      As other subsystems may be able to show what functions they are
      using for function tracing, the hash items should no longer
      be shown just because the FILTER flag is set. As they have nothing
      to do with other subsystems filters.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      69a3083c
    • S
      ftrace: Allow other users of function tracing to use the output listing · fc13cb0c
      Steven Rostedt 提交于
      The function tracer is set up to allow any other subsystem (like perf)
      to use it. Ftrace already has a way to list what functions are enabled
      by the global_ops. It would be very helpful to let other users of
      the function tracer to be able to use the same code.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      fc13cb0c
    • S
      ftrace: Replace record newlist with record page list · 85ae32ae
      Steven Rostedt 提交于
      As new functions come in to be initalized from mcount to nop,
      they are done by groups of pages. Whether it is the core kernel
      or a module. There's no need to keep track of these on a per record
      basis.
      
      At startup, and as any module is loaded, the functions to be
      traced are stored in a group of pages and added to the function
      list at the end. We just need to keep a pointer to the first
      page of the list that was added, and use that to know where to
      start on the list for initializing functions.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      85ae32ae
    • S
      ftrace: Remove usage of "freed" records · 32082309
      Steven Rostedt 提交于
      Records that are added to the function trace table are
      permanently there, except for modules. By separating out the
      modules to their own pages that can be freed in one shot
      we can remove the "freed" flag and simplify some of the record
      management.
      
      Another benefit of doing this is that we can also move the
      records around; sort them.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      32082309
    • S
      ftrace: Allow archs to modify code without stop machine · c88fd863
      Steven Rostedt 提交于
      The stop machine method to modify all functions in the kernel
      (some 20,000 of them) is the safest way to do so across all archs.
      But some archs may not need this big hammer approach to modify code
      on SMP machines, and can simply just update the code it needs.
      
      Adding a weak function arch_ftrace_update_code() that now does the
      stop machine, will also let any arch override this method.
      
      If the arch needs to check the system and then decide if it can
      avoid stop machine, it can still call ftrace_run_stop_machine() to
      use the old method.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c88fd863
  13. 01 11月, 2011 1 次提交
    • P
      include: replace linux/module.h with "struct module" wherever possible · de477254
      Paul Gortmaker 提交于
      The <linux/module.h> pretty much brings in the kitchen sink along
      with it, so it should be avoided wherever reasonably possible in
      terms of being included from other commonly used <linux/something.h>
      files, as it results in a measureable increase on compile times.
      
      The worst culprit was probably device.h since it is used everywhere.
      This file also had an implicit dependency/usage of mutex.h which was
      masked by module.h, and is also fixed here at the same time.
      
      There are over a dozen other headers that simply declare the
      struct instead of pulling in the whole file, so follow their lead
      and simply make it a few more.
      
      Most of the implicit dependencies on module.h being present by
      these headers pulling it in have been now weeded out, so we can
      finally make this change with hopefully minimal breakage.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      de477254
  14. 11 7月, 2011 1 次提交
  15. 07 7月, 2011 1 次提交
    • S
      ftrace: Fix regression of :mod:module function enabling · 43dd61c9
      Steven Rostedt 提交于
      The new code that allows different utilities to pick and choose
      what functions they trace broke the :mod: hook that allows users
      to trace only functions of a particular module.
      
      The reason is that the :mod: hook bypasses the hash that is setup
      to allow individual users to trace their own functions and uses
      the global hash directly. But if the global hash has not been
      set up, it will cause a bug:
      
      echo '*:mod:radeon' > /sys/kernel/debug/set_ftrace_filter
      
      produces:
      
       [drm:drm_mode_getfb] *ERROR* invalid framebuffer id
       [drm:radeon_crtc_page_flip] *ERROR* failed to reserve new rbo buffer before flip
       BUG: unable to handle kernel paging request at ffffffff8160ec90
       IP: [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
       PGD 1a05067 PUD 1a09063 PMD 80000000016001e1
       Oops: 0003 [#1] SMP Jul  7 04:02:28 phyllis kernel: [55303.858604] CPU 1
       Modules linked in: cryptd aes_x86_64 aes_generic binfmt_misc rfcomm bnep ip6table_filter hid radeon r8169 ahci libahci mii ttm drm_kms_helper drm video i2c_algo_bit intel_agp intel_gtt
      
       Pid: 10344, comm: bash Tainted: G        WC  3.0.0-rc5 #1 Dell Inc. Inspiron N5010/0YXXJJ
       RIP: 0010:[<ffffffff810d9136>]  [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
       RSP: 0018:ffff88003a96bda8  EFLAGS: 00010246
       RAX: ffff8801301735c0 RBX: ffffffff8160ec80 RCX: 0000000000306ee0
       RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880137c92940
       RBP: ffff88003a96bdb8 R08: ffff880137c95680 R09: 0000000000000000
       R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff81c9df78
       R13: ffff8801153d1000 R14: 0000000000000000 R15: 0000000000000000
       FS: 00007f329c18a700(0000) GS:ffff880137c80000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: ffffffff8160ec90 CR3: 000000003002b000 CR4: 00000000000006e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Process bash (pid: 10344, threadinfo ffff88003a96a000, task ffff88012fcfc470)
       Stack:
        0000000000000fd0 00000000000000fc ffff88003a96be38 ffffffff810d92f5
        ffff88011c4c4e00 ffff880000000000 000000000b69f4d0 ffffffff8160ec80
        ffff8800300e6f06 0000000081130295 0000000000000282 ffff8800300e6f00
       Call Trace:
        [<ffffffff810d92f5>] match_records+0x155/0x1b0
        [<ffffffff810d940c>] ftrace_mod_callback+0xbc/0x100
        [<ffffffff810dafdf>] ftrace_regex_write+0x16f/0x210
        [<ffffffff810db09f>] ftrace_filter_write+0xf/0x20
        [<ffffffff81166e48>] vfs_write+0xc8/0x190
        [<ffffffff81167001>] sys_write+0x51/0x90
        [<ffffffff815c7e02>] system_call_fastpath+0x16/0x1b
       Code: 48 8b 33 31 d2 48 85 f6 75 33 49 89 d4 4c 03 63 08 49 8b 14 24 48 85 d2 48 89 10 74 04 48 89 42 08 49 89 04 24 4c 89 60 08 31 d2
       RIP [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
        RSP <ffff88003a96bda8>
       CR2: ffffffff8160ec90
       ---[ end trace a5d031828efdd88e ]---
      Reported-by: NBrian Marete <marete@toshnix.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      43dd61c9
  16. 19 5月, 2011 7 次提交
    • S
      ftrace: Modify ftrace_set_filter/notrace to take ops · 936e074b
      Steven Rostedt 提交于
      Since users of the function tracer can now pick and choose which
      functions they want to trace agnostically from other users of the
      function tracer, we need to pass the ops struct to the ftrace_set_filter()
      functions.
      
      The functions ftrace_set_global_filter() and ftrace_set_global_notrace()
      is added to keep the old filter functions which are used to modify
      the generic function tracers.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      936e074b
    • S
      ftrace: Allow dynamically allocated function tracers · cdbe61bf
      Steven Rostedt 提交于
      Now that functions may be selected individually, it only makes sense
      that we should allow dynamically allocated trace structures to
      be traced. This will allow perf to allocate a ftrace_ops structure
      at runtime and use it to pick and choose which functions that
      structure will trace.
      
      Note, a dynamically allocated ftrace_ops will always be called
      indirectly instead of being called directly from the mcount in
      entry.S. This is because there's no safe way to prevent mcount
      from being preempted before calling the function, unless we
      modify every entry.S to do so (not likely). Thus, dynamically allocated
      functions will now be called by the ftrace_ops_list_func() that
      loops through the ops that are allocated if there are more than
      one op allocated at a time. This loop is protected with a
      preempt_disable.
      
      To determine if an ftrace_ops structure is allocated or not, a new
      util function was added to the kernel/extable.c called
      core_kernel_data(), which returns 1 if the address is between
      _sdata and _edata.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      cdbe61bf
    • S
      ftrace: Implement separate user function filtering · b848914c
      Steven Rostedt 提交于
      ftrace_ops that are registered to trace functions can now be
      agnostic to each other in respect to what functions they trace.
      Each ops has their own hash of the functions they want to trace
      and a hash to what they do not want to trace. A empty hash for
      the functions they want to trace denotes all functions should
      be traced that are not in the notrace hash.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b848914c
    • S
      ftrace: Use counters to enable functions to trace · ed926f9b
      Steven Rostedt 提交于
      Every function has its own record that stores the instruction
      pointer and flags for the function to be traced. There are only
      two flags: enabled and free. The enabled flag states that tracing
      for the function has been enabled (actively traced), and the free
      flag states that the record no longer points to a function and can
      be used by new functions (loaded modules).
      
      These flags are now moved to the MSB of the flags (actually just
      the top 32bits). The rest of the bits (30 bits) are now used as
      a ref counter. Everytime a tracer register functions to trace,
      those functions will have its counter incremented.
      
      When tracing is enabled, to determine if a function should be traced,
      the counter is examined, and if it is non-zero it is set to trace.
      
      When a ftrace_ops is registered to trace functions, its hashes
      are examined. If the ftrace_ops filter_hash count is zero, then
      all functions are set to be traced, otherwise only the functions
      in the hash are to be traced. The exception to this is if a function
      is also in the ftrace_ops notrace_hash. Then that function's counter
      is not incremented for this ftrace_ops.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ed926f9b
    • S
      ftrace: Create a global_ops to hold the filter and notrace hashes · f45948e8
      Steven Rostedt 提交于
      Combine the filter and notrace hashes to be accessed by a single entity,
      the global_ops. The global_ops is a ftrace_ops structure that is passed
      to different functions that can read or modify the filtering of the
      function tracer.
      
      The ftrace_ops structure was modified to hold a filter and notrace
      hashes so that later patches may allow each ftrace_ops to have its own
      set of rules to what functions may be filtered.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f45948e8
    • S
      ftrace: Use hash instead for FTRACE_FL_FILTER · 1cf41dd7
      Steven Rostedt 提交于
      When multiple users are allowed to have their own set of functions
      to trace, having the FTRACE_FL_FILTER flag will not be enough to
      handle the accounting of those users. Each user will need their own
      set of functions.
      
      Replace the FTRACE_FL_FILTER with a filter_hash instead. This is
      temporary until the rest of the function filtering accounting
      gets in.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1cf41dd7
    • S
      ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions · b448c4e3
      Steven Rostedt 提交于
      To prepare for the accounting system that will allow multiple users of
      the function tracer, having the FTRACE_FL_NOTRACE as a flag in the
      dyn_trace record does not make sense.
      
      All ftrace_ops will soon have a hash of functions they should trace
      and not trace. By making a global hash of functions not to trace makes
      this easier for the transition.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b448c4e3
  17. 30 4月, 2011 2 次提交
    • S
      ftrace: Remove FTRACE_FL_CONVERTED flag · d2c8c3ea
      Steven Rostedt 提交于
      Since we disable all function tracer processing if we detect
      that a modification of a instruction had failed, we do not need
      to track that the record has failed. No more ftrace processing
      is allowed, and the FTRACE_FL_CONVERTED flag is pointless.
      
      The FTRACE_FL_CONVERTED flag was used to denote records that were
      successfully converted from mcount calls into nops. But if a single
      record fails, all of ftrace is disabled.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d2c8c3ea
    • S
      ftrace: Remove FTRACE_FL_FAILED flag · 45a4a237
      Steven Rostedt 提交于
      Since we disable all function tracer processing if we detect
      that a modification of a instruction had failed, we do not need
      to track that the record has failed. No more ftrace processing
      is allowed, and the FTRACE_FL_FAILED flag is pointless.
      
      Removing this flag simplifies some of the code, but some ftrace_disabled
      checks needed to be added or move around a little.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      45a4a237
  18. 12 2月, 2011 1 次提交
    • S
      ftrace: Fix memory leak with function graph and cpu hotplug · 868baf07
      Steven Rostedt 提交于
      When the fuction graph tracer starts, it needs to make a special
      stack for each task to save the real return values of the tasks.
      All running tasks have this stack created, as well as any new
      tasks.
      
      On CPU hot plug, the new idle task will allocate a stack as well
      when init_idle() is called. The problem is that cpu hotplug does
      not create a new idle_task. Instead it uses the idle task that
      existed when the cpu went down.
      
      ftrace_graph_init_task() will add a new ret_stack to the task
      that is given to it. Because a clone will make the task
      have a stack of its parent it does not check if the task's
      ret_stack is already NULL or not. When the CPU hotplug code
      starts a CPU up again, it will allocate a new stack even
      though one already existed for it.
      
      The solution is to treat the idle_task specially. In fact, the
      function_graph code already does, just not at init_idle().
      Instead of using the ftrace_graph_init_task() for the idle task,
      which that function expects the task to be a clone, have a
      separate ftrace_graph_init_idle_task(). Also, we will create a
      per_cpu ret_stack that is used by the idle task. When we call
      ftrace_graph_init_idle_task() it will check if the idle task's
      ret_stack is NULL, if it is, then it will assign it the per_cpu
      ret_stack.
      Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Suggested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stable Tree <stable@kernel.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      868baf07
  19. 21 7月, 2010 1 次提交