1. 21 4月, 2017 4 次提交
  2. 18 4月, 2017 1 次提交
  3. 17 4月, 2017 1 次提交
    • S
      ftrace: Fix indexing of t_hash_start() from t_next() · fcdc7125
      Steven Rostedt (VMware) 提交于
      t_hash_start() does not increment *pos, where as t_next() must. But when
      t_next() does increment *pos, it must still pass in the original *pos to
      t_hash_start() otherwise it will skip the first instance:
      
       # cd /sys/kernel/debug/tracing
       # echo schedule:traceoff > set_ftrace_filter
       # echo do_IRQ:traceoff > set_ftrace_filter
       # echo call_rcu > set_ftrace_filter
       # cat set_ftrace_filter
      call_rcu
      schedule:traceoff:unlimited
      do_IRQ:traceoff:unlimited
      
      The above called t_hash_start() from t_start() as there was only one
      function (call_rcu), but if we add another function:
      
       # echo xfrm_policy_destroy_rcu >> set_ftrace_filter
       # cat set_ftrace_filter
      call_rcu
      xfrm_policy_destroy_rcu
      do_IRQ:traceoff:unlimited
      
      The "schedule:traceoff" disappears.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      fcdc7125
  4. 16 4月, 2017 1 次提交
    • S
      ftrace: Fix removing of second function probe · acceb72e
      Steven Rostedt (VMware) 提交于
      When two function probes are added to set_ftrace_filter, and then one of
      them is removed, the update to the function locations is not performed, and
      the record keeping of the function states are corrupted, and causes an
      ftrace_bug() to occur.
      
      This is easily reproducable by adding two probes, removing one, and then
      adding it back again.
      
       # cd /sys/kernel/debug/tracing
       # echo schedule:traceoff > set_ftrace_filter
       # echo do_IRQ:traceoff > set_ftrace_filter
       # echo \!do_IRQ:traceoff > /debug/tracing/set_ftrace_filter
       # echo do_IRQ:traceoff > set_ftrace_filter
      
      Causes:
       ------------[ cut here ]------------
       WARNING: CPU: 2 PID: 1098 at kernel/trace/ftrace.c:2369 ftrace_get_addr_curr+0x143/0x220
       Modules linked in: [...]
       CPU: 2 PID: 1098 Comm: bash Not tainted 4.10.0-test+ #405
       Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
       Call Trace:
        dump_stack+0x68/0x9f
        __warn+0x111/0x130
        ? trace_irq_work_interrupt+0xa0/0xa0
        warn_slowpath_null+0x1d/0x20
        ftrace_get_addr_curr+0x143/0x220
        ? __fentry__+0x10/0x10
        ftrace_replace_code+0xe3/0x4f0
        ? ftrace_int3_handler+0x90/0x90
        ? printk+0x99/0xb5
        ? 0xffffffff81000000
        ftrace_modify_all_code+0x97/0x110
        arch_ftrace_update_code+0x10/0x20
        ftrace_run_update_code+0x1c/0x60
        ftrace_run_modify_code.isra.48.constprop.62+0x8e/0xd0
        register_ftrace_function_probe+0x4b6/0x590
        ? ftrace_startup+0x310/0x310
        ? debug_lockdep_rcu_enabled.part.4+0x1a/0x30
        ? update_stack_state+0x88/0x110
        ? ftrace_regex_write.isra.43.part.44+0x1d3/0x320
        ? preempt_count_sub+0x18/0xd0
        ? mutex_lock_nested+0x104/0x800
        ? ftrace_regex_write.isra.43.part.44+0x1d3/0x320
        ? __unwind_start+0x1c0/0x1c0
        ? _mutex_lock_nest_lock+0x800/0x800
        ftrace_trace_probe_callback.isra.3+0xc0/0x130
        ? func_set_flag+0xe0/0xe0
        ? __lock_acquire+0x642/0x1790
        ? __might_fault+0x1e/0x20
        ? trace_get_user+0x398/0x470
        ? strcmp+0x35/0x60
        ftrace_trace_onoff_callback+0x48/0x70
        ftrace_regex_write.isra.43.part.44+0x251/0x320
        ? match_records+0x420/0x420
        ftrace_filter_write+0x2b/0x30
        __vfs_write+0xd7/0x330
        ? do_loop_readv_writev+0x120/0x120
        ? locks_remove_posix+0x90/0x2f0
        ? do_lock_file_wait+0x160/0x160
        ? __lock_is_held+0x93/0x100
        ? rcu_read_lock_sched_held+0x5c/0xb0
        ? preempt_count_sub+0x18/0xd0
        ? __sb_start_write+0x10a/0x230
        ? vfs_write+0x222/0x240
        vfs_write+0xef/0x240
        SyS_write+0xab/0x130
        ? SyS_read+0x130/0x130
        ? trace_hardirqs_on_caller+0x182/0x280
        ? trace_hardirqs_on_thunk+0x1a/0x1c
        entry_SYSCALL_64_fastpath+0x18/0xad
       RIP: 0033:0x7fe61c157c30
       RSP: 002b:00007ffe87890258 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
       RAX: ffffffffffffffda RBX: ffffffff8114a410 RCX: 00007fe61c157c30
       RDX: 0000000000000010 RSI: 000055814798f5e0 RDI: 0000000000000001
       RBP: ffff8800c9027f98 R08: 00007fe61c422740 R09: 00007fe61ca53700
       R10: 0000000000000073 R11: 0000000000000246 R12: 0000558147a36400
       R13: 00007ffe8788f160 R14: 0000000000000024 R15: 00007ffe8788f15c
        ? trace_hardirqs_off_caller+0xc0/0x110
       ---[ end trace 99fa09b3d9869c2c ]---
       Bad trampoline accounting at: ffffffff81cc3b00 (do_IRQ+0x0/0x150)
      
      Cc: stable@vger.kernel.org
      Fixes: 59df055f ("ftrace: trace different functions with a different tracer")
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      acceb72e
  5. 07 4月, 2017 1 次提交
    • S
      ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines · 0598e4f0
      Steven Rostedt (VMware) 提交于
      The function tracer needs to be more careful than other subsystems when it
      comes to freeing data. Especially if that data is actually executable code.
      When a single function is traced, a trampoline can be dynamically allocated
      which is called to jump to the function trace callback. When the callback is
      no longer needed, the dynamic allocated trampoline needs to be freed. This
      is where the issues arise. The dynamically allocated trampoline must not be
      used again. As function tracing can trace all subsystems, including
      subsystems that are used to serialize aspects of freeing (namely RCU), it
      must take extra care when doing the freeing.
      
      Before synchronize_rcu_tasks() was around, there was no way for the function
      tracer to know that nothing was using the dynamically allocated trampoline
      when CONFIG_PREEMPT was enabled. That's because a task could be indefinitely
      preempted while sitting on the trampoline. Now with synchronize_rcu_tasks(),
      it will wait till all tasks have either voluntarily scheduled (not on the
      trampoline) or goes into userspace (not on the trampoline). Then it is safe
      to free the trampoline even with CONFIG_PREEMPT set.
      Acked-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      0598e4f0
  6. 04 4月, 2017 1 次提交
    • S
      ftrace: Have init/main.c call ftrace directly to free init memory · b80f0f6c
      Steven Rostedt (VMware) 提交于
      Relying on free_reserved_area() to call ftrace to free init memory proved to
      not be sufficient. The issue is that on x86, when debug_pagealloc is
      enabled, the init memory is not freed, but simply set as not present. Since
      ftrace was uninformed of this, starting function tracing still tries to
      update pages that are not present according to the page tables, causing
      ftrace to bug, as well as killing the kernel itself.
      
      Instead of relying on free_reserved_area(), have init/main.c call ftrace
      directly just before it frees the init memory. Then it needs to use
      __init_begin and __init_end to know where the init memory location is.
      Looking at all archs (and testing what I can), it appears that this should
      work for each of them.
      Reported-by: Nkernel test robot <xiaolong.ye@intel.com>
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      b80f0f6c
  7. 01 4月, 2017 5 次提交
    • S
      ftrace: Create separate t_func_next() to simplify the function / hash logic · 5bd84629
      Steven Rostedt (VMware) 提交于
      I noticed that if I use dd to read the set_ftrace_filter file that the first
      hash command is repeated.
      
       # cd /sys/kernel/debug/tracing
       # echo schedule > set_ftrace_filter
       # echo do_IRQ >> set_ftrace_filter
       # echo schedule:traceoff >> set_ftrace_filter
       # echo do_IRQ:traceoff >> set_ftrace_filter
      
       # cat set_ftrace_filter
       schedule
       do_IRQ
       schedule:traceoff:unlimited
       do_IRQ:traceoff:unlimited
      
       # dd if=set_ftrace_filter bs=1
       schedule
       do_IRQ
       schedule:traceoff:unlimited
       schedule:traceoff:unlimited
       do_IRQ:traceoff:unlimited
       98+0 records in
       98+0 records out
       98 bytes copied, 0.00265011 s, 37.0 kB/s
      
      This is due to the way t_start() calls t_next() as well as the seq_file
      calls t_next() and the state is slightly different between the two. Namely,
      t_start() will call t_next() with a local "pos" variable.
      
      By separating out the function listing from t_next() into its own function,
      we can have better control of outputting the functions and the hash of
      triggers. This simplifies the code.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      5bd84629
    • S
      ftrace: Update func_pos in t_start() when all functions are enabled · 43ff926a
      Steven Rostedt (VMware) 提交于
      If all functions are enabled, there's a comment displayed in the file to
      denote that:
      
        # cd /sys/kernel/debug/tracing
        # cat set_ftrace_filter
       #### all functions enabled ####
      
      If a function trigger is set, those are displayed as well:
      
        # echo schedule:traceoff >> /debug/tracing/set_ftrace_filter
        # cat set_ftrace_filter
       #### all functions enabled ####
       schedule:traceoff:unlimited
      
      But if you read that file with dd, the output can change:
      
        # dd if=/debug/tracing/set_ftrace_filter bs=1
       #### all functions enabled ####
       32+0 records in
       32+0 records out
       32 bytes copied, 7.0237e-05 s, 456 kB/s
      
      This is because the "pos" variable is updated for the comment, but func_pos
      is not. "func_pos" is used by the triggers (or hashes) to know how many
      functions were printed and it bases its index from the pos - func_pos.
      func_pos should be 1 to count for the comment printed. But since it is not,
      t_hash_start() thinks that one trigger was already printed.
      
      The cat gets to t_hash_start() via t_next() and not t_start() which updates
      both pos and func_pos.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      43ff926a
    • S
      ftrace: Return NULL at end of t_start() instead of calling t_hash_start() · 2d71d989
      Steven Rostedt (VMware) 提交于
      The loop in t_start() of calling t_next() will call t_hash_start() if the
      pos is beyond the functions and enters the hash items. There's no reason to
      check if p is NULL and call t_hash_start(), as that would be redundant.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      2d71d989
    • S
      ftrace: Assign iter->hash to filter or notrace hashes on seq read · c20489da
      Steven Rostedt (VMware) 提交于
      Instead of testing if the hash to use is the filter_hash or the notrace_hash
      at each iteration, do the test at open, and set the iter->hash to point to
      the corresponding filter or notrace hash. Then use that directly instead of
      testing which hash needs to be used each iteration.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      c20489da
    • S
      ftrace: Clean up __seq_open_private() return check · c1bc5919
      Steven Rostedt (VMware) 提交于
      The return status check of __seq_open_private() is rather strange:
      
      	iter = __seq_open_private();
      	if (iter) {
      		/* do stuff */
      	}
      
      	return iter ? 0 : -ENOMEM;
      
      It makes much more sense to do the return of failure right away:
      
      	iter = __seq_open_private();
      	if (!iter)
      		return -ENOMEM;
      
      	/* do stuff */
      
      	return 0;
      
      This clean up will make updates to this code a bit nicer.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      c1bc5919
  8. 25 3月, 2017 1 次提交
  9. 03 3月, 2017 2 次提交
  10. 02 3月, 2017 1 次提交
  11. 28 2月, 2017 1 次提交
  12. 03 2月, 2017 7 次提交
    • S
      ftrace: Have set_graph_function handle multiple functions in one write · e704eff3
      Steven Rostedt (VMware) 提交于
      Currently, only one function can be written to set_graph_function and
      set_graph_notrace. The last function in the list will have saved, even
      though other functions will be added then removed.
      
      Change the behavior to be the same as set_ftrace_function as to allow
      multiple functions to be written. If any one fails, none of them will be
      added. The addition of the functions are done at the end when the file is
      closed.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      e704eff3
    • S
      ftrace: Do not hold references of ftrace_graph_{notrace_}hash out of graph_lock · 649b988b
      Steven Rostedt (VMware) 提交于
      The hashs ftrace_graph_hash and ftrace_graph_notrace_hash are modified
      within the graph_lock being held. Holding a pointer to them and passing them
      along can lead to a use of a stale pointer (fgd->hash). Move assigning the
      pointer and its use to within the holding of the lock. Note, it's an
      rcu_sched protected data, and other instances of referencing them are done
      with preemption disabled. But the file manipuation code must be protected by
      the lock.
      
      The fgd->hash pointer is set to NULL when the lock is being released.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      649b988b
    • S
      ftrace: Have set_graph_functions handle write with RDWR · ae98d27a
      Steven Rostedt (VMware) 提交于
      Since reading the set_graph_functions uses seq functions, which sets the
      file->private_data pointer to a seq_file descriptor. On writes the
      ftrace_graph_data descriptor is set to file->private_data. But if the file
      is opened for RDWR, the ftrace_graph_write() will incorrectly use the
      file->private_data descriptor instead of
      ((struct seq_file *)file->private_data)->private pointer, and this can crash
      the kernel.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      ae98d27a
    • S
      ftrace: Reset fgd->hash in ftrace_graph_write() · d4ad9a1c
      Steven Rostedt (VMware) 提交于
      fgd->hash is saved and then freed, but is never reset to either
      ftrace_graph_hash nor ftrace_graph_notrace_hash. But if multiple writes are
      performed, then the freed hash could be accessed again.
      
       # cd /sys/kernel/debug/tracing
       # head -1000 available_filter_functions > /tmp/funcs
       # cat /tmp/funcs > set_graph_function
      
      Causes:
      
       general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
       Modules linked in:  [...]
       CPU: 2 PID: 1337 Comm: cat Not tainted 4.10.0-rc2-test-00010-g6b052e9 #32
       Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
       task: ffff880113a12200 task.stack: ffffc90001940000
       RIP: 0010:free_ftrace_hash+0x7c/0x160
       RSP: 0018:ffffc90001943db0 EFLAGS: 00010246
       RAX: 6b6b6b6b6b6b6b6b RBX: 6b6b6b6b6b6b6b6b RCX: 6b6b6b6b6b6b6b6b
       RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffff8800ce1e1d40
       RBP: ffff8800ce1e1d50 R08: 0000000000000000 R09: 0000000000006400
       R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
       R13: ffff8800ce1e1d40 R14: 0000000000004000 R15: 0000000000000001
       FS:  00007f9408a07740(0000) GS:ffff88011e500000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000aee1f0 CR3: 0000000116bb4000 CR4: 00000000001406e0
       Call Trace:
        ? ftrace_graph_write+0x150/0x190
        ? __vfs_write+0x1f6/0x210
        ? __audit_syscall_entry+0x17f/0x200
        ? rw_verify_area+0xdb/0x210
        ? _cond_resched+0x2b/0x50
        ? __sb_start_write+0xb4/0x130
        ? vfs_write+0x1c8/0x330
        ? SyS_write+0x62/0xf0
        ? do_syscall_64+0xa3/0x1b0
        ? entry_SYSCALL64_slow_path+0x25/0x25
       Code: 01 48 85 db 0f 84 92 00 00 00 b8 01 00 00 00 d3 e0 85 c0 7e 3f 83 e8 01 48 8d 6f 10 45 31 e4 4c 8d 34 c5 08 00 00 00 49 8b 45 08 <4a> 8b 34 20 48 85 f6 74 13 48 8b 1e 48 89 ef e8 20 fa ff ff 48
       RIP: free_ftrace_hash+0x7c/0x160 RSP: ffffc90001943db0
       ---[ end trace 999b48216bf4b393 ]---
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      d4ad9a1c
    • S
      ftrace: Replace (void *)1 with a meaningful macro name FTRACE_GRAPH_EMPTY · 555fc781
      Steven Rostedt (VMware) 提交于
      When the set_graph_function or set_graph_notrace contains no records, a
      banner is displayed of either "#### all functions enabled ####" or
      "#### all functions disabled ####" respectively. To tell the seq operations
      to do this, (void *)1 is passed as a return value. Instead of using a
      hardcoded meaningless variable, define it as a macro.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      555fc781
    • S
      ftrace: Create a slight optimization on searching the ftrace_hash · 2b2c279c
      Steven Rostedt (VMware) 提交于
      This is a micro-optimization, but as it has to deal with a fast path of the
      function tracer, these optimizations can be noticed.
      
      The ftrace_lookup_ip() returns true if the given ip is found in the hash. If
      it's not found or the hash is NULL, it returns false. But there's some cases
      that a NULL hash is a true, and the ftrace_hash_empty() is tested before
      calling ftrace_lookup_ip() in those cases. But as ftrace_lookup_ip() tests
      that first, that adds a few extra unneeded instructions in those cases.
      
      A new static "always_inlined" function is created that does not perform the
      hash empty test. This most only be used by callers that do the check first
      anyway, as an empty or NULL hash could cause a crash if a lookup is
      performed on it.
      
      Also add kernel doc for the ftrace_lookup_ip() main function.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      2b2c279c
    • S
      tracing: Add ftrace_hash_key() helper function · 2b0cce0e
      Steven Rostedt (VMware) 提交于
      Replace the couple of use cases that has small logic to produce the ftrace
      function key id with a helper function. No need for duplicate code.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      2b0cce0e
  13. 21 1月, 2017 3 次提交
  14. 25 12月, 2016 1 次提交
  15. 16 11月, 2016 1 次提交
  16. 15 11月, 2016 3 次提交
  17. 02 9月, 2016 1 次提交
  18. 05 7月, 2016 1 次提交
  19. 20 6月, 2016 1 次提交
  20. 21 5月, 2016 1 次提交
    • S
      ftrace: Don't disable irqs when taking the tasklist_lock read_lock · 6112a300
      Soumya PN 提交于
      In ftrace.c inside the function alloc_retstack_tasklist() (which will be
      invoked when function_graph tracing is on) the tasklist_lock is being
      held as reader while iterating through a list of threads. Here the lock
      is being held as reader with irqs disabled. The tasklist_lock is never
      write_locked in interrupt context so it is safe to not disable interrupts
      for the duration of read_lock in this block which, can be significant,
      given the block of code iterates through all threads. Hence changing the
      code to call read_lock() and read_unlock() instead of read_lock_irqsave()
      and read_unlock_irqrestore().
      
      A similar change was made in commits: 8063e41d ("tracing: Change
      syscall_*regfunc() to check PF_KTHREAD and use for_each_process_thread()")'
      and 3472eaa1 ("sched: normalize_rt_tasks(): Don't use _irqsave for
      tasklist_lock, use task_rq_lock()")'
      
      Link: http://lkml.kernel.org/r/1463500874-77480-1-git-send-email-soumya.p.n@hpe.comSigned-off-by: NSoumya PN <soumya.p.n@hpe.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      6112a300
  21. 27 4月, 2016 1 次提交
    • T
      ftrace: Match dot symbols when searching functions on ppc64 · 7132e2d6
      Thiago Jung Bauermann 提交于
      In the ppc64 big endian ABI, function symbols point to function
      descriptors. The symbols which point to the function entry points
      have a dot in front of the function name. Consequently, when the
      ftrace filter mechanism searches for the symbol corresponding to
      an entry point address, it gets the dot symbol.
      
      As a result, ftrace filter users have to be aware of this ABI detail on
      ppc64 and prepend a dot to the function name when setting the filter.
      
      The perf probe command insulates the user from this by ignoring the dot
      in front of the symbol name when matching function names to symbols,
      but the sysfs interface does not. This patch makes the ftrace filter
      mechanism do the same when searching symbols.
      
      Fixes the following failure in ftracetest's kprobe_ftrace.tc:
      
        .../kprobe_ftrace.tc: line 9: echo: write error: Invalid argument
      
      That failure is on this line of kprobe_ftrace.tc:
      
        echo _do_fork > set_ftrace_filter
      
      This is because there's no _do_fork entry in the functions list:
      
        # cat available_filter_functions | grep _do_fork
        ._do_fork
      
      This change introduces no regressions on the perf and ftracetest
      testsuite results.
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NThiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7132e2d6
  22. 14 4月, 2016 1 次提交