1. 13 4月, 2013 10 次提交
  2. 09 2月, 2013 12 次提交
    • O
      uprobes/perf: Avoid uprobe_apply() whenever possible · b2fe8ba6
      Oleg Nesterov 提交于
      uprobe_perf_open/close call the costly uprobe_apply() every time,
      we can avoid it if:
      
      	- "nr_systemwide != 0" is not changed.
      
      	- There is another process/thread with the same ->mm.
      
      	- copy_proccess() does inherit_event(). dup_mmap() preserves the
      	  inserted breakpoints.
      
      	- event->attr.enable_on_exec == T, we can rely on uprobe_mmap()
      	  called by exec/mmap paths.
      
      	- tp_target is exiting. Only _close() checks PF_EXITING, I don't
      	  think TRACE_REG_PERF_OPEN can hit the dying task too often.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      b2fe8ba6
    • O
      uprobes/perf: Teach trace_uprobe/perf code to use UPROBE_HANDLER_REMOVE · f42d24a1
      Oleg Nesterov 提交于
      Change uprobe_trace_func() and uprobe_perf_func() to return "int". Change
      uprobe_dispatcher() to return "trace_ret | perf_ret" although this is not
      needed, currently TP_FLAG_TRACE/TP_FLAG_PROFILE are mutually exclusive.
      
      The only functional change is that uprobe_perf_func() checks the filtering
      too and returns UPROBE_HANDLER_REMOVE if nobody wants to trace current.
      
      Testing:
      
      	# perf probe -x /lib/libc.so.6 syscall
      
      	# perf record -e probe_libc:syscall -i perl -e 'fork; syscall -1 for 1..10; wait'
      
      	# perf report --show-total-period
      		100.00%            10     perl  libc-2.8.so    [.] syscall
      
      Before this patch:
      
      	# cat /sys/kernel/debug/tracing/uprobe_profile
      		/lib/libc.so.6 syscall				20
      
      A child process doesn't have a counter, but still it hits this breakoint
      "copied" by dup_mmap().
      
      After the patch:
      
      	# cat /sys/kernel/debug/tracing/uprobe_profile
      		/lib/libc.so.6 syscall				11
      
      The child process hits this int3 only once and does unapply_uprobe().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      f42d24a1
    • O
      uprobes/perf: Teach trace_uprobe/perf code to pre-filter · 31ba3348
      Oleg Nesterov 提交于
      Finally implement uprobe_perf_filter() which checks ->nr_systemwide or
      ->perf_events to figure out whether we need to insert the breakpoint.
      
      uprobe_perf_open/close are changed to do uprobe_apply(true/false) when
      the new perf event comes or goes away.
      
      Note that currently this is very suboptimal:
      
      	- uprobe_register() called by TRACE_REG_PERF_REGISTER becomes a
      	  heavy nop, consumer->filter() always returns F at this stage.
      
      	  As it was already discussed we need uprobe_register_only() to
      	  avoid the costly register_for_each_vma() when possible.
      
      	- uprobe_apply() is oftenly overkill. Unless "nr_systemwide != 0"
      	  changes we need uprobe_apply_mm(), unapply_uprobe() is almost
      	  what we need.
      
      	- uprobe_apply() can be simply avoided sometimes, see the next
      	  changes.
      
      Testing:
      
      	# perf probe -x /lib/libc.so.6 syscall
      
      	# perl -e 'syscall -1 while 1' &
      	[1] 530
      
      	# perf record -e probe_libc:syscall perl -e 'syscall -1 for 1..10; sleep 1'
      
      	# perf report --show-total-period
      		100.00%            10     perl  libc-2.8.so    [.] syscall
      
      Before this patch:
      
      	# cat /sys/kernel/debug/tracing/uprobe_profile
      		/lib/libc.so.6 syscall				79291
      
      A huge ->nrhit == 79291 reflects the fact that the background process
      530 constantly hits this breakpoint too, even if doesn't contribute to
      the output.
      
      After the patch:
      
      	# cat /sys/kernel/debug/tracing/uprobe_profile
      		/lib/libc.so.6 syscall				10
      
      This shows that only the target process was punished by int3.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      31ba3348
    • O
      uprobes/perf: Teach trace_uprobe/perf code to track the active perf_event's · 736288ba
      Oleg Nesterov 提交于
      Introduce "struct trace_uprobe_filter" which records the "active"
      perf_event's attached to ftrace_event_call. For the start we simply
      use list_head, we can optimize this later if needed. For example, we
      do not really need to record an event with ->parent != NULL, we can
      rely on parent->child_list. And we can certainly do some optimizations
      for the case when 2 events have the same ->tp_target or tp_target->mm.
      
      Change trace_uprobe_register() to process TRACE_REG_PERF_OPEN/CLOSE
      and add/del this perf_event to the list.
      
      We can probably avoid any locking, but lets start with the "obvioulsy
      correct" trace_uprobe_filter->rwlock which protects everything.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      736288ba
    • O
      uprobes/perf: Always increment trace_uprobe->nhit · 1b47aefd
      Oleg Nesterov 提交于
      Move tu->nhit++ from uprobe_trace_func() to uprobe_dispatcher().
      
      ->nhit counts how many time we hit the breakpoint inserted by this
      uprobe, we do not want to loose this info if uprobe was enabled by
      sys_perf_event_open().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      1b47aefd
    • O
      uprobes/tracing: Kill uprobe_trace_consumer, embed uprobe_consumer into trace_uprobe · a932b738
      Oleg Nesterov 提交于
      trace_uprobe->consumer and "struct uprobe_trace_consumer" add the
      unnecessary indirection and complicate the code for no reason.
      
      This patch simply embeds uprobe_consumer into "struct trace_uprobe",
      all other changes only fix the compilation errors.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      a932b738
    • O
      uprobes/tracing: Introduce is_trace_uprobe_enabled() · b64b0077
      Oleg Nesterov 提交于
      probe_event_enable/disable() check tu->consumer != NULL to avoid the
      wrong uprobe_register/unregister().
      
      We are going to kill this pointer and "struct uprobe_trace_consumer",
      so we add the new helper, is_trace_uprobe_enabled(), which can rely
      on TP_FLAG_TRACE/TP_FLAG_PROFILE instead.
      
      Note: the current logic doesn't look optimal, it is not clear why
      TP_FLAG_TRACE/TP_FLAG_PROFILE are mutually exclusive, we will probably
      change this later.
      
      Also kill the unused TP_FLAG_UPROBE.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      b64b0077
    • O
      uprobes/tracing: Ensure inode != NULL in create_trace_uprobe() · 7e4e28c5
      Oleg Nesterov 提交于
      probe_event_enable/disable() check tu->inode != NULL at the start.
      This is ugly, if igrab() can fail create_trace_uprobe() should not
      succeed and "postpone" the failure.
      
      And S_ISREG(inode->i_mode) check added by d24d7dbf is not safe.
      
      Note: alloc_uprobe() should probably check igrab() != NULL as well.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      7e4e28c5
    • O
      uprobes/tracing: Fully initialize uprobe_trace_consumer before uprobe_register() · 4161824f
      Oleg Nesterov 提交于
      probe_event_enable() does uprobe_register() and only after that sets
      utc->tu and tu->consumer/flags. This can race with uprobe_dispatcher()
      which can miss these assignments or see them out of order. Nothing
      really bad can happen, but this doesn't look clean/safe.
      
      And this does not allow to use uprobe_consumer->filter() we are going
      to add, it is called by uprobe_register() and it needs utc->tu.
      
      Change this code to initialize everything before uprobe_register(), and
      reset tu->consumer/flags if it fails. We can't race with event_disable(),
      the caller holds event_mutex, and if we could the code would be wrong
      anyway.
      
      In fact I think uprobe_trace_consumer should die, it buys nothing but
      complicates the code. We can simply add uprobe_consumer into trace_uprobe.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      4161824f
    • O
      uprobes/tracing: Fix dentry/mount leak in create_trace_uprobe() · 84d7ed79
      Oleg Nesterov 提交于
      create_trace_uprobe() does kern_path() to find ->d_inode, but forgets
      to do path_put(). We can do this right after igrab().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      84d7ed79
    • O
      uprobes: Change handle_swbp() to expose bp_vaddr to handler_chain() · 74e59dfc
      Oleg Nesterov 提交于
      Change handle_swbp() to set regs->ip = bp_vaddr in advance, this is
      what consumer->handler() needs but uprobe_get_swbp_addr() is not
      exported.
      
      This also simplifies the code and makes it more consistent across
      the supported architectures. handle_swbp() becomes the only caller
      of uprobe_get_swbp_addr().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
      74e59dfc
    • O
      uprobes: Kill uprobe_consumer->filter() · fe20d71f
      Oleg Nesterov 提交于
      uprobe_consumer->filter() is pointless in its current form, kill it.
      
      We will add it back, but with the different signature/semantics. Perhaps
      we will even re-introduce the callsite in handler_chain(), but not to
      just skip uc->handler().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      fe20d71f
  3. 22 1月, 2013 1 次提交
  4. 18 12月, 2012 1 次提交
  5. 01 11月, 2012 1 次提交
  6. 25 10月, 2012 1 次提交
  7. 31 7月, 2012 1 次提交
  8. 07 5月, 2012 1 次提交
    • S
      tracing: Provide trace events interface for uprobes · f3f096cf
      Srikar Dronamraju 提交于
      Implements trace_event support for uprobes. In its current form
      it can be used to put probes at a specified offset in a file and
      dump the required registers when the code flow reaches the
      probed address.
      
      The following example shows how to dump the instruction pointer
      and %ax a register at the probed text address.  Here we are
      trying to probe zfree in /bin/zsh:
      
       # cd /sys/kernel/debug/tracing/
       # cat /proc/`pgrep  zsh`/maps | grep /bin/zsh | grep r-xp
       00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh
       # objdump -T /bin/zsh | grep -w zfree
       0000000000446420 g    DF .text  0000000000000012  Base
       zfree # echo 'p /bin/zsh:0x46420 %ip %ax' > uprobe_events
       # cat uprobe_events
       p:uprobes/p_zsh_0x46420 /bin/zsh:0x0000000000046420
       # echo 1 > events/uprobes/enable
       # sleep 20
       # echo 0 > events/uprobes/enable
       # cat trace
       # tracer: nop
       #
       #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
       #              | |       |          |         |
                    zsh-24842 [006] 258544.995456: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
                    zsh-24842 [007] 258545.000270: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
                    zsh-24842 [002] 258545.043929: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
                    zsh-24842 [004] 258547.046129: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Linux-mm <linux-mm@kvack.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120411103043.GB29437@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f3f096cf