1. 13 12月, 2009 1 次提交
  2. 02 12月, 2009 6 次提交
  3. 26 11月, 2009 14 次提交
    • M
      tracepoint: Add signal loss events · ba005e1f
      Masami Hiramatsu 提交于
      Add signal_overflow_fail and signal_lose_info tracepoints
      for signal-lost events.
      
      Changes in v3:
       - Add docbook style comments
      
      Changes in v2:
       - Use siginfo string macro
      Suggested-by: NRoland McGrath <roland@redhat.com>
      Reviewed-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: NRoland McGrath <roland@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      LKML-Reference: <20091124215658.30449.9934.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ba005e1f
    • M
      tracepoint: Add signal deliver event · f9d4257e
      Masami Hiramatsu 提交于
      Add a tracepoint where a process gets a signal. This tracepoint
      shows signal-number, sa-handler and sa-flag.
      
      Changes in v3:
       - Add docbook style comments
      
      Changes in v2:
       - Add siginfo argument
       - Fix comment
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Reviewed-by: NJason Baron <jbaron@redhat.com>
      Acked-by: NRoland McGrath <roland@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      LKML-Reference: <20091124215651.30449.20926.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f9d4257e
    • M
      tracepoint: Move signal sending tracepoint to events/signal.h · d1eb650f
      Masami Hiramatsu 提交于
      Move signal sending event to events/signal.h. This patch also
      renames sched_signal_send event to signal_generate.
      
      Changes in v4:
       - Fix a typo of task_struct pointer.
      
      Changes in v3:
       - Add docbook style comments
      
      Changes in v2:
       - Add siginfo argument
       - Add siginfo storing macro
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Reviewed-by: NJason Baron <jbaron@redhat.com>
      Acked-by: NRoland McGrath <roland@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      LKML-Reference: <20091124215645.30449.60208.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d1eb650f
    • L
      tracing: Restore original format of sched events · 470dda74
      Li Zefan 提交于
      The original format for sched_stat_iowait and sched_stat_sleep:
      
        $ cat events/sched/sched_stat_iowait/format
        ...
        print fmt: "comm=%s pid=%d delay=%Lu [ns]", ...
        $ cat events/sched/sched_stat_sleep/format
        ...
        print fmt: "comm=%s pid=%d delay=%Lu [ns]", ...
      
      But commit commit 75ec29ab
      ("tracing: Convert some sched trace events to DEFINE_EVENT and
      _PRINT") broke the format:
      
        $ cat events/sched/sched_stat_iowait/format
        print fmt: "task: %s:%d iowait: %Lu [ns]", ...
        $ cat events/sched/sched_stat_sleep/format
        print fmt: "task: %s:%d sleep: %Lu [ns]", ...
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E2951.9050800@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      470dda74
    • L
      tracing: Convert some ext4 events to DEFINE_TRACE · b5eb34c3
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
       294695    6104     340  301139   49853 fs/ext4/ext4.o.old
       289983    6104     324  296411   485db fs/ext4/ext4.o
      
      5 events are convertd:
      
        ext4__write_begin: ext4_write_begin, ext4_da_write_begin
        ext4__write_end: ext4_{ordered, writeback, journalled}_write_end
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E2938.2040708@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b5eb34c3
    • L
      tracing: Convert some jbd2 events to DEFINE_EVENT · 071688f3
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
        34903    1693     448   37044    90b4 fs/jbd2/journal.o.old
        31931    1693     416   34040    84f8 fs/jbd2/journal.o
      
      Four events are converted:
      
        jbd2_commit: jbd2_start_commit,
                     jbd2_commit_{locking, flushing, logging}
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E290F.7030909@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      071688f3
    • L
      tracing: Convert some block events to DEFINE_EVENT · 77ca1e02
      Li Zefan 提交于
      use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
        53570    3284     184   57038    dece block/blk-core.o.old
        43702    3284     144   47130    b81a block/blk-core.o
      
      12 events are converted:
      
        block_rq: block_rq_insert, block_rq_issue
        block_rq_with_error: block_rq_{abort, requeue, complete}
        block_bio: block_bio_{backmerge, frontmerge, queue}
        block_get_rq: block_getrq, block_sleeprq
        block_unplug: block_unplug_timer, block_unplug_io
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E28E6.7060609@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      77ca1e02
    • L
      tracing: Convert some power events to DEFINE_EVENT · 7703466b
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
         4312     524      12    4848    12f0 kernel/trace/power-traces.o.old
         3455     524       8    3987     f93 kernel/trace/power-traces.o
      
      Two events are converted:
      
        power: power_start, power_frequency
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      LKML-Reference: <4B0E28C2.1090906@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7703466b
    • L
      tracing: Convert some workqueue events to DEFINE_EVENT · 382ece71
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
        13171     800      72   14043    36db kernel/workqueue.o.old
        12243     800      68   13111    3337 kernel/workqueue.o
      
      Two events are converted:
      
        workqueue: workqueue_insertion, workqueue_execution
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E289F.5010104@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      382ece71
    • L
      tracing: Convert softirq events to DEFINE_EVENT · c467307c
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
        12781     952      36   13769    35c9 kernel/softirq.o.old
        11981     952      32   12965    32a5 kernel/softirq.o
      
      Two events are converted:
      
        softirq: softirq_entry, softirq_exit
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E287F.4030708@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c467307c
    • L
      tracing: Convert some kmem events to DEFINE_EVENT · 53d0422c
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
       333987   69800   27228  431015   693a7 mm/built-in.o.old
       330030   69800   27228  427058   68432 mm/built-in.o
      
      8 events are converted:
      
        kmem_alloc: kmalloc, kmem_cache_alloc
        kmem_alloc_node: kmalloc_node, kmem_cache_alloc_node
        kmem_free: kfree, kmem_cache_free
        mm_page: mm_page_alloc_zone_locked, mm_page_pcpu_drain
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      LKML-Reference: <4B0E286A.2000405@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      53d0422c
    • L
      tracing: Convert module refcnt events to DEFINE_EVENT · 925684d6
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
        29854    1980     128   31962    7cda kernel/module.o.old
        28750    1980     128   30858    788a kernel/module.o
      
      Two events are converted:
      
        module_refcnt: module_get, module_put
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E283B.3010508@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      925684d6
    • I
      events: Rename TRACE_EVENT_TEMPLATE() to DECLARE_EVENT_CLASS() · 091ad365
      Ingo Molnar 提交于
      It is not quite obvious at first sight what TRACE_EVENT_TEMPLATE
      does: does it define an event as well beyond defining a template?
      
      To clarify this, rename it to DECLARE_EVENT_CLASS, which follows
      the various 'DECLARE_*()' idioms we already have in the kernel:
      
        DECLARE_EVENT_CLASS(class)
      
          DEFINE_EVENT(class, event1)
          DEFINE_EVENT(class, event2)
          DEFINE_EVENT(class, event3)
      
      To complete this logic we should also rename TRACE_EVENT() to:
      
        DEFINE_SINGLE_EVENT(single_event)
      
      ... but in a more quiet moment of the kernel cycle.
      
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E286A.2000405@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      091ad365
    • L
      tracing: Separate raw syscall from syscall tracer · b8007ef7
      Lai Jiangshan 提交于
      The current syscall tracer mixes raw syscalls and real syscalls.
      
      echo 1 > events/syscalls/enable
      And we get these from the output:
      
      (XXXX insteads "            grep-20914 [001] 588211.446347" .. etc)
      
      XXXX: sys_read(fd: 3, buf: 80609a8, count: 7000)
      XXXX: sys_enter: NR 3 (3, 80609a8, 7000, a, 1000, bfce8ef8)
      XXXX: sys_read -> 0x138
      XXXX: sys_exit: NR 3 = 312
      XXXX: sys_read(fd: 3, buf: 8060ae0, count: 7000)
      XXXX: sys_enter: NR 3 (3, 8060ae0, 7000, a, 1000, bfce8ef8)
      XXXX: sys_read -> 0x138
      XXXX: sys_exit: NR 3 = 312
      
      There are 2 drawbacks here.
      A) two almost identical records are saved in ringbuffer
         when a syscall enters or exits. (4 records for every syscall)
         This wastes precious space in the ring buffer.
      B) the lines including "sys_enter/sys_exit" produces
         hardly any useful information for the output (no labels).
      
      The user can use this method to prevent these drawbacks:
      echo 1 > events/syscalls/enable
      echo 0 > events/syscalls/sys_enter/enable
      echo 0 > events/syscalls/sys_exit/enable
      
      But this is not user friendly. So we separate raw syscall
      from syscall tracer.
      
      After this fix applied:
      syscall tracer's output (echo 1 > events/syscalls/enable):
      
      XXXX: sys_read(fd: 3, buf: bfe87d88, count: 200)
      XXXX: sys_read -> 0x200
      XXXX: sys_fstat64(fd: 3, statbuf: bfe87c98)
      XXXX: sys_fstat64 -> 0x0
      XXXX: sys_close(fd: 3)
      
      raw syscall tracer's output (echo 1 > events/raw_syscalls/enable):
      
      XXXX: sys_enter: NR 175 (0, bf92bf18, bf92bf98, 8, b748cff4, bf92bef8)
      XXXX: sys_exit: NR 175 = 0
      XXXX: sys_enter: NR 175 (2, bf92bf98, 0, 8, b748cff4, bf92bef8)
      XXXX: sys_exit: NR 175 = 0
      XXXX: sys_enter: NR 3 (9, bf927f9c, 4000, b77e2518, b77dce60, bf92bff8)
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4AEFC37C.5080609@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b8007ef7
  4. 25 11月, 2009 3 次提交
    • S
      tracing: Convert some sched trace events to DEFINE_EVENT and _PRINT · 75ec29ab
      Steven Rostedt 提交于
      Converting some of the scheduler trace events to use the
      TRACE_EVENT_TEMPLATE, DEFINE_EVENT and DEFINE_EVENT_PRINT helped to
      save some space:
      
      $ size kernel/sched.o-*
         text	   data	    bss	    dec	    hex	filename
        79299	   6776	   2520	  88595	  15a13	kernel/sched.o-notrace
       101941	  11896	   2584	 116421	  1c6c5	kernel/sched.o-templ
       104779	  11896	   2584	 119259	  1d1db	kernel/sched.o-trace
      
      sched.o-notrace is without any tracepoints compiled
      sched.o-templ is with this patch
      sched.o-trace is the tracepoints before this patch
      
      The trace events converted to DEFINE_EVENT:
      
      sched_wakeup, sched_wakeup_new, sched_process_free, sched_process_exit,
      and sched_stat_wait.
      
      The trace events converted to DEFINE_EVENT_PRINT:
      
      sched_stat_sleep and sched_stat_iowait.
      
      Note, since the TRACE_EVENT_TEMPLATE always uses a print, the
      sched_stat_wait print format is defined in the template and this
      template is used by sched_stat_sleep and sched_stat_iowait. But the
      later two override the print format.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      75ec29ab
    • S
      tracing: Create new DEFINE_EVENT_PRINT · e5bc9721
      Steven Rostedt 提交于
      After creating the TRACE_EVENT_TEMPLATE I started to look at other
      trace points to see what duplication was made. I noticed that there
      are several trace points where they are almost identical except for
      the name and the output format. Since TRACE_EVENT_TEMPLATE was successful
      in bringing down the size of trace events, I added a DEFINE_EVENT_PRINT.
      
      DEFINE_EVENT_PRINT is used just like DEFINE_EVENT is. That is, the
      DEFINE_EVENT_PRINT also uses a TRACE_EVENT_TEMPLATE, but it allows the
      developer to overwrite the print format. If there are two or more
      TRACE_EVENTS that are identical except for the name and print, then
      they can be converted to use a TRACE_EVENT_TEMPLATE. Since the
      TRACE_EVENT_TEMPLATE already does the print output, the first trace event
      would have its print format held in the TRACE_EVENT_TEMPLATE and
      be defined with a DEFINE_EVENT. The rest will use the DEFINE_EVENT_PRINT
      and override the print format.
      
      Converting the sched trace points to both DEFINE_EVENT and
      DEFINE_EVENT_PRINT. Five were converted to DEFINE_EVENT and two were
      converted to DEFINE_EVENT_PRINT.
      
      I was able to get the following:
      
      $ size kernel/sched.o-*
         text	   data	    bss	    dec	    hex	filename
        79299	   6776	   2520	  88595	  15a13	kernel/sched.o-notrace
       101941	  11896	   2584	 116421	  1c6c5	kernel/sched.o-templ
       104779	  11896	   2584	 119259	  1d1db	kernel/sched.o-trace
      
      sched.o-notrace is the scheduler compiled with no trace points.
      sched.o-templ is with the use of DEFINE_EVENT and DEFINE_EVENT_PRINT
      sched.o-trace is the current trace events.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e5bc9721
    • S
      tracing: Create new TRACE_EVENT_TEMPLATE · ff038f5c
      Steven Rostedt 提交于
      There are some places in the kernel that define several tracepoints and
      they are all identical besides the name. The code to enable, disable and
      record is created for every trace point even if most of the code is
      identical.
      
      This patch adds TRACE_EVENT_TEMPLATE that lets the developer create
      a template TRACE_EVENT and create trace points with DEFINE_EVENT, which
      is based off of a given template. Each trace point used by this
      will share most of the code, and bring down the size of the kernel
      when there are several duplicate events.
      
      Usage is:
      
      TRACE_EVENT_TEMPLATE(name, proto, args, tstruct, assign, print);
      
      Which would be the same as defining a normal TRACE_EVENT.
      
      To create the trace events that the trace points will use:
      
      DEFINE_EVENT(template, name, proto, args) is done. The template
      is the name of the TRACE_EVENT_TEMPLATE to use. The name is the
      name of the trace point. The parameters proto and args must be the same
      as the proto and args of the template. If they are not the same,
      then a compile error will result. I tried hard removing this duplication
      but the C preprocessor is not powerful enough (or my CPP magic
      experience points is not at a high enough level) to not need them.
      
      A lot of trace events are coming in with new XFS development. Most of
      the trace points are identical except for the name. The following shows
      the advantage of having TRACE_EVENT_TEMPLATE:
      
      $ size fs/xfs/xfs.o.*
          text          data     bss     dec     hex filename
        452114          2788    3520  458422   6feb6 fs/xfs/xfs.o.old
        638482         38116    3744  680342   a6196 fs/xfs/xfs.o.template
        996954         38116    4480 1039550   fdcbe fs/xfs/xfs.o.trace
      
      xfs.o.old is without any tracepoints.
      xfs.o.template uses the new TRACE_EVENT_TEMPLATE.
      xfs.o.trace uses the current TRACE_EVENT macros.
      Requested-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ff038f5c
  5. 23 11月, 2009 1 次提交
  6. 22 11月, 2009 1 次提交
    • F
      tracing: Use the perf recursion protection from trace event · ce71b9df
      Frederic Weisbecker 提交于
      When we commit a trace to perf, we first check if we are
      recursing in the same buffer so that we don't mess-up the buffer
      with a recursing trace. But later on, we do the same check from
      perf to avoid commit recursion. The recursion check is desired
      early before we touch the buffer but we want to do this check
      only once.
      
      Then export the recursion protection from perf and use it from
      the trace events before submitting a trace.
      
      v2: Put appropriate Reported-by tag
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <1258864015-10579-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ce71b9df
  7. 19 11月, 2009 1 次提交
  8. 14 11月, 2009 1 次提交
    • J
      tracing: Fix event format export · 811cb50b
      Johannes Berg 提交于
      For some reason the export of the event print format to userspace
      uses '#fmt' which breaks if the format string is anything but a plain
      string, for example if it is built with macros then the macro names
      are exported instead of their contents.
      
      Use
      	"\"%s\"", fmt
      instead of
      	"%s", #fmt
      to export the string and not the way it is built.
      
      For example, in net/mac80211/driver-trace.h for the trace event drv_start
      there is:
      
              TP_printk(
                      LOCAL_PR_FMT, LOCAL_PR_ARG
              )
      
      Which use to produce:
      
       print fmt: LOCAL_PR_FMT, REC->wiphy_name
      
      Now produces:
      
       print fmt: "%s", REC->wiphy_name
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      LKML-Reference: <20091113224009.GB23942@elte.hu>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      811cb50b
  9. 13 11月, 2009 1 次提交
  10. 08 11月, 2009 1 次提交
    • F
      tracing, perf_events: Protect the buffer from recursion in perf · 444a2a3b
      Frederic Weisbecker 提交于
      While tracing using events with perf, if one enables the
      lockdep:lock_acquire event, it will infect every other perf
      trace events.
      
      Basically, you can enable whatever set of trace events through
      perf but if this event is part of the set, the only result we
      can get is a long list of lock_acquire events of rcu read lock,
      and only that.
      
      This is because of a recursion inside perf.
      
      1) When a trace event is triggered, it will fill a per cpu
         buffer and submit it to perf.
      
      2) Perf will commit this event but will also protect some data
         using rcu_read_lock
      
      3) A recursion appears: rcu_read_lock triggers a lock_acquire
         event that will fill the per cpu event and then submit the
         buffer to perf.
      
      4) Perf detects a recursion and ignores it
      
      5) Perf continues its work on the previous event, but its buffer
         has been overwritten by the lock_acquire event, it has then
         been turned into a lock_acquire event of rcu read lock
      
      Such scenario also happens with lock_release with
      rcu_read_unlock().
      
      We could turn the rcu_read_lock() into __rcu_read_lock() to drop
      the lock debugging from perf fast path, but that would make us
      lose the rcu debugging and that doesn't prevent from other
      possible kind of recursion from perf in the future.
      
      This patch adds a recursion protection based on a counter on the
      perf trace per cpu buffers to solve the problem.
      
      -v2: Fixed lost whitespace, added reviewed-by tag
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <1257477185-7838-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      444a2a3b
  11. 15 10月, 2009 1 次提交
    • I
      events: Harmonize event field names and print output names · 434a83c3
      Ingo Molnar 提交于
      Now that we can filter based on fields via perf record, people
      will start using filter expressions and will expect them to
      be obvious.
      
      The primary way to see which fields are available is by looking
      at the trace output, such as:
      
        gcc-18676 [000]   343.011728: irq_handler_entry: irq=0 handler=timer
        cc1-18677 [000]   343.012727: irq_handler_entry: irq=0 handler=timer
        cc1-18677 [000]   343.032692: irq_handler_entry: irq=0 handler=timer
        cc1-18677 [000]   343.033690: irq_handler_entry: irq=0 handler=timer
        cc1-18677 [000]   343.034687: irq_handler_entry: irq=0 handler=timer
        cc1-18677 [000]   343.035686: irq_handler_entry: irq=0 handler=timer
        cc1-18677 [000]   343.036684: irq_handler_entry: irq=0 handler=timer
      
      While 'irq==0' filters work, the 'handler==<x>' filter expression
      does not work:
      
        $ perf record -R -f -a -e irq:irq_handler_entry --filter handler=timer sleep 1
         Error: failed to set filter with 22 (Invalid argument)
      
      The problem is that while an 'irq' field exists and is recognized
      as a filter field - 'handler' does not exist - its name is 'name'
      in the output.
      
      To solve this, we need to synchronize the printout and the field
      names, wherever possible.
      
      In cases where the printout prints a non-field, we enclose
      that information in square brackets, such as:
      
        perf-1380  [013]   724.903505: softirq_exit: vec=9 [action=RCU]
        perf-1380  [013]   724.904482: softirq_exit: vec=1 [action=TIMER]
      
      This way users can use filter expressions more intuitively: all
      fields that show up as 'primary' (non-bracketed) information is
      filterable.
      
      This patch harmonizes the field names for all irq, bkl, power,
      sched and timer events.
      
      We might in fact think about dropping the print format bit of
      generic tracepoints altogether, and just print the fields that
      are being recorded.
      
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      434a83c3
  12. 14 10月, 2009 1 次提交
    • F
      tracing: Move syscalls metadata handling from arch to core · c44fc770
      Frederic Weisbecker 提交于
      Most of the syscalls metadata processing is done from arch.
      But these operations are mostly generic accross archs. Especially now
      that we have a common variable name that expresses the number of
      syscalls supported by an arch: NR_syscalls, the only remaining bits
      that need to reside in arch is the syscall nr to addr translation.
      
      v2: Compare syscalls symbols only after the "sys" prefix so that we
          avoid spurious mismatches with archs that have syscalls wrappers,
          in which case syscalls symbols have "SyS" prefixed aliases.
          (Reported by: Heiko Carstens)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      c44fc770
  13. 13 10月, 2009 1 次提交
    • H
      perf_event, x86, mce: Use TRACE_EVENT() for MCE logging · 8968f9d3
      Hidetoshi Seto 提交于
      This approach is the first baby step towards solving many of the
      structural problems the x86 MCE logging code is having today:
      
       - It has a private ring-buffer implementation that has a number
         of limitations and has been historically fragile and buggy.
      
       - It is using a quirky /dev/mcelog ioctl driven ABI that is MCE
         specific. /dev/mcelog is not part of any larger logging
         framework and hence has remained on the fringes for many years.
      
       - The MCE logging code is still very unclean partly due to its ABI
         limitations. Fields are being reused for multiple purposes, and
         the whole message structure is limited and x86 specific to begin
         with.
      
      All in one, the x86 tree would like to move away from this private
      implementation of an event logging facility to a broader framework.
      
      By using perf events we gain the following advantages:
      
       - Multiple user-space agents can access MCE events. We can have an
         mcelog daemon running but also a system-wide tracer capturing
         important events in flight-recorder mode.
      
       - Sampling support: the kernel and the user-space call-chain of MCE
         events can be stored and analyzed as well. This way actual patterns
         of bad behavior can be matched to precisely what kind of activity
         happened in the kernel (and/or in the app) around that moment in
         time.
      
       - Coupling with other hardware and software events: the PMU can track a
         number of other anomalies - monitoring software might chose to
         monitor those plus the MCE events as well - in one coherent stream of
         events.
      
       - Discovery of MCE sources - tracepoints are enumerated and tools can
         act upon the existence (or non-existence) of various channels of MCE
         information.
      
       - Filtering support: we just subscribe to and act upon the events we
         are interested in. Then even on a per event source basis there's
         in-kernel filter expressions available that can restrict the amount
         of data that hits the event channel.
      
       - Arbitrary deep per cpu buffering of events - we can buffer 32
         entries or we can buffer as much as we want, as long as we have
         the RAM.
      
       - An NMI-safe ring-buffer implementation - mappable to user-space.
      
       - Built-in support for timestamping of events, PID markers, CPU
         markers, etc.
      
       - A rich ABI accessible over system call interface. Per cpu, per task
         and per workload monitoring of MCE events can be done this way. The
         ABI itself has a nice, meaningful structure.
      
       - Extensible ABI: new fields can be added without breaking tooling.
         New tracepoints can be added as the hardware side evolves. There's
         various parsers that can be used.
      
       - Lots of scheduling/buffering/batching modes of operandi for MCE
         events. poll() support. mmap() support. read() support. You name it.
      
       - Rich tooling support: even without any MCE specific extensions added
         the 'perf' tool today offers various views of MCE data: perf report,
         perf stat, perf trace can all be used to view logged MCE events and
         perhaps correlate them to certain user-space usage patterns. But it
         can be used directly as well, for user-space agents and policy action
         in mcelog, etc.
      
      With this we hope to achieve significant code cleanup and feature
      improvements in the MCE code, and we hope to be able to drop the
      /dev/mcelog facility in the end.
      
      This patch is just a plain dumb dump of mce_log() records to
      the tracepoints / perf events framework - a first proof of
      concept step.
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      LKML-Reference: <4AD42A0D.7050104@jp.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8968f9d3
  14. 06 10月, 2009 1 次提交
  15. 02 10月, 2009 2 次提交
  16. 30 9月, 2009 4 次提交