1. 09 5月, 2010 3 次提交
    • F
      tracing: Factorize lock events in a lock class · 2c193c73
      Frederic Weisbecker 提交于
      lock_acquired, lock_contended and lock_release now share the
      same prototype and format. Let's factorize them into a lock
      event class.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      2c193c73
    • F
      tracing: Drop the nested field from lock_release event · 93135439
      Frederic Weisbecker 提交于
      Drop the nested field as we don't use it. Every nested state can
      be computed from a state machine on post processing already.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      93135439
    • F
      tracing: Drop lock_acquired waittime field · 883a2a31
      Frederic Weisbecker 提交于
      Drop the waittime field from the lock_acquired event, we can
      calculate it by substracting the lock_acquired event timestamp
      with the matching lock_acquire one.
      
      It is not needed and takes useless space in the traces.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      883a2a31
  2. 09 3月, 2010 1 次提交
  3. 01 3月, 2010 2 次提交
  4. 31 1月, 2010 1 次提交
  5. 10 12月, 2009 1 次提交
    • T
      itimer: Fix the itimer trace print format · e9c0748b
      Thomas Gleixner 提交于
      Compiling powerpc64 results in:
      
      include/trace/events/timer.h:279: warning:
      format '%lu' expects type 'long unsigned int', but argument 4 has type 'cputime_t'
      ....
      
      cputime_t on power is u64, which triggers the above warning.
      
      Cast the cputime_t to unsigned long long and fix the print format
      string. That works on both 32 and 64 bit architectures.
      
      While at it change the print format for long variables from %lu to %ld.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      e9c0748b
  6. 01 1月, 2010 1 次提交
  7. 23 12月, 2009 1 次提交
  8. 26 11月, 2009 14 次提交
    • M
      tracepoint: Add signal loss events · ba005e1f
      Masami Hiramatsu 提交于
      Add signal_overflow_fail and signal_lose_info tracepoints
      for signal-lost events.
      
      Changes in v3:
       - Add docbook style comments
      
      Changes in v2:
       - Use siginfo string macro
      Suggested-by: NRoland McGrath <roland@redhat.com>
      Reviewed-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: NRoland McGrath <roland@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      LKML-Reference: <20091124215658.30449.9934.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ba005e1f
    • M
      tracepoint: Add signal deliver event · f9d4257e
      Masami Hiramatsu 提交于
      Add a tracepoint where a process gets a signal. This tracepoint
      shows signal-number, sa-handler and sa-flag.
      
      Changes in v3:
       - Add docbook style comments
      
      Changes in v2:
       - Add siginfo argument
       - Fix comment
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Reviewed-by: NJason Baron <jbaron@redhat.com>
      Acked-by: NRoland McGrath <roland@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      LKML-Reference: <20091124215651.30449.20926.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f9d4257e
    • M
      tracepoint: Move signal sending tracepoint to events/signal.h · d1eb650f
      Masami Hiramatsu 提交于
      Move signal sending event to events/signal.h. This patch also
      renames sched_signal_send event to signal_generate.
      
      Changes in v4:
       - Fix a typo of task_struct pointer.
      
      Changes in v3:
       - Add docbook style comments
      
      Changes in v2:
       - Add siginfo argument
       - Add siginfo storing macro
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Reviewed-by: NJason Baron <jbaron@redhat.com>
      Acked-by: NRoland McGrath <roland@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      LKML-Reference: <20091124215645.30449.60208.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d1eb650f
    • L
      tracing: Restore original format of sched events · 470dda74
      Li Zefan 提交于
      The original format for sched_stat_iowait and sched_stat_sleep:
      
        $ cat events/sched/sched_stat_iowait/format
        ...
        print fmt: "comm=%s pid=%d delay=%Lu [ns]", ...
        $ cat events/sched/sched_stat_sleep/format
        ...
        print fmt: "comm=%s pid=%d delay=%Lu [ns]", ...
      
      But commit commit 75ec29ab
      ("tracing: Convert some sched trace events to DEFINE_EVENT and
      _PRINT") broke the format:
      
        $ cat events/sched/sched_stat_iowait/format
        print fmt: "task: %s:%d iowait: %Lu [ns]", ...
        $ cat events/sched/sched_stat_sleep/format
        print fmt: "task: %s:%d sleep: %Lu [ns]", ...
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E2951.9050800@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      470dda74
    • L
      tracing: Convert some ext4 events to DEFINE_TRACE · b5eb34c3
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
       294695    6104     340  301139   49853 fs/ext4/ext4.o.old
       289983    6104     324  296411   485db fs/ext4/ext4.o
      
      5 events are convertd:
      
        ext4__write_begin: ext4_write_begin, ext4_da_write_begin
        ext4__write_end: ext4_{ordered, writeback, journalled}_write_end
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E2938.2040708@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b5eb34c3
    • L
      tracing: Convert some jbd2 events to DEFINE_EVENT · 071688f3
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
        34903    1693     448   37044    90b4 fs/jbd2/journal.o.old
        31931    1693     416   34040    84f8 fs/jbd2/journal.o
      
      Four events are converted:
      
        jbd2_commit: jbd2_start_commit,
                     jbd2_commit_{locking, flushing, logging}
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E290F.7030909@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      071688f3
    • L
      tracing: Convert some block events to DEFINE_EVENT · 77ca1e02
      Li Zefan 提交于
      use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
        53570    3284     184   57038    dece block/blk-core.o.old
        43702    3284     144   47130    b81a block/blk-core.o
      
      12 events are converted:
      
        block_rq: block_rq_insert, block_rq_issue
        block_rq_with_error: block_rq_{abort, requeue, complete}
        block_bio: block_bio_{backmerge, frontmerge, queue}
        block_get_rq: block_getrq, block_sleeprq
        block_unplug: block_unplug_timer, block_unplug_io
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E28E6.7060609@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      77ca1e02
    • L
      tracing: Convert some power events to DEFINE_EVENT · 7703466b
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
         4312     524      12    4848    12f0 kernel/trace/power-traces.o.old
         3455     524       8    3987     f93 kernel/trace/power-traces.o
      
      Two events are converted:
      
        power: power_start, power_frequency
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      LKML-Reference: <4B0E28C2.1090906@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7703466b
    • L
      tracing: Convert some workqueue events to DEFINE_EVENT · 382ece71
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
        13171     800      72   14043    36db kernel/workqueue.o.old
        12243     800      68   13111    3337 kernel/workqueue.o
      
      Two events are converted:
      
        workqueue: workqueue_insertion, workqueue_execution
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E289F.5010104@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      382ece71
    • L
      tracing: Convert softirq events to DEFINE_EVENT · c467307c
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
        12781     952      36   13769    35c9 kernel/softirq.o.old
        11981     952      32   12965    32a5 kernel/softirq.o
      
      Two events are converted:
      
        softirq: softirq_entry, softirq_exit
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E287F.4030708@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c467307c
    • L
      tracing: Convert some kmem events to DEFINE_EVENT · 53d0422c
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
       333987   69800   27228  431015   693a7 mm/built-in.o.old
       330030   69800   27228  427058   68432 mm/built-in.o
      
      8 events are converted:
      
        kmem_alloc: kmalloc, kmem_cache_alloc
        kmem_alloc_node: kmalloc_node, kmem_cache_alloc_node
        kmem_free: kfree, kmem_cache_free
        mm_page: mm_page_alloc_zone_locked, mm_page_pcpu_drain
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      LKML-Reference: <4B0E286A.2000405@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      53d0422c
    • L
      tracing: Convert module refcnt events to DEFINE_EVENT · 925684d6
      Li Zefan 提交于
      Use DECLARE_EVENT_CLASS to remove duplicate code:
      
         text    data     bss     dec     hex filename
        29854    1980     128   31962    7cda kernel/module.o.old
        28750    1980     128   30858    788a kernel/module.o
      
      Two events are converted:
      
        module_refcnt: module_get, module_put
      
      No change in functionality.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E283B.3010508@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      925684d6
    • I
      events: Rename TRACE_EVENT_TEMPLATE() to DECLARE_EVENT_CLASS() · 091ad365
      Ingo Molnar 提交于
      It is not quite obvious at first sight what TRACE_EVENT_TEMPLATE
      does: does it define an event as well beyond defining a template?
      
      To clarify this, rename it to DECLARE_EVENT_CLASS, which follows
      the various 'DECLARE_*()' idioms we already have in the kernel:
      
        DECLARE_EVENT_CLASS(class)
      
          DEFINE_EVENT(class, event1)
          DEFINE_EVENT(class, event2)
          DEFINE_EVENT(class, event3)
      
      To complete this logic we should also rename TRACE_EVENT() to:
      
        DEFINE_SINGLE_EVENT(single_event)
      
      ... but in a more quiet moment of the kernel cycle.
      
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E286A.2000405@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      091ad365
    • L
      tracing: Separate raw syscall from syscall tracer · b8007ef7
      Lai Jiangshan 提交于
      The current syscall tracer mixes raw syscalls and real syscalls.
      
      echo 1 > events/syscalls/enable
      And we get these from the output:
      
      (XXXX insteads "            grep-20914 [001] 588211.446347" .. etc)
      
      XXXX: sys_read(fd: 3, buf: 80609a8, count: 7000)
      XXXX: sys_enter: NR 3 (3, 80609a8, 7000, a, 1000, bfce8ef8)
      XXXX: sys_read -> 0x138
      XXXX: sys_exit: NR 3 = 312
      XXXX: sys_read(fd: 3, buf: 8060ae0, count: 7000)
      XXXX: sys_enter: NR 3 (3, 8060ae0, 7000, a, 1000, bfce8ef8)
      XXXX: sys_read -> 0x138
      XXXX: sys_exit: NR 3 = 312
      
      There are 2 drawbacks here.
      A) two almost identical records are saved in ringbuffer
         when a syscall enters or exits. (4 records for every syscall)
         This wastes precious space in the ring buffer.
      B) the lines including "sys_enter/sys_exit" produces
         hardly any useful information for the output (no labels).
      
      The user can use this method to prevent these drawbacks:
      echo 1 > events/syscalls/enable
      echo 0 > events/syscalls/sys_enter/enable
      echo 0 > events/syscalls/sys_exit/enable
      
      But this is not user friendly. So we separate raw syscall
      from syscall tracer.
      
      After this fix applied:
      syscall tracer's output (echo 1 > events/syscalls/enable):
      
      XXXX: sys_read(fd: 3, buf: bfe87d88, count: 200)
      XXXX: sys_read -> 0x200
      XXXX: sys_fstat64(fd: 3, statbuf: bfe87c98)
      XXXX: sys_fstat64 -> 0x0
      XXXX: sys_close(fd: 3)
      
      raw syscall tracer's output (echo 1 > events/raw_syscalls/enable):
      
      XXXX: sys_enter: NR 175 (0, bf92bf18, bf92bf98, 8, b748cff4, bf92bef8)
      XXXX: sys_exit: NR 175 = 0
      XXXX: sys_enter: NR 175 (2, bf92bf98, 0, 8, b748cff4, bf92bef8)
      XXXX: sys_exit: NR 175 = 0
      XXXX: sys_enter: NR 3 (9, bf927f9c, 4000, b77e2518, b77dce60, bf92bff8)
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4AEFC37C.5080609@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b8007ef7
  9. 25 11月, 2009 2 次提交
    • S
      tracing: Convert some sched trace events to DEFINE_EVENT and _PRINT · 75ec29ab
      Steven Rostedt 提交于
      Converting some of the scheduler trace events to use the
      TRACE_EVENT_TEMPLATE, DEFINE_EVENT and DEFINE_EVENT_PRINT helped to
      save some space:
      
      $ size kernel/sched.o-*
         text	   data	    bss	    dec	    hex	filename
        79299	   6776	   2520	  88595	  15a13	kernel/sched.o-notrace
       101941	  11896	   2584	 116421	  1c6c5	kernel/sched.o-templ
       104779	  11896	   2584	 119259	  1d1db	kernel/sched.o-trace
      
      sched.o-notrace is without any tracepoints compiled
      sched.o-templ is with this patch
      sched.o-trace is the tracepoints before this patch
      
      The trace events converted to DEFINE_EVENT:
      
      sched_wakeup, sched_wakeup_new, sched_process_free, sched_process_exit,
      and sched_stat_wait.
      
      The trace events converted to DEFINE_EVENT_PRINT:
      
      sched_stat_sleep and sched_stat_iowait.
      
      Note, since the TRACE_EVENT_TEMPLATE always uses a print, the
      sched_stat_wait print format is defined in the template and this
      template is used by sched_stat_sleep and sched_stat_iowait. But the
      later two override the print format.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      75ec29ab
    • W
      ext4: remove encountered_congestion trace · b4d72415
      Wu Fengguang 提交于
      It is no longer set and scheduled to be removed.
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      b4d72415
  10. 23 11月, 2009 3 次提交
  11. 13 11月, 2009 1 次提交
  12. 15 10月, 2009 1 次提交
    • I
      events: Harmonize event field names and print output names · 434a83c3
      Ingo Molnar 提交于
      Now that we can filter based on fields via perf record, people
      will start using filter expressions and will expect them to
      be obvious.
      
      The primary way to see which fields are available is by looking
      at the trace output, such as:
      
        gcc-18676 [000]   343.011728: irq_handler_entry: irq=0 handler=timer
        cc1-18677 [000]   343.012727: irq_handler_entry: irq=0 handler=timer
        cc1-18677 [000]   343.032692: irq_handler_entry: irq=0 handler=timer
        cc1-18677 [000]   343.033690: irq_handler_entry: irq=0 handler=timer
        cc1-18677 [000]   343.034687: irq_handler_entry: irq=0 handler=timer
        cc1-18677 [000]   343.035686: irq_handler_entry: irq=0 handler=timer
        cc1-18677 [000]   343.036684: irq_handler_entry: irq=0 handler=timer
      
      While 'irq==0' filters work, the 'handler==<x>' filter expression
      does not work:
      
        $ perf record -R -f -a -e irq:irq_handler_entry --filter handler=timer sleep 1
         Error: failed to set filter with 22 (Invalid argument)
      
      The problem is that while an 'irq' field exists and is recognized
      as a filter field - 'handler' does not exist - its name is 'name'
      in the output.
      
      To solve this, we need to synchronize the printout and the field
      names, wherever possible.
      
      In cases where the printout prints a non-field, we enclose
      that information in square brackets, such as:
      
        perf-1380  [013]   724.903505: softirq_exit: vec=9 [action=RCU]
        perf-1380  [013]   724.904482: softirq_exit: vec=1 [action=TIMER]
      
      This way users can use filter expressions more intuitively: all
      fields that show up as 'primary' (non-bracketed) information is
      filterable.
      
      This patch harmonizes the field names for all irq, bkl, power,
      sched and timer events.
      
      We might in fact think about dropping the print format bit of
      generic tracepoints altogether, and just print the fields that
      are being recorded.
      
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      434a83c3
  13. 13 10月, 2009 1 次提交
    • H
      perf_event, x86, mce: Use TRACE_EVENT() for MCE logging · 8968f9d3
      Hidetoshi Seto 提交于
      This approach is the first baby step towards solving many of the
      structural problems the x86 MCE logging code is having today:
      
       - It has a private ring-buffer implementation that has a number
         of limitations and has been historically fragile and buggy.
      
       - It is using a quirky /dev/mcelog ioctl driven ABI that is MCE
         specific. /dev/mcelog is not part of any larger logging
         framework and hence has remained on the fringes for many years.
      
       - The MCE logging code is still very unclean partly due to its ABI
         limitations. Fields are being reused for multiple purposes, and
         the whole message structure is limited and x86 specific to begin
         with.
      
      All in one, the x86 tree would like to move away from this private
      implementation of an event logging facility to a broader framework.
      
      By using perf events we gain the following advantages:
      
       - Multiple user-space agents can access MCE events. We can have an
         mcelog daemon running but also a system-wide tracer capturing
         important events in flight-recorder mode.
      
       - Sampling support: the kernel and the user-space call-chain of MCE
         events can be stored and analyzed as well. This way actual patterns
         of bad behavior can be matched to precisely what kind of activity
         happened in the kernel (and/or in the app) around that moment in
         time.
      
       - Coupling with other hardware and software events: the PMU can track a
         number of other anomalies - monitoring software might chose to
         monitor those plus the MCE events as well - in one coherent stream of
         events.
      
       - Discovery of MCE sources - tracepoints are enumerated and tools can
         act upon the existence (or non-existence) of various channels of MCE
         information.
      
       - Filtering support: we just subscribe to and act upon the events we
         are interested in. Then even on a per event source basis there's
         in-kernel filter expressions available that can restrict the amount
         of data that hits the event channel.
      
       - Arbitrary deep per cpu buffering of events - we can buffer 32
         entries or we can buffer as much as we want, as long as we have
         the RAM.
      
       - An NMI-safe ring-buffer implementation - mappable to user-space.
      
       - Built-in support for timestamping of events, PID markers, CPU
         markers, etc.
      
       - A rich ABI accessible over system call interface. Per cpu, per task
         and per workload monitoring of MCE events can be done this way. The
         ABI itself has a nice, meaningful structure.
      
       - Extensible ABI: new fields can be added without breaking tooling.
         New tracepoints can be added as the hardware side evolves. There's
         various parsers that can be used.
      
       - Lots of scheduling/buffering/batching modes of operandi for MCE
         events. poll() support. mmap() support. read() support. You name it.
      
       - Rich tooling support: even without any MCE specific extensions added
         the 'perf' tool today offers various views of MCE data: perf report,
         perf stat, perf trace can all be used to view logged MCE events and
         perhaps correlate them to certain user-space usage patterns. But it
         can be used directly as well, for user-space agents and policy action
         in mcelog, etc.
      
      With this we hope to achieve significant code cleanup and feature
      improvements in the MCE code, and we hope to be able to drop the
      /dev/mcelog facility in the end.
      
      This patch is just a plain dumb dump of mce_log() records to
      the tracepoints / perf events framework - a first proof of
      concept step.
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      LKML-Reference: <4AD42A0D.7050104@jp.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8968f9d3
  14. 02 10月, 2009 2 次提交
  15. 30 9月, 2009 4 次提交
  16. 24 9月, 2009 1 次提交
    • F
      tracing/bkl: Add bkl ftrace events · 96a2c464
      Frederic Weisbecker 提交于
      Add two events lock_kernel and unlock_kernel() to trace the bkl uses.
      This opens the door for userspace tools to perform statistics about
      the callsites that use it, dependencies with other locks (by pairing
      the trace with lock events), use with recursivity and so on...
      
      The {__reacquire,release}_kernel_lock() events are not traced because
      these are called from schedule, thus the sched events are sufficient
      to trace them.
      
      Example of a trace:
      
      hald-addon-stor-4152  [000]   165.875501: unlock_kernel: depth: 0, fs/block_dev.c:1358 __blkdev_put()
      hald-addon-stor-4152  [000]   167.832974: lock_kernel: depth: 0, fs/block_dev.c:1167 __blkdev_get()
      
      How to get the callsites that acquire it recursively:
      
      cd /debug/tracing/events/bkl
      echo "lock_depth > 0" > filter
      
      firefox-4951  [001]   206.276967: unlock_kernel: depth: 1, fs/reiserfs/super.c:575 reiserfs_dirty_inode()
      
      You can also filter by file and/or line.
      
      v2: Use of FILTER_PTR_STRING attribute for files and lines fields to
          make them traceable.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      96a2c464
  17. 23 9月, 2009 1 次提交