1. 26 10月, 2010 2 次提交
    • A
      perf python scripting: Improve the syscalls-counts script · 6545aaa5
      Arnaldo Carvalho de Melo 提交于
      . Print message at script start telling how to get te summary
      . Print the syscall name
      
      Now it looks like this:
      
      [root@emilia ~]# perf trace syscall-counts
      Press control+C to stop and show the summary
      ^C
      syscall events:
      
      event                                          count
      ----------------------------------------  -----------
      read                                          102752
      open                                            1293
      close                                            878
      write                                            319
      stat                                             185
      fstat                                            149
      getdents                                         116
      mmap                                              98
      brk                                               80
      rt_sigaction                                      66
      munmap                                            42
      mprotect                                          24
      lseek                                             21
      lstat                                              7
      rt_sigprocmask                                     4
      futex                                              3
      statfs                                             3
      ioctl                                              3
      readlink                                           2
      select                                             2
      getegid                                            1
      geteuid                                            1
      getgid                                             1
      getuid                                             1
      getrlimit                                          1
      fcntl                                              1
      uname                                              1
      [root@emilia ~]#
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6545aaa5
    • A
      perf python scripting: Improve the failed-syscalls-by-pid script · 6cc73614
      Arnaldo Carvalho de Melo 提交于
      . Print message at script start telling how to get te summary
      . Print the syscall name using the audit-lib-python package, if
        installed
      . Print the errno string
      . Accept both pid (if numeric) or COMM name
      
      Now it looks like this:
      
      [root@emilia ~]# perf trace failed-syscalls-by-pid
      Press control+C to stop and show the summary
      ^C
      syscall errors:
      
      comm [pid]                           count
      ------------------------------  ----------
      
      automount [1670]
        syscall: futex
          err = ETIMEDOUT                     39
      
      irqbalance [1462]
        syscall: openat
          err = ENOENT                         4
      
      perf [7888]
        syscall: lseek
          err = ESPIPE                         1
        syscall: open
          err = ENOENT                        24
      
      perf [7889]
        syscall: ioctl
          err = EINVAL                         1
        syscall: readlink
          err = EINVAL                         2
        syscall: open
          err = ENOENT                       389
        syscall: stat
          err = ENOENT                       141
        syscall: lseek
          err = ESPIPE                         3
      [root@emilia ~]#
      
      [root@emilia ~]# perf trace failed-syscalls-by-pid 1670
      Press control+C to stop and show the summary
      ^C
      syscall errors:
      
      comm [pid]                           count
      ------------------------------  ----------
      
      automount [1670]
        syscall: futex
          err = ETIMEDOUT                      2
      [root@emilia ~]#
      [root@emilia ~]#
      [root@emilia ~]#
      [root@emilia ~]# perf trace failed-syscalls-by-pid automount
      Press control+C to stop and show the summary
      ^C
      syscall errors for automount:
      
      comm [pid]                           count
      ------------------------------  ----------
      
      automount [1669]
        syscall: futex
          err = ETIMEDOUT                      1
      
      automount [1670]
        syscall: futex
          err = ETIMEDOUT                      5
      [root@emilia ~]#
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6cc73614
  2. 24 10月, 2010 4 次提交
  3. 23 10月, 2010 1 次提交
    • A
      perf tools: Remove direct slang.h include · 8bfb5e7d
      Arnaldo Carvalho de Melo 提交于
      We wrap it in libslang.h because we need to deal with older slang release
      where HAVE_LONG_LONG is referenced as:
      
      So we need to define it.
      
      Noticed when rebuilding the perf tools on a RHEL5 machine.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8bfb5e7d
  4. 22 10月, 2010 7 次提交
    • M
      perf probe: Add basic module support · 469b9b88
      Masami Hiramatsu 提交于
      Add basic module probe support on perf probe. This introduces "--module
      <MODNAME>" option to perf probe for putting probes and showing lines and
      variables in the given module.
      
      Currently, this supports only probing on running modules.  Supporting off-line
      module probing is the next step.
      
      e.g.)
      [show lines]
       # ./perf probe --module drm -L drm_vblank_info
      <drm_vblank_info:0>
            0  int drm_vblank_info(struct seq_file *m, void *data)
            1  {
                      struct drm_info_node *node = (struct drm_info_node *) m->private
            3         struct drm_device *dev = node->minor->dev;
       ...
      [show vars]
       # ./perf probe --module drm -V drm_vblank_info:3
      Available variables at drm_vblank_info:3
              @<drm_vblank_info+20>
                      (unknown_type)  data
                      struct drm_info_node*   node
                      struct seq_file*        m
      [put a probe]
       # ./perf probe --module drm drm_vblank_info:3 node m
      Add new event:
        probe:drm_vblank_info (on drm_vblank_info:3 with node m)
      
      You can now use it on all perf tools, such as:
      
              perf record -e probe:drm_vblank_info -aR sleep 1
      [list probes]
       # ./perf probe -l
      probe:drm_vblank_info (on drm_vblank_info:3@drivers/gpu/drm/drm_info.c with ...
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20101021101341.3542.71638.stgit@ltc236.sdl.hitachi.co.jp>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      469b9b88
    • M
      perf probe: Show accessible global variables · fb8c5a56
      Masami Hiramatsu 提交于
      Add --externs for allowing --vars to show accessible global (externally
      defined) variables from a given probe point too.
      
      This will give you a hint which globals can be accessible from the probe point.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20101021101335.3542.31003.stgit@ltc236.sdl.hitachi.co.jp>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fb8c5a56
    • M
      perf probe: Function style fix · c82ec0a2
      Masami Hiramatsu 提交于
      Just change the order of function arguments for ease of read; moving optional
      bool flag to the last.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20101021101329.3542.51200.stgit@ltc236.sdl.hitachi.co.jp>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c82ec0a2
    • M
      perf probe: Show accessible local variables · cf6eb489
      Masami Hiramatsu 提交于
      Add -V (--vars) option for listing accessible local variables at given probe
      point. This will help finding which local variables are available for event
      arguments.
      
      e.g.)
       # perf probe -V call_timer_fn:23
       Available variables at call_timer_fn:23
               @<run_timer_softirq+345>
                       function_type*  fn
                       int     preempt_count
                       long unsigned int       data
                       struct list_head        work_list
                       struct list_head*       head
                       struct timer_list*      timer
                       struct tvec_base*       base
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20101021101323.3542.40282.stgit@ltc236.sdl.hitachi.co.jp>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cf6eb489
    • M
      perf probe: Support global variables · 632941c4
      Masami Hiramatsu 提交于
      Allow users to set external defined global variables as event arguments (e.g.
      jiffies).
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      LKML-Reference: <20101021101316.3542.1999.stgit@ltc236.sdl.hitachi.co.jp>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      632941c4
    • M
      perf probe: Fix local variable searching loop · 378eeaad
      Masami Hiramatsu 提交于
      Fix to check the die's address and search into the die only if it has given
      address.
      
      This will avoid finding wrong variables in wrong basic block.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      LKML-Reference: <20101021101309.3542.46434.stgit@ltc236.sdl.hitachi.co.jp>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      378eeaad
    • M
      perf probe: Fix type searching · 4046b8bb
      Masami Hiramatsu 提交于
      Fix to get the actual type die of variables by using dwarf_attr_integrate()
      which gets attribute from die even if the type die is connected by
      DW_AT_abstract_origin.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      LKML-Reference: <20101021101302.3542.38549.stgit@ltc236.sdl.hitachi.co.jp>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4046b8bb
  5. 21 10月, 2010 1 次提交
    • T
      tracing: Cleanup the convoluted softirq tracepoints · f4bc6bb2
      Thomas Gleixner 提交于
      With the addition of trace_softirq_raise() the softirq tracepoint got
      even more convoluted. Why the tracepoints take two pointers to assign
      an integer is beyond my comprehension.
      
      But adding an extra case which treats the first pointer as an unsigned
      long when the second pointer is NULL including the back and forth
      type casting is just horrible.
      
      Convert the softirq tracepoints to take a single unsigned int argument
      for the softirq vector number and fix the call sites.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <alpine.LFD.2.00.1010191428560.6815@localhost6.localdomain6>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Acked-by: mathieu.desnoyers@efficios.com
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      f4bc6bb2
  6. 20 10月, 2010 1 次提交
  7. 19 10月, 2010 19 次提交
    • S
      tracing: Fix compile issue for trace_sched_wakeup.c · 7e40798f
      Steven Rostedt 提交于
      The function start_func_tracer() was incorrectly added in the
       #ifdef CONFIG_FUNCTION_TRACER condition, but is still used even
      when function tracing is not enabled.
      
      The calls to register_ftrace_function() and register_ftrace_graph()
      become nops (and their arguments are even ignored), thus there is
      no reason to hide start_func_tracer() when function tracing is
      not enabled.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7e40798f
    • H
      [S390] hardirq: remove pointless header file includes · 3f7edb16
      Heiko Carstens 提交于
      Remove a couple of pointless header file includes.
      Fixes a compile bug caused by header file include dependencies with
      "irq: Add tracepoint to softirq_raise" within linux-next.
      Reported-by: NSachin Sant <sachinp@in.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      [ cherry-picked from the s390 tree to fix "2bf2160d: irq: Add tracepoint to softirq_raise" ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3f7edb16
    • T
      [IA64] Move local_softirq_pending() definition · 3c4ea5b4
      Tony Luck 提交于
      Ugly #include dependencies. We need to have local_softirq_pending()
      defined before it gets used in <linux/interrupt.h>. But <asm/hardirq.h>
      provides the definition *after* this #include chain:
        <linux/irq.h>
          <asm/irq.h>
            <asm/hw_irq.h>
              <linux/interrupt.h>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      [ cherry-picked from the ia64 tree to fix "2bf2160d: irq: Add tracepoint to softirq_raise" ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3c4ea5b4
    • P
      perf, powerpc: Fix power_pmu_event_init to not use event->ctx · 57fa7214
      Paul Mackerras 提交于
      Commit c3f00c70 ("perf: Separate find_get_context() from event
      initialization") changed the generic perf_event code to call
      perf_event_alloc, which calls the arch-specific event_init code,
      before looking up the context for the new event.  Unfortunately,
      power_pmu_event_init uses event->ctx->task to see whether the
      new event is a per-task event or a system-wide event, and thus
      crashes since event->ctx is NULL at the point where
      power_pmu_event_init gets called.
      
      (The reason it needs to know whether it is a per-task event is
      because there are some hardware events on Power systems which
      only count when the processor is not idle, and there are some
      fixed-function counters which count such events.  For example,
      the "run cycles" event counts cycles when the processor is not
      idle.  If the user asks to count cycles, we can use "run cycles"
      if this is a per-task event, since the processor is running when
      the task is running, by definition.  We can't use "run cycles"
      if the user asks for "cycles" on a system-wide counter.)
      
      Fortunately the information we need is in the
      event->attach_state field, so we just use that instead.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20101019055535.GA10398@drongo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Reported-by: NAlexey Kardashevskiy <aik@au1.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      57fa7214
    • I
      Merge branch 'tip/perf/recordmcount-2' of... · 1fa41266
      Ingo Molnar 提交于
      Merge branch 'tip/perf/recordmcount-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
      1fa41266
    • S
      ftrace: Remove recursion between recordmcount and scripts/mod/empty · d7b4d6de
      Steven Rostedt 提交于
      When DYNAMIC_FTRACE is enabled and we use the C version of recordmcount,
      all objects are run through the recordmcount program to create a
      separate section that stores all the callers of mcount.
      
      The build process has a special file: scripts/mod/empty.o. This is
      built from empty.c which is literally an empty file (except for a
      single comment). This file is used to find information about the target
      elf format, like endianness and word size.
      
      The problem comes up when we need to build recordmcount. The
      build process requires that empty.o is built first. The build rules
      for empty.o will try to execute recordmcount on the empty.o file.
      We get an error that recordmcount does not exist.
      
      To avoid this recursion, the build file will skip running recordmcount
      if the file that it is building is script/mod/empty.o.
      
      [ extra comment Suggested-by: Sam Ravnborg <sam@ravnborg.org> ]
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Tested-by: NIngo Molnar <mingo@elte.hu>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: linux-kbuild@vger.kernel.org
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d7b4d6de
    • P
      jump_label: Add COND_STMT(), reducer wrappery · ebf31f50
      Peter Zijlstra 提交于
      The use of the JUMP_LABEL() construct ends up creating endless silly
      wrappers, create a higher level construct to reduce this clutter.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ebf31f50
    • P
      perf: Optimize sw events · 7e54a5a0
      Peter Zijlstra 提交于
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7e54a5a0
    • P
      perf: Use jump_labels to optimize the scheduler hooks · 82cd6def
      Peter Zijlstra 提交于
      Trades a call + conditional + ret for an unconditional jmp.
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20101014203625.501657727@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      82cd6def
    • P
      jump_label: Add atomic_t interface · 8b92538d
      Peter Zijlstra 提交于
      Add an interface to allow usage of jump_labels with atomic counters.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20101014203625.501657727@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8b92538d
    • P
      jump_label: Use more consistent naming · 3b6e901f
      Peter Zijlstra 提交于
      Now that there's still only a few users around, rename things to make
      them more consistent.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20101014203625.448565169@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3b6e901f
    • P
      perf, hw_breakpoint: Fix crash in hw_breakpoint creation · d580ff86
      Peter Zijlstra 提交于
      hw_breakpoint creation needs to account stuff per-task to ensure there
      is always sufficient hardware resources to back these things due to
      ptrace.
      
      With the perf per pmu context changes the event initialization no
      longer has access to the event context, for the simple reason that we
      need to first find the pmu (result of initialization) before we can
      find the context.
      
      This makes hw_breakpoints unhappy, because it can no longer do per
      task accounting, cure this by frobbing a task pointer in the event::hw
      bits for now...
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20101014203625.391543667@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d580ff86
    • P
      perf: Find task before event alloc · c6be5a5c
      Peter Zijlstra 提交于
      So that we can pass the task pointer to the event allocation, so that
      we can use task associated data during event initialization.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20101014203625.340789919@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c6be5a5c
    • P
      perf: Fix task refcount bugs · e7d0bc04
      Peter Zijlstra 提交于
      Currently it looks like find_lively_task_by_vpid() takes a task ref
      and relies on find_get_context() to drop it.
      
      The problem is that perf_event_create_kernel_counter() shouldn't be
      dropping task refs.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NMatt Helsley <matthltc@us.ibm.com>
      LKML-Reference: <20101014203625.278436085@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e7d0bc04
    • P
      perf: Fix group moving · 74c3337c
      Peter Zijlstra 提交于
      Matt found we trigger the WARN_ON_ONCE() in perf_group_attach() when we take
      the move_group path in perf_event_open().
      
      Since we cannot de-construct the group (we rely on it to move the events), we
      have to simply ignore the double attach. The group state is context invariant
      and doesn't need changing.
      Reported-by: NMatt Fleming <matt@console-pimps.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1287135757.29097.1368.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      74c3337c
    • P
      irq_work: Add generic hardirq context callbacks · e360adbe
      Peter Zijlstra 提交于
      Provide a mechanism that allows running code in IRQ context. It is
      most useful for NMI code that needs to interact with the rest of the
      system -- like wakeup a task to drain buffers.
      
      Perf currently has such a mechanism, so extract that and provide it as
      a generic feature, independent of perf so that others may also
      benefit.
      
      The IRQ context callback is generated through self-IPIs where
      possible, or on architectures like powerpc the decrementer (the
      built-in timer facility) is set to generate an interrupt immediately.
      
      Architectures that don't have anything like this get to do with a
      callback from the timer tick. These architectures can call
      irq_work_run() at the tail of any IRQ handlers that might enqueue such
      work (like the perf IRQ handler) to avoid undue latencies in
      processing the work.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NKyle McMartin <kyle@mcmartin.ca>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      [ various fixes ]
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      LKML-Reference: <1287036094.7768.291.camel@yhuang-dev>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e360adbe
    • S
      perf_events: Fix transaction recovery in group_sched_in() · 8e5fc1a7
      Stephane Eranian 提交于
      The group_sched_in() function uses a transactional approach to schedule
      a group of events. In a group, either all events can be scheduled or
      none are. To schedule each event in, the function calls event_sched_in().
      In case of error, event_sched_out() is called on each event in the group.
      
      The problem is that event_sched_out() does not completely cancel the
      effects of event_sched_in(). Furthermore event_sched_out() changes the
      state of the event as if it had run which is not true is this particular
      case.
      
      Those inconsistencies impact time tracking fields and may lead to events
      in a group not all reporting the same time_enabled and time_running values.
      This is demonstrated with the example below:
      
      $ task -eunhalted_core_cycles,baclears,baclears -e unhalted_core_cycles,baclears,baclears sleep 5
      1946101 unhalted_core_cycles (32.85% scaling, ena=829181, run=556827)
        11423 baclears (32.85% scaling, ena=829181, run=556827)
         7671 baclears (0.00% scaling, ena=556827, run=556827)
      
      2250443 unhalted_core_cycles (57.83% scaling, ena=962822, run=405995)
        11705 baclears (57.83% scaling, ena=962822, run=405995)
        11705 baclears (57.83% scaling, ena=962822, run=405995)
      
      Notice that in the first group, the last baclears event does not
      report the same timings as its siblings.
      
      This issue comes from the fact that tstamp_stopped is updated
      by event_sched_out() as if the event had actually run.
      
      To solve the issue, we must ensure that, in case of error, there is
      no change in the event state whatsoever. That means timings must
      remain as they were when entering group_sched_in().
      
      To do this we defer updating tstamp_running until we know the
      transaction succeeded. Therefore, we have split event_sched_in()
      in two parts separating the update to tstamp_running.
      
      Similarly, in case of error, we do not want to update tstamp_stopped.
      Therefore, we have split event_sched_out() in two parts separating
      the update to tstamp_stopped.
      
      With this patch, we now get the following output:
      
      $ task -eunhalted_core_cycles,baclears,baclears -e unhalted_core_cycles,baclears,baclears sleep 5
      2492050 unhalted_core_cycles (71.75% scaling, ena=1093330, run=308841)
        11243 baclears (71.75% scaling, ena=1093330, run=308841)
        11243 baclears (71.75% scaling, ena=1093330, run=308841)
      
      1852746 unhalted_core_cycles (0.00% scaling, ena=784489, run=784489)
         9253 baclears (0.00% scaling, ena=784489, run=784489)
         9253 baclears (0.00% scaling, ena=784489, run=784489)
      
      Note that the uneven timing between groups is a side effect of
      the process spending most of its time sleeping, i.e., not enough
      event rotations (but that's a separate issue).
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4cb86b4c.41e9d80a.44e9.3e19@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8e5fc1a7
    • S
      perf_events: Fix bogus AMD64 generic TLB events · ba0cef3d
      Stephane Eranian 提交于
      PERF_COUNT_HW_CACHE_DTLB:READ:MISS had a bogus umask value of 0 which
      counts nothing. Needed to be 0x7 (to count all possibilities).
      
      PERF_COUNT_HW_CACHE_ITLB:READ:MISS had a bogus umask value of 0 which
      counts nothing. Needed to be 0x3 (to count all possibilities).
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: <stable@kernel.org> # as far back as it applies
      LKML-Reference: <4cb85478.41e9d80a.44e2.3f00@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ba0cef3d
    • S
      perf_events: Fix bogus context time tracking · c530ccd9
      Stephane Eranian 提交于
      You can only call update_context_time() when the context
      is active, i.e., the thread it is attached to is still running.
      
      However, perf_event_read() can be called even when the context
      is inactive, e.g., user read() the counters. The call to
      update_context_time() must be conditioned on the status of
      the context, otherwise, bogus time_enabled, time_running may
      be returned. Here is an example on AMD64. The task program
      is an example from libpfm4. The -p prints deltas every 1s.
      
      $ task -p -e cpu_clk_unhalted sleep 5
          2,266,610 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
      	    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
      	    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
      	    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
      	    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
      5,242,358,071 cpu_clk_unhalted (99.95% scaling, ena=5,000,359,984, run=2,319,270)
      
      Whereas if you don't read deltas, e.g., no call to perf_event_read() until
      the process terminates:
      
      $ task -e cpu_clk_unhalted sleep 5
          2,497,783 cpu_clk_unhalted (0.00% scaling, ena=2,376,899, run=2,376,899)
      
      Notice that time_enable, time_running are bogus in the first example
      causing bogus scaling.
      
      This patch fixes the problem, by conditionally calling update_context_time()
      in perf_event_read().
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: stable@kernel.org
      LKML-Reference: <4cb856dc.51edd80a.5ae0.38fb@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c530ccd9
  8. 18 10月, 2010 5 次提交
新手
引导
客服 返回
顶部