1. 19 1月, 2011 2 次提交
    • O
      perf: Validate cpu early in perf_event_alloc() · 66832eb4
      Oleg Nesterov 提交于
      Starting from perf_event_alloc()->perf_init_event(), the kernel
      assumes that event->cpu is either -1 or the valid CPU number.
      
      Change perf_event_alloc() to validate this argument early. This
      also means we can remove the similar check in
      find_get_context().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: gregkh@suse.de
      Cc: stable@kernel.org
      LKML-Reference: <20110118161032.GC693@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      66832eb4
    • O
      perf: Find_get_context: fix the per-cpu-counter check · 22a4ec72
      Oleg Nesterov 提交于
      If task == NULL, find_get_context() should always check that cpu
      is correct.
      
      Afaics, the bug was introduced by 38a81da2 "perf events: Clean
      up pid passing", but even before that commit "&& cpu != -1" was
      not exactly right, -ESRCH from find_task_by_vpid() is not
      accurate.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: gregkh@suse.de
      Cc: stable@kernel.org
      LKML-Reference: <20110118161008.GB693@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      22a4ec72
  2. 18 1月, 2011 3 次提交
    • P
      perf: Fix contexted inheritance · c5ed5145
      Peter Zijlstra 提交于
      Linus reported that the RCU lockdep annotation bits triggered for this
      rcu_dereference() because we're not holding rcu_read_lock().
      
      Going over the code I cannot convince myself its correct:
      
       - holding a ref on the parent_ctx, doesn't avoid it being uncloned
         concurrently (as the comment says), so we can race with a free.
      
       - holding parent_ctx->mutex doesn't avoid the above free from taking
         place either, it would at best avoid parent_ctx from being freed.
      
      I.e. the warning is correct. To fix the bug, serialize against the
      unclone_ctx() call by extending the reach of the parent_ctx->lock.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c5ed5145
    • A
      perf tools: Fix tracepoint id to string perf.data header table · ad7f4e3f
      Arnaldo Carvalho de Melo 提交于
      It was broken by f006d25a that passed just the event name, not the complete
      sys:event that it expected to open the /sys/.../sys/sys:event/id file to get
      the id.
      
      Fix it by moving it to after parse_events in cmd_record, as at that point
      we can just traverse the evsel_list and use evsel->attr.config +
      event_name(evsel) instead of re-opening the /id file.
      Reported-by: NFranck Bui-Huu <vagabon.xyz@gmail.com>
      Cc: Franck Bui-Huu <vagabon.xyz@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Han Pingtian <phan@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20110117202801.GG2085@ghostprotocols.net>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ad7f4e3f
    • A
      perf tools: Fix handling of wildcards in tracepoint event selectors · dd9a9ad5
      Arnaldo Carvalho de Melo 提交于
      It wasn't accounting the ':' when consuming bytes in the the event
      selector string, so parse_events() would fail in this test:
      
                      if (!(*str == 0 || *str == ',' || isspace(*str)))
                              return -1;
      
      as *str would be pointing to '*', the last character in the '-e' arg in:
      
      $ perf record -q -a -D -e sched:sched_* | perf script -i - -s perf-script.py
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dd9a9ad5
  3. 17 1月, 2011 1 次提交
    • A
      powerpc: perf: Fix frequency calculation for overflowing counters · 4bca770e
      Anton Blanchard 提交于
      When profiling a benchmark that is almost 100% userspace, I noticed some wildly
      inaccurate profiles that showed almost all time spent in the kernel.
      
      Closer examination shows we were programming a tiny number of cycles into the
      PMU after each overflow (about ~200 away from the next overflow). This gets us
      stuck in a loop which we eventually break out of by throttling the PMU (there
      are regular throttle/unthrottle events in the log).
      
      It looks like we aren't setting event->hw.last_period to something same and the
      frequency to period calculations in perf are going haywire.
      
      With the following patch we find the correct period after a few interrupts and
      stay there. I also see no more throttle events.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      LKML-Reference: <20110117161742.5feb3761@kryten>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4bca770e
  4. 15 1月, 2011 3 次提交
    • I
      Merge branch 'tip/perf/urgent' of... · 7c46d8da
      Ingo Molnar 提交于
      Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent
      7c46d8da
    • L
      tracing: Remove syscall_exit_fields · 7f85803a
      Lai Jiangshan 提交于
      There is no need for syscall_exit_fields as the syscall
      exit event class can already host the fields in its structure,
      like most other trace events do by default. Use that
      default behavior instead.
      
      Following this scheme, we don't need anymore to override the
      get_fields() callback of the syscall exit event class either.
      
      Hence both syscall_exit_fields and syscall_get_exit_fields() can
      be removed.
      
      Also changed some indentation to keep the following under 80
      characters:
      
      ".fields		= LIST_HEAD_INIT(event_class_syscall_exit.fields),"
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4D301C0E.8090408@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7f85803a
    • S
      tracing: Only process module tracepoints once · c94fbe1d
      Steven Rostedt 提交于
      The commit:
      
       9f987b3141f086de27832514aad9f50a53f754
       tracing: Include module.h in define_trace.h
      
      only solved half the problem. If the trace/events/module.h header is
      included at the time of define_trace.h (or in ftrace.h within it),
      the module.h TRACE_SYSTEM will override the current TRACE_SYSTEM
      macro.
      
      Since define_trace.h is included when CREATE_TRACE_POINTS is set,
      and the first thing it does is to #undef CREATE_TRACE_POINTS,
      by placing the module.h TRACE_SYSTEM inside a
       #ifdef CREATE_TRACE_POINTS
      we can prevent it from overriding the TRACE_SYSTEM that is
      being processed, and still process the module.h tracepoints
      when the module code defines CREATE_TRACE_POINTS and includes
      the trace/events/module.h header.
      
      As with commit 9f987b3141, this is only an issue if module.h
      is not included before the trace/events/<event>.h file is
      included, which (luckily) has not happened yet.
      Reported-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c94fbe1d
  5. 13 1月, 2011 2 次提交
    • K
      perf record: Add "nodelay" mode, disabled by default · acac03fa
      Kirill Smelkov 提交于
      Sometimes there is a need to use perf in "live-log" mode. The problem
      is, for seldom events, actual info output is largely delayed because
      perf-record reads sample data in whole pages.
      
      So for such scenarious, add flag for perf-record to go in "nodelay"
      mode. To track e.g. what's going on in icmp_rcv while ping is running
      Use it with something like this:
      
      (1) $ perf probe -L icmp_rcv | grep -U8 '^ *43\>'
                                          goto error;
                          }
               38         if (!pskb_pull(skb, sizeof(*icmph)))
                                  goto error;
                          icmph = icmp_hdr(skb);
      
               43         ICMPMSGIN_INC_STATS_BH(net, icmph->type);
                          /*
                           *      18 is the highest 'known' ICMP type. Anything else is a mystery
                           *
                           *      RFC 1122: 3.2.2  Unknown ICMP messages types MUST be silently
                           *                discarded.
                           */
               50         if (icmph->type > NR_ICMP_TYPES)
                                  goto error;
      
          $ perf probe icmp_rcv:43 'type=icmph->type'
      
      (2) $ cat trace-icmp.py
          [...]
          def trace_begin():
                  print "in trace_begin"
      
          def trace_end():
                  print "in trace_end"
      
          def probe__icmp_rcv(event_name, context, common_cpu,
                  common_secs, common_nsecs, common_pid, common_comm,
                  __probe_ip, type):
                          print_header(event_name, common_cpu, common_secs, common_nsecs,
                                  common_pid, common_comm)
      
                          print "__probe_ip=%u, type=%u\n" % \
                          (__probe_ip, type),
          [...]
      
      (3) $ perf record -a -D -e probe:icmp_rcv -o - | \
            perf script -i - -s trace-icmp.py
      
      Thanks to Peter Zijlstra for pointing how to do it.
      
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>, Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20110112140613.GA11698@tugrik.mns.mnsspb.ru>
      Signed-off-by: NKirill Smelkov <kirr@mns.spb.ru>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      acac03fa
    • S
      perf sched: Fix list of events, dropping unsupported ':r' modifier · 9710118b
      Stephane Eranian 提交于
      Looks to me like the :r modifier is not supported anymore, so remove it from
      the list of events.
      
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <robert.richter@amd.com>
      LKML-Reference: <AANLkTim=jawJyBj0iFd0r4-LCKzvjFW+NddzJMD5GUB9@mail.gmail.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9710118b
  6. 12 1月, 2011 2 次提交
  7. 11 1月, 2011 4 次提交
    • A
      perf evsel: Fix order of event list deletion · bd3bfe9e
      Arnaldo Carvalho de Melo 提交于
      We need to defer calling perf_evsel_list__delete() till after atexit
      registered routines, because we need to traverse the events being
      recorded at that time at least on 'perf record'.
      
      This fixes the problem reported by Thomas Renninger where cmd_record
      called by cmd_timechart would not write the tracing data to the perf.data
      file header because the evsel_list at atexit (control+C on 'perf timechart
      record') time would be empty, being already deleted by run_builtin(),
      and thus 'perf timechart' when trying to process such perf.data file would
      die with:
      
      "no trace data in the file"
      
      Problem introduced in 70d544d0.
      Reported-by: NThomas Renninger <trenn@suse.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bd3bfe9e
    • A
      perf session: Fix infinite loop in __perf_session__process_events · 3d03e2ea
      Arnaldo Carvalho de Melo 提交于
      In this if statement:
      
              if (head + event->header.size >= mmap_size) {
                      if (mmaps[map_idx]) {
                              munmap(mmaps[map_idx], mmap_size);
                              mmaps[map_idx] = NULL;
                      }
      
                      page_offset = page_size * (head / page_size);
                      file_offset += page_offset;
                      head -= page_offset;
                      goto remap;
              }
      
      With, for instance, these values:
      
      head=2992
      event->header.size=48
      mmap_size=3040
      
      We end up endlessly looping back to remap. Off by one.
      
      Problem introduced in 55b4462.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Reported-by: NDavid Ahern <daahern@cisco.com>
      Bisected-by: NDavid Ahern <daahern@cisco.com>
      Tested-by: NDavid Ahern <daahern@cisco.com>
      Cc: David Ahern <daahern@cisco.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3d03e2ea
    • A
      perf evsel: Support perf_evsel__open(cpus > 1 && threads > 1) · 0252208e
      Arnaldo Carvalho de Melo 提交于
      And a test for it:
      
      [acme@felicio linux]$ perf test
       1: vmlinux symtab matches kallsyms: Ok
       2: detect open syscall event: Ok
       3: detect open syscall event on all cpus: Ok
      [acme@felicio linux]$
      
      Translating C the test does:
      
      1. generates different number of open syscalls on each CPU
         by using sched_setaffinity
      2. Verifies that the expected number of events is generated
         on each CPU
      
      It works as expected.
      
      LKML-Reference: <new-submission>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0252208e
    • J
      perf sched: Use PTHREAD_STACK_MIN to avoid pthread_attr_setstacksize() fail · 12f7e036
      Jiri Pirko 提交于
      on ppc64:
      /usr/include/bits/local_lim.h:#define PTHREAD_STACK_MIN	131072
      
      therefore following set of commands:
      
      gives:
      perf.2.6.37test: builtin-sched.c:493: create_tasks: Assertion `!(err)' failed.
      
      So make sure we do not set stack size lower than PTHREAD_STACK_MIN.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <20110110160417.GB2685@psychotron.brq.redhat.com>
      Signed-off-by: NJiri Pirko <jpirko@redhat.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      12f7e036
  8. 10 1月, 2011 3 次提交
    • A
      perf tools: Emit clearer message for sys_perf_event_open ENOENT return · aa7bc7ef
      Arnaldo Carvalho de Melo 提交于
      Improve sys_perf_event_open ENOENT return handling in top and record, just
      like 5a3446bc does for stat.
      
      Cc: David Ahern <daahern@cisco.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aa7bc7ef
    • D
      perf stat: better error message for unsupported events · 5a3446bc
      David Ahern 提交于
      For unsupported events (e.g., H/W events when running in a VM)
      perf stat currently fails with the error message:
      
            Error: open_counter returned with 2 (No such file or directory).
          /bin/dmesg may provide additional information.
      
            Fatal: Not all events could be opened.
      
      dmesg is of no help and it is not clear as to why it fails to
      open the counter. This patch changes the error message to
      
            Error: cache-misses event is not supported.
            Fatal: Not all events could be opened.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: a.p.zijlstra@chello.nl
      LPU-Reference: <1294597272-17335-1-git-send-email-daahern@cisco.com>
      Signed-off-by: NDavid Ahern <daahern@cisco.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5a3446bc
    • A
      perf sched: Fix allocation result check · e462dc55
      Arnaldo Carvalho de Melo 提交于
      Bug introduced in ce47dc56.
      Reported-by: NMike Galbraith <efault@gmx.de>
      Cc: Chris Samuel <chris@csamuel.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e462dc55
  9. 09 1月, 2011 2 次提交
    • I
      Merge branch 'tip/perf/core' of... · 4385428a
      Ingo Molnar 提交于
      Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent
      4385428a
    • C
      perf, x86: P4 PMU - Fix unflagged overflows handling · 047a3772
      Cyrill Gorcunov 提交于
      Don found that P4 PMU reads CCCR register instead of counter
      itself (in attempt to catch unflagged event) this makes P4
      NMI handler to consume all NMIs it observes. So the other
      NMI users such as kgdb simply have no chance to get NMI
      on their hands.
      
      Side note: at moment there is no way to run nmi-watchdog
      together with perf tool. This is because both 'perf top' and
      nmi-watchdog use same event. So while nmi-watchdog reserves
      one event/counter for own needs there is no room for perf tool
      left (there is a way to disable nmi-watchdog on boot of course).
      
      Ming has tested this patch with the following results
      
       | 1. watchdog disabled
       |
       | kgdb tests on boot OK
       | perf works OK
       |
       | 2. watchdog enabled, without patch perf-x86-p4-nmi-4
       |
       | kgdb tests on boot hang
       |
       | 3. watchdog enabled, without patch perf-x86-p4-nmi-4 and do not run kgdb
       | tests on boot
       |
       | "perf top" partialy works
       |   cpu-cycles            no
       |   instructions          yes
       |   cache-references      no
       |   cache-misses          no
       |   branch-instructions   no
       |   branch-misses         yes
       |   bus-cycles            no
       |
       | 4. watchdog enabled, with patch perf-x86-p4-nmi-4 applied
       |
       | kgdb tests on boot OK
       | perf does not work, NMI "Dazed and confused" messages show up
       |
      
      Which means we still have problems with p4 box due to 'unknown'
      nmi happens but at least it should fix kgdb test cases.
      Reported-by: NJason Wessel <jason.wessel@windriver.com>
      Reported-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Acked-by: NLin Ming <ming.m.lin@intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <4D275E7E.3040903@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      047a3772
  10. 08 1月, 2011 8 次提交
  11. 07 1月, 2011 10 次提交