1. 23 9月, 2012 5 次提交
  2. 03 7月, 2012 1 次提交
    • D
      perf kvm: Fix segfault with report and mixed guestmount use · 7ed97ad4
      David Ahern 提交于
      Using the guestmount option on record:
          $ perf kvm --guest --host --guestmount=/tmp/guest-mount record -ag
      
      But not the subsequent report:
          $ perf kvm report
      
      causes a SEGFAULT in the usual place:
      (gdb) bt
      0  0x0000000000470356 in machine__mmap_name (self=0x0, bf=0x7fffffffbdb0 " z\370\367\377\177", size=
          4096) at util/map.c:712
      1  0x00000000004453e8 in perf_event__process_kernel_mmap (tool=0x7fffffffde10, event=0x7ffff7f87e38,
          machine=0x0) at util/event.c:550
      2  0x00000000004458c9 in perf_event__process_mmap (tool=0x7fffffffde10, event=0x7ffff7f87e38, sample=
          0x7fffffffd2a0, machine=0x0) at util/event.c:656
      3  0x00000000004733e0 in perf_session_deliver_event (session=0x91aca0, event=0x7ffff7f87e38, sample=
          0x7fffffffd2a0, tool=0x7fffffffde10, file_offset=7736) at util/session.c:979
      ...
      
      The MMAP events in this case already contain the full path to the
      module. No need to require it for the report path to.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1341241977-71535-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7ed97ad4
  3. 02 7月, 2012 2 次提交
    • D
      perf kvm: Fix regression with guest machine creation · 207b5792
      David Ahern 提交于
      Commit 743eb868 reworked when the
      machines were created. Prior to this commit guest machines could be
      created in perf_event__process_kernel_mmap() while processing kernel
      MMAP events. This commit assumes that the machines exist by the time
      perf_session_deliver_event is called (e.g., during processing of build
      id events) - which is not always correct.
      
      One example is the use of default guest args (--guestkallsyms and
      --guestmodules) for short times where no samples hit within a guest
      module. For this case no build id is added to the file header. No build
      id == no machine created. That leads to the next example -- the use of
      no-buildid (-B) on the record for all perf-kvm invocations. In both
      cases perf report dies with a SEGFAULT of the form:
      
      (gdb) bt
      0  0x000000000046dd7b in machine__mmap_name (self=0x0, bf=0x7fffffffbd20 "q\021", size=4096) at util/map.c:715
      1  0x0000000000444161 in perf_event__process_kernel_mmap (tool=0x7fffffffdd80, event=0x7ffff7fb4120, machine=0x0) at util/event.c:562
      2  0x0000000000444642 in perf_event__process_mmap (tool=0x7fffffffdd80, event=0x7ffff7fb4120, sample=0x7fffffffd210, machine=0x0)
          at util/event.c:668
      3  0x0000000000470e0b in perf_session_deliver_event (session=0x915ca0, event=0x7ffff7fb4120, sample=0x7fffffffd210, tool=0x7fffffffdd80,
          file_offset=8480) at util/session.c:979
      4  0x000000000047032e in flush_sample_queue (s=0x915ca0, tool=0x7fffffffdd80) at util/session.c:679
      5  0x0000000000471c8d in __perf_session__process_events (session=0x915ca0, data_offset=400, data_size=150448, file_size=150848, tool=
          0x7fffffffdd80) at util/session.c:1363
      6  0x0000000000471d42 in perf_session__process_events (self=0x915ca0, tool=0x7fffffffdd80) at util/session.c:1379
      7  0x000000000042484a in __cmd_report (rep=0x7fffffffdd80) at builtin-report.c:368
      8  0x0000000000425bf1 in cmd_report (argc=0, argv=0x915b00, prefix=0x0) at builtin-report.c:756
      9  0x0000000000438505 in __cmd_report (argc=4, argv=0x7fffffffe260) at builtin-kvm.c:84
      10 0x000000000043882a in cmd_kvm (argc=4, argv=0x7fffffffe260, prefix=0x0) at builtin-kvm.c:131
      11 0x00000000004152cd in run_builtin (p=0x7a54e8, argc=9, argv=0x7fffffffe260) at perf.c:273
      12 0x00000000004154c7 in handle_internal_command (argc=9, argv=0x7fffffffe260) at perf.c:345
      13 0x0000000000415613 in run_argv (argcp=0x7fffffffe14c, argv=0x7fffffffe140) at perf.c:389
      14 0x0000000000415899 in main (argc=9, argv=0x7fffffffe260) at perf.c:487
      
      Fix by allowing the machine to be created in perf_session_deliver_event.
      
      Tested with --guestmount option and default guest args, with and without
      -B arg on record for both and for short (10 seconds) and long (10
      minutes) windows.
      Reported-by: NPradeep Kumar Surisetty <psuriset@linux.vnet.ibm.com>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pradeep Kumar Surisetty <psuriset@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1341180697-64515-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      207b5792
    • D
      perf script: Fix format regression due to libtraceevent merge · 76a8349d
      David Ahern 提交于
      Consider the commands:
          perf record -e sched:sched_switch -fo /tmp/perf.data  -a -- sleep 1
          perf script -i /tmp/perf.data
      
      In v3.4 the output has the form (lines wrapped here)
          perf 29214 [005] 821043.582596: sched_switch:
      prev_comm=perf prev_pid=29214 prev_prio=120
      prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120
      
      In 3.5 that same line has become:
          perf 29214 [005] 821043.582596: sched_switch:
      <...>-29214 [005]     0.000000000: sched_switch:
      prev_comm=perf prev_pid=29214 prev_prio=120
      prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120
      
      Note the duplicates in the output -- pid, cpu, event name. With
      this patch the v3.4 output is restored:
          perf 29214 [005] 821043.582596: sched_switch:
      prev_comm=perf prev_pid=29214 prev_prio=120
      prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120
      
      v3:
      Remove that pesky newline too. Output now matches v3.4 (pre-libtracevent).
      
      v2:
      Change print_trace_event function local to perf per Steve's comments.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1339698977-68962-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      76a8349d
  4. 12 6月, 2012 1 次提交
    • A
      perf tools: Fix synthesizing tracepoint names from the perf.data headers · cb9dd49e
      Arnaldo Carvalho de Melo 提交于
      We need to use the per event info snapshoted at record time to
      synthesize the events name, so do it just after reading the perf.data
      headers, when we already processed the /sys events data, otherwise we'll
      end up using the local /sys that only by sheer luck will have the same
      tracepoint ID -> real event association.
      
      Example:
      
        # uname -a
        Linux felicio.ghostprotocols.net 3.4.0-rc5+ #1 SMP Sat May 19 15:27:11 BRT 2012 x86_64 x86_64 x86_64 GNU/Linux
        # perf record -e sched:sched_switch usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.015 MB perf.data (~648 samples) ]
        # cat /t/events/sched/sched_switch/id
        279
        # perf evlist -v
        sched:sched_switch: sample_freq=1, type: 2, config: 279, size: 80, sample_type: 1159, read_format: 7, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1
        #
      
      So on the above machine the sched:sched_switch has tracepoint id 279, but on
      the machine were we'll analyse it it has a different id:
      
        $ cat /t/events/sched/sched_switch/id
        56
        $ perf evlist -i /tmp/perf.data
        kmem:mm_balancedirty_writeout
        $ cat /t/events/kmem/mm_balancedirty_writeout/id
        279
      
      With this fix:
      
        $ perf evlist -i /tmp/perf.data
        sched:sched_switch
      Reported-by: NDmitry Antipov <dmitry.antipov@linaro.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-auwks8fpuhmrdpiefs55o5oz@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cb9dd49e
  5. 11 6月, 2012 2 次提交
    • S
      perf stat: Fix default output file · fc3e4d07
      Stephane Eranian 提交于
      The following commit:
      
      commit 56f3bae7
      Author: Jim Cromie <jim.cromie@gmail.com>
      Date:   Wed Sep 7 17:14:00 2011 -0600
      
          perf stat: Add --log-fd <N> option to redirect stderr elsewhere
      
      introduced a bug in the way perf stat outputs the results by default,
      i.e., without the --log-fd or --output option. It would default to
      writing to file descriptor 0, i.e., stdin. Writing to stdin is allowed
      and is equivalent to writing to stdout. However, there is a major
      difference for any script that was already capturing the output of perf
      stat via redirection:
      
          perf stat >/tmp/log .... or perf stat 2>/tmp/log ....
      
      They would not capture anything anymore. They would have to do:
          perf stat 0>/tmp/log ...
      
      This breaks compatibility with existing scripts and does not look very
      natural.
      
      This patch fixes the problem by looking at output_fd only when it was
      modified by user (> 0). It also checks that the value if positive.
      Passing --log-fd 0 is ignored.
      
      I would also argue that defaulting to stderr for the results is not the
      right thing to do, though this patch does not address this specific
      issue.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jim Cromie <jim.cromie@gmail.com>
      Link: http://lkml.kernel.org/r/20120515111111.GA9870@quadSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fc3e4d07
    • D
      perf tools: Fix endianity swapping for adds_features bitmask · 80c0120a
      David Ahern 提交于
      Based on Jiri's latest attempt:
      https://lkml.org/lkml/2012/5/16/61
      
      Basically, adds_features should be byte swapped assuming unsigned
      longs are either 8-bytes (u64) or 4-bytes (u32).
      
          Fixes 32-bit ppc dumping 64-bit x86 feature data:
           ========
           captured on: Sun May 20 19:23:23 2012
           hostname : nxos-vdc-dev3
           os release : 3.4.0-rc7+
           perf version : 3.4.rc4.137.g978da3
           arch : x86_64
           nrcpus online : 16
           nrcpus avail : 16
           cpudesc : Intel(R) Xeon(R) CPU E5540 @ 2.53GHz
           cpuid : GenuineIntel,6,26,5
           total memory : 24680324 kB
          ...
      
      Verified 64-bit x86 can still dump feature data for 32-bit ppc.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Reviewed-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/4FBBB539.5010805@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      80c0120a
  6. 09 6月, 2012 1 次提交
  7. 04 6月, 2012 2 次提交
    • L
      tools/power turbostat: fix IVB support · 650a37f3
      Len Brown 提交于
      Initial IVB support went into turbostat in Linux-3.1:
      553575f1
      (tools turbostat: recognize and run properly on IVB)
      
      However, when running on IVB, turbostat would fail
      to report the new couters added with SNB, c7, pc2 and pc7.
      So in scenarios where these counters are non-zero on IVB,
      turbostat would report erroneous residencey results.
      
      In particular c7 time would be added to c1 time,
      since c1 time is calculated as "that which is left over".
      
      Also, turbostat reports MHz capabilities when passed
      the "-v" option, and it would incorrectly report 133MHz
      bclk instead of 100MHz bclk for IVB, which would inflate
      GHz reported with that option.
      
      This patch is a backport of a fix already included in turbostat v2.
      Signed-off-by: NLen Brown <len.brown@intel.com>
      650a37f3
    • L
      tools/power turbostat: fix un-intended affinity of forked program · d15cf7c1
      Len Brown 提交于
      Linux 3.4 included a modification to turbostat to
      lower cross-call overhead by using scheduler affinity:
      
      15aaa346
      (tools turbostat: reduce measurement overhead due to IPIs)
      
      In the use-case where turbostat forks a child program,
      that change had the un-intended side-effect of binding
      the child to the last cpu in the system.
      
      This change removed the binding before forking the child.
      
      This is a back-port of a fix already included in turbostat v2.
      Signed-off-by: NLen Brown <len.brown@intel.com>
      d15cf7c1
  8. 01 6月, 2012 3 次提交
    • C
      syscalls, x86: add __NR_kcmp syscall · d97b46a6
      Cyrill Gorcunov 提交于
      While doing the checkpoint-restore in the user space one need to determine
      whether various kernel objects (like mm_struct-s of file_struct-s) are
      shared between tasks and restore this state.
      
      The 2nd step can be solved by using appropriate CLONE_ flags and the
      unshare syscall, while there's currently no ways for solving the 1st one.
      
      One of the ways for checking whether two tasks share e.g.  mm_struct is to
      provide some mm_struct ID of a task to its proc file, but showing such
      info considered to be not that good for security reasons.
      
      Thus after some debates we end up in conclusion that using that named
      'comparison' syscall might be the best candidate.  So here is it --
      __NR_kcmp.
      
      It takes up to 5 arguments - the pids of the two tasks (which
      characteristics should be compared), the comparison type and (in case of
      comparison of files) two file descriptors.
      
      Lookups for pids are done in the caller's PID namespace only.
      
      At moment only x86 is supported and tested.
      
      [akpm@linux-foundation.org: fix up selftests, warnings]
      [akpm@linux-foundation.org: include errno.h]
      [akpm@linux-foundation.org: tweak comment text]
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Andrey Vagin <avagin@openvz.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Glauber Costa <glommer@parallels.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Vasiliy Kulikov <segoon@openwall.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Valdis.Kletnieks@vt.edu
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d97b46a6
    • D
      tools/selftests: add mq_perf_tests · 7820b071
      Doug Ledford 提交于
      Add the mq_perf_tests tool I used when creating my mq performance patch.
      Also add a local .gitignore to keep the binaries from showing up in git
      status output.
      
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7820b071
    • D
      selftests: add mq_open_tests · 50069a58
      Doug Ledford 提交于
      Add a directory to house POSIX message queue subsystem specific tests.
      Add first test which checks the operation of mq_open() under various
      corner conditions.
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Joe Korty <joe.korty@ccur.com>
      Cc: Amerigo Wang <amwang@redhat.com>
      Cc: Serge E. Hallyn <serue@us.ibm.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      50069a58
  9. 31 5月, 2012 14 次提交
    • S
      perf uprobes: Remove unnecessary check before strlist__delete · a23c4dc4
      Srikar Dronamraju 提交于
      Since strlist__delete() itself checks, the additional check before
      calling strlist__delete() is redundant.
      
      No Functional change.
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Suggested-by: NArnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20120531114643.23691.38666.sendpatchset@srdronam.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a23c4dc4
    • S
      perf symbols: Check for valid dso before creating map · 378474e4
      Srikar Dronamraju 提交于
      dso__new() can return NULL. Hence verify dso before creating a new map.
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Suggested-by: NArnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20120531114656.23691.54223.sendpatchset@srdronam.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      378474e4
    • J
      perf evsel: Fix 32 bit values endianity swap for sample_id_all header · 37073f9e
      Jiri Olsa 提交于
      We swap the sample_id_all header by u64 pointers. Some members of the
      header happen to be 32 bit values. We need to handle them separatelly.
      
      Together with other endianity patches, this change fixies perf report
      discrepancies on origin and target systems as described in test 1 below,
      e.g. following perf report diff:
      
      ...
            0.12%               ps  [kernel.kallsyms]    [k] clear_page
      -     0.12%              awk  bash                 [.] alloc_word_desc
      +     0.12%              awk  bash                 [.] yyparse
            0.11%   beah-rhts-task  libpython2.6.so.1.0  [.] 0x5560e
            0.10%             perf  libc-2.12.so         [.] __ctype_toupper_loc
      -     0.09%  rhts-test-runne  bash                 [.] maybe_make_export_env
      +     0.09%  rhts-test-runne  bash                 [.] 0x385a0
            0.09%               ps  [kernel.kallsyms]    [k] page_fault
      ...
      
      Note, running following to test perf endianity handling:
      test 1)
        - origin system:
          # perf record -a -- sleep 10 (any perf record will do)
          # perf report > report.origin
          # perf archive perf.data
      
        - copy the perf.data, report.origin and perf.data.tar.bz2
          to a target system and run:
          # tar xjvf perf.data.tar.bz2 -C ~/.debug
          # perf report > report.target
          # diff -u report.origin report.target
      
        - the diff should produce no output
          (besides some white space stuff and possibly different
           date/TZ output)
      
      test 2)
        - origin system:
          # perf record -ag -fo /tmp/perf.data -- sleep 1
        - mount origin system root to the target system on /mnt/origin
        - target system:
          # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
           --kallsyms /mnt/origin/proc/kallsyms
        - complete perf.data header is displayed
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1338380624-7443-4-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      37073f9e
    • J
      perf session: Handle endianity swap on sample_id_all header data · 268fb20f
      Jiri Olsa 提交于
      Adding endianity swapping for event header attached via sample_id_all.
      
      Currently we dont do that and it's causing wrong data to be read when
      running report on architecture with different endianity than the record.
      
      The perf is currently able to process 32-bit PPC samples on 32-bit
      and 64-bit x86.
      
      Together with other endianity patches, this change fixies perf report
      discrepancies on origin and target systems as described in test 1
      below, e.g. following perf report diff:
      
      ...
            0.12%               ps  [kernel.kallsyms]    [k] clear_page
      -     0.12%              awk  bash                 [.] alloc_word_desc
      +     0.12%              awk  bash                 [.] yyparse
            0.11%   beah-rhts-task  libpython2.6.so.1.0  [.] 0x5560e
            0.10%             perf  libc-2.12.so         [.] __ctype_toupper_loc
      -     0.09%  rhts-test-runne  bash                 [.] maybe_make_export_env
      +     0.09%  rhts-test-runne  bash                 [.] 0x385a0
            0.09%               ps  [kernel.kallsyms]    [k] page_fault
      ...
      
      Note, running following to test perf endianity handling:
      test 1)
        - origin system:
          # perf record -a -- sleep 10 (any perf record will do)
          # perf report > report.origin
          # perf archive perf.data
      
        - copy the perf.data, report.origin and perf.data.tar.bz2
          to a target system and run:
          # tar xjvf perf.data.tar.bz2 -C ~/.debug
          # perf report > report.target
          # diff -u report.origin report.target
      
        - the diff should produce no output
          (besides some white space stuff and possibly different
           date/TZ output)
      
      test 2)
        - origin system:
          # perf record -ag -fo /tmp/perf.data -- sleep 1
        - mount origin system root to the target system on /mnt/origin
        - target system:
          # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
           --kallsyms /mnt/origin/proc/kallsyms
        - complete perf.data header is displayed
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1338380624-7443-3-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      268fb20f
    • J
      perf symbols: Handle different endians properly during symbol load · 8db4841f
      Jiri Olsa 提交于
      Currently we dont care about the file object's endianness. It's possible
      we read buildid file object from different architecture than we are
      currentlly running on. So we need to care about properly reading such
      object's data - handle different endianness properly.
      
      Adding:
      	needs_swap DSO field
      	dso__swap_init function to initialize DSO's needs_swap
      	DSO__SWAP to read the data with proper swaps
      
      Together with other endianity patches, this change fixies perf report
      discrepancies on origin and target systems as described in test 1 below,
      e.g. following perf report diff:
      
      ...
            0.12%               ps  [kernel.kallsyms]    [k] clear_page
      -     0.12%              awk  bash                 [.] alloc_word_desc
      +     0.12%              awk  bash                 [.] yyparse
            0.11%   beah-rhts-task  libpython2.6.so.1.0  [.] 0x5560e
            0.10%             perf  libc-2.12.so         [.] __ctype_toupper_loc
      -     0.09%  rhts-test-runne  bash                 [.] maybe_make_export_env
      +     0.09%  rhts-test-runne  bash                 [.] 0x385a0
            0.09%               ps  [kernel.kallsyms]    [k] page_fault
      ...
      
      Note, running following to test perf endianity handling:
      test 1)
        - origin system:
          # perf record -a -- sleep 10 (any perf record will do)
          # perf report > report.origin
          # perf archive perf.data
      
        - copy the perf.data, report.origin and perf.data.tar.bz2
          to a target system and run:
          # tar xjvf perf.data.tar.bz2 -C ~/.debug
          # perf report > report.target
          # diff -u report.origin report.target
      
        - the diff should produce no output
          (besides some white space stuff and possibly different
           date/TZ output)
      
      test 1)
        - origin system:
          # perf record -ag -fo /tmp/perf.data -- sleep 1
        - mount origin system root to the target system on /mnt/origin
        - target system:
          # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
           --kallsyms /mnt/origin/proc/kallsyms
        - complete perf.data header is displayed
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1338380624-7443-2-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8db4841f
    • N
      perf evlist: Pass third argument to ioctl explicitly · 55da8005
      Namhyung Kim 提交于
      The ioctl on perf event fd wants 3 arguments but we only passed 2. As
      the only user of the functions is perf record and it calls them for
      every event (regardless of group setting), just pass 0 for now.
      Signed-off-by: NNamhyung Kim <namhyung.kim@lge.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1338443506-25009-3-git-send-email-namhyung.kim@lge.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      55da8005
    • N
      perf tools: Update ioctl documentation for PERF_IOC_FLAG_GROUP · a59e64a1
      Namhyung Kim 提交于
      The ioctl interface of perf event fd receives 3 arguments to control
      event group behavior but it lacked documentation.
      Signed-off-by: NNamhyung Kim <namhyung.kim@lge.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1338443506-25009-2-git-send-email-namhyung.kim@lge.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a59e64a1
    • A
      perf tools: Make --version show kernel version instead of pull req tag · aa5cdd30
      Arnaldo Carvalho de Melo 提交于
      Before:
      
        $ perf --version
        perf version perf.urgent.for.mingo.5.g37da28
      
      After:
      
        $ perf --version
        perf version 3.4.8941.g37da28.dirty
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-vc9b4e6023iegz9kabr3yvyv@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aa5cdd30
    • N
      perf tools: Check if callchain is corrupted · 114067b6
      Namhyung Kim 提交于
      We faced segmentation fault on perf top -G at very high sampling rate
      due to a corrupted callchain. While the root cause was not revealed (I
      failed to figure it out), this patch tries to protect us from the
      segfault on such cases.
      Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NNamhyung Kim <namhyung.kim@lge.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Sunjin Yang <fan4326@gmail.com>
      Link: http://lkml.kernel.org/r/1338443007-24857-2-git-send-email-namhyung.kim@lge.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      114067b6
    • N
      perf callchain: Make callchain cursors TLS · 47260645
      Namhyung Kim 提交于
      perf top -G has a race on callchain cursor between main thread and
      display thread. Since the callchain cursors are used locally make them
      thread-local data would solve the problem.
      Signed-off-by: NNamhyung Kim <namhyung.kim@lge.com>
      Reported-by: NSunjin Yang <fan4326@gmail.com>
      Suggested-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Sunjin Yang <fan4326@gmail.com>
      Link: http://lkml.kernel.org/r/1338443007-24857-1-git-send-email-namhyung.kim@lge.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      47260645
    • A
      perf tools: Fix pager on minimal-install embedded systems · ea1b3eba
      Avik Sil 提交于
      Some Distributions may lack "less" package being included by default,
      e.g., Linaro nano rootfs. In those cases use the portable "pager"
      command instead of "less".
      Signed-off-by: NAvik Sil <avik.sil@linaro.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1338287725-26382-1-git-send-email-avik.sil@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ea1b3eba
    • A
      perf tools: Fix make tarballs · f1439c31
      Arnaldo Carvalho de Melo 提交于
      The patch series that introduced the top level tools/ makefile and the
      libtraceevent broke this feature where files needed to build in a
      detached tarball were not included in the MANIFEST file and thus not
      included in the tarball.
      
      Fix it by adding the relevant files to the MANIFEST.
      
      Cc: Borislav Petkov <borislav.petkov@amd.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-z3mjj74927xvqwhlmu18kj80@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f1439c31
    • D
      perf script: Fix regression in callchain dso name · 52deff71
      David Ahern 提交于
      $ perf script -i /tmp/perf.data
      ...
      gcc 13623 544315.062858: context-switches:
          ffffffff815f65c9 __schedule ([kernel.kallsyms])
          ffffffff81087cea __cond_resched ([kernel.kallsyms])
          ffffffff815f6b92 _cond_resched ([kernel.kallsyms])
          ffffffff815fb87a do_page_fault ([kernel.kallsyms])
          ffffffff815f8465 page_fault ([kernel.kallsyms])
              2b7a71ea0303 _dl_lookup_symbol_x ([kernel.kallsyms])
              2b7a71ea1eb5 _dl_relocate_object ([kernel.kallsyms])
              2b7a71e99b2e dl_main ([kernel.kallsyms])
              2b7a71eab7f4 _dl_sysdep_start ([kernel.kallsyms])
      
      All DSO's in a callchain are printed as [kernel.kallsyms].
      
      git bisect chased it to:
      
      547a92e0 is the first bad commit
      commit 547a92e0
      Author: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
      Date:   Mon Jan 30 13:42:57 2012 +0900
      
          perf script: Unify the expressions indicating "unknown"
      
          The perf script command uses various expressions to indicate "unknown".
      
          It is unfriendly for user scripts to parse it. So, this patch unifies
          the expressions to "[unknown]".
      
      Looks like a copy-paste in that the other references use al.map but this one
      should be node->map.
      
      With this patch you get:
      
      $ perf script -i /tmp/perf.data
      ...
      gcc 13623 544315.062858: context-switches:
          ffffffff815f65c9 __schedule ([kernel.kallsyms])
          ffffffff81087cea __cond_resched ([kernel.kallsyms])
          ffffffff815f6b92 _cond_resched ([kernel.kallsyms])
          ffffffff815fb87a do_page_fault ([kernel.kallsyms])
          ffffffff815f8465 page_fault ([kernel.kallsyms])
              2b7a71ea0303 _dl_lookup_symbol_x (/lib64/ld-2.14.90.so)
              2b7a71ea1eb5 _dl_relocate_object (/lib64/ld-2.14.90.so)
              2b7a71e99b2e dl_main (/lib64/ld-2.14.90.so)
              2b7a71eab7f4 _dl_sysdep_start (/lib64/ld-2.14.90.so)
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1338353906-60706-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      52deff71
    • A
      perf stat: Initialize default events wrt exclude_{guest,host} · 79695e1b
      Arnaldo Carvalho de Melo 提交于
      When no event is specified the tools use perf_evlist__add_default(), that will
      call event_attr_init to initialize the KVM exclusion bits.
      
      When the change was made to the tools so that by default guest samples would be
      excluded, the changes were made just to the parsing routines and to
      perf_evlist__add_default(), not to perf_evlist__add_attrs, that is used so far
      just by perf stat to add multiple events, according to the level of detail
      specified.
      
      Recently the tools were changed to reconstruct the event name from all the
      details in perf_event_attr, not just from .type and .config, but taking into
      account all the feature bits (.exclude_{guest,host,user,kernel,etc},
      .precise_ip, etc).
      
      That is when we noticed that the default for perf stat wasn't the one for the
      rest of the tools, i.e. the .exclude_guest bit wasn't being set.
      
      I.e. the default, that doesn't call event_attr_init was showing the :HG
      modifier:
      
        $ perf stat usleep 1
      
         Performance counter stats for 'usleep 1':
      
                  0.942119 task-clock                #    0.454 CPUs utilized
                         1 context-switches          #    0.001 M/sec
                         0 CPU-migrations            #    0.000 K/sec
                       126 page-faults               #    0.134 M/sec
                   693,193 cycles:HG                 #    0.736 GHz                     [40.11%]
                   407,461 stalled-cycles-frontend:HG #   58.78% frontend cycles idle    [72.29%]
                   365,403 stalled-cycles-backend:HG #   52.71% backend  cycles idle
                   465,982 instructions:HG           #    0.67  insns per cycle
                                                     #    0.87  stalled cycles per insn
                    89,760 branches:HG               #   95.275 M/sec
                     6,178 branch-misses:HG          #    6.88% of all branches
      
               0.002077228 seconds time elapsed
      
      While if one explicitely specifies the same events, which will make the parsing code
      to be called and thus event_attr_init is called:
      
        $ perf stat -e task-clock,context-switches,migrations,page-faults,cycles,stalled-cycles-frontend,stalled-cycles-backend,instructions,branches,branch-misses usleep 1
      
         Performance counter stats for 'usleep 1':
      
                  1.040349 task-clock                #    0.500 CPUs utilized
                         2 context-switches          #    0.002 M/sec
                         0 CPU-migrations            #    0.000 K/sec
                       127 page-faults               #    0.122 M/sec
                   587,966 cycles                    #    0.565 GHz                     [13.18%]
                   459,167 stalled-cycles-frontend   #   78.09% frontend cycles idle
                   390,249 stalled-cycles-backend    #   66.37% backend  cycles idle
                   504,006 instructions              #    0.86  insns per cycle
                                                     #    0.91  stalled cycles per insn
                    96,455 branches                  #   92.714 M/sec
                     6,522 branch-misses             #    6.76% of all branches         [96.12%]
      
               0.002078681 seconds time elapsed
      
      Fix it by introducing a perf_evlist__add_default_attrs method that will call
      evlist_attr_init in all the perf_event_attr entries before adding the events.
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-4eysr236r0pgiyum9epwxw7s@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      79695e1b
  10. 30 5月, 2012 9 次提交