1. 31 5月, 2012 5 次提交
    • N
      perf callchain: Make callchain cursors TLS · 47260645
      Namhyung Kim 提交于
      perf top -G has a race on callchain cursor between main thread and
      display thread. Since the callchain cursors are used locally make them
      thread-local data would solve the problem.
      Signed-off-by: NNamhyung Kim <namhyung.kim@lge.com>
      Reported-by: NSunjin Yang <fan4326@gmail.com>
      Suggested-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Sunjin Yang <fan4326@gmail.com>
      Link: http://lkml.kernel.org/r/1338443007-24857-1-git-send-email-namhyung.kim@lge.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      47260645
    • A
      perf tools: Fix pager on minimal-install embedded systems · ea1b3eba
      Avik Sil 提交于
      Some Distributions may lack "less" package being included by default,
      e.g., Linaro nano rootfs. In those cases use the portable "pager"
      command instead of "less".
      Signed-off-by: NAvik Sil <avik.sil@linaro.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1338287725-26382-1-git-send-email-avik.sil@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ea1b3eba
    • A
      perf tools: Fix make tarballs · f1439c31
      Arnaldo Carvalho de Melo 提交于
      The patch series that introduced the top level tools/ makefile and the
      libtraceevent broke this feature where files needed to build in a
      detached tarball were not included in the MANIFEST file and thus not
      included in the tarball.
      
      Fix it by adding the relevant files to the MANIFEST.
      
      Cc: Borislav Petkov <borislav.petkov@amd.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-z3mjj74927xvqwhlmu18kj80@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f1439c31
    • D
      perf script: Fix regression in callchain dso name · 52deff71
      David Ahern 提交于
      $ perf script -i /tmp/perf.data
      ...
      gcc 13623 544315.062858: context-switches:
          ffffffff815f65c9 __schedule ([kernel.kallsyms])
          ffffffff81087cea __cond_resched ([kernel.kallsyms])
          ffffffff815f6b92 _cond_resched ([kernel.kallsyms])
          ffffffff815fb87a do_page_fault ([kernel.kallsyms])
          ffffffff815f8465 page_fault ([kernel.kallsyms])
              2b7a71ea0303 _dl_lookup_symbol_x ([kernel.kallsyms])
              2b7a71ea1eb5 _dl_relocate_object ([kernel.kallsyms])
              2b7a71e99b2e dl_main ([kernel.kallsyms])
              2b7a71eab7f4 _dl_sysdep_start ([kernel.kallsyms])
      
      All DSO's in a callchain are printed as [kernel.kallsyms].
      
      git bisect chased it to:
      
      547a92e0 is the first bad commit
      commit 547a92e0
      Author: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
      Date:   Mon Jan 30 13:42:57 2012 +0900
      
          perf script: Unify the expressions indicating "unknown"
      
          The perf script command uses various expressions to indicate "unknown".
      
          It is unfriendly for user scripts to parse it. So, this patch unifies
          the expressions to "[unknown]".
      
      Looks like a copy-paste in that the other references use al.map but this one
      should be node->map.
      
      With this patch you get:
      
      $ perf script -i /tmp/perf.data
      ...
      gcc 13623 544315.062858: context-switches:
          ffffffff815f65c9 __schedule ([kernel.kallsyms])
          ffffffff81087cea __cond_resched ([kernel.kallsyms])
          ffffffff815f6b92 _cond_resched ([kernel.kallsyms])
          ffffffff815fb87a do_page_fault ([kernel.kallsyms])
          ffffffff815f8465 page_fault ([kernel.kallsyms])
              2b7a71ea0303 _dl_lookup_symbol_x (/lib64/ld-2.14.90.so)
              2b7a71ea1eb5 _dl_relocate_object (/lib64/ld-2.14.90.so)
              2b7a71e99b2e dl_main (/lib64/ld-2.14.90.so)
              2b7a71eab7f4 _dl_sysdep_start (/lib64/ld-2.14.90.so)
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1338353906-60706-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      52deff71
    • A
      perf stat: Initialize default events wrt exclude_{guest,host} · 79695e1b
      Arnaldo Carvalho de Melo 提交于
      When no event is specified the tools use perf_evlist__add_default(), that will
      call event_attr_init to initialize the KVM exclusion bits.
      
      When the change was made to the tools so that by default guest samples would be
      excluded, the changes were made just to the parsing routines and to
      perf_evlist__add_default(), not to perf_evlist__add_attrs, that is used so far
      just by perf stat to add multiple events, according to the level of detail
      specified.
      
      Recently the tools were changed to reconstruct the event name from all the
      details in perf_event_attr, not just from .type and .config, but taking into
      account all the feature bits (.exclude_{guest,host,user,kernel,etc},
      .precise_ip, etc).
      
      That is when we noticed that the default for perf stat wasn't the one for the
      rest of the tools, i.e. the .exclude_guest bit wasn't being set.
      
      I.e. the default, that doesn't call event_attr_init was showing the :HG
      modifier:
      
        $ perf stat usleep 1
      
         Performance counter stats for 'usleep 1':
      
                  0.942119 task-clock                #    0.454 CPUs utilized
                         1 context-switches          #    0.001 M/sec
                         0 CPU-migrations            #    0.000 K/sec
                       126 page-faults               #    0.134 M/sec
                   693,193 cycles:HG                 #    0.736 GHz                     [40.11%]
                   407,461 stalled-cycles-frontend:HG #   58.78% frontend cycles idle    [72.29%]
                   365,403 stalled-cycles-backend:HG #   52.71% backend  cycles idle
                   465,982 instructions:HG           #    0.67  insns per cycle
                                                     #    0.87  stalled cycles per insn
                    89,760 branches:HG               #   95.275 M/sec
                     6,178 branch-misses:HG          #    6.88% of all branches
      
               0.002077228 seconds time elapsed
      
      While if one explicitely specifies the same events, which will make the parsing code
      to be called and thus event_attr_init is called:
      
        $ perf stat -e task-clock,context-switches,migrations,page-faults,cycles,stalled-cycles-frontend,stalled-cycles-backend,instructions,branches,branch-misses usleep 1
      
         Performance counter stats for 'usleep 1':
      
                  1.040349 task-clock                #    0.500 CPUs utilized
                         2 context-switches          #    0.002 M/sec
                         0 CPU-migrations            #    0.000 K/sec
                       127 page-faults               #    0.122 M/sec
                   587,966 cycles                    #    0.565 GHz                     [13.18%]
                   459,167 stalled-cycles-frontend   #   78.09% frontend cycles idle
                   390,249 stalled-cycles-backend    #   66.37% backend  cycles idle
                   504,006 instructions              #    0.86  insns per cycle
                                                     #    0.91  stalled cycles per insn
                    96,455 branches                  #   92.714 M/sec
                     6,522 branch-misses             #    6.76% of all branches         [96.12%]
      
               0.002078681 seconds time elapsed
      
      Fix it by introducing a perf_evlist__add_default_attrs method that will call
      evlist_attr_init in all the perf_event_attr entries before adding the events.
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-4eysr236r0pgiyum9epwxw7s@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      79695e1b
  2. 30 5月, 2012 7 次提交
  3. 29 5月, 2012 2 次提交
  4. 26 5月, 2012 3 次提交
    • S
      perf record: Fix branch_stack type in perf_record_opts · a00dc319
      Stephane Eranian 提交于
      The attr.branch_sample_type field is defined as u64 by the API.  As
      such, we need to ensure the variable holding the value of the branch
      stack filters is also u64 otherwise we may lose bits in the future.
      
      Note also that the bogus definition of the field in perf_record_opts
      caused problems on big-endian PPC systems.  Thanks to Anshuman Khandual
      for tracking the problem on PPC.
      Reported-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120525211344.GA7729@quadSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a00dc319
    • A
      perf tools: Reconstruct event with modifiers from perf_event_attr · c410431c
      Arnaldo Carvalho de Melo 提交于
      The modifiers:
      
        k		kernel space
        u		user space
        h		hypervisor
        G		guest
        H		host
        p, pp, ppp    precision level (PEBS)
      
      that can be suffixed to an event were lost when tools used event_name()
      to reconstruct them from the perf_event_attr entries in a perf.data
      file.
      
      Fix it by following the defaults used for these modifiers in the current
      codebase, so:
      
       $ perf record -e instructions:u usleep 1 2> /dev/null
       $ perf evlist
       instructions:u
       $ perf record -e cycles:k usleep 1 2> /dev/null
       $ perf evlist
       cycles:k
       $ perf record -e cycles:kh usleep 1 2> /dev/null
       $ perf evlist
       cycles:kh
       $ perf record -e cache-misses:G usleep 1 2> /dev/null
       $ perf evlist
       cache-misses:G
       $ perf record -e cycles:ppk usleep 1 2> /dev/null
       $ perf evlist
       cycles:kpp
       $
      
      Also works with 'top', 'report', etc.
      
      More work needed to cover tracepoints and software events while not
      dragging lots of baggage to the python binding, this is a minimal fix
      for v3.5.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-4hl5glle0hxlklw4usva1mkt@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c410431c
    • A
      perf top: Fix counter name fixup when fallbacking to cpu-clock · 895d9766
      Arnaldo Carvalho de Melo 提交于
      In 40491eaa "perf top: Update event name when falling back to cpu-clock"
      we freed counter->name but didn't reset it to NULL, then when setting it
      to the result of event_name(), event_name() would use the cached value,
      which by now was overwritten and thus we got garbage or a zero lenght
      string.
      
      Fix it by just freeing and setting counter->name to NULL, this way
      event_name() when called afterwards, will find the right counter name
      and cache it again.
      
      Found while trying 'cycles:pp' on a machine were :pp couldn't be
      honoured. Probably the best fallback here is to tell the user that that
      level of precision is not available on the PMU and then go removing 'p',
      levels of precision till we get to play 'cycles' and if even that fails,
      _then_ get to 'cpu-clock'.
      
      But that is the matter for another patch, this one just needs to fix the
      caching issue, which in the end will show 'cpu-clock' when tools ask for
      the event name being used, which clarifies things for the user, that
      will see that 'cycles:pp' or whatever not support event is not being
      used, some sort of fallback happened.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-w1neie2dqli89we1bzwkf4id@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      895d9766
  5. 25 5月, 2012 2 次提交
  6. 24 5月, 2012 1 次提交
  7. 23 5月, 2012 3 次提交
    • A
      perf evlist: Show event attribute details · 26252ea6
      Arnaldo Carvalho de Melo 提交于
      There was no easy way to see the frequency used, and with the change of
      default, we better provide one.
      
      [root@sandy linux]# perf evlist -F
      cycles: sample_freq=4000
      [root@sandy linux]# perf evlist -v
      cycles: sample_freq=4000, size: 80, sample_type: 391, read_format: 7, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
      [root@sandy linux]#
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-e1p9poez3nwrgycbmwqmhlsu@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      26252ea6
    • A
      perf tools: Bump default sample freq to 4 kHz · 447a6013
      Arnaldo Carvalho de Melo 提交于
      Quoting Ingo:
      
      "While at it I'd also suggest increasing the default sampling frequency,
      from 1000 Hz per CPU to at least 4Khz auto-freq or so - this should work
      well all across the board I think. CPUs are getting faster and command/app
      run times are getting shorter, 1Khz is a bit low IMO."
      Requested-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-2jafa6mkrufyekny9ei59lpu@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      447a6013
    • S
      perf buildid-list: Work better with pipe mode · 299c3452
      Stephane Eranian 提交于
      In order for perf buildid-list to work with pipe-mode files, it needs to
      process buildids and event attr structs.
      
      $ perf record -o - noploop 2 | ./perf inject -b | perf buildid-list -i - -H
      noploop for 2 seconds
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.084 MB - (~3678 samples) ]
      0000000000000000000000000000000000000000 [kernel.kallsyms]
      3a0d0629efe74a8da3eeba372cdbd74ad9b8f5d5 /usr/local/bin/noploop
      
      The reason [kernel.kallsyms] shows a 0 build-id comes from the
      way buildids are injected in the stream.
      
      The buildid for the kernel is provided by a BUILD_ID record. The
      [kernel.kallsyms] is provided by a MMAP record. There is no clean and
      obvious way to link the two, unfortunately.
      
      In regular mode, the kernel buildid is generated from reading the ELF
      image or kallsyms and perf knows to associate [kernel.kallsyms] to it.
      Later on, when perf processes the [kernel.kallsyms] MMAP record, it will
      already have a dso for it.
      
      So for now, make sure perf buildid-list shows the buildids for
      everything but the kernel image.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1337081295-10303-6-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      299c3452
  8. 22 5月, 2012 16 次提交
  9. 19 5月, 2012 1 次提交
    • D
      perf evsel: Create events initially disabled -- again · 5e1c81d9
      David Ahern 提交于
      764e16a3 changed perf-record to create events disabled by default and
      enable them once perf initializations are done. This setting was dropped
      by 0f82ebc4. Now perf events are once again generated during perf's
      initialization phase (e.g., generating maps).
      
      As an example, perf opens a lot of files at startup. Unpatched:
      
      perf record -e syscalls:sys_enter_open -ga -fo /tmp/perf.data -- sleep 1
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.087 MB /tmp/perf.data (~3798 samples) ]
      
      Using perf-script to look at the samples shows the perf command generating
      563 of the 566 total events.
      
      Patched:
      
      perf record -e syscalls:sys_enter_open -ga -fo /tmp/perf.data -- sleep 1
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.028 MB /tmp/perf.data (~1206 samples) ]
      
      Using perf-script to look at the samples does not show perf command.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Link: http://lkml.kernel.org/r/1336968088-11531-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5e1c81d9