1. 31 5月, 2012 5 次提交
    • N
      perf callchain: Make callchain cursors TLS · 47260645
      Namhyung Kim 提交于
      perf top -G has a race on callchain cursor between main thread and
      display thread. Since the callchain cursors are used locally make them
      thread-local data would solve the problem.
      Signed-off-by: NNamhyung Kim <namhyung.kim@lge.com>
      Reported-by: NSunjin Yang <fan4326@gmail.com>
      Suggested-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Sunjin Yang <fan4326@gmail.com>
      Link: http://lkml.kernel.org/r/1338443007-24857-1-git-send-email-namhyung.kim@lge.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      47260645
    • A
      perf tools: Fix pager on minimal-install embedded systems · ea1b3eba
      Avik Sil 提交于
      Some Distributions may lack "less" package being included by default,
      e.g., Linaro nano rootfs. In those cases use the portable "pager"
      command instead of "less".
      Signed-off-by: NAvik Sil <avik.sil@linaro.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1338287725-26382-1-git-send-email-avik.sil@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ea1b3eba
    • A
      perf tools: Fix make tarballs · f1439c31
      Arnaldo Carvalho de Melo 提交于
      The patch series that introduced the top level tools/ makefile and the
      libtraceevent broke this feature where files needed to build in a
      detached tarball were not included in the MANIFEST file and thus not
      included in the tarball.
      
      Fix it by adding the relevant files to the MANIFEST.
      
      Cc: Borislav Petkov <borislav.petkov@amd.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-z3mjj74927xvqwhlmu18kj80@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f1439c31
    • D
      perf script: Fix regression in callchain dso name · 52deff71
      David Ahern 提交于
      $ perf script -i /tmp/perf.data
      ...
      gcc 13623 544315.062858: context-switches:
          ffffffff815f65c9 __schedule ([kernel.kallsyms])
          ffffffff81087cea __cond_resched ([kernel.kallsyms])
          ffffffff815f6b92 _cond_resched ([kernel.kallsyms])
          ffffffff815fb87a do_page_fault ([kernel.kallsyms])
          ffffffff815f8465 page_fault ([kernel.kallsyms])
              2b7a71ea0303 _dl_lookup_symbol_x ([kernel.kallsyms])
              2b7a71ea1eb5 _dl_relocate_object ([kernel.kallsyms])
              2b7a71e99b2e dl_main ([kernel.kallsyms])
              2b7a71eab7f4 _dl_sysdep_start ([kernel.kallsyms])
      
      All DSO's in a callchain are printed as [kernel.kallsyms].
      
      git bisect chased it to:
      
      547a92e0 is the first bad commit
      commit 547a92e0
      Author: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
      Date:   Mon Jan 30 13:42:57 2012 +0900
      
          perf script: Unify the expressions indicating "unknown"
      
          The perf script command uses various expressions to indicate "unknown".
      
          It is unfriendly for user scripts to parse it. So, this patch unifies
          the expressions to "[unknown]".
      
      Looks like a copy-paste in that the other references use al.map but this one
      should be node->map.
      
      With this patch you get:
      
      $ perf script -i /tmp/perf.data
      ...
      gcc 13623 544315.062858: context-switches:
          ffffffff815f65c9 __schedule ([kernel.kallsyms])
          ffffffff81087cea __cond_resched ([kernel.kallsyms])
          ffffffff815f6b92 _cond_resched ([kernel.kallsyms])
          ffffffff815fb87a do_page_fault ([kernel.kallsyms])
          ffffffff815f8465 page_fault ([kernel.kallsyms])
              2b7a71ea0303 _dl_lookup_symbol_x (/lib64/ld-2.14.90.so)
              2b7a71ea1eb5 _dl_relocate_object (/lib64/ld-2.14.90.so)
              2b7a71e99b2e dl_main (/lib64/ld-2.14.90.so)
              2b7a71eab7f4 _dl_sysdep_start (/lib64/ld-2.14.90.so)
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1338353906-60706-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      52deff71
    • A
      perf stat: Initialize default events wrt exclude_{guest,host} · 79695e1b
      Arnaldo Carvalho de Melo 提交于
      When no event is specified the tools use perf_evlist__add_default(), that will
      call event_attr_init to initialize the KVM exclusion bits.
      
      When the change was made to the tools so that by default guest samples would be
      excluded, the changes were made just to the parsing routines and to
      perf_evlist__add_default(), not to perf_evlist__add_attrs, that is used so far
      just by perf stat to add multiple events, according to the level of detail
      specified.
      
      Recently the tools were changed to reconstruct the event name from all the
      details in perf_event_attr, not just from .type and .config, but taking into
      account all the feature bits (.exclude_{guest,host,user,kernel,etc},
      .precise_ip, etc).
      
      That is when we noticed that the default for perf stat wasn't the one for the
      rest of the tools, i.e. the .exclude_guest bit wasn't being set.
      
      I.e. the default, that doesn't call event_attr_init was showing the :HG
      modifier:
      
        $ perf stat usleep 1
      
         Performance counter stats for 'usleep 1':
      
                  0.942119 task-clock                #    0.454 CPUs utilized
                         1 context-switches          #    0.001 M/sec
                         0 CPU-migrations            #    0.000 K/sec
                       126 page-faults               #    0.134 M/sec
                   693,193 cycles:HG                 #    0.736 GHz                     [40.11%]
                   407,461 stalled-cycles-frontend:HG #   58.78% frontend cycles idle    [72.29%]
                   365,403 stalled-cycles-backend:HG #   52.71% backend  cycles idle
                   465,982 instructions:HG           #    0.67  insns per cycle
                                                     #    0.87  stalled cycles per insn
                    89,760 branches:HG               #   95.275 M/sec
                     6,178 branch-misses:HG          #    6.88% of all branches
      
               0.002077228 seconds time elapsed
      
      While if one explicitely specifies the same events, which will make the parsing code
      to be called and thus event_attr_init is called:
      
        $ perf stat -e task-clock,context-switches,migrations,page-faults,cycles,stalled-cycles-frontend,stalled-cycles-backend,instructions,branches,branch-misses usleep 1
      
         Performance counter stats for 'usleep 1':
      
                  1.040349 task-clock                #    0.500 CPUs utilized
                         2 context-switches          #    0.002 M/sec
                         0 CPU-migrations            #    0.000 K/sec
                       127 page-faults               #    0.122 M/sec
                   587,966 cycles                    #    0.565 GHz                     [13.18%]
                   459,167 stalled-cycles-frontend   #   78.09% frontend cycles idle
                   390,249 stalled-cycles-backend    #   66.37% backend  cycles idle
                   504,006 instructions              #    0.86  insns per cycle
                                                     #    0.91  stalled cycles per insn
                    96,455 branches                  #   92.714 M/sec
                     6,522 branch-misses             #    6.76% of all branches         [96.12%]
      
               0.002078681 seconds time elapsed
      
      Fix it by introducing a perf_evlist__add_default_attrs method that will call
      evlist_attr_init in all the perf_event_attr entries before adding the events.
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-4eysr236r0pgiyum9epwxw7s@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      79695e1b
  2. 30 5月, 2012 9 次提交
  3. 29 5月, 2012 2 次提交
  4. 27 5月, 2012 1 次提交
  5. 26 5月, 2012 3 次提交
    • S
      perf record: Fix branch_stack type in perf_record_opts · a00dc319
      Stephane Eranian 提交于
      The attr.branch_sample_type field is defined as u64 by the API.  As
      such, we need to ensure the variable holding the value of the branch
      stack filters is also u64 otherwise we may lose bits in the future.
      
      Note also that the bogus definition of the field in perf_record_opts
      caused problems on big-endian PPC systems.  Thanks to Anshuman Khandual
      for tracking the problem on PPC.
      Reported-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120525211344.GA7729@quadSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a00dc319
    • A
      perf tools: Reconstruct event with modifiers from perf_event_attr · c410431c
      Arnaldo Carvalho de Melo 提交于
      The modifiers:
      
        k		kernel space
        u		user space
        h		hypervisor
        G		guest
        H		host
        p, pp, ppp    precision level (PEBS)
      
      that can be suffixed to an event were lost when tools used event_name()
      to reconstruct them from the perf_event_attr entries in a perf.data
      file.
      
      Fix it by following the defaults used for these modifiers in the current
      codebase, so:
      
       $ perf record -e instructions:u usleep 1 2> /dev/null
       $ perf evlist
       instructions:u
       $ perf record -e cycles:k usleep 1 2> /dev/null
       $ perf evlist
       cycles:k
       $ perf record -e cycles:kh usleep 1 2> /dev/null
       $ perf evlist
       cycles:kh
       $ perf record -e cache-misses:G usleep 1 2> /dev/null
       $ perf evlist
       cache-misses:G
       $ perf record -e cycles:ppk usleep 1 2> /dev/null
       $ perf evlist
       cycles:kpp
       $
      
      Also works with 'top', 'report', etc.
      
      More work needed to cover tracepoints and software events while not
      dragging lots of baggage to the python binding, this is a minimal fix
      for v3.5.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-4hl5glle0hxlklw4usva1mkt@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c410431c
    • A
      perf top: Fix counter name fixup when fallbacking to cpu-clock · 895d9766
      Arnaldo Carvalho de Melo 提交于
      In 40491eaa "perf top: Update event name when falling back to cpu-clock"
      we freed counter->name but didn't reset it to NULL, then when setting it
      to the result of event_name(), event_name() would use the cached value,
      which by now was overwritten and thus we got garbage or a zero lenght
      string.
      
      Fix it by just freeing and setting counter->name to NULL, this way
      event_name() when called afterwards, will find the right counter name
      and cache it again.
      
      Found while trying 'cycles:pp' on a machine were :pp couldn't be
      honoured. Probably the best fallback here is to tell the user that that
      level of precision is not available on the PMU and then go removing 'p',
      levels of precision till we get to play 'cycles' and if even that fails,
      _then_ get to 'cpu-clock'.
      
      But that is the matter for another patch, this one just needs to fix the
      caching issue, which in the end will show 'cpu-clock' when tools ask for
      the event name being used, which clarifies things for the user, that
      will see that 'cycles:pp' or whatever not support event is not being
      used, some sort of fallback happened.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-w1neie2dqli89we1bzwkf4id@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      895d9766
  6. 25 5月, 2012 2 次提交
  7. 24 5月, 2012 7 次提交
  8. 23 5月, 2012 4 次提交
  9. 22 5月, 2012 7 次提交