1. 06 6月, 2018 6 次提交
    • J
      perf stat: Display user and system time · 0ce2da14
      Jiri Olsa 提交于
      Adding the support to read rusage data once the workload is finished and
      display the system/user time values:
      
        $ perf stat --null perf bench sched pipe
        ...
      
         Performance counter stats for 'perf bench sched pipe':
      
             5.342599256 seconds time elapsed
      
             2.544434000 seconds user
             4.549691000 seconds sys
      
      It works only in non -r mode and only for workload target.
      
      So as of now, for workload targets, we display 3 types of timings. The
      time we meassure in perf stat from enable to disable+period:
      
             5.342599256 seconds time elapsed
      
      The time spent in user and system lands, displayed only for workload
      session/target:
      
             2.544434000 seconds user
             4.549691000 seconds sys
      
      Those times are the very same displayed by 'time' tool.  They are
      returned by wait4 call via the getrusage struct interface.
      
      Committer notes:
      
      Had to rename some variables to avoid this on older systems such as
      centos:6:
      
        builtin-stat.c: In function 'print_footer':
        builtin-stat.c:1831: warning: declaration of 'stime' shadows a global declaration
        /usr/include/time.h:297: warning: shadowed declaration is here
      
      Committer testing:
      
        # perf stat --null time perf bench sched pipe
        # Running 'sched/pipe' benchmark:
        # Executed 1000000 pipe operations between two processes
      
             Total time: 5.526 [sec]
      
               5.526534 usecs/op
                 180945 ops/sec
        1.00user 6.25system 0:05.52elapsed 131%CPU (0avgtext+0avgdata 8056maxresident)k
        0inputs+0outputs (0major+606minor)pagefaults 0swaps
      
         Performance counter stats for 'time perf bench sched pipe':
      
               5.530978744 seconds time elapsed
      
               1.004037000 seconds user
               6.259937000 seconds sys
      
        #
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180605121313.31337-1-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0ce2da14
    • A
      perf record: Enable arbitrary event names thru name= modifier · f92da712
      Alexey Budankov 提交于
      Enable complex event names containing [.:=,] symbols to be encoded into Perf
      trace using name= modifier e.g. like this:
      
        perf record -e cpu/name=\'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM\',\
      		  period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex
      
      Below is how it looks like in the report output. Please note explicit escaped
      quoting at cmdline string in the header so that thestring can be directly reused
      for another collection in shell:
      
      perf report --header
      
        # ========
        ...
        # cmdline : /root/abudanko/kernel/tip/tools/perf/perf record -v -e cpu/name=\'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM\',period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex
        # event : name = OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM, , type = 4, size = 112, config = 0x100003c, { sample_period, sample_freq } = 3500000, sample_type = IP|TID|TIME, disabled = 1, inh
        ...
        # ========
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 24K of event 'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM'
        # Event count (approx.): 86492000000
        #
        # Overhead  Command  Shared Object     Symbol
        # ........  .......  ................  ..............................................
        #
            14.75%  futex    [kernel.vmlinux]  [k] __entry_trampoline_start
      ...
      
        perf stat -e cpu/name=\'CPU_CLK_UNHALTED.THREAD:cmask=0x1\',period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex
      
        10000000 process context switches in 16678890291ns (1667.9ns/ctxsw)
      
         Performance counter stats for './futex':
      
            88,095,770,571      CPU_CLK_UNHALTED.THREAD:cmask=0x1
      
              16.679542407 seconds time elapsed
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Acked-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/c194b060-761d-0d50-3b21-bb4ed680002d@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f92da712
    • A
      perf tools: Fix symbol and object code resolution for vdso32 and vdsox32 · aef4feac
      Adrian Hunter 提交于
      Fix __kmod_path__parse() so that perf tools does not treat vdso32 and
      vdsox32 as kernel modules and fail to find the object.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: stable@vger.kernel.org
      Fixes: 1f121b03 ("perf tools: Deal with kernel module names in '[]' correctly")
      Link: http://lkml.kernel.org/r/1528117014-30032-3-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aef4feac
    • A
      perf tests kmod-path: Add tests for vdso32 and vdsox32 · dcaeae4e
      Adrian Hunter 提交于
      Add tests for vdso32 and vdsox32. This will cause the overall test to
      fail because __kmod_path__parse() does not handle vdso32 or vdsox32.
      
      Fixes: 1f121b03 ("perf tools: Deal with kernel module names in '[]' correctly")
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1528117014-30032-2-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dcaeae4e
    • A
      perf hists: Check if a hist_entry has callchains before using them · fabd37b8
      Arnaldo Carvalho de Melo 提交于
      So far if we use 'perf record -g' this will make
      symbol_conf.use_callchain 'true' and logic will assume that all events
      have callchains enabled, but ever since we added the possibility of
      setting up callchains for some events (e.g.: -e
      cycles/call-graph=dwarf/) while not for others, we limit usage scenarios
      by looking at that symbol_conf.use_callchain global boolean, we better
      look at each event attributes.
      
      On the road to that we need to look if a hist_entry has callchains, that
      is, to go from hist_entry->hists to the evsel that contains it, to then
      look at evsel->sample_type for PERF_SAMPLE_CALLCHAIN.
      
      The next step is to add a symbol_conf.ignore_callchains global, to use
      in the places where what we really want to know is if callchains should
      be ignored, even if present.
      
      Then -g will mean just to select a callchain mode to be applied to all
      events not explicitely setting some other callchain mode, i.e. a default
      callchain mode, and --no-call-graph will set
      symbol_conf.ignore_callchains with that clear intention.
      
      That too will at some point become a per evsel thing, that tools can set
      for all or just a few of its evsels.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-0sas5cm4dsw2obn75g7ruz69@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fabd37b8
    • A
      perf hists: Introduce hist_entry__has_callchain() method · 0b5d6ece
      Arnaldo Carvalho de Melo 提交于
      We'll use this helper more frequently when reworking
      symbol_conf.use_callchain logic, where knowing if a hist_entry has
      callchains is the important bit, so make going from hist_entry to hists
      to evsel easier, compact.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-p6gioxkzpkpz71dtt4wcs36o@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0b5d6ece
  2. 05 6月, 2018 3 次提交
  3. 04 6月, 2018 30 次提交
  4. 03 6月, 2018 1 次提交
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 26bdace7
      Linus Torvalds 提交于
      Pull perf tooling fixes from Thomas Gleixner:
      
       - fix 'perf test Session topology' segfault on s390 (Thomas Richter)
      
       - fix NULL return handling in bpf__prepare_load() (YueHaibing)
      
       - fix indexing on Coresight ETM packet queue decoder (Mathieu Poirier)
      
       - fix perf.data format description of NRCPUS header (Arnaldo Carvalho
         de Melo)
      
       - update perf.data documentation section on cpu topology
      
       - handle uncore event aliases in small groups properly (Kan Liang)
      
       - add missing perf_sample.addr into python sample dictionary (Leo Yan)
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf tools: Fix perf.data format description of NRCPUS header
        perf script python: Add addr into perf sample dict
        perf data: Update documentation section on cpu topology
        perf cs-etm: Fix indexing for decoder packet queue
        perf bpf: Fix NULL return handling in bpf__prepare_load()
        perf test: "Session topology" dumps core on s390
        perf parse-events: Handle uncore event aliases in small groups properly
      26bdace7