1. 18 12月, 2015 4 次提交
    • J
      perf stat record: Synthesize stat record data · 8b99b1a4
      Jiri Olsa 提交于
      Synthesizing needed stat record data for report/script:
        - cpu/thread maps
        - stat config
      
      Committer note:
      
      New records generated on a perf.data file with this patch:
      
        $ perf report -D | grep PERF_RECORD_
        0x568 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 29097
        0x590 [0x12]: PERF_RECORD_CPU_MAP nr: 1 cpu: 65535
        0x5a2 [0x40]: PERF_RECORD_STAT_CONFIG
        $
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NKan Liang <kan.liang@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1446734469-11352-5-git-send-email-jolsa@kernel.org
      [ Adjusted wrt kernel PERF_RECORD_MMAP added when introducing 'perf stat record' ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8b99b1a4
    • J
      perf stat record: Initialize record features · 3ba78bd0
      Jiri Olsa 提交于
      Disabling all non stat related features.
      
      Also as we now enable STAT feature in the data file, adding code to
      instruct session open to skip sample type checking for stat data files.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NKan Liang <kan.liang@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1446734469-11352-4-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3ba78bd0
    • J
      perf stat record: Add record command · 4979d0c7
      Jiri Olsa 提交于
      Add 'perf stat record' command support. It creates simple (header only)
      perf.data file ATM.
      
      The record command could be specified anywhere among stat options. All
      stat command options are valid for stat record command with '-o' option
      exception. If specified for record command it denotes the perf data file
      name.
      
      Committer note:
      
      Set sample_type to PERF_SAMPLE_IDENTIFIER, which should be harmless
      while avoiding that older tools show confusing messages, for instance,
      with sample_type = 0, we get:
      
        $ perf stat record usleep 1
      
         Performance counter stats for 'usleep 1':
      
                0.630237      task-clock (msec)         #    0.528 CPUs utilized
                       1      context-switches          #    0.002 M/sec
                       0      cpu-migrations            #    0.000 K/sec
                      52      page-faults               #    0.083 M/sec
                 978,312      cycles                    #    1.552 GHz
                 671,931      stalled-cycles-frontend   #   68.68% frontend cycles idle
         <not supported>      stalled-cycles-backend
                 646,379      instructions              #    0.66  insns per cycle
                                                        #    1.04  stalled cycles per insn
                 131,046      branches                  #  207.931 M/sec
                   7,073      branch-misses             #    5.40% of all branches
      
             0.001193240 seconds time elapsed
      
        $ oldperf evlist
        WARNING: The perf.data file's data size field is 0 which is unexpected.
        Was the 'perf record' command properly terminated?
        non matching sample_type
        $
      
      While with sample_type set to PERF_SAMPLE_IDENTIFIER, after we re-run 'perf
      stat record usleep' we get:
      
        $ oldperf evlist
        WARNING: The perf.data file's data size field is 0 which is unexpected.
        Was the 'perf record' command properly terminated?
        task-clock
        context-switches
        cpu-migrations
        page-faults
        cycles
        stalled-cycles-frontend
        stalled-cycles-backend
        instructions
        branches
        branch-misses
        $
      
      Which at least shows the names of the events in the perf.data file.
      
      Additionally, such files, when passed to 'perf report' will produce:
      
        $ oldperf report --stdio
        WARNING: The perf.data file's data size field is 0 which is unexpected.
        Was the 'perf record' command properly terminated?
        Warning:
        Kernel address maps (/proc/{kallsyms,modules}) were restricted.
      
        Check /proc/sys/kernel/kptr_restrict before running 'perf record'.
      
        As no suitable kallsyms nor vmlinux was found, kernel samples
        can't be resolved.
      
        Samples in kernel modules can't be resolved as well.
      
        Error:
        The perf.data file has no samples!
        # To display the perf.data header info, please use --header/--header-only options.
        #
        $
      
      Which is confusing and can be solved by just adding the kernel mmap record,
      which will also remove that warning about the data size field being equal to
      zero, after generating the mmap record:
      
        $ perf stat record usleep 1
      
         Performance counter stats for 'usleep 1':
      
                0.600796      task-clock (msec)         #    0.478 CPUs utilized
                       1      context-switches          #    0.002 M/sec
                       0      cpu-migrations            #    0.000 K/sec
                      54      page-faults               #    0.090 M/sec
                 886,844      cycles                    #    1.476 GHz
                 582,169      stalled-cycles-frontend   #   65.65% frontend cycles idle
         <not supported>      stalled-cycles-backend
                 638,344      instructions              #    0.72  insns per cycle
                                                        #    0.91  stalled cycles per insn
                 130,204      branches                  #  216.719 M/sec
                   7,500      branch-misses             #    5.76% of all branches
      
             0.001255897 seconds time elapsed
      
        $ oldperf evlist
        task-clock
        context-switches
        cpu-migrations
        page-faults
        cycles
        stalled-cycles-frontend
        stalled-cycles-backend
        instructions
        branches
        branch-misses
        $ oldperf report --stdio
        Error:
        The perf.data file has no samples!
        # To display the perf.data header info, please use --header/--header-only options.
        #
        [acme@zoo linux]$
      
      No warnings, sensible output about what are the events in the perf.data file and also
      a "file has no samples" message, which indeed it doesn't.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NKan Liang <kan.liang@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: htp://lkml.kernel.org/r/1446734469-11352-3-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4979d0c7
    • J
      perf subcmd: Create subcmd library · 4b6ab94e
      Josh Poimboeuf 提交于
      Move the subcommand-related files from perf to a new library named
      libsubcmd.a.
      
      Since we're moving files anyway, go ahead and rename 'exec_cmd.*' to
      'exec-cmd.*' to be consistent with the naming of all the other files.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/c0a838d4c878ab17fee50998811612b2281355c1.1450193761.git.jpoimboe@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4b6ab94e
  2. 10 12月, 2015 1 次提交
    • M
      perf stat: Fix cmd_stat to release cpu_map · 544c2ae7
      Masami Hiramatsu 提交于
      Fix cmd_stat() to release cpu_map objects (aggr_map and
      cpus_aggr_map) afterwards.
      
      refcnt debugger shows that the cmd_stat initializes cpu_map
      but not puts it.
        ----
        # ./perf stat -v ls
        ....
        REFCNT: BUG: Unreclaimed objects found.
        ==== [0] ====
        Unreclaimed cpu_map@0x29339c0
        Refcount +1 => 1 at
          ./perf(cpu_map__empty_new+0x6d) [0x4e64bd]
          ./perf(cmd_stat+0x5fe) [0x43594e]
          ./perf() [0x47b785]
          ./perf(main+0x617) [0x422587]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2dff420af5]
          ./perf() [0x4226fd]
        REFCNT: Total 1 objects are not reclaimed.
          "cpu_map" leaks 1 objects
        ----
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151209021127.10245.93697.stgit@localhost.localdomain
      [ Remove NULL checks before calling the put operation, it checks it already ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      544c2ae7
  3. 08 12月, 2015 4 次提交
  4. 27 11月, 2015 1 次提交
  5. 06 11月, 2015 1 次提交
  6. 05 11月, 2015 2 次提交
  7. 28 10月, 2015 1 次提交
  8. 20 10月, 2015 3 次提交
  9. 03 10月, 2015 1 次提交
    • K
      perf stat: Reduce min --interval-print to 10ms · 19afd104
      Kan Liang 提交于
      The --interval-print parameter was limited to 100ms. However, for
      example, 10ms is required to do sophisticated bandwidth analysis using
      uncore events.
      
      The test shows that the overhead of the system-wide uncore monitoring
      with 10ms interval is only ~2%. So this patch reduces the minimal
      interval-print allowd to 10ms.
      
      But 10ms may not work well for all cases. For example, when the
      cpus/threads number is very large, for system-wide core event monitoring
      the overhead could be high.
      
      To handle this issue, a warning will be displayed when the
      interval-print is set between 10ms to 100ms. So users can make a
      decision according to their specific cases.
      
       # perf stat -e uncore_imc_1/cas_count_read/ -a --interval-print 10 -- sleep 1
      
       print interval < 100ms. The overhead percentage could be high in some
       cases. Please proceed with caution.
       #           time             counts unit events
            0.010200451               0.10 MiB  uncore_imc_1/cas_count_read/
            0.020475117               0.02 MiB  uncore_imc_1/cas_count_read/
            0.030692800               0.01 MiB  uncore_imc_1/cas_count_read/
            0.040948161               0.02 MiB  uncore_imc_1/cas_count_read/
            0.051159564               0.00 MiB  uncore_imc_1/cas_count_read/
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1443776674-42511-1-git-send-email-kan.liang@intel.com
      [ Added warning about overhead when using sub 100ms intervals to the man page ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      19afd104
  10. 03 9月, 2015 1 次提交
  11. 28 8月, 2015 1 次提交
    • K
      perf stat: Get correct cpu id for print_aggr · 601083cf
      Kan Liang 提交于
      print_aggr() fails to print per-core/per-socket statistics after commit
      582ec082 ("perf stat: Fix per-socket output bug for uncore events")
      if events have differnt cpus. Because in print_aggr(), aggr_get_id needs
      index (not cpu id) to find core/pkg id. Also, evsel cpu maps should be
      used to get aggregated id.
      
      Here is an example:
      
      Counting events cycles,uncore_imc_0/cas_count_read/. (Uncore event has
      cpumask 0,18)
      
        $ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,18 --per-core sleep 2
      
      Without this patch, it failes to get CPU 18 result.
      
         Performance counter stats for 'CPU(s) 0,18':
      
        S0-C0           1            7526851      cycles
        S0-C0           1               1.05 MiB  uncore_imc_0/cas_count_read/
        S1-C0           0      <not counted>      cycles
        S1-C0           0      <not counted> MiB  uncore_imc_0/cas_count_read/
      
      With this patch, it can get both CPU0 and CPU18 result.
      
         Performance counter stats for 'CPU(s) 0,18':
      
        S0-C0           1            6327768      cycles
        S0-C0           1               0.47 MiB  uncore_imc_0/cas_count_read/
        S1-C0           1             330228      cycles
        S1-C0           1               0.29 MiB  uncore_imc_0/cas_count_read/
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NStephane Eranian <eranian@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Fixes: 582ec082 ("perf stat: Fix per-socket output bug for uncore events")
      Link: http://lkml.kernel.org/r/1435820925-51091-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      601083cf
  12. 09 8月, 2015 1 次提交
  13. 07 8月, 2015 6 次提交
  14. 08 7月, 2015 1 次提交
  15. 26 6月, 2015 12 次提交