1. 18 9月, 2015 3 次提交
    • A
      tools build: Add test for presence of __get_cpuid() gcc builtin · b0063dbf
      Arnaldo Carvalho de Melo 提交于
      The auxtrace code needed by Intel PT uses the __get_cpuid() gcc builtin,
      that is not present in old systems, breaking the build.
      
      Add a test to check for that builtin and disable AUXTRACE in those
      systems.
      
        [acme@rhel5 linux]$  make NO_LIBPERL=1 -C tools/perf O=/tmp/build/perf install-bin
        make: Entering directory `/home/acme/git/linux/tools/perf'
          BUILD:   Doing 'make -j2' parallel build
      
        Auto-detecting system features:
        <SNIP>
        ...                          lzma: [ on  ]
        ...                     get_cpuid: [ OFF ]
        <SNIP>
        config/Makefile:630: Your gcc lacks the __get_cpuid() builtin, disables support for auxtrace/Intel PT, please install a newer gcc
          MKDIR    /tmp/build/perf/util/
        <SNIP>
      
      This fixes the build on old systems such as RHEL/CentOS 5.11.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Victor Kamensky <victor.kamensky@linaro.org>
      Cc: Vinson Lee <vlee@twopensource.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-d4puslul0jltoodzpx9r4sje@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b0063dbf
    • A
      tools build: Add test for presence of numa_num_possible_cpus() in libnuma · f8ac8606
      Arnaldo Carvalho de Melo 提交于
      The existing numa test checks only if numa.h and numa_available() are
      present, but that can be satisfied with an old libnuma that is not
      enough for the 'perf bench numa' entry, so add a test to check for that:
      
        [acme@rhel5 linux]$  make NO_AUXTRACE=1 NO_LIBPERL=1 -C tools/perf O=/tmp/build/perf install-bin
        make: Entering directory `/home/acme/git/linux/tools/perf'
          BUILD:   Doing 'make -j2' parallel build
      
        Auto-detecting system features:
        ...                        libelf: [ on  ]
        ...                       libnuma: [ on  ]
        ...        numa_num_possible_cpus: [ OFF ]
        ...                       libperl: [ on  ]
      
        <SNIP>
        config/Makefile:577: Old numa library found, disables 'perf bench numa mem' benchmark, please install numactl-devel/libnuma-devel/libnuma-dev >= 2.0.8
          INSTALL  binaries
        <SNIP>
      
      This fixes the build on old systems such as RHEL/CentOS 5.11.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Victor Kamensky <victor.kamensky@linaro.org>
      Cc: Vinson Lee <vlee@twopensource.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-zqriqkezppi2de2iyjin1tnc@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f8ac8606
    • A
      Revert "perf symbols: Fix mismatched declarations for elf_getphdrnum" · 179f36dd
      Arnaldo Carvalho de Melo 提交于
      This reverts commit f785f235.
      
      We have a test to check if elf_getphdrnum() is present, so, if it fails,
      we'll get:
      
        [acme@rhel5 linux]$ cat /tmp/build/perf/feature/test-libelf-getphdrnum.make.output
        cc1: warnings being treated as errors
        test-libelf-getphdrnum.c: In function ‘main’:
        test-libelf-getphdrnum.c:7: warning: implicit declaration of function ‘elf_getphdrnum’
        [acme@rhel5 linux]$
      
      And this block will not be compiled:
      
        #ifndef HAVE_ELF_GETPHDRNUM_SUPPORT
        static int elf_getphdrnum(Elf *elf, size_t *dst)
        ...
        #endif
      
      So, if elf_getphdrnum() is being defined somewhere, there is a problem
      with the test that is not detecting that function, go fix it.
      Reported-by: NVinson Lee <vlee@twopensource.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Victor Kamensky <victor.kamensky@linaro.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-qn459fal6acvcvm50i8zxx9k@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      179f36dd
  2. 17 9月, 2015 1 次提交
    • S
      perf stat: Fix per-pkg event reporting bug · 02d8dabc
      Stephane Eranian 提交于
      Per-pkg events need to be captured once per processor socket. The code
      in check_per_pkg() ensures only one value per processor package is used.
      However there is a problem with this function in case the first CPU of
      the package does not measure anything for the per-pkg event, but other
      CPUs do.
      
      Consider the following:
      
        $ create cgroup FOO; echo $$ >FOO/tasks; taskset -c 1 noploop &
        $ perf stat -a -I 1000 -e intel_cqm/llc_occupancy/ -G FOO sleep 100
          1.00000 <not counted> Bytes intel_cqm/llc_occupancy/  FOO
      
      The reason for this is that CPU0 in the cgroup has nothing running on it.
      Yet check_per_plg() will mark socket0 as processed and no other event
      value will be considered for the socket.
      
      This patch fixes the problem by having check_per_pkg() only consider
      events which actually ran.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1441286620-10117-1-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      02d8dabc
  3. 15 9月, 2015 15 次提交
  4. 13 9月, 2015 1 次提交
    • A
      perf header: Fixup reading of HEADER_NRCPUS feature · caa47047
      Arnaldo Carvalho de Melo 提交于
      The original patch introducing this header wrote the number of CPUs available
      and online in one order and then swapped those values when reading, fix it.
      
      Before:
      
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 4
        # nrcpus avail : 4
        # echo 0 > /sys/devices/system/cpu/cpu2/online
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 4
        # nrcpus avail : 3
        # echo 0 > /sys/devices/system/cpu/cpu1/online
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 4
        # nrcpus avail : 2
      
      After the fix, bringing back the CPUs online:
      
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 2
        # nrcpus avail : 4
        # echo 1 > /sys/devices/system/cpu/cpu2/online
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 3
        # nrcpus avail : 4
        # echo 1 > /sys/devices/system/cpu/cpu1/online
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 4
        # nrcpus avail : 4
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: fbe96f29 ("perf tools: Make perf.data more self-descriptive (v8)")
      Link: http://lkml.kernel.org/r/20150911153323.GP23511@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      caa47047
  5. 03 9月, 2015 1 次提交
    • A
      perf tools: Fix use of wrong event when processing exit events · 53ff6bc3
      Adrian Hunter 提交于
      In a couple of cases the 'comm' member of 'union event' has been used
      instead of the correct member ('fork') when processing exit events.
      
      In the cases where it has been used incorrectly, only the 'pid' and
      'tid' are affected.  The 'pid' value would be correct anyway because it
      is in the same position in 'comm' and 'fork' events, but the 'tid' would
      have been incorrectly assigned from 'ppid'.
      
      However, for exit events, the kernel puts the current task in the 'ppid'
      and 'ttid' which is the same as the exiting task.  That is 'ppid' ==
      'pid' and if the task is not multi-threaded, 'pid' == 'tid' i.e. the
      data goes wrong only when tracing multi-threaded programs.
      
      It is hard to find an example of how this would produce an error in
      practice.  There are 3 occurences of the fix:
      
      1. perf script is only affected if !sample_id_all which only happens on
        old kernels.
      
      2. intel_pt is only affected when decoding without timestamps
         and would probably still decode correctly - the exit event is
         only used to flush out data which anyway gets flushed at the
         end of the session
      
      3. intel_bts also uses the exit event to flush data which
         would probably not cause errors as it would get flushed at
         the end of the session instead
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1439888825-27708-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      53ff6bc3
  6. 02 9月, 2015 3 次提交
  7. 01 9月, 2015 7 次提交
  8. 29 8月, 2015 8 次提交
  9. 28 8月, 2015 1 次提交
    • K
      perf stat: Get correct cpu id for print_aggr · 601083cf
      Kan Liang 提交于
      print_aggr() fails to print per-core/per-socket statistics after commit
      582ec082 ("perf stat: Fix per-socket output bug for uncore events")
      if events have differnt cpus. Because in print_aggr(), aggr_get_id needs
      index (not cpu id) to find core/pkg id. Also, evsel cpu maps should be
      used to get aggregated id.
      
      Here is an example:
      
      Counting events cycles,uncore_imc_0/cas_count_read/. (Uncore event has
      cpumask 0,18)
      
        $ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,18 --per-core sleep 2
      
      Without this patch, it failes to get CPU 18 result.
      
         Performance counter stats for 'CPU(s) 0,18':
      
        S0-C0           1            7526851      cycles
        S0-C0           1               1.05 MiB  uncore_imc_0/cas_count_read/
        S1-C0           0      <not counted>      cycles
        S1-C0           0      <not counted> MiB  uncore_imc_0/cas_count_read/
      
      With this patch, it can get both CPU0 and CPU18 result.
      
         Performance counter stats for 'CPU(s) 0,18':
      
        S0-C0           1            6327768      cycles
        S0-C0           1               0.47 MiB  uncore_imc_0/cas_count_read/
        S1-C0           1             330228      cycles
        S1-C0           1               0.29 MiB  uncore_imc_0/cas_count_read/
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NStephane Eranian <eranian@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Fixes: 582ec082 ("perf stat: Fix per-socket output bug for uncore events")
      Link: http://lkml.kernel.org/r/1435820925-51091-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      601083cf