1. 30 5月, 2018 1 次提交
    • K
      perf parse-events: Handle uncore event aliases in small groups properly · 369b2308
      Kan Liang 提交于
      Perf stat doesn't count the uncore event aliases from the same uncore
      block in a group, for example:
      
        perf stat -e '{unc_m_cas_count.all,unc_m_clockticks}' -a -I 1000
        #           time             counts unit events
             1.000447342      <not counted>      unc_m_cas_count.all
             1.000447342      <not counted>      unc_m_clockticks
             2.000740654      <not counted>      unc_m_cas_count.all
             2.000740654      <not counted>      unc_m_clockticks
      
      The output is very misleading. It gives a wrong impression that the
      uncore event doesn't work.
      
      An uncore block could be composed by several PMUs. An uncore event alias
      is a joint name which means the same event runs on all PMUs of a block.
      Perf doesn't support mixed events from different PMUs in the same group.
      It is wrong to put uncore event aliases in a big group.
      
      The right way is to split the big group into multiple small groups which
      only include the events from the same PMU.
      
      Only uncore event aliases from the same uncore block should be specially
      handled here. It doesn't make sense to mix the uncore events with other
      uncore events from different blocks or even core events in a group.
      
      With the patch:
        #           time             counts unit events
           1.001557653            140,833      unc_m_cas_count.all
           1.001557653      1,330,231,332      unc_m_clockticks
           2.002709483             85,007      unc_m_cas_count.all
           2.002709483      1,429,494,563      unc_m_clockticks
      Reported-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1525727623-19768-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      369b2308
  2. 11 5月, 2018 5 次提交
    • A
      perf tools: Add missing newline when parsing empty BPF proggie · c23080a6
      Arnaldo Carvalho de Melo 提交于
      This is not specific to BPF but was found when parsing a .c BPF proggie
      that while valid, had no events attached to tracepoints, kprobes, etc:
      
      Very minimal file that perf's BPF code can compile:
      
        # cat empty.c
        char _license[] __attribute__((section("license"), used)) = "GPL";
        int _version __attribute__((section("version"), used)) = LINUX_VERSION_CODE;
        #
      
      Before this patch:
      
        # perf trace -e empty.c
        WARNING: event parser found nothinginvalid or unsupported event: 'empty.c'
        Run 'perf list' for a list of valid events
      
         Usage: perf trace [<options>] [<command>]
            or: perf trace [<options>] -- <command> [<options>]
            or: perf trace record [<options>] [<command>]
            or: perf trace record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event/syscall selector. use 'perf list' to list available events
          #
      
      After:
      
        # perf trace -e empty.c
        WARNING: event parser found nothing
        invalid or unsupported event: 'empty.c'
        Run 'perf list' for a list of valid events
      
         Usage: perf trace [<options>] [<command>]
            or: perf trace [<options>] -- <command> [<options>]
            or: perf trace record [<options>] [<command>]
            or: perf trace record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event/syscall selector. use 'perf list' to list available events
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-8ysughiz00h6mjpcot04qyjj@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c23080a6
    • L
      perf cs-etm: Remove redundant space · 3a088799
      Leo Yan 提交于
      There have two spaces ahead function name cs_etm__set_pid_tid_cpu(), so
      remove one space and correct indentation.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1525924920-4381-2-git-send-email-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3a088799
    • L
      perf cs-etm: Support unknown_thread in cs_etm_auxtrace · 46d53620
      Leo Yan 提交于
      CoreSight doesn't allocate thread structure for unknown_thread in ETM
      auxtrace, so unknown_thread is NULL pointer.  If the perf data doesn't
      contain valid tid and then cs_etm__mem_access() uses unknown_thread
      instead as thread handler, this results in a segmentation fault when
      thread__find_addr_map() accesses the thread handler.
      
      This commit creates a new thread data which is used by unknown_thread, so
      CoreSight tracing can roll back to use unknown_thread if perf data
      doesn't include valid thread info.  This commit also releases thread
      data for initialization failure case and for normal auxtrace free flow.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1525924920-4381-1-git-send-email-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      46d53620
    • J
      perf annotate: Display all available events on --stdio · 04d2600a
      Jin Yao 提交于
      When we perform the following command lines:
      
        $ perf record -e "{cycles,branches}" ./div
        $ perf annotate main --stdio
      
      The output shows only the first event, "cycles" and the displaying
      format is not correct.
      
         Percent         |      Source code & Disassembly of div for cycles (44550 samples)
        -----------------------------------------------------------------------------------
                         :
                         :
                         :
                         :            Disassembly of section .text:
                         :
                         :            00000000004004b0 <main>:
                         :            main():
                         :
                         :                    return i;
                         :            }
                         :
                         :            int main(void)
                         :            {
            0.00 :   4004b0:       push   %rbx
                         :                    int i;
                         :                    int flag;
                         :                    volatile double x = 1212121212, y = 121212;
                         :
                         :                    s_randseed = time(0);
            0.00 :   4004b1:       xor    %edi,%edi
                         :                    srand(s_randseed);
            0.00 :   4004b3:       mov    $0x77359400,%ebx
                         :
                         :                    return i;
                         :            }
      
      The issue is that the value of the 'nr_percent' variable is hardcoded to
      1.  This patch fixes it.
      
      With this patch, the output is:
      
         Percent         |      Source code & Disassembly of div for cycles (44550 samples)
        -----------------------------------------------------------------------------------
                         :
                         :
                         :
                         :            Disassembly of section .text:
                         :
                         :            00000000004004b0 <main>:
                         :            main():
                         :
                         :                    return i;
                         :            }
                         :
                         :            int main(void)
                         :            {
            0.00    0.00 :   4004b0:       push   %rbx
                         :                    int i;
                         :                    int flag;
                         :                    volatile double x = 1212121212, y = 121212;
                         :
                         :                    s_randseed = time(0);
            0.00    0.00 :   4004b1:       xor    %edi,%edi
                         :                    srand(s_randseed);
            0.00    0.00 :   4004b3:       mov    $0x77359400,%ebx
                         :
                         :                    return i;
                         :            }
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: f681d593 ("perf annotate: Remove disasm__calc_percent() from disasm_line__print()")
      Link: http://lkml.kernel.org/r/1525881435-4092-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      04d2600a
    • T
      perf test: "probe libc's inet_pton" fails on s390 due to missing inline · f8207b98
      Thomas Richter 提交于
      perf test "probe libc's inet_pton & backtrace it with ping" fails on
      4.17.0rc3 on s/390. It turned out that function __inet_pton is reported
      as inline:
      
        [root@s8360047 perf]# ./perf script -i /tmp/perf.data.111
        ping 12457 [000]  1584.478959: probe_libc:inet_pton: (3ffb5a347e8)
                          1347e8 __inet_pton (inlined)
                           f19d7 gaih_inet.constprop.5 (/usr/lib64/libc-2.24.so)
                           f4c3f __GI_getaddrinfo (inlined)
                            410b main (/usr/bin/ping)
      
      Allow __inet_pton listed as inline.
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180503065837.71043-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f8207b98
  3. 08 5月, 2018 2 次提交
  4. 07 5月, 2018 1 次提交
  5. 25 4月, 2018 4 次提交
    • K
      perf stat: Fix duplicate PMU name for interval print · 80ee8c58
      Kan Liang 提交于
      PMU name is printed repeatedly for interval print, for example:
      
        perf stat --no-merge -e 'unc_m_clockticks' -a -I 1000
        #           time             counts unit events
           1.001053069        243,702,144      unc_m_clockticks [uncore_imc_4]
           1.001053069        244,268,304      unc_m_clockticks [uncore_imc_2]
           1.001053069        244,427,386      unc_m_clockticks [uncore_imc_0]
           1.001053069        244,583,760      unc_m_clockticks [uncore_imc_5]
           1.001053069        244,738,971      unc_m_clockticks [uncore_imc_3]
           1.001053069        244,880,309      unc_m_clockticks [uncore_imc_1]
           2.002024821        240,818,200      unc_m_clockticks [uncore_imc_4] [uncore_imc_4]
           2.002024821        240,767,812      unc_m_clockticks [uncore_imc_2] [uncore_imc_2]
           2.002024821        240,764,215      unc_m_clockticks [uncore_imc_0] [uncore_imc_0]
           2.002024821        240,759,504      unc_m_clockticks [uncore_imc_5] [uncore_imc_5]
           2.002024821        240,755,992      unc_m_clockticks [uncore_imc_3] [uncore_imc_3]
           2.002024821        240,750,403      unc_m_clockticks [uncore_imc_1] [uncore_imc_1]
      
      For each print, the PMU name is unconditionally appended to the
      counter->name.
      
      Need to check the counter->name first. If the PMU name is already
      appended, do nothing.
      
      Committer notes:
      
      Add and use perf_evsel->uniquified_name bool instead of doing the more
      expensive strstr(event->name, pmu->name).
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Fixes: 8c5421c0 ("perf pmu: Display pmu name when printing unmerged events in stat")
      Link: http://lkml.kernel.org/r/1524594014-79243-5-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      80ee8c58
    • K
      perf evsel: Only fall back group read for leader · 121f325f
      Kan Liang 提交于
      Perf doesn't support mixed events from different PMUs (except software
      event) in a group. The perf stat should output <not counted>/<not
      supported> for all events, but it doesn't. For example,
      
        perf stat -e '{cycles,uncore_imc_5/umask=0xF,event=0x4/,instructions}'
             <not counted>      cycles
             <not supported>    uncore_imc_5/umask=0xF,event=0x4/
                 1,024,300      instructions
      
      If perf fails to open an event, it doesn't error out directly. It will
      disable some features and retry, until the event is opened or all
      features are disabled. The disabled features will not be re-enabled. The
      group read is one of these features.
      
      For the example as above, the IMC event and the leader event "cycles"
      are from different PMUs. Opening the IMC event must fail. The group read
      feature must be disabled for IMC event and the followed event
      "instructions". The "instructions" event has the same PMU as the leader
      "cycles". It can be opened successfully. Since the group read feature
      has been disabled, the "instructions" event will be read as a single
      event, which definitely has a value.
      
      The group read fallback is still useful for the case which kernel
      doesn't support group read. It is good enough to be handled only by the
      leader.
      
      For the fallback request from members, it must be caused by an error.
      The fallback only breaks the semantics of group.  Limit the group read
      fallback only for the leader.
      
      Committer testing:
      
      On a broadwell t450s notebook:
      
      Before:
      
        # perf stat -e '{cycles,unc_cbo_cache_lookup.read_i,instructions}' sleep 1
      
        Performance counter stats for 'sleep 1':
      
           <not counted>      cycles
         <not supported>      unc_cbo_cache_lookup.read_i
                 818,206      instructions
      
             1.003170887 seconds time elapsed
      
        Some events weren't counted. Try disabling the NMI watchdog:
      	echo 0 > /proc/sys/kernel/nmi_watchdog
      	perf stat ...
      	echo 1 > /proc/sys/kernel/nmi_watchdog
      
      After:
      
        # perf stat -e '{cycles,unc_cbo_cache_lookup.read_i,instructions}' sleep 1
      
        Performance counter stats for 'sleep 1':
      
           <not counted>      cycles
         <not supported>      unc_cbo_cache_lookup.read_i
           <not counted>      instructions
      
             1.001380511 seconds time elapsed
      
        Some events weren't counted. Try disabling the NMI watchdog:
      	echo 0 > /proc/sys/kernel/nmi_watchdog
      	perf stat ...
      	echo 1 > /proc/sys/kernel/nmi_watchdog
        #
      Reported-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Fixes:  82bf311e ("perf stat: Use group read for event groups")
      Link: http://lkml.kernel.org/r/1524594014-79243-3-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      121f325f
    • K
      perf stat: Print out hint for mixed PMU group error · 30060eae
      Kan Liang 提交于
      Perf doesn't support mixed events from different PMUs (except software
      event) in a group. For this case, only "<not counted>" or "<not
      supported>" are printed out. There is no hint which guides users to fix
      the issue.
      
      Checking the PMU type of events to determine if they are from the same
      PMU. There may be false alarm for the checking. E.g. the core PMU has
      different PMU type. But it should not happen often.
      
      The false alarm can also be tolerated, because:
      
      - It only happens on error path.
      - It just provides a possible solution for the issue.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1524594014-79243-2-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      30060eae
    • K
      perf pmu: Fix core PMU alias list for X86 platform · 292c34c1
      Kan Liang 提交于
      When counting uncore event with alias, core event is mistakenly
      involved, for example:
      
        perf stat --no-merge -e "unc_m_cas_count.all" -C0  sleep 1
      
        Performance counter stats for 'CPU(s) 0':
      
                       0      unc_m_cas_count.all [uncore_imc_4]
                       0      unc_m_cas_count.all [uncore_imc_2]
                       0      unc_m_cas_count.all [uncore_imc_0]
                 153,640      unc_m_cas_count.all [cpu]
                       0      unc_m_cas_count.all [uncore_imc_5]
                  25,026      unc_m_cas_count.all [uncore_imc_3]
                       0      unc_m_cas_count.all [uncore_imc_1]
      
             1.001447890 seconds time elapsed
      
      The reason is that current implementation doesn't check PMU name of a
      event when adding its alias into the alias list for core PMU. The
      uncore event aliases are mistakenly added.
      
      This bug was introduced in:
        commit 14b22ae0 ("perf pmu: Add helper function is_pmu_core to
        detect PMU CORE devices")
      
      Checking the PMU name for all PMUs on X86 and other architectures except
      ARM.
      There is no behavior change for ARM.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Fixes: 14b22ae0 ("perf pmu: Add helper function is_pmu_core to detect PMU CORE devices")
      Link: http://lkml.kernel.org/r/1524594014-79243-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      292c34c1
  6. 23 4月, 2018 8 次提交
  7. 19 4月, 2018 8 次提交
  8. 17 4月, 2018 4 次提交
    • T
      perf list: Add s390 support for detailed/verbose PMU event description · 038586c3
      Thomas Richter 提交于
      'perf list' with flags -d and -v print a description (-d) or a very
      verbose explanation (-v) of CPU specific counter events.  These
      descriptions are provided with the json files in directory
      pmu-events/arch/s390/*.json.
      
      Display of these descriptions on s390 requires the corresponding json
      files.
      
      On s390 this does not work because function is_pmu_core() does not
      detect the s390 directory name where the CPU specific events are listed.
      On x86 it is:
      
        /sys/bus/event_source/devices/cpu
      
      whereas on s390 it is:
      
        /sys/bus/event_source/devices/cpum_cf
        /sys/bus/event_source/devices/cpum_sf
      
      Fix this by adding s390 directory name testing to function
      is_pmu_core(). This is the same approach as taken for the ARM platform.
      
      Output before:
      
      [root@s35lp76 perf]# ./perf list -d pmu
      List of pre-defined events (to be used in -e):
      
        cpum_cf/AES_BLOCKED_CYCLES/      [Kernel PMU event]
        cpum_cf/AES_BLOCKED_FUNCTIONS/   [Kernel PMU event]
        cpum_cf/AES_CYCLES/              [Kernel PMU event]
        cpum_cf/AES_FUNCTIONS/           [Kernel PMU event]
        ....
        cpum_cf/TX_NC_TEND/              [Kernel PMU event]
        cpum_cf/VX_BCD_EXECUTION_SLOTS/  [Kernel PMU event]
        cpum_sf/SF_CYCLES_BASIC/         [Kernel PMU event]
      
      Output after:
      
      [root@s35lp76 perf]# ./perf list -d pmu
      List of pre-defined events (to be used in -e):
      
        cpum_cf/AES_BLOCKED_CYCLES/      [Kernel PMU event]
        cpum_cf/AES_BLOCKED_FUNCTIONS/   [Kernel PMU event]
        cpum_cf/AES_CYCLES/              [Kernel PMU event]
        cpum_cf/AES_FUNCTIONS/           [Kernel PMU event]
        ....
        cpum_cf/TX_NC_TEND/              [Kernel PMU event]
        cpum_cf/VX_BCD_EXECUTION_SLOTS/  [Kernel PMU event]
        cpum_sf/SF_CYCLES_BASIC/         [Kernel PMU event]
      
      3906:
        bcd_dfp_execution_slots
             [BCD DFP Execution Slots]
        decimal_instructions
             [Decimal Instructions]
        dtlb2_gpage_writes
             [DTLB2 GPAGE Writes]
        dtlb2_hpage_writes
             [DTLB2 HPAGE Writes]
        dtlb2_misses
             [DTLB2 Misses]
        dtlb2_writes
             [DTLB2 Writes]
        itlb2_misses
             [ITLB2 Misses]
        itlb2_writes
             [ITLB2 Writes]
        l1c_tlb2_misses
             [L1C TLB2 Misses]
        .....
      
      cfvn 3:
        cpu_cycles
             [CPU Cycles]
        instructions
             [Instructions]
        l1d_dir_writes
             [L1D Directory Writes]
        l1d_penalty_cycles
             [L1D Penalty Cycles]
        l1i_dir_writes
             [L1I Directory Writes]
        l1i_penalty_cycles
             [L1I Penalty Cycles]
        problem_state_cpu_cycles
             [Problem State CPU Cycles]
        problem_state_instructions
             [Problem State Instructions]
        ....
      
      csvn generic:
        aes_blocked_cycles
             [AES Blocked Cycles]
        aes_blocked_functions
             [AES Blocked Functions]
        aes_cycles
             [AES Cycles]
        aes_functions
             [AES Functions]
        dea_blocked_cycles
             [DEA Blocked Cycles]
        dea_blocked_functions
             [DEA Blocked Functions]
        ....
      Signed-off-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Reviewed-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180416132314.33249-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      038586c3
    • A
      perf script: Extend misc field decoding with switch out event type · bf30cc18
      Alexey Budankov 提交于
      Append 'p' sign to 'S' tag designating the type of context switch out event so
      'Sp' means preemption context switch. Documentation is extended to cover
      new presentation changes.
      
        $ perf script --show-switch-events -F +misc -I -i perf.data:
      
                hdparm 4073 [004] U  762.198265:     380194 cycles:ppp:      7faf727f5a23 strchr (/usr/lib64/ld-2.26.so)
                hdparm 4073 [004] K  762.198366:     441572 cycles:ppp:  ffffffffb9218435 alloc_set_pte (/lib/modules/4.16.0-rc6+/build/vmlinux)
                hdparm 4073 [004] S  762.198391: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:    0/0
               swapper    0 [004]    762.198392: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 4073/4073
               swapper    0 [004] Sp 762.198477: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 4073/4073
                hdparm 4073 [004]    762.198478: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    0/0
               swapper    0 [007] K  762.198514:    2303073 cycles:ppp:  ffffffffb98b0c66 intel_idle (/lib/modules/4.16.0-rc6+/build/vmlinux)
               swapper    0 [007] Sp 762.198561: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 1134/1134
        kworker/u16:18 1134 [007]    762.198562: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    0/0
        kworker/u16:18 1134 [007] S  762.198567: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:    0/0
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/5fc65ce7-8ca5-53ae-8858-8ddd27290575@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bf30cc18
    • A
      perf report: Extend raw dump (-D) out with switch out event type · b3f35b5d
      Alexey Budankov 提交于
      Print additional 'preempt' tag for PERF_RECORD_SWITCH[_CPU_WIDE] OUT records when
      event header misc field contains PERF_RECORD_MISC_SWITCH_OUT_PREEMPT bit set
      designating preemption context switch out event:
      
      tools/perf/perf report -D -i perf.data | grep _SWITCH
      
      0 768361415226 0x27f076 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     8/8
      4 768362216813 0x28f45e [0x28]: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
      4 768362217824 0x28f486 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:  4073/4073
      0 768362414027 0x27f0ce [0x28]: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid:     8/8
      0 768362414367 0x27f0f6 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/6f5aebb9-b96c-f304-f08f-8f046d38de4f@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b3f35b5d
    • I
      tools/headers: Synchronize kernel ABI headers, v4.17-rc1 · e2f73a18
      Ingo Molnar 提交于
      Sync the following tooling headers with the latest kernel version:
      
        tools/arch/arm/include/uapi/asm/kvm.h
          - New ABI: KVM_REG_ARM_*
      
        tools/arch/x86/include/asm/required-features.h
          - Removal of NEED_LA57 dependency
      
        tools/arch/x86/include/uapi/asm/kvm.h
          - New KVM ABI: KVM_SYNC_X86_*
      
        tools/include/uapi/asm-generic/mman-common.h
          - New ABI: MAP_FIXED_NOREPLACE flag
      
        tools/include/uapi/linux/bpf.h
          - New ABI: BPF_F_SEQ_NUMBER functions
      
        tools/include/uapi/linux/if_link.h
          - New ABI: IFLA tun and rmnet support
      
        tools/include/uapi/linux/kvm.h
          - New ABI: hyperv eventfd and CONN_ID_MASK support plus header cleanups
      
        tools/include/uapi/sound/asound.h
          - New ABI: SNDRV_PCM_FORMAT_FIRST PCM format specifier
      
        tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
          - The x86 system call table description changed due to the ptregs changes and the renames, in:
      
      	d5a00528: syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()
      	5ac9efa3: syscalls/core, syscalls/x86: Clean up compat syscall stub naming convention
      	ebeb8c82: syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32
      
      Also fix the x86 syscall table warning:
      
        -Warning: Kernel ABI header at 'tools/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
        +Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
      
      None of these changes impact existing tooling code, so we only have to copy the kernel version.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Brian Robbins <brianrob@microsoft.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Dmitriy Vyukov <dvyukov@google.com> <dvyukov@google.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Li Zhijian <lizhijian@cn.fujitsu.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matthias Kaehlcke <mka@chromium.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Sandipan Das <sandipan@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Takuya Yamamoto <tkydevel@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: William Cohen <wcohen@redhat.com>
      Cc: Yonghong Song <yhs@fb.com>
      Link: http://lkml.kernel.org/r/20180416064024.ofjtrz5yuu3ykhvl@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e2f73a18
  9. 13 4月, 2018 5 次提交
    • A
      perf annotate: Handle variables in 'sub', 'or' and many other instructions · b0d5c81e
      Arnaldo Carvalho de Melo 提交于
      Just like is done for 'mov' and others that can have as source or
      targets variables resolved by objdump, to make them more compact:
      
      -               orb    $0x4,0x224d71(%rip)        # 226ca4 <_rtld_global+0xca4>
      +               orb    $0x4,_rtld_global+0xca4
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-efex7746id4w4wa03nqxvh3m@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b0d5c81e
    • A
      perf annotate: Allow setting the offset level in .perfconfig · 43c40231
      Arnaldo Carvalho de Melo 提交于
      The default is 1 (jump_target):
      
        # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave
        Samples: 3K of event 'cycles:ppp', 3000 Hz, Event count (approx.): 2766398574
        _raw_spin_lock_irqsave() /proc/kcore
          0.26        nop
          4.61        push   %rbx
         19.33        pushfq
          7.97        pop    %rax
          0.32        nop
          0.06        mov    %rax,%rbx
         14.63        cli
          0.06        nop
                      xor    %eax,%eax
                      mov    $0x1,%edx
         49.94        lock   cmpxchg %edx,(%rdi)
          0.16        test   %eax,%eax
                    ↓ jne    2b
          2.66        mov    %rbx,%rax
                      pop    %rbx
                    ← retq
                2b:   mov    %eax,%esi
                    → callq  *ffffffffb30eaed0
                      mov    %rbx,%rax
                      pop    %rbx
                    ← retq
        #
      
      But one can ask for showing offsets for call instructions by setting
      this:
      
        # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave
        Samples: 3K of event 'cycles:ppp', 3000 Hz, Event count (approx.): 2766398574
        _raw_spin_lock_irqsave() /proc/kcore
          0.26        nop
          4.61        push   %rbx
         19.33        pushfq
          7.97        pop    %rax
          0.32        nop
          0.06        mov    %rax,%rbx
         14.63        cli
          0.06        nop
                      xor    %eax,%eax
                      mov    $0x1,%edx
         49.94        lock   cmpxchg %edx,(%rdi)
          0.16        test   %eax,%eax
                    ↓ jne    2b
          2.66        mov    %rbx,%rax
                      pop    %rbx
                    ← retq
                2b:   mov    %eax,%esi
                2d: → callq  *ffffffffb30eaed0
                      mov    %rbx,%rax
                      pop    %rbx
                    ← retq
        #
      
      Or using a big value to ask for all offsets to be shown:
      
        # cat ~/.perfconfig
        [annotate]
      
      	offset_level = 100
      
      	hide_src_code = true
        # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave
        Samples: 3K of event 'cycles:ppp', 3000 Hz, Event count (approx.): 2766398574
        _raw_spin_lock_irqsave() /proc/kcore
          0.26   0:   nop
          4.61   5:   push   %rbx
         19.33   6:   pushfq
          7.97   7:   pop    %rax
          0.32   8:   nop
          0.06   d:   mov    %rax,%rbx
         14.63  10:   cli
          0.06  11:   nop
                17:   xor    %eax,%eax
                19:   mov    $0x1,%edx
         49.94  1e:   lock   cmpxchg %edx,(%rdi)
          0.16  22:   test   %eax,%eax
                24: ↓ jne    2b
          2.66  26:   mov    %rbx,%rax
                29:   pop    %rbx
                2a: ← retq
                2b:   mov    %eax,%esi
                2d: → callq  *ffffffffb30eaed0
                32:   mov    %rbx,%rax
                35:   pop    %rbx
                36: ← retq
         #
      
      This also affects the TUI, i.e. the default 'perf annotate' and 'perf
      top/report' -> A hotkey -> annotate interfaces, when slang-devel is present
      in the build, i.e.:
      
        # perf version --build-options | grep slang
                    libslang: [ on  ]  # HAVE_SLANG_SUPPORT
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-venm6x5zrt40eu8hxdsmqxz6@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      43c40231
    • A
      perf report: Fix switching to another perf.data file · 7b366142
      Arnaldo Carvalho de Melo 提交于
      In the TUI the 's' hotkey can be used to switch to another perf.data
      file in the current directory, but that got broken in Fixes:
      b01141f4 ("perf annotate: Initialize the priv are in symbol__new()"),
      that would show this once another file was chosen:
      
          ┌─Fatal Error─────────────────────────────────────┐
          │Annotation needs to be init before symbol__init()│
          │                                                 │
          │                                                 │
          │Press any key...                                 │
          └─────────────────────────────────────────────────┘
      
      Fix it by just silently bailing out if symbol__annotation_init() was already
      called, just like is done with symbol__init(), i.e. they are done just once at
      session start, not when switching to a new perf.data file.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: b01141f4 ("perf annotate: Initialize the priv are in symbol__new()")
      Link: https://lkml.kernel.org/n/tip-ogppdtpzfax7y1h6gjdv5s6u@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7b366142
    • T
      perf record: Change warning for missing sysfs entry to debug · 4f75f1cb
      Thomas Richter 提交于
      Using perf on 4.16.0 kernel on s390 shows this warning:
      
         failed: can't open node sysfs data
      
      each time I run command perf record ... for example:
      
        [root@s35lp76 perf]# ./perf record -e rB0000 -- sleep 1
        [ perf record: Woken up 1 times to write data ]
        failed: can't open node sysfs data
        [ perf record: Captured and wrote 0.001 MB perf.data (4 samples) ]
        [root@s35lp76 perf]#
      
      It turns out commit e2091ced ("perf tools: Add MEM_TOPOLOGY feature
      to perf data file") tries to open directory named /sys/devices/system/node/
      which does not exist on s390.
      
      This is the call stack:
       __cmd_record
       +---> perf_session__write_header
             +---> perf_header__adds_write
                   +---> do_write_feat
      	           +---> write_mem_topology
      		         +---> build_mem_topology
      			       prints warning
      
      The issue starts in do_write_feat() which unconditionally loops over all
      features and now includes HEADER_MEM_TOPOLOGY and calls write_mem_topology().
      
      Function record__init_features() at the beginning of __cmd_record() sets
      all features and then turns off some of them.
      
      Fix this by changing the warning to a level 2 debug output statement.
      
      So it is only shown when debug level 2 or higher is set.
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180412133246.92801-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4f75f1cb
    • S
      perf tests: Disable breakpoint accounting test for powerpc · 4b163ca3
      Sandipan Das 提交于
      We disable this test as instruction breakpoints (HW_BREAKPOINT_X) are
      not available for powerpc.
      
      Before applying patch:
      
        21: Breakpoint accounting                                 :
        --- start ---
        test child forked, pid 3635
        failed opening event 0
        failed opening event 0
        watchpoints count 1, breakpoints count 0, has_ioctl 1, share 0
        test child finished with -2
        ---- end ----
        Breakpoint accounting: Skip
      
      After applying patch:
      
        21: Breakpoint accounting                                 : Disabled
      Signed-off-by: NSandipan Das <sandipan@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/20180412162140.2992-1-sandipan@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4b163ca3
  10. 12 4月, 2018 2 次提交