1. 29 11月, 2019 8 次提交
  2. 28 11月, 2019 7 次提交
    • A
      perf script: Fix invalid LBR/binary mismatch error · 5172672d
      Adrian Hunter 提交于
      The 'len' returned by grab_bb() includes an extra MAXINSN bytes to allow
      for the last instruction, so the the final 'offs' will not be 'len'.
      Fix the error condition logic accordingly.
      
      Before:
      
        $ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
        [ perf record: Woken up 19 times to write data ]
        [ perf record: Captured and wrote 2.274 MB perf.data ]
        $ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
                  grep 13759 [002]  8091.310257:       1862                                        instructions:uH:      5641d58069eb bmexec+0x86b (/bin/grep)
              bmexec+2485:
              00005641d5806b35                        jnz 0x5641d5806bd0              # MISPRED
              00005641d5806bd0                        movzxb  (%r13,%rdx,1), %eax
              00005641d5806bd6                        add %rdi, %rax
              00005641d5806bd9                        movzxb  -0x1(%rax), %edx
              00005641d5806bdd                        cmp %rax, %r14
              00005641d5806be0                        jnb 0x5641d58069c0              # MISPRED
              mismatch of LBR data and executable
              00005641d58069c0                        movzxb  (%r13,%rdx,1), %edi
      
      After:
      
        $ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
                  grep 13759 [002]  8091.310257:       1862                                        instructions:uH:      5641d58069eb bmexec+0x86b (/bin/grep)
              bmexec+2485:
              00005641d5806b35                        jnz 0x5641d5806bd0              # MISPRED
              00005641d5806bd0                        movzxb  (%r13,%rdx,1), %eax
              00005641d5806bd6                        add %rdi, %rax
              00005641d5806bd9                        movzxb  -0x1(%rax), %edx
              00005641d5806bdd                        cmp %rax, %r14
              00005641d5806be0                        jnb 0x5641d58069c0              # MISPRED
              00005641d58069c0                        movzxb  (%r13,%rdx,1), %edi
              00005641d58069c6                        add %rax, %rdi
      
      Fixes: e98df280 ("perf script brstackinsn: Fix recovery from LBR/binary mismatch")
      Reported-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lore.kernel.org/lkml/20191127095631.15663-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5172672d
    • A
      perf script: Fix brstackinsn for AUXTRACE · 0cd032d3
      Adrian Hunter 提交于
      brstackinsn must be allowed to be set by the user when AUX area data has
      been captured because, in that case, the branch stack might be
      synthesized on the fly. This fixes the following error:
      
      Before:
      
        $ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
        [ perf record: Woken up 19 times to write data ]
        [ perf record: Captured and wrote 2.274 MB perf.data ]
        $ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
        Display of branch stack assembler requested, but non all-branch filter set
        Hint: run 'perf record -b ...'
      
      After:
      
        $ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
        [ perf record: Woken up 19 times to write data ]
        [ perf record: Captured and wrote 2.274 MB perf.data ]
        $ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
                  grep 13759 [002]  8091.310257:       1862                                        instructions:uH:      5641d58069eb bmexec+0x86b (/bin/grep)
              bmexec+2485:
              00005641d5806b35                        jnz 0x5641d5806bd0              # MISPRED
              00005641d5806bd0                        movzxb  (%r13,%rdx,1), %eax
              00005641d5806bd6                        add %rdi, %rax
              00005641d5806bd9                        movzxb  -0x1(%rax), %edx
              00005641d5806bdd                        cmp %rax, %r14
              00005641d5806be0                        jnb 0x5641d58069c0              # MISPRED
              mismatch of LBR data and executable
              00005641d58069c0                        movzxb  (%r13,%rdx,1), %edi
      
      Fixes: 48d02a1d ("perf script: Add 'brstackinsn' for branch stacks")
      Reported-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lore.kernel.org/lkml/20191127095322.15417-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0cd032d3
    • A
      perf affinity: Add infrastructure to save/restore affinity · 267ed5d8
      Andi Kleen 提交于
      The kernel perf subsystem has to IPI to the target CPU for many
      operations. On systems with many CPUs and when managing many events the
      overhead can be dominated by lots of IPIs.
      
      An alternative is to set up CPU affinity in the perf tool, then set up
      all the events for that CPU, and then move on to the next CPU.
      
      Add some affinity management infrastructure to enable such a model.
      Used in followon patches.
      
      Committer notes:
      
      Use zfree() in some places, add missing stdbool.h header, some minor
      coding style changes.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lore.kernel.org/lkml/20191121001522.180827-3-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      267ed5d8
    • A
      perf pmu: Use file system cache to optimize sysfs access · d9664582
      Andi Kleen 提交于
      pmu.c does a lot of redundant /sys accesses while parsing aliases
      and probing for PMUs. On large systems with a lot of PMUs this
      can get expensive (>2s):
      
        % time     seconds  usecs/call     calls    errors syscall
        ------ ----------- ----------- --------- --------- ----------------
         27.25    1.227847           8    160888     16976 openat
         26.42    1.190481           7    164224    164077 stat
      
      Add a cache to remember if specific file names exist or don't
      exist, which eliminates most of this overhead.
      
      Also optimize some stat() calls to be slightly cheaper access()
      
      Resulting in:
      
          0.18    0.004166           2      1851       305 open
          0.08    0.001970           2       829       622 access
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lore.kernel.org/lkml/20191121001522.180827-2-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d9664582
    • A
      perf regs: Make perf_reg_name() return "unknown" instead of NULL · 5b596e0f
      Arnaldo Carvalho de Melo 提交于
      To avoid breaking the build on arches where this is not wired up, at
      least all the other features should be made available and when using
      this specific routine, the "unknown" should point the user/developer to
      the need to wire this up on this particular hardware architecture.
      
      Detected in a container mipsel debian cross build environment, where it
      shows up as:
      
        In file included from /usr/mipsel-linux-gnu/include/stdio.h:867,
                         from /git/linux/tools/perf/lib/include/perf/cpumap.h:6,
                         from util/session.c:13:
        In function 'printf',
            inlined from 'regs_dump__printf' at util/session.c:1103:3,
            inlined from 'regs__printf' at util/session.c:1131:2:
        /usr/mipsel-linux-gnu/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
          107 |   return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
              |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      cross compiler details:
      
        mipsel-linux-gnu-gcc (Debian 9.2.1-8) 9.2.1 20190909
      
      Also on mips64:
      
        In file included from /usr/mips64-linux-gnuabi64/include/stdio.h:867,
                         from /git/linux/tools/perf/lib/include/perf/cpumap.h:6,
                         from util/session.c:13:
        In function 'printf',
            inlined from 'regs_dump__printf' at util/session.c:1103:3,
            inlined from 'regs__printf' at util/session.c:1131:2,
            inlined from 'regs_user__printf' at util/session.c:1139:3,
            inlined from 'dump_sample' at util/session.c:1246:3,
            inlined from 'machines__deliver_event' at util/session.c:1421:3:
        /usr/mips64-linux-gnuabi64/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
          107 |   return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
              |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        In function 'printf',
            inlined from 'regs_dump__printf' at util/session.c:1103:3,
            inlined from 'regs__printf' at util/session.c:1131:2,
            inlined from 'regs_intr__printf' at util/session.c:1147:3,
            inlined from 'dump_sample' at util/session.c:1249:3,
            inlined from 'machines__deliver_event' at util/session.c:1421:3:
        /usr/mips64-linux-gnuabi64/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
          107 |   return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
              |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      cross compiler details:
      
        mips64-linux-gnuabi64-gcc (Debian 9.2.1-8) 9.2.1 20190909
      
      Fixes: 2bcd355b ("perf tools: Add interface to arch registers sets")
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-95wjyv4o65nuaeweq31t7l1s@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5b596e0f
    • A
      perf diff: Use llabs() with 64-bit values · 2b1ac640
      Arnaldo Carvalho de Melo 提交于
      To fix this build error on a debian mipsel cross build environment:
      
        builtin-diff.c: In function 'compute_cycles_diff':
        builtin-diff.c:649:10: error: absolute value function 'labs' given an argument of type 's64' {aka 'long long int'} but has parameter of type 'long int' which may cause truncation of value [-Werror=absolute-value]
          649 |    val = labs(pair->block_info->cycles_spark[i] -
              |          ^~~~
      
      Fixes: cebf7d51 ("perf diff: Report noisy for cycles diff")
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-pn7szy5uw384ntjgk6zckh6a@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2b1ac640
    • A
      perf diff: Use llabs() with 64-bit values · 98e93245
      Arnaldo Carvalho de Melo 提交于
      To fix these build errors on a debian mipsel cross build environment:
      
        builtin-diff.c: In function 'block_cycles_diff_cmp':
        builtin-diff.c:550:6: error: absolute value function 'labs' given an argument of type 's64' {aka 'long long int'} but has parameter of type 'long int' which may cause truncation of value [-Werror=absolute-value]
          550 |  l = labs(left->diff.cycles);
              |      ^~~~
        builtin-diff.c:551:6: error: absolute value function 'labs' given an argument of type 's64' {aka 'long long int'} but has parameter of type 'long int' which may cause truncation of value [-Werror=absolute-value]
          551 |  r = labs(right->diff.cycles);
              |      ^~~~
      
      Fixes: 99150a1f ("perf diff: Use hists to manage basic blocks per symbol")
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-pn7szy5uw384ntjgk6zckh6a@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      98e93245
  3. 26 11月, 2019 14 次提交
  4. 22 11月, 2019 11 次提交