1. 23 9月, 2016 1 次提交
  2. 22 9月, 2016 2 次提交
  3. 12 9月, 2016 1 次提交
    • A
      perf probe: Fix dwarf regs table for x86_64 · 7a023fd2
      Arnaldo Carvalho de Melo 提交于
      In 293d5b43 ("perf probe: Support probing on offline cross-arch binary")
      DWARF register tables were introduced for many architectures, with the one for
      the "dx" register being broken for x86_64, which got noticed by the 'perf test
      bpf' testcase, that has this difference from a successful run to one that
      fails, with the aforementioned patch:
      
        -Writing event: p:perf_bpf_probe/func _text+5197232 f_mode=+68(%di):x32 offset=%si:s64 orig=dx:s32
        -Failed to write event: Invalid argument
        -bpf_probe: failed to apply perf probe eventsFailed to add events selected by BPF
        +Writing event: p:perf_bpf_probe/func _text+5197232 f_mode=+68(%di):x32 offset=%si:s64 orig=%dx:s32
      
      Add the missing '%' to '%dx' to fix this.
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 293d5b43 ("perf probe: Support probing on offline cross-arch binary")
      Link: https://lkml.kernel.org/r/20160909145955.GC32585@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7a023fd2
  4. 05 9月, 2016 1 次提交
  5. 01 9月, 2016 1 次提交
  6. 16 8月, 2016 1 次提交
  7. 13 8月, 2016 1 次提交
  8. 09 8月, 2016 1 次提交
    • R
      perf probe ppc64le: Fix probe location when using DWARF · 99e608b5
      Ravi Bangoria 提交于
      Powerpc has Global Entry Point and Local Entry Point for functions.  LEP
      catches call from both the GEP and the LEP. Symbol table of ELF contains
      GEP and Offset from which we can calculate LEP, but debuginfo does not
      have LEP info.
      
      Currently, perf prioritize symbol table over dwarf to probe on LEP for
      ppc64le. But when user tries to probe with function parameter, we fall
      back to using dwarf(i.e. GEP) and when function called via LEP, probe
      will never hit.
      
      For example:
      
        $ objdump -d vmlinux
          ...
          do_sys_open():
          c0000000002eb4a0:       e8 00 4c 3c     addis   r2,r12,232
          c0000000002eb4a4:       60 00 42 38     addi    r2,r2,96
          c0000000002eb4a8:       a6 02 08 7c     mflr    r0
          c0000000002eb4ac:       d0 ff 41 fb     std     r26,-48(r1)
      
        $ sudo ./perf probe do_sys_open
        $ sudo cat /sys/kernel/debug/tracing/kprobe_events
          p:probe/do_sys_open _text+3060904
      
        $ sudo ./perf probe 'do_sys_open filename:string'
        $ sudo cat /sys/kernel/debug/tracing/kprobe_events
          p:probe/do_sys_open _text+3060896 filename_string=+0(%gpr4):string
      
      For second case, perf probed on GEP. So when function will be called via
      LEP, probe won't hit.
      
        $ sudo ./perf record -a -e probe:do_sys_open ls
          [ perf record: Woken up 1 times to write data ]
          [ perf record: Captured and wrote 0.195 MB perf.data ]
      
      To resolve this issue, let's not prioritize symbol table, let perf
      decide what it wants to use. Perf is already converting GEP to LEP when
      it uses symbol table. When perf uses debuginfo, let it find LEP offset
      form symbol table. This way we fall back to probe on LEP for all cases.
      
      After patch:
      
        $ sudo ./perf probe 'do_sys_open filename:string'
        $ sudo cat /sys/kernel/debug/tracing/kprobe_events
          p:probe/do_sys_open _text+3060904 filename_string=+0(%gpr4):string
      
        $ sudo ./perf record -a -e probe:do_sys_open ls
          [ perf record: Woken up 1 times to write data ]
          [ perf record: Captured and wrote 0.197 MB perf.data (11 samples) ]
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1470723805-5081-2-git-send-email-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      99e608b5
  9. 27 7月, 2016 1 次提交
  10. 24 7月, 2016 1 次提交
    • D
      x86/insn: remove pcommit · fd1d961d
      Dan Williams 提交于
      The pcommit instruction is being deprecated in favor of either ADR
      (asynchronous DRAM refresh: flush-on-power-fail) at the platform level, or
      posted-write-queue flush addresses as defined by the ACPI 6.x NFIT (NVDIMM
      Firmware Interface Table).
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Acked-by: NIngo Molnar <mingo@redhat.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      fd1d961d
  11. 21 7月, 2016 1 次提交
  12. 20 7月, 2016 1 次提交
  13. 13 7月, 2016 3 次提交
  14. 05 7月, 2016 1 次提交
    • A
      perf tools: Sync copy of syscall_64.tbl with the kernel · f3d082ce
      Arnaldo Carvalho de Melo 提交于
      Noticed by the build system, that emitted this warning:
      
        Warning: x86_64's syscall_64.tbl differs from kernel
      
      This was due to the wiring up of the recently added preadv2 & pwritev2
      syscalls to the compat code, which hadn't been done by the patch
      introducing those syscalls: 4babf2c5 ("x86: wire up preadv2 and
      pwritev2").
      
      The patch doing the compat wiring was:
      
        482dd2ef ("x86/syscalls: Wire up compat readv2/writev2 syscalls")
      
      This just silences the perf build warning, as compat syscalls still
      can't be supported in 'perf trace´ due to limitations in the
      raw_syscalls:sys_{enter,exit} tracepoints it relies on.
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-4dm8eoy0wslgtwqdhz64ods0@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f3d082ce
  15. 28 6月, 2016 1 次提交
  16. 23 6月, 2016 1 次提交
  17. 22 6月, 2016 1 次提交
  18. 08 6月, 2016 2 次提交
  19. 07 6月, 2016 3 次提交
    • H
      perf tools: Export normalize_arch() function · 940e6987
      He Kuang 提交于
      Export normalize_arch() function, so other part of perf can get
      normalized form of arch string.
      Signed-off-by: NHe Kuang <hekuang@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1464924803-22214-10-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      940e6987
    • H
      perf unwind: Separate local/remote libunwind config · 9d8e14d3
      He Kuang 提交于
      CONFIG_LIBUNWIND/NO_LIBUNWIND are changed to CONFIG_LOCAL_LIBUNWIND/
      NO_LOCAL_LIBUNWIND for retaining local unwind features. The new
      CONFIG_LIBUNWIND stands for either local or remote or both unwind are
      supported, and NO_LIBUNWIND means that neither local nor remote unwind
      is supported.
      
      LIBUNWIND_LIBS is eliminated in LDFLAGS if local libunwind is not
      supported.
      Signed-off-by: NHe Kuang <hekuang@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1464924803-22214-7-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9d8e14d3
    • A
      perf stat: Basic support for TopDown in perf stat · 44b1e60a
      Andi Kleen 提交于
      Add basic plumbing for TopDown in perf stat
      
      TopDown is intended to replace the frontend cycles idle/ backend cycles
      idle metrics in standard perf stat output.  These metrics are not
      reliable in many workloads, due to out of order effects.
      
      This implements a new --topdown mode in perf stat (similar to
      --transaction) that measures the pipe line bottlenecks using
      standardized formulas. The measurement can be all done with 5 counters
      (one fixed counter)
      
      The result are four metrics:
      
      FrontendBound, BackendBound, BadSpeculation, Retiring
      
      that describe the CPU pipeline behavior on a high level.
      
      The full top down methology has many hierarchical metrics.  This
      implementation only supports level 1 which can be collected without
      multiplexing. A full implementation of top down on top of perf is
      available in pmu-tools toplev.  (http://github.com/andikleen/pmu-tools)
      
      The current version works on Intel Core CPUs starting with Sandy Bridge,
      and Atom CPUs starting with Silvermont.  In principle the generic
      metrics should be also implementable on other out of order CPUs.
      
      TopDown level 1 uses a set of abstracted metrics which are generic to
      out of order CPU cores (although some CPUs may not implement all of
      them):
      
        topdown-total-slots       Available slots in the pipeline
        topdown-slots-issued      Slots issued into the pipeline
        topdown-slots-retired     Slots successfully retired
        topdown-fetch-bubbles     Pipeline gaps in the frontend
        topdown-recovery-bubbles  Pipeline gaps during recovery
                                  from misspeculation
      
      These metrics then allow to compute four useful metrics:
      
      FrontendBound, BackendBound, Retiring, BadSpeculation.
      
      Add a new --topdown options to enable events.  When --topdown is
      specified set up events for all topdown events supported by the kernel.
      Add topdown-* as a special case to the event parser, as is needed for
      all events containing -.
      
      The actual code to compute the metrics is in follow-on patches.
      
      v2: Use standard sysctl read function.
      v3: Move x86 specific code to arch/
      v4: Enable --metric-only implicitly for topdown.
      v5: Add --single-thread option to not force per core mode
      v6: Fix output order of topdown metrics
      v7: Allow combining with -d
      v8: Remove --single-thread again
      v9: Rename functions, adding arch_ and topdown_.
      v10: Expand man page and describe TopDown better
      Paste intro into commit description.
      Print error when malloc fails.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1464119559-17203-1-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      44b1e60a
  20. 30 5月, 2016 1 次提交
  21. 12 5月, 2016 1 次提交
  22. 11 5月, 2016 1 次提交
  23. 06 5月, 2016 3 次提交
  24. 27 4月, 2016 1 次提交
  25. 26 4月, 2016 1 次提交
  26. 21 4月, 2016 2 次提交
    • M
      tool/perf: Add sample_reg_mask to include all perf_regs · bb62bad6
      Madhavan Srinivasan 提交于
      Add sample_reg_mask array with pt_regs registers.
      This is needed for printing supported regs ( -I? option).
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      bb62bad6
    • A
      tools/perf: Map the ID values with register names · dc642e83
      Anju T 提交于
      Map ID values with corresponding register names. These names are then
      displayed when user issues perf record with the -I option
      followed by perf report/script with -D option.
      
      To test this patchset, Eg:
      
        $ perf record -I ls   # record machine state at interrupt
        $ perf script -D      # read the perf.data file
      
      Sample output obtained for this patch / output looks like as follows:
      
        496768515470 0x1988 [0x188]: PERF_RECORD_SAMPLE(IP, 0x1): 4522/4522:
        0xc0000000001e538c period: 1 addr: 0
        ... intr regs: mask 0x7ffffffffff ABI 64-bit
        .... r0    0xc0000000001e5e34
        .... r1    0xc000000fe733f9a0
        .... r2    0xc000000001523100
        .... r3    0xc000000ffaadeb60
        .... r4    0xc000000003456800
        .... r5    0x73a9b5e000
        .... r6    0x1e000000
        .... r7    0x0
        .... r8    0x0
        .... r9    0x0
        .... r10   0x1
        .... r11   0x0
        .... r12   0x24022822
        .... r13   0xc00000000feec180
        .... r14   0x0
        .... r15   0xc000001e4be18800
        .... r16   0x0
        .... r17   0xc000000ffaac5000
        .... r18   0xc000000fe733f8a0
        .... r19   0xc000000001523100
        .... r20   0xc00000000009fd1c
        .... r21   0xc000000fcaa69000
        .... r22   0xc0000000001e4968
        .... r23   0xc000000001523100
        .... r24   0xc000000fe733f850
        .... r25   0xc000000fcaa69000
        .... r26   0xc000000003b8fcf0
        .... r27   0xfffffffffffffead
        .... r28   0x0
        .... r29   0xc000000fcaa69000
        .... r30   0x1
        .... r31   0x0
        .... nip   0xc0000000001dd320
        .... msr   0x9000000000009032
        .... orig_r3 0xc0000000001e538c
        .... ctr   0xc00000000009d550
        .... link  0xc0000000001e5e34
        .... xer   0x0
        .... ccr   0x84022882
        .... softe 0x0
        .... trap  0xf01
        .... dar   0x0
        .... dsisr 0xf00040060000004
         ... thread: :4522:4522
         ...... dso: /root/.debug/.build-id/b0/ef11b1a1629e62ac9de75199117ee5ef9469e9
                   :4522 4522 496.768515: 1 cycles: c0000000001e538c
                   .perf_event_context_sched_in (/boot/vmlinux)
      Signed-off-by: NAnju T <anju@linux.vnet.ibm.com>
      Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      dc642e83
  27. 12 4月, 2016 1 次提交
  28. 08 4月, 2016 2 次提交
    • A
      perf dwarf: Guard !x86_64 definitions under #ifdef else clause · f9383452
      Arnaldo Carvalho de Melo 提交于
      To fix the build on Fedora Rawhide (gcc 6.0.0 20160311 (Red Hat 6.0.0-0.17):
      
          CC       /tmp/build/perf/arch/x86/util/dwarf-regs.o
        arch/x86/util/dwarf-regs.c:66:36: error: 'x86_32_regoffset_table' defined but not used [-Werror=unused-const-variable=]
         static const struct pt_regs_offset x86_32_regoffset_table[] = {
                                            ^~~~~~~~~~~~~~~~~~~~~~
        cc1: all warnings being treated as errors
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-fghuksc1u8ln82bof4lwcj0o@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f9383452
    • A
      perf tools: Build syscall table .c header from kernel's syscall_64.tbl · 1b700c99
      Arnaldo Carvalho de Melo 提交于
      We used libaudit to map ids to syscall names and vice-versa, but that
      imposes a delay in supporting new syscalls, having to wait for libaudit
      to get those new syscalls on its tables.
      
      To remove that delay, for x86_64 initially, grab a copy of
      arch/x86/entry/syscalls/syscall_64.tbl and use it to generate those
      tables.
      
      Syscalls currently not available in audit-libs:
      
        # trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd
        Error:	Invalid syscall copy_file_range, membarrier, mlock2, pread64, pwrite64, timerfd_create, userfaultfd
        Hint:	try 'perf list syscalls:sys_enter_*'
        Hint:	and: 'man syscalls'
        #
      
      With this patch:
      
        # trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd
          8505.733 ( 0.010 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 36
          8506.688 ( 0.005 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 40
         30023.097 ( 0.025 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63ae382000, count: 4096, pos: 529592320) = 4096
         31268.712 ( 0.028 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afd8b000, count: 4096, pos: 2314133504) = 4096
         31268.854 ( 0.016 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afda2000, count: 4096, pos: 2314137600) = 4096
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-51xfjbxevdsucmnbc4ka5r88@git.kernel.org
      [ Added make dep for 'prepare' in 'LIBPERF_IN', fix by Wang Nan to fix parallell build ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1b700c99
  29. 02 4月, 2016 2 次提交