1. 16 5月, 2018 8 次提交
    • K
      perf parse-events: Handle uncore event aliases in small groups properly · 3cdc5c2c
      Kan Liang 提交于
      Perf stat doesn't count the uncore event aliases from the same uncore
      block in a group, for example:
      
        perf stat -e '{unc_m_cas_count.all,unc_m_clockticks}' -a -I 1000
        #           time             counts unit events
             1.000447342      <not counted>      unc_m_cas_count.all
             1.000447342      <not counted>      unc_m_clockticks
             2.000740654      <not counted>      unc_m_cas_count.all
             2.000740654      <not counted>      unc_m_clockticks
      
      The output is very misleading. It gives a wrong impression that the
      uncore event doesn't work.
      
      An uncore block could be composed by several PMUs. An uncore event alias
      is a joint name which means the same event runs on all PMUs of a block.
      Perf doesn't support mixed events from different PMUs in the same group.
      It is wrong to put uncore event aliases in a big group.
      
      The right way is to split the big group into multiple small groups which
      only include the events from the same PMU.
      
      Only uncore event aliases from the same uncore block should be specially
      handled here. It doesn't make sense to mix the uncore events with other
      uncore events from different blocks or even core events in a group.
      
      With the patch:
        #           time             counts unit events
           1.001557653            140,833      unc_m_cas_count.all
           1.001557653      1,330,231,332      unc_m_clockticks
           2.002709483             85,007      unc_m_cas_count.all
           2.002709483      1,429,494,563      unc_m_clockticks
      Reported-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1525727623-19768-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3cdc5c2c
    • A
      perf tools: Use the "_stest" symbol to identify the kernel map when loading kcore · 56549978
      Adrian Hunter 提交于
      The first symbol is not necessarily in the kernel text.  Instead of
      using the first symbol, use the _stest symbol to identify the kernel map
      when loading kcore.
      
      This allows for the introduction of symbols to identify the x86_64 PTI
      entry trampolines.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86@kernel.org
      Link: http://lkml.kernel.org/r/1525866228-30321-6-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      56549978
    • A
      perf bpf: Add probe() helper to reduce kprobes boilerplate · d8fc764d
      Arnaldo Carvalho de Melo 提交于
      So that kprobe definitions become:
      
        int probe(function, variables)(void *ctx, int err, var1, var2, ...)
      
      The existing 5sec.c, got converted and goes from:
      
        SEC("func=hrtimer_nanosleep rqtp->tv_sec")
        int func(void *ctx, int err, long sec)
        {
        }
      
      To:
      
        int probe(hrtimer_nanosleep, rqtp->tv_sec)(void *ctx, int err, long sec)
        {
        }
      
      If we decide to add tv_nsec as well, then it becomes:
      
        $ cat tools/perf/examples/bpf/5sec.c
        #include <bpf.h>
      
        int probe(hrtimer_nanosleep, rqtp->tv_sec rqtp->tv_nsec)(void *ctx, int err, long sec, long nsec)
        {
      	  return sec == 5;
        }
      
        license(GPL);
        $
      
      And if we run it, system wide as before and run some 'sleep' with values
      for the tv_nsec field, we get:
      
        # perf trace --no-syscalls -e tools/perf/examples/bpf/5sec.c
           0.000 perf_bpf_probe:hrtimer_nanosleep:(ffffffff9811b5f0) tv_sec=5 tv_nsec=100000000
        9641.650 perf_bpf_probe:hrtimer_nanosleep:(ffffffff9811b5f0) tv_sec=5 tv_nsec=123450001
        ^C#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-1v9r8f6ds5av0w9pcwpeknyl@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d8fc764d
    • A
      perf bpf: Add license(NAME) helper · 1f477305
      Arnaldo Carvalho de Melo 提交于
      To further reduce boilerplate.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-vst6hj335s0ebxzqltes3nsc@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1f477305
    • A
      perf bpf: Add kprobe example to catch 5s naps · 7542b767
      Arnaldo Carvalho de Melo 提交于
      Description:
      
      . Disable strace like syscall tracing (--no-syscalls), or try tracing
        just some (-e *sleep).
      
      . Attach a filter function to a kernel function, returning when it should
        be considered, i.e. appear on the output:
      
        $ cat tools/perf/examples/bpf/5sec.c
        #include <bpf.h>
      
        SEC("func=hrtimer_nanosleep rqtp->tv_sec")
        int func(void *ctx, int err, long sec)
        {
      	  return sec == 5;
        }
      
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        $
      
      . Run it system wide, so that any sleep of >= 5 seconds and < than 6
        seconds gets caught.
      
      . Ask for callgraphs using DWARF info, so that userspace can be unwound
      
      . While this is running, run something like "sleep 5s".
      
        # perf trace --no-syscalls -e tools/perf/examples/bpf/5sec.c/call-graph=dwarf/
           0.000 perf_bpf_probe:func:(ffffffff9811b5f0) tv_sec=5
                                             hrtimer_nanosleep ([kernel.kallsyms])
                                             __x64_sys_nanosleep ([kernel.kallsyms])
                                             do_syscall_64 ([kernel.kallsyms])
                                             entry_SYSCALL_64 ([kernel.kallsyms])
                                             __GI___nanosleep (/usr/lib64/libc-2.26.so)
                                             rpl_nanosleep (/usr/bin/sleep)
                                             xnanosleep (/usr/bin/sleep)
                                             main (/usr/bin/sleep)
                                             __libc_start_main (/usr/lib64/libc-2.26.so)
                                             _start (/usr/bin/sleep)
        ^C#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-2nmxth2l2h09f9gy85lyexcq@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7542b767
    • A
      perf bpf: Add bpf.h to be used in eBPF proggies · dd8e4ead
      Arnaldo Carvalho de Melo 提交于
      So, the first helper is the one shortening a variable/function section
      attribute, from, for instance:
      
        char _license[] __attribute__((section("license"), used)) = "GPL";
      
      to:
      
        char _license[] SEC("license") = "GPL";
      
      Convert empty.c to that and it becomes:
      
        # cat ~acme/lib/examples/perf/bpf/empty.c
        #include <bpf.h>
      
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-zmeg52dlvy51rdlhyumfl5yf@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dd8e4ead
    • A
      perf bpf: Add 'examples' directories · 8f12a2ff
      Arnaldo Carvalho de Melo 提交于
      The first one is the bare minimum that bpf infrastructure accepts before
      it expects actual events to be set up:
      
        $ cat tools/perf/examples/bpf/empty.c
        char _license[] __attribute__((section("license"), used)) = "GPL";
        int _version __attribute__((section("version"), used)) = LINUX_VERSION_CODE;
        $
      
      If you remove that "version" line, then it will be refused with:
      
        # perf trace -e tools/perf/examples/bpf/empty.c
        event syntax error: 'tools/perf/examples/bpf/empty.c'
                             \___ Failed to load tools/perf/examples/bpf/empty.c from source: 'version' section incorrect or lost
      
        (add -v to see detail)
        Run 'perf list' for a list of valid events
      
         Usage: perf trace [<options>] [<command>]
            or: perf trace [<options>] -- <command> [<options>]
            or: perf trace record [<options>] [<command>]
            or: perf trace record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event/syscall selector. use 'perf list' to list available events
        #
      
      The next ones will, step by step, show simple filters, then the needs
      for headers will be made clear, it will be put in place and tested with
      new examples, rinse, repeat.
      
      Back to using this first one to test the perf+bpf infrastructure:
      
      If we run it will fail, as no functions are present connecting with,
      say, a tracepoint or a function using the kprobes or uprobes
      infrastructure:
      
        # perf trace -e tools/perf/examples/bpf/empty.c
        WARNING: event parser found nothing
        invalid or unsupported event: 'tools/perf/examples/bpf/empty.c'
        Run 'perf list' for a list of valid events
      
         Usage: perf trace [<options>] [<command>]
            or: perf trace [<options>] -- <command> [<options>]
            or: perf trace record [<options>] [<command>]
            or: perf trace record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event/syscall selector. use 'perf list' to list available events
        #
      
      But, if we set things up to dump the generated object file to a file,
      and do this after having run 'make install', still on the developer's
      $HOME directory:
      
        # cat ~/.perfconfig
        [llvm]
      
      	dump-obj = true
        #
        # perf trace -e ~acme/lib/examples/perf/bpf/empty.c
        LLVM: dumping /home/acme/lib/examples/perf/bpf/empty.o
        WARNING: event parser found nothing
        invalid or unsupported event: '/home/acme/lib/examples/perf/bpf/empty.c'
        <SNIP>
        #
      
      We can look at the dumped object file:
      
        # ls -la ~acme/lib/examples/perf/bpf/empty.o
        -rw-r--r--. 1 root root 576 May  4 12:10 /home/acme/lib/examples/perf/bpf/empty.o
        # file ~acme/lib/examples/perf/bpf/empty.o
        /home/acme/lib/examples/perf/bpf/empty.o: ELF 64-bit LSB relocatable, *unknown arch 0xf7* version 1 (SYSV), not stripped
        # readelf -sw ~acme/lib/examples/perf/bpf/empty.o
      
        Symbol table '.symtab' contains 3 entries:
           Num:    Value          Size Type    Bind   Vis      Ndx Name
             0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
             1: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT    3 _license
             2: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT    4 _version
        #
        # tools/bpf/bpftool/bpftool --pretty ~acme/lib/examples/perf/bpf/empty.o
        null
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-y7dkhakejz3013o0w21n98xd@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8f12a2ff
    • A
      perf llvm-utils: Add bpf include path to clang command line · 1b16fffa
      Arnaldo Carvalho de Melo 提交于
      We'll start putting headers for helpers to be used in eBPF proggies in
      there:
      
        # perf trace -v --no-syscalls -e empty.c |& grep "llvm compiling command : "
        llvm compiling command : /usr/lib64/ccache/clang -D__KERNEL__ -D__NR_CPUS__=4 -DLINUX_VERSION_CODE=0x41100   -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h  -I/home/acme/lib/include/perf/bpf -Wno-unused-value -Wno-pointer-sign -working-directory /lib/modules/4.17.0-rc3-00034-gf4ef6a43/build -c /home/acme/bpf/empty.c -target bpf -O2 -o -
        #
      
      Notice the "-I/home/acme/lib/include/perf/bpf"
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-6xq94xro8xlb5s9urznh3f9k@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1b16fffa
  2. 15 5月, 2018 4 次提交
  3. 14 5月, 2018 5 次提交
    • L
      Linux 4.17-rc5 · 67b8d5c7
      Linus Torvalds 提交于
      67b8d5c7
    • L
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 66e1c94d
      Linus Torvalds 提交于
      Pull x86/pti updates from Thomas Gleixner:
       "A mixed bag of fixes and updates for the ghosts which are hunting us.
      
        The scheduler fixes have been pulled into that branch to avoid
        conflicts.
      
         - A set of fixes to address a khread_parkme() race which caused lost
           wakeups and loss of state.
      
         - A deadlock fix for stop_machine() solved by moving the wakeups
           outside of the stopper_lock held region.
      
         - A set of Spectre V1 array access restrictions. The possible
           problematic spots were discuvered by Dan Carpenters new checks in
           smatch.
      
         - Removal of an unused file which was forgotten when the rest of that
           functionality was removed"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/vdso: Remove unused file
        perf/x86/cstate: Fix possible Spectre-v1 indexing for pkg_msr
        perf/x86/msr: Fix possible Spectre-v1 indexing in the MSR driver
        perf/x86: Fix possible Spectre-v1 indexing for x86_pmu::event_map()
        perf/x86: Fix possible Spectre-v1 indexing for hw_perf_event cache_*
        perf/core: Fix possible Spectre-v1 indexing for ->aux_pages[]
        sched/autogroup: Fix possible Spectre-v1 indexing for sched_prio_to_weight[]
        sched/core: Fix possible Spectre-v1 indexing for sched_prio_to_weight[]
        sched/core: Introduce set_special_state()
        kthread, sched/wait: Fix kthread_parkme() completion issue
        kthread, sched/wait: Fix kthread_parkme() wait-loop
        sched/fair: Fix the update of blocked load when newly idle
        stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock
      66e1c94d
    • L
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 86a4ac43
      Linus Torvalds 提交于
      Pull scheduler fix from Thomas Gleixner:
       "Revert the new NUMA aware placement approach which turned out to
        create more problems than it solved"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        Revert "sched/numa: Delay retrying placement for automatic NUMA balance after wake_affine()"
      86a4ac43
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · baeda713
      Linus Torvalds 提交于
      Pull perf tooling fixes from Thomas Gleixner:
       "Another small set of perf tooling fixes and updates:
      
         - Revert "perf pmu: Fix pmu events parsing rule", as it broke Intel
           PT event description parsing (Arnaldo Carvalho de Melo)
      
         - Sync x86's cpufeatures.h and kvm UAPI headers with the kernel
           sources, suppressing the ABI drift warnings (Arnaldo Carvalho de
           Melo)
      
         - Remove duplicated entry for westmereep-dp in Intel's mapfile.csv
           (William Cohen)
      
         - Fix typo in 'perf bench numa' options description (Yisheng Xie)"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        Revert "perf pmu: Fix pmu events parsing rule"
        tools headers kvm: Sync ARM UAPI headers with the kernel sources
        tools headers kvm: Sync uapi/linux/kvm.h with the kernel sources
        tools headers: Sync x86 cpufeatures.h with the kernel sources
        perf vendor events intel: Remove duplicated entry for westmereep-dp in mapfile.csv
        perf bench numa: Fix typo in options
      baeda713
    • L
      Merge tag 'dma-mapping-4.17-5' of git://git.infradead.org/users/hch/dma-mapping · 0503fd65
      Linus Torvalds 提交于
      Pull dma-mapping fix from Christoph Hellwig:
       "Just one little fix from Jean to avoid a harmless but very annoying
        warning, especially for the drm code"
      
      * tag 'dma-mapping-4.17-5' of git://git.infradead.org/users/hch/dma-mapping:
        swiotlb: silent unwanted warning "buffer is full"
      0503fd65
  4. 13 5月, 2018 3 次提交
  5. 12 5月, 2018 20 次提交