1. 03 4月, 2015 12 次提交
    • Y
      perf script: Support using -f to override perf.data file ownership · 06af0f2c
      Yunlong Song 提交于
      Enable perf script to use perf.data when it is not owned by current user
      or root. Change the short option name of --fields to -F to avoid confusion
      with --force.
      
      Example:
      
       # perf record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 28360 Apr  2 14:53 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf script
       File perf.data not owned by current user or root (use -f to override)
       # perf script -f
         Error: switch `f' requires a value
      
        usage: perf script [<options>]
           or: perf script [<options>] record <script> [<record-options>] <command>
           or: perf script [<options>] report <script> [script-args]
           or: perf script [<options>] <script> [<record-options>] <command>
           or: perf script [<options>] <top-script> [script-args]
      
           -f, --fields <str>    comma separated output fields prepend with
           'type:'. Valid types: hw,sw,trace,raw. Fields:
           comm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,period
      
      As shown above, the -f option does not work at all. And -f is already
      taken up by --fields, which makes --force confused, so change the short
      option name of --fields to -F like what other perf commands do (e.g.
      perf report -F) and use -f as the short option name of --force.
      
      After this patch:
      
       # perf script
       File perf.data not owned by current user or root (use -f to override)
       # perf script -f
       :41298 41298 2590086.564226:          1 cycles:  ffffffff8103efc6
       native_write_msr_safe ([kernel.kallsyms])
       :41298 41298 2590086.564244:          1 cycles:  ffffffff8103efc6
       native_write_msr_safe ([kernel.kallsyms])
       :41298 41298 2590086.564249:          7 cycles:  ffffffff8103efc6
       native_write_msr_safe ([kernel.kallsyms])
       :41298 41298 2590086.564255:        176 cycles:  ffffffff8103efc6
       native_write_msr_safe ([kernel.kallsyms])
           ls 41298 2590086.567346:       4059 cycles:  ffffffff8105a592
           raise_softirq ([kernel.kallsyms])
           ls 41298 2590086.567353:       3717 cycles:  ffffffff8105a592
           raise_softirq ([kernel.kallsyms])
           ls 41298 2590086.567358:      63058 cycles:  ffffffff8105a592
           raise_softirq ([kernel.kallsyms])
           ls 41298 2590086.567448:    1706255 cycles:            406ae0
           [unknown] (/usr/bin/ls)
      
      As shown above, the -f option really works now.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-8-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      06af0f2c
    • Y
      perf mem: Support using -f to override perf.data file ownership · 62a1a63a
      Yunlong Song 提交于
      Enable perf mem to use perf.data when it is not owned by current user or
      root.
      
      Example:
      
       # perf mem -t load record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 16392 Apr  2 14:34 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf mem -D report
       File perf.data not owned by current user or root (use -f to override)
       # perf mem -D -f report
         Error: unknown switch `f'
      
        usage: perf mem [<options>] {record|report}
      
           -t, --type <type>     memory operations(load,store) Default load,store
           -D, --dump-raw-samples
                                 dump raw samples in ASCII
           -U, --hide-unresolved
                                 Only display entries resolved to a symbol
           -i, --input <file>    input file name
           -C, --cpu <cpu>       list of cpus to profile
           -x, --field-separator <separator>
                                 separator for columns, no spaces will be added
                                 between columns '.' is reserved.
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf mem -D report
       File perf.data not owned by current user or root (use -f to override)
       # perf mem -D -f report
       # PID, TID, IP, ADDR, LOCAL WEIGHT, DSRC, SYMBOL
       39095 39095 0xffffffff81127e40 0x016ffff887f45148338 8 0x68100142
       /proc/kcore:perf_event_aux
       39095 39095 0xffffffff8100a3fe 0xffff89007f8cb7d0 6 0x68100142
       /proc/kcore:native_sched_clock
       39095 39095 0xffffffff81309139 0xffff88bf44c9ded8 6 0x68100142
       /proc/kcore:acpi_map_lookup
       39095 39095 0xffffffff810f8c4c 0xffff89007f8ccd88 6 0x68100142
       /proc/kcore:rcu_nmi_exit
       39095 39095 0xffffffff81136346 0xffff88fea995dd50 6 0x68100142
       /proc/kcore:unlock_page
       39095 39095 0xffffffff812a64a2 0xffff88fea995dcc8 6 0x68100142
       /proc/kcore:half_md4_transform
       39095 39095 0x7f0cf877c7e9 0x25dfb94 6 0x68100142
       /lib64/libc-2.19.so:__readdir64
       39095 39095 0x7f0cf87575a3 0x7f0cf9163731 6 0x68100142
       /lib64/libc-2.19.so:__strcoll_l
       39095 39095 0xffffffff8116910e 0xffffea01c1bfbd50 23 0x68100242
       /proc/kcore:page_remove_rmap
      
      As shown above, the -f option really works now.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-7-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      62a1a63a
    • Y
      perf lock: Support using -f to override perf.data file ownership · c4ac732a
      Yunlong Song 提交于
      Enable perf lock to use perf.data when it is not owned by current user
      or root.
      
      Example:
      
       # perf lock record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 4880686 Apr  2 14:14 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf lock report
       File perf.data not owned by current user or root (use -f to override)
       Initializing perf session failed
       # perf lock report -f
         Error: unknown switch `f'
      
        usage: perf lock report [<options>]
      
           -k, --key <acquired>  key for sorting (acquired / contended /
           avg_wait / wait_total / wait_max / wait_min)
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf lock report
       File perf.data not owned by current user or root (use -f to override)
       Initializing perf session failed
       # perf lock report -f
                      Name   acquired  contended   avg wait (ns) total wait (ns) ...
      
       &ldata->output_l...        128          0               0               0 ...
                &ctx->lock        114          0               0               0 ...
               &p->pi_lock        112          0               0               0 ...
       &(&pool->lock)->...        112          0               0               0 ...
       &(&dentry->d_loc...         70          0               0               0 ...
       &(&newf->file_lo...         62          0               0               0 ...
       &(&fs->lock)->rl...         43          0               0               0 ...
       ...
      
      As shown above, the -f option really works now.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-6-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c4ac732a
    • Y
      perf kvm: Support using -f to override perf.data.guest file ownership · 8cc5ec1f
      Yunlong Song 提交于
      Enable perf kvm to use perf.data.guest when it is not owned by current
      user or root.
      
      Example:
      
       # perf kvm stat record ls
       # chown Yunlong.Song:Yunlong.Song perf.data.guest
       # ls -al perf.data.guest
       -rw------- 1 Yunlong.Song Yunlong.Song 4128937 Apr  2 11:05 perf.data.guest
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf kvm stat report
       File perf.data.guest not owned by current user or root (use -f to override)
       Initializing perf session failed
       # perf kvm stat report -f
         Error: unknown switch `f'
      
        usage: perf kvm stat report [<options>]
      
               --event <report event>
                                 event for reporting: vmexit, mmio (x86 only),
                                 ioport (x86 only)
               --vcpu <n>        vcpu id to report
           -k, --key <sort-key>  key for sorting: sample(sort by samples
       						   number) time (sort by avg time)
           -p, --pid <pid>       analyze events only for given process id(s)
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf kvm stat report
       File perf.data.guest not owned by current user or root (use -f to override)
       Initializing perf session failed
       # perf kvm stat report -f
       Analyze events for all VMs, all VCPUs:
      
         VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time   Avg time
      
       Total Samples:0, Total events handled time:0.00us.
      
      As shown above, the -f option really works now. Since we have not
      launched any KVM related process, the result shows 0 sample here.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-5-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8cc5ec1f
    • Y
      perf kmem: Support using -f to override perf.data file ownership · d1eeb77c
      Yunlong Song 提交于
      Enable perf kmem to use perf.data when it is not owned by current user
      or root.
      
      Example:
      
       # perf kmem record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 5315665 Apr  2 10:54 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf kmem stat
       File perf.data not owned by current user or root (use -f to override)
       # perf kmem stat -f
         Error: unknown switch `f'
      
        usage: perf kmem [<options>] {record|stat}
      
           -i, --input <file>    input file name
           -v, --verbose         be more verbose (show symbol address, etc)
               --caller          show per-callsite statistics
               --alloc           show per-allocation statistics
           -s, --sort <key[,key2...]>
                                 sort by keys: ptr, call_site, bytes, hit,
                                 pingpong, frag
           -l, --line <num>      show n lines
               --raw-ip          show raw ip instead of symbol
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf kmem stat
       File perf.data not owned by current user or root (use -f to override)
       # perf kmem stat -f
       SUMMARY
       =======
       Total bytes requested: 437599
       Total bytes allocated: 615472
       Total bytes wasted on internal fragmentation: 177873
       Internal fragmentation: 28.900259%
       Cross CPU allocations: 6/1192
      
      As shown above, the -f option really works now.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-4-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d1eeb77c
    • Y
      perf inject: Support using -f to override perf.data file ownership · ccaa474c
      Yunlong Song 提交于
      Enable perf inject to use perf.data when it is not owned by current user
      or root.
      
      Example:
      
       # perf record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 28260 Apr  2 10:37 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf inject -v -b -i perf.data -o perf.data.new
       File perf.data not owned by current user or root (use -f to override)
       # perf inject -v -b -i perf.data -o perf.data.new -f
         Error: unknown switch `f'
      
        usage: perf inject [<options>]
      
           -b, --build-ids       Inject build-ids into the output stream
           -i, --input <file>    input file name
           -o, --output <file>   output file name
           -s, --sched-stat      Merge sched-stat and sched-switch for getting
           events where and how long tasks slept
           -v, --verbose         be more verbose (show build ids, etc)
               --kallsyms <file>
                                 kallsyms pathname
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf inject -v -b -i perf.data -o perf.data.new
       File perf.data not owned by current user or root (use -f to override)
       # perf inject -v -b -i perf.data -o perf.data.new -f
       build id event received for [kernel.kallsyms]:
       f6dcb66d8b98f1c0d9eb87bf043444b69f91d30c
       symsrc__init: cannot get elf header.
       Looking at the vmlinux_path (7 entries long)
       Using /proc/kcore for kernel object code
       Using /proc/kallsyms for symbols
      
      As shown above, the -f option really works now.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-3-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ccaa474c
    • Y
      perf evlist: Support using -f to override perf.data file ownership · 9e3b6ec1
      Yunlong Song 提交于
      Enable perf evlist to use perf.data when it is not owned by current user
      or root.
      
      Example:
      
       # perf record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 28260 Apr  2 10:18 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf evlist
       File perf.data not owned by current user or root (use -f to override)
       # perf evlist -f
         Error: unknown switch `f'
      
        usage: perf evlist [<options>]
      
           -i, --input <file>    Input file name
           -F, --freq            Show the sample frequency
           -v, --verbose         Show all event attr details
           -g, --group           Show event group information
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf evlist
       File perf.data not owned by current user or root (use -f to override)
       # perf evlist -f
       cycles
      
      As shown above, the -f option really works now.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-2-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9e3b6ec1
    • M
      perf probe: Fix to track down unnamed union/structure members · c7273835
      Masami Hiramatsu 提交于
      Fix 'perf probe' to track down unnamed union/structure members.
      
      perf probe did not track down the tree of unnamed union/structure
      members, since it just failed to find given "name" in a parent
      structure/union.  To solve this issue, I've introduced 2 changes.
      
      - Fix die_find_member() to track down the type-DIE if it is
        unnamed, and if it contains the specified member, returns the
        unnamed member.
        (note that we don't return found member, since unnamed member
         has the offset in the parent structure)
      - Fix convert_variable_fields() to track down the unnamed union/
        structure (one-by-one).
      
      With this patch, perf probe can access unnamed fields:
        -----
        #./perf probe -nfx ./perf lock__delete ops 'locked_ops=ops->locked.ops'
        Added new event:
          probe_perf:lock__delete (on lock__delete in /home/mhiramat/ksrc/linux-3/tools/perf/perf with ops locked_ops=ops->locked.ops)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe_perf:lock__delete -aR sleep 1
        -----
      Reported-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Report-Link: https://lkml.org/lkml/2015/3/5/431Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150402073312.14482.37942.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c7273835
    • A
      perf db-export: No need to have ->thread twice in struct export_sample · b83e868d
      Arnaldo Carvalho de Melo 提交于
      As it comes from address_location->thread, that is already stored as
      export_sample->al, where the thread can be obtained.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20150402141542.GA9630@kernel.org
      Link: http://lkml.kernel.org/n/tip-bzotbl4epoztw0jd6sm2stpf@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b83e868d
    • A
      perf db-export: No need to pass thread twice to db_export__sample · 7327259d
      Arnaldo Carvalho de Melo 提交于
      As it is available via another parameter, address_location->thread.
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: lkml.kernel.org/r/551D08F8.3040706@intel.com
      Link: http://lkml.kernel.org/n/tip-6dbn0tcm9hyv92g7h3zj2dbt@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7327259d
    • A
      perf scripting: No need to pass thread twice to the scripting callbacks · f9d5d549
      Arnaldo Carvalho de Melo 提交于
      It is already in the addr_location, so remove the redundant 'thread'
      parameter from the callback signatures.
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1427906210-10519-3-git-send-email-acme@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f9d5d549
    • A
      perf script: No need to lookup thread twice · 79628f2c
      Arnaldo Carvalho de Melo 提交于
      We get the thread when we call perf_event__preprocess_sample(), no need
      to do it before that.
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1427906210-10519-2-git-send-email-acme@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      79628f2c
  2. 02 4月, 2015 11 次提交
    • I
      bpf: Fix the build on BPF_SYSCALL=y && !CONFIG_TRACING kernels, make it more configurable · e1abf2cc
      Ingo Molnar 提交于
      So bpf_tracing.o depends on CONFIG_BPF_SYSCALL - but that's not its only
      dependency, it also depends on the tracing infrastructure and on kprobes,
      without which it will fail to build with:
      
        In file included from kernel/trace/bpf_trace.c:14:0:
        kernel/trace/trace.h: In function ‘trace_test_and_set_recursion’:
        kernel/trace/trace.h:491:28: error: ‘struct task_struct’ has no member named ‘trace_recursion’
          unsigned int val = current->trace_recursion;
        [...]
      
      It took quite some time to trigger this build failure, because right now
      BPF_SYSCALL is very obscure, depends on CONFIG_EXPERT. So also make BPF_SYSCALL
      more configurable, not just under CONFIG_EXPERT.
      
      If BPF_SYSCALL, tracing and kprobes are enabled then enable the bpf_tracing
      gateway as well.
      
      We might want to make this an interactive option later on, although
      I'd not complicate it unnecessarily: enabling BPF_SYSCALL is enough of
      an indicator that the user wants BPF support.
      
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e1abf2cc
    • A
      samples/bpf: Add kmem_alloc()/free() tracker tool · 9811e353
      Alexei Starovoitov 提交于
      One BPF program attaches to kmem_cache_alloc_node() and
      remembers all allocated objects in the map.
      Another program attaches to kmem_cache_free() and deletes
      corresponding object from the map.
      
      User space walks the map every second and prints any objects
      which are older than 1 second.
      
      Usage:
      
      	$ sudo tracex4
      
      Then start few long living processes. The 'tracex4' will print
      something like this:
      
      	obj 0xffff880465928000 is 13sec old was allocated at ip ffffffff8105dc32
      	obj 0xffff88043181c280 is 13sec old was allocated at ip ffffffff8105dc32
      	obj 0xffff880465848000 is  8sec old was allocated at ip ffffffff8105dc32
      	obj 0xffff8804338bc280 is 15sec old was allocated at ip ffffffff8105dc32
      
      	$ addr2line -fispe vmlinux ffffffff8105dc32
      	do_fork at fork.c:1665
      
      As soon as processes exit the memory is reclaimed and 'tracex4'
      prints nothing.
      
      Similar experiment can be done with the __kmalloc()/kfree() pair.
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-10-git-send-email-ast@plumgrid.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9811e353
    • A
      samples/bpf: Add IO latency analysis (iosnoop/heatmap) tool · 5c7fc2d2
      Alexei Starovoitov 提交于
      BPF C program attaches to
      blk_mq_start_request()/blk_update_request() kprobe events to
      calculate IO latency.
      
      For every completed block IO event it computes the time delta
      in nsec and records in a histogram map:
      
      	map[log10(delta)*10]++
      
      User space reads this histogram map every 2 seconds and prints
      it as a 'heatmap' using gray shades of text terminal. Black
      spaces have many events and white spaces have very few events.
      Left most space is the smallest latency, right most space is
      the largest latency in the range.
      
      Usage:
      
      	$ sudo ./tracex3
      	and do 'sudo dd if=/dev/sda of=/dev/null' in other terminal.
      
      Observe IO latencies and how different activity (like 'make
      kernel') affects it.
      
      Similar experiments can be done for network transmit latencies,
      syscalls, etc.
      
      '-t' flag prints the heatmap using normal ascii characters:
      
      $ sudo ./tracex3 -t
        heatmap of IO latency
        # - many events with this latency
          - few events
      	|1us      |10us     |100us    |1ms      |10ms     |100ms    |1s |10s
      				 *ooo. *O.#.                                    # 221
      			      .  *#     .                                       # 125
      				 ..   .o#*..                                    # 55
      			    .  . .  .  .#O                                      # 37
      				 .#                                             # 175
      				       .#*.                                     # 37
      				  #                                             # 199
      		      .              . *#*.                                     # 55
      				       *#..*                                    # 42
      				  #                                             # 266
      			      ...***Oo#*OO**o#* .                               # 629
      				  #                                             # 271
      				      . .#o* o.*o*                              # 221
      				. . o* *#O..                                    # 50
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-9-git-send-email-ast@plumgrid.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5c7fc2d2
    • A
      samples/bpf: Add counting example for kfree_skb() function calls and the write() syscall · d822a192
      Alexei Starovoitov 提交于
      this example has two probes in one C file that attach to
      different kprove events and use two different maps.
      
      1st probe is x64 specific equivalent of dropmon. It attaches to
      kfree_skb, retrevies 'ip' address of kfree_skb() caller and
      counts number of packet drops at that 'ip' address. User space
      prints 'location - count' map every second.
      
      2nd probe attaches to kprobe:sys_write and computes a histogram
      of different write sizes
      
      Usage:
      	$ sudo tracex2
      	location 0xffffffff81695995 count 1
      	location 0xffffffff816d0da9 count 2
      
      	location 0xffffffff81695995 count 2
      	location 0xffffffff816d0da9 count 2
      
      	location 0xffffffff81695995 count 3
      	location 0xffffffff816d0da9 count 2
      
      	557145+0 records in
      	557145+0 records out
      	285258240 bytes (285 MB) copied, 1.02379 s, 279 MB/s
      		   syscall write() stats
      	     byte_size       : count     distribution
      	       1 -> 1        : 3        |                                      |
      	       2 -> 3        : 0        |                                      |
      	       4 -> 7        : 0        |                                      |
      	       8 -> 15       : 0        |                                      |
      	      16 -> 31       : 2        |                                      |
      	      32 -> 63       : 3        |                                      |
      	      64 -> 127      : 1        |                                      |
      	     128 -> 255      : 1        |                                      |
      	     256 -> 511      : 0        |                                      |
      	     512 -> 1023     : 1118968  |************************************* |
      
      Ctrl-C at any time. Kernel will auto cleanup maps and programs
      
      	$ addr2line -ape ./bld_x64/vmlinux 0xffffffff81695995
      	0xffffffff816d0da9 0xffffffff81695995:
      	./bld_x64/../net/ipv4/icmp.c:1038 0xffffffff816d0da9:
      	./bld_x64/../net/unix/af_unix.c:1231
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-8-git-send-email-ast@plumgrid.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d822a192
    • A
      samples/bpf: Add simple non-portable kprobe filter example · b896c4f9
      Alexei Starovoitov 提交于
      tracex1_kern.c - C program compiled into BPF.
      
      It attaches to kprobe:netif_receive_skb()
      
      When skb->dev->name == "lo", it prints sample debug message into
      trace_pipe via bpf_trace_printk() helper function.
      
      tracex1_user.c - corresponding user space component that:
        - loads BPF program via bpf() syscall
        - opens kprobes:netif_receive_skb event via perf_event_open()
          syscall
        - attaches the program to event via ioctl(event_fd,
          PERF_EVENT_IOC_SET_BPF, prog_fd);
        - prints from trace_pipe
      
      Note, this BPF program is non-portable. It must be recompiled
      with current kernel headers. kprobe is not a stable ABI and
      BPF+kprobe scripts may no longer be meaningful when kernel
      internals change.
      
      No matter in what way the kernel changes, neither the kprobe,
      nor the BPF program can ever crash or corrupt the kernel,
      assuming the kprobes, perf and BPF subsystem has no bugs.
      
      The verifier will detect that the program is using
      bpf_trace_printk() and the kernel will print 'this is a DEBUG
      kernel' warning banner, which means that bpf_trace_printk()
      should be used for debugging of the BPF program only.
      
      Usage:
      $ sudo tracex1
                  ping-19826 [000] d.s2 63103.382648: : skb ffff880466b1ca00 len 84
                  ping-19826 [000] d.s2 63103.382684: : skb ffff880466b1d300 len 84
      
                  ping-19826 [000] d.s2 63104.382533: : skb ffff880466b1ca00 len 84
                  ping-19826 [000] d.s2 63104.382594: : skb ffff880466b1d300 len 84
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-7-git-send-email-ast@plumgrid.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b896c4f9
    • A
      tracing: Allow BPF programs to call bpf_trace_printk() · 9c959c86
      Alexei Starovoitov 提交于
      Debugging of BPF programs needs some form of printk from the
      program, so let programs call limited trace_printk() with %d %u
      %x %p modifiers only.
      
      Similar to kernel modules, during program load verifier checks
      whether program is calling bpf_trace_printk() and if so, kernel
      allocates trace_printk buffers and emits big 'this is debug
      only' banner.
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-6-git-send-email-ast@plumgrid.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9c959c86
    • A
      tracing: Allow BPF programs to call bpf_ktime_get_ns() · d9847d31
      Alexei Starovoitov 提交于
      bpf_ktime_get_ns() is used by programs to compute time delta
      between events or as a timestamp
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-5-git-send-email-ast@plumgrid.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d9847d31
    • A
      tracing, perf: Implement BPF programs attached to kprobes · 2541517c
      Alexei Starovoitov 提交于
      BPF programs, attached to kprobes, provide a safe way to execute
      user-defined BPF byte-code programs without being able to crash or
      hang the kernel in any way. The BPF engine makes sure that such
      programs have a finite execution time and that they cannot break
      out of their sandbox.
      
      The user interface is to attach to a kprobe via the perf syscall:
      
      	struct perf_event_attr attr = {
      		.type	= PERF_TYPE_TRACEPOINT,
      		.config	= event_id,
      		...
      	};
      
      	event_fd = perf_event_open(&attr,...);
      	ioctl(event_fd, PERF_EVENT_IOC_SET_BPF, prog_fd);
      
      'prog_fd' is a file descriptor associated with BPF program
      previously loaded.
      
      'event_id' is an ID of the kprobe created.
      
      Closing 'event_fd':
      
      	close(event_fd);
      
      ... automatically detaches BPF program from it.
      
      BPF programs can call in-kernel helper functions to:
      
        - lookup/update/delete elements in maps
      
        - probe_read - wraper of probe_kernel_read() used to access any
          kernel data structures
      
      BPF programs receive 'struct pt_regs *' as an input ('struct pt_regs' is
      architecture dependent) and return 0 to ignore the event and 1 to store
      kprobe event into the ring buffer.
      
      Note, kprobes are a fundamentally _not_ a stable kernel ABI,
      so BPF programs attached to kprobes must be recompiled for
      every kernel version and user must supply correct LINUX_VERSION_CODE
      in attr.kern_version during bpf_prog_load() call.
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-4-git-send-email-ast@plumgrid.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2541517c
    • A
      tracing: Add kprobe flag · 72cbbc89
      Alexei Starovoitov 提交于
      add TRACE_EVENT_FL_KPROBE flag to differentiate kprobe type of
      tracepoints, since bpf programs can only be attached to kprobe
      type of PERF_TYPE_TRACEPOINT perf events.
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-3-git-send-email-ast@plumgrid.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      72cbbc89
    • D
      bpf: Make internal bpf API independent of CONFIG_BPF_SYSCALL #ifdefs · 4e537f7f
      Daniel Borkmann 提交于
      Socket filter code and other subsystems with upcoming eBPF
      support should not need to deal with the fact that we have
      CONFIG_BPF_SYSCALL defined or not.
      
      Having the bpf syscall as a config option is a nice thing and
      I'd expect it to stay that way for expert users (I presume one
      day the default setting of it might change, though), but code
      making use of it should not care if it's actually enabled or
      not.
      
      Instead, hide this via header files and let the rest deal with it.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-2-git-send-email-ast@plumgrid.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4e537f7f
    • I
      Merge branch 'perf/timer' into perf/core · 223aa646
      Ingo Molnar 提交于
      This WIP branch is now ready to be merged.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      223aa646
  3. 01 4月, 2015 6 次提交
  4. 30 3月, 2015 1 次提交
  5. 27 3月, 2015 10 次提交