1. 24 11月, 2009 8 次提交
    • A
      perf symbols: Rename find_symbol routines to find_function · fcf1203a
      Arnaldo Carvalho de Melo 提交于
      Paving the way for supporting variable in adition to function
      symbols.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1259074912-5924-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fcf1203a
    • A
      perf symbols: Simplify symbol machinery setup · b32d133a
      Arnaldo Carvalho de Melo 提交于
      And also express its configuration toggles via a struct.
      
      Now all one has to do is to call symbol__init(NULL) if the
      defaults are OK, or pass a struct symbol_conf pointer with the
      desired configuration.
      
      If a tool uses kernel_maps__find_symbol() to look at the kernel
      and modules mappings for a symbol but didn't call symbol__init()
      first, that will generate a one time warning too, alerting the
      subcommand developer that symbol__init() must be called.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1259071517-3242-2-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b32d133a
    • L
      perf kmem: Measure kmalloc/kfree CPU ping-pong call-sites · 079d3f65
      Li Zefan 提交于
      Show statistics for allocations and frees on different cpus:
      
      ------------------------------------------------------------------------------------------------------
      Callsite                           | Total_alloc/Per | Total_req/Per   | Hit   | Ping-pong | Frag
      ------------------------------------------------------------------------------------------------------
       perf_event_alloc.clone.0+0         |      7504/682   |      7128/648   |     11 |        0 |  5.011%
       alloc_buffer_head+16               |       288/57    |       280/56    |      5 |        0 |  2.778%
       radix_tree_preload+51              |       296/296   |       288/288   |      1 |        0 |  2.703%
       tracepoint_add_probe+32e           |       157/31    |       154/30    |      5 |        0 |  1.911%
       do_maps_open+0                     |       796/12    |       792/12    |     66 |        0 |  0.503%
       sock_alloc_send_pskb+16e           |     23780/495   |     23744/494   |     48 |       38 |  0.151%
       anon_vma_prepare+9a                |      3744/44    |      3740/44    |     85 |        0 |  0.107%
       d_alloc+21                         |     64948/164   |     64944/164   |    396 |        0 |  0.006%
       proc_alloc_inode+23                |    262292/676   |    262288/676   |    388 |        0 |  0.002%
       create_object+28                   |    459600/200   |    459600/200   |   2298 |       71 |  0.000%
       journal_start+67                   |     14440/40    |     14440/40    |    361 |        0 |  0.000%
       get_empty_filp+df                  |     53504/256   |     53504/256   |    209 |        0 |  0.000%
       getname+2a                         |    823296/4096  |    823296/4096  |    201 |        0 |  0.000%
       seq_read+2b0                       |    544768/4096  |    544768/4096  |    133 |        0 |  0.000%
       seq_open+6d                        |     17024/128   |     17024/128   |    133 |        0 |  0.000%
       mmap_region+2e6                    |     11704/88    |     11704/88    |    133 |        0 |  0.000%
       single_open+0                      |      1072/16    |      1072/16    |     67 |        0 |  0.000%
       __alloc_skb+2e                     |     12544/256   |     12544/256   |     49 |       38 |  0.000%
       __sigqueue_alloc+4a                |      1296/144   |      1296/144   |      9 |        8 |  0.000%
       tracepoint_add_probe+6f            |        80/16    |        80/16    |      5 |        0 |  0.000%
      ------------------------------------------------------------------------------------------------------
      ...
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: linux-mm@kvack.org <linux-mm@kvack.org>
      LKML-Reference: <4B0B6E9F.6020309@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      079d3f65
    • L
      perf kmem: Collect cross node allocation statistics · 7d0d3945
      Li Zefan 提交于
      Show cross node memory allocations:
      
       # ./perf kmem
      
       SUMMARY
       =======
       ...
       Cross node allocations: 0/3633
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: linux-mm@kvack.org <linux-mm@kvack.org>
      LKML-Reference: <4B0B6E87.10906@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7d0d3945
    • L
      perf kmem: Default to sort by fragmentation · 29b3e152
      Li Zefan 提交于
      Make the output sort by fragmentation by default.
      
      Also make the usage of "--sort" option consistent with other
      perf tools. That is, we support multi keys: "--sort
      key1[,key2]...".
      
       # ./perf kmem --stat caller
       ------------------------------------------------------------------------------
       Callsite                    |Total_alloc/Per | Total_req/Per | Hit  | Frag
       ------------------------------------------------------------------------------
       __netdev_alloc_skb+23       |    5048/1682   |    4564/1521  |     3|   9.588%
       perf_event_alloc.clone.0+0  |    7504/682    |    7128/648   |    11|   5.011%
       tracepoint_add_probe+32e    |     157/31     |     154/30    |     5|   1.911%
       alloc_buffer_head+16        |     456/57     |     448/56    |     8|   1.754%
       radix_tree_preload+51       |     584/292    |     576/288   |     2|   1.370%
       ...
      
      TODO:
      - Extract duplicate code in builtin-kmem.c and builtin-sched.c
        into util/sort.c.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: linux-mm@kvack.org <linux-mm@kvack.org>
      LKML-Reference: <4B0B6E72.7010200@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      29b3e152
    • L
      perf kmem: Add new option to show raw ip · 7707b6b6
      Li Zefan 提交于
      Add option "--raw-ip" to show raw ip instead of symbols:
      
       # ./perf kmem --stat caller --raw-ip
       ------------------------------------------------------------------------------
       Callsite                    |Total_alloc/Per | Total_req/Per | Hit  | Frag
       ------------------------------------------------------------------------------
       0xc05301aa                  |  733184/4096   |  733184/4096  |   179|   0.000%
       0xc0542ba0                  |  483328/4096   |  483328/4096  |   118|   0.000%
       ...
      
      Also show symbols with format sym+offset instead of sym/offset.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: linux-mm@kvack.org <linux-mm@kvack.org>
      LKML-Reference: <4B0B6E5C.4080900@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7707b6b6
    • A
      perf kmem: Resolve symbols · 1b145ae5
      Arnaldo Carvalho de Melo 提交于
      E.g.:
      
        [root@doppio linux-2.6-tip]# perf kmem record sleep 3s
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 0.804 MB perf.data (~35105 samples) ]
      
        [root@doppio linux-2.6-tip]# perf kmem --stat caller | head -10
        ------------------------------------------------------------------------------
        Callsite                    |Total_alloc/Per | Total_req/Per | Hit  | Frag
        ------------------------------------------------------------------------------
        getname/40                  | 1519616/4096   | 1519616/4096  |   371|   0.000%
        seq_read/a2                 |  987136/4096   |  987136/4096  |   241|   0.000%
        __netdev_alloc_skb/43       |  260368/1049   |  259968/1048  |   248|   0.154%
        __alloc_skb/5a              |   77312/256    |   77312/256   |   302|   0.000%
        proc_alloc_inode/33         |   76480/632    |   76472/632   |   121|   0.010%
        get_empty_filp/8d           |   70272/192    |   70272/192   |   366|   0.000%
        split_vma/8e                |   42064/176    |   42064/176   |   239|   0.000%
        [root@doppio linux-2.6-tip]#
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: linux-mm@kvack.org <linux-mm@kvack.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1259005869-13487-2-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1b145ae5
    • A
      perf symbols: Look for vmlinux in more places · cc612d81
      Arnaldo Carvalho de Melo 提交于
      Now that we can check the buildid to see if it really matches,
      this can be done safely:
      
        vmlinux
        /boot/vmlinux
        /boot/vmlinux-<uts.release>
        /lib/modules/<uts.release>/build/vmlinux
        /usr/lib/debug/lib/modules/%s/vmlinux
      
      More can be added - if you know about distros that put the
      vmlinux somewhere else please let us know.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1259001550-8194-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cc612d81
  2. 22 11月, 2009 1 次提交
    • P
      perf kmem: Add --sort hit and --sort frag · f3ced7cd
      Pekka Enberg 提交于
      This patch adds support for "--sort hit" and "--sort frag" to
      the "perf kmem" tool. The former was already mentioned in the
      help text and the latter is useful for finding call-sites that
      exhibit worst case behavior for SLAB allocators.
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: linux-mm@kvack.org <linux-mm@kvack.org>
      LKML-Reference: <1258883880-7149-1-git-send-email-penberg@cs.helsinki.fi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f3ced7cd
  3. 20 11月, 2009 1 次提交
    • L
      perf: Add 'perf kmem' tool · ba77c9e1
      Li Zefan 提交于
      This tool is mostly a perf version of kmemtrace-user.
      
      The following information is provided by this tool:
      
       - the total amount of memory allocated and fragmentation per
         call-site
      
       - the total amount of memory allocated and fragmentation per
         allocation
      
       - total memory allocated and fragmentation in the collected
         dataset - ...
      
      Sample output:
      
       # ./perf kmem record
       ^C
       # ./perf kmem --stat caller --stat alloc -l 10
      
       ------------------------------------------------------------------------------
       Callsite          | Total_alloc/Per |  Total_req/Per  |  Hit   | Fragmentation
       ------------------------------------------------------------------------------
       0xc052f37a        |   790528/4096   |   790528/4096   |    193 |    0.000%
       0xc0541d70        |   524288/4096   |   524288/4096   |    128 |    0.000%
       0xc051cc68        |   481600/200    |   481600/200    |   2408 |    0.000%
       0xc0572623        |   297444/676    |   297440/676    |    440 |    0.001%
       0xc05399f1        |    73476/164    |    73472/164    |    448 |    0.005%
       0xc05243bf        |    51456/256    |    51456/256    |    201 |    0.000%
       0xc0730d0e        |    31844/497    |    31808/497    |     64 |    0.113%
       0xc0734c4e        |    17152/256    |    17152/256    |     67 |    0.000%
       0xc0541a6d        |    16384/128    |    16384/128    |    128 |    0.000%
       0xc059c217        |    13120/40     |    13120/40     |    328 |    0.000%
       0xc0501ee6        |    11264/88     |    11264/88     |    128 |    0.000%
       0xc04daef0        |     7504/682    |     7128/648    |     11 |    5.011%
       0xc04e14a3        |     4216/191    |     4216/191    |     22 |    0.000%
       0xc05041ca        |     3524/44     |     3520/44     |     80 |    0.114%
       0xc0734fa3        |     2104/701    |     1620/540    |      3 |   23.004%
       0xc05ec9f1        |     2024/289    |     2016/288    |      7 |    0.395%
       0xc06a1999        |     1792/256    |     1792/256    |      7 |    0.000%
       0xc0463b9a        |     1584/144    |     1584/144    |     11 |    0.000%
       0xc0541eb0        |     1024/16     |     1024/16     |     64 |    0.000%
       0xc06a19ac        |      896/128    |      896/128    |      7 |    0.000%
       0xc05721c0        |      772/12     |      768/12     |     64 |    0.518%
       0xc054d1e6        |      288/57     |      280/56     |      5 |    2.778%
       0xc04b562e        |      157/31     |      154/30     |      5 |    1.911%
       0xc04b536f        |       80/16     |       80/16     |      5 |    0.000%
       0xc05855a0        |       64/64     |       36/36     |      1 |   43.750%
       ------------------------------------------------------------------------------
      
       ------------------------------------------------------------------------------
       Alloc Ptr         | Total_alloc/Per |  Total_req/Per  |  Hit   | Fragmentation
       ------------------------------------------------------------------------------
       0xda884000        |  1052672/4096   |  1052672/4096   |    257 |    0.000%
       0xda886000        |   262144/4096   |   262144/4096   |     64 |    0.000%
       0xf60c7c00        |    16512/128    |    16512/128    |    129 |    0.000%
       0xf59a4118        |    13120/40     |    13120/40     |    328 |    0.000%
       0xdfd4b2c0        |    11264/88     |    11264/88     |    128 |    0.000%
       0xf5274600        |     7680/256    |     7680/256    |     30 |    0.000%
       0xe8395000        |     5948/594    |     5464/546    |     10 |    8.137%
       0xe59c3c00        |     5748/479    |     5712/476    |     12 |    0.626%
       0xf4cd1a80        |     3524/44     |     3520/44     |     80 |    0.114%
       0xe5bd1600        |     2892/482    |     2856/476    |      6 |    1.245%
       ...               | ...             | ...             | ...    | ...
       ------------------------------------------------------------------------------
      
      SUMMARY
      =======
      Total bytes requested: 2333626
      Total bytes allocated: 2353712
      Total bytes wasted on internal fragmentation: 20086
      Internal fragmentation: 0.853375%
      
      TODO:
      - show sym+offset in 'callsite' column
      - show cross node allocation stats
      - collect more useful stats?
      - ...
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: linux-mm@kvack.org <linux-mm@kvack.org>
      LKML-Reference: <4B064AF5.9060208@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ba77c9e1