1. 28 4月, 2010 1 次提交
    • A
      perf tools: Rename "kernel_info" to "machine" · 23346f21
      Arnaldo Carvalho de Melo 提交于
      struct kernel_info and kerninfo__ are too vague, what they really
      describe are machines, virtual ones or hosts.
      
      There are more changes to introduce helpers to shorten function calls
      and to make more clear what is really being done, but I left that for
      subsequent patches.
      
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      23346f21
  2. 24 4月, 2010 1 次提交
    • F
      perf: Generalize perf lock's sample event reordering to the session layer · c61e52ee
      Frederic Weisbecker 提交于
      The sample events recorded by perf record are not time ordered
      because we have one buffer per cpu for each event (even demultiplexed
      per task/per cpu for task bound events). But when we read trace events
      we want them to be ordered by time because many state machines are
      involved.
      
      There are currently two ways perf tools deal with that:
      
      - use -M to multiplex every buffers (perf sched, perf kmem)
        But this creates a lot of contention in SMP machines on
        record time.
      
      - use a post-processing time reordering (perf timechart, perf lock)
        The reordering used by timechart is simple but doesn't scale well
        with huge flow of events, in terms of performance and memory use
        (unusable with perf lock for example).
        Perf lock has its own samples reordering that flushes its memory
        use in a regular basis and that uses a sorting based on the
        previous event queued (a new event to be queued is close to the
        previous one most of the time).
      
      This patch proposes to export perf lock's samples reordering facility
      to the session layer that reads the events. So if a tool wants to
      get ordered sample events, it needs to set its
      struct perf_event_ops::ordered_samples to true and that's it.
      
      This prepares tracing based perf tools to get rid of the need to
      use buffers multiplexing (-M) or to implement their own
      reordering.
      
      Also lower the flush period to 2 as it's sufficient already.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      c61e52ee
  3. 19 4月, 2010 1 次提交
  4. 14 4月, 2010 5 次提交
    • T
      perf: Convert perf header build_ids into build_id events · c7929e47
      Tom Zanussi 提交于
      Bypasses the build_id perf header code and replaces it with a
      synthesized event and processing function that accomplishes the
      same thing, used when reading/writing perf data to/from a pipe.
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: fweisbec@gmail.com
      Cc: rostedt@goodmis.org
      Cc: k-keiichi@bx.jp.nec.com
      Cc: acme@ghostprotocols.net
      LKML-Reference: <1270184365-8281-9-git-send-email-tzanussi@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7929e47
    • T
      perf: Convert perf tracing data into a tracing_data event · 9215545e
      Tom Zanussi 提交于
      Bypasses the tracing_data perf header code and replaces it with
      a synthesized event and processing function that accomplishes
      the same thing, used when reading/writing perf data to/from a
      pipe.
      
      The tracing data is pretty large, and this patch doesn't attempt
      to break it down into component events.  The tracing_data event
      itself doesn't actually contain the tracing data, rather it
      arranges for the event processing code to skip over it after
      it's read, using the skip return value added to the event
      processing loop in a previous patch.
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: fweisbec@gmail.com
      Cc: rostedt@goodmis.org
      Cc: k-keiichi@bx.jp.nec.com
      Cc: acme@ghostprotocols.net
      LKML-Reference: <1270184365-8281-8-git-send-email-tzanussi@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9215545e
    • T
      perf: Convert perf event types into event type events · cd19a035
      Tom Zanussi 提交于
      Bypasses the event type perf header code and replaces it with a
      synthesized event and processing function that accomplishes the
      same thing, used when reading/writing perf data to/from a pipe.
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: fweisbec@gmail.com
      Cc: rostedt@goodmis.org
      Cc: k-keiichi@bx.jp.nec.com
      Cc: acme@ghostprotocols.net
      LKML-Reference: <1270184365-8281-7-git-send-email-tzanussi@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cd19a035
    • T
      perf: Convert perf header attrs into attr events · 2c46dbb5
      Tom Zanussi 提交于
      Bypasses the attr perf header code and replaces it with a
      synthesized event and processing function that accomplishes the
      same thing, used when reading/writing perf data to/from a pipe.
      
      Making the attrs into events allows them to be streamed over a
      pipe along with the rest of the header data (in later patches).
      It also paves the way to allowing events to be added and removed
      from perf sessions dynamically.
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: fweisbec@gmail.com
      Cc: rostedt@goodmis.org
      Cc: k-keiichi@bx.jp.nec.com
      Cc: acme@ghostprotocols.net
      LKML-Reference: <1270184365-8281-6-git-send-email-tzanussi@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2c46dbb5
    • T
      perf: Add pipe-specific header read/write and event processing code · 8dc58101
      Tom Zanussi 提交于
      This patch makes several changes to allow the perf event stream
      to be sent and received over a pipe:
      
      - adds pipe-specific versions of the header read/write code
      
      - adds pipe-specific version of the event processing code
      
      - adds a range of event types to be used for header or other
        pseudo events, above the range used by the kernel
      
      - checks the return value of event handlers, which they can use
        to skip over large events during event processing rather than actually
        reading them into event objects.
      
      - unifies the multiple do_read() functions and updates its
        users.
      
      Note that none of these changes affect the existing perf data
      file format or processing - this code only comes into play if
      perf output is sent to stdout (or is read from stdin).
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: fweisbec@gmail.com
      Cc: rostedt@goodmis.org
      Cc: k-keiichi@bx.jp.nec.com
      Cc: acme@ghostprotocols.net
      LKML-Reference: <1270184365-8281-2-git-send-email-tzanussi@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8dc58101
  5. 03 4月, 2010 3 次提交
    • A
      perf session: Remove one more exit() call from library code · ad5b217b
      Arnaldo Carvalho de Melo 提交于
      Return NULL instead and make the caller propagate the error.
      
      LKML-Reference: <new-submission>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ad5b217b
    • A
      perf kmem: Resolve kernel symbols again · e727ca73
      Arnaldo Carvalho de Melo 提交于
      Due to the assumption in perf_session__new that the kernel maps would be
      created using the fake PERF_RECORD_MMAP event in a perf.data file 'perf
      kmem --stat caller', that doesn't have such event, ends up not being
      able to resolve the kernel addresses.
      
      Fix it by calling perf_session__create_kernel_maps() in __cmd_kmem().
      
      LKML-Reference: <new-submission>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e727ca73
    • A
      perf report: Add progress bars · 5f4d3f88
      Arnaldo Carvalho de Melo 提交于
      For when we are processing the events and inserting the entries in the
      browser.
      
      Experimentation here: naming "ui_something" we may be treading into
      creating a TUI/GUI set of routines that can then be implemented in terms
      of multiple backends.
      
      Also the time it takes for adding things to the "browser" takes, visually
      (I guess I should do some profiling here ;-) ), more time than for
      processing the events...
      
      That means we probably need to create a custom hist_entry browser, so
      that we reuse the structures we have in place instead of duplicating
      them in newt.
      
      But progress was made and at least we can see something while long files
      are being loaded, that must be one of UI 101 bullet points :-)
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5f4d3f88
  6. 26 3月, 2010 2 次提交
  7. 10 3月, 2010 1 次提交
  8. 04 2月, 2010 3 次提交
    • X
      perf tools: Clean up O_LARGEFILE et al usage · f887f301
      Xiao Guangrong 提交于
      Setting _FILE_OFFSET_BITS and using O_LARGEFILE, lseek64, etc,
      is redundant. Thanks H. Peter Anvin for pointing it out.
      
      So, this patch removes O_LARGEFILE, lseek64, etc.
      Suggested-by: N"H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <4B6A8972.3070605@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f887f301
    • A
      perf record: Stop intercepting events, use postprocessing to get build-ids · 6122e4e4
      Arnaldo Carvalho de Melo 提交于
      We want to stream events as fast as possible to perf.data, and
      also in the future we want to have splice working, when no
      interception will be possible.
      
      Using build_id__mark_dso_hit_ops to create the list of DSOs that
      back MMAPs we also optimize disk usage in the build-id cache by
      only caching DSOs that had hits.
      Suggested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1265223128-11786-6-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6122e4e4
    • A
      perf symbols: Remove perf_session usage in symbols layer · 9de89fe7
      Arnaldo Carvalho de Melo 提交于
      I noticed while writing the first test in 'perf regtest' that to
      just test the symbol handling routines one needs to create a
      perf session, that is a layer centered on a perf.data file,
      events, etc, so I untied these layers.
      
      This reduces the complexity for the users as the number of
      parameters to most of the symbols and session APIs now was
      reduced while not adding more state to all the map instances by
      only having data that is needed to split the kernel (kallsyms
      and ELF symtab sections) maps and do vmlinux relocation on the
      main kernel map.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1265223128-11786-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9de89fe7
  9. 03 2月, 2010 1 次提交
    • X
      perf tools: Use O_LARGEFILE to open perf data file · b8f46c5a
      Xiao Guangrong 提交于
      Open perf data file with O_LARGEFILE flag since its size is
      easily larger that 2G.
      
      For example:
      
       # rm -rf perf.data
       # ./perf kmem record sleep 300
      
       [ perf record: Woken up 0 times to write data ]
       [ perf record: Captured and wrote 3142.147 MB perf.data
       (~137282513 samples) ]
      
       # ll -h perf.data
       -rw------- 1 root root 3.1G .....
      Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <4B68F32A.9040203@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b8f46c5a
  10. 29 1月, 2010 1 次提交
  11. 16 1月, 2010 3 次提交
    • A
      perf tools: Convert getpagesize() uses to sysconf(_SC_GETPAGESIZE) · 1b75962e
      Arnaldo Carvalho de Melo 提交于
      Using the more portable and equivalent sysconf call.
      Reported-by: NAristeu Rozanski <aris@redhat.com>
      Reported-by: NUlrich Drepper <drepper@redhat.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Aristeu Rozanski <aris@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ulrich Drepper <drepper@redhat.com>
      LKML-Reference: <1263501006-14185-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1b75962e
    • A
      perf tools: Cross platform perf.data analysis support · ba21594c
      Arnaldo Carvalho de Melo 提交于
      There are still some problems related to loading vmlinux files,
      but those are unrelated to the feature implemented in this
      patch, so will get fixed in the next patches, but here are some
      results:
      
      1. collect perf.data file on a Fedora 12 machine, x86_64, 64-bit
      userland
      
      2. transfer it to a Debian Testing machine, PARISC64, 32-bit
      userland
      
        acme@parisc:~/git/linux-2.6-tip$ perf buildid-list | head -5
        74f9930ee94475b6b3238caf3725a50d59cb994b [kernel.kallsyms]
        55fdd56670453ea66c011158c4b9d30179c1d049 /lib/modules/2.6.33-rc4-tip+/kernel/net/ipv4/netfilter/ipt_MASQUERADE.ko
        41adff63c730890480980d5d8ba513f1c216a858 /lib/modules/2.6.33-rc4-tip+/kernel/net/ipv4/netfilter/iptable_nat.ko
        90a33def1077bb8e97b8a78546dc96c2de62df46 /lib/modules/2.6.33-rc4-tip+/kernel/net/ipv4/netfilter/nf_nat.ko
        984c7bea90ce1376d5c8e7ef43a781801286e62d /lib/modules/2.6.33-rc4-tip+/kernel/drivers/net/tun.ko
      
        acme@parisc:~/git/linux-2.6-tip$ perf buildid-list | tail -5
        22492f3753c6a67de5c7ccbd6b863390c92c0723 /usr/lib64/libXt.so.6.0.0
        353802bb7e1b895ba43507cc678f951e778e4c6f /usr/lib64/libMagickCore.so.2.0.0
        d10c2897558595efe7be8b0584cf7e6398bc776c /usr/lib64/libfprint.so.0.0.0
        a83ecfb519a788774a84d5ddde633c9ba56c03ab /home/acme/bin/perf
        d3ca765a8ecf257d263801d7ad8c49c189082317 /usr/lib64/libdwarf.so.0.0
        acme@parisc:~/git/linux-2.6-tip$
      
        acme@parisc:~/git/linux-2.6-tip$ perf report --sort comm
        The file [kernel.kallsyms] cannot be used, trying to use /proc/kallsyms...
      
        ^^^^ The problem related to vmlinux handling, it shouldn't be trying this
        ^^^^ rather alien /proc/kallsyms at all...
      
        /lib64/libpthread-2.10.2.so with build id 5c68f7afeb33309c78037e374b0deee84dd441f6 not found, continuing without symbols
        /lib64/libc-2.10.2.so with build id eb4ec8fa8b2a5eb18cad173c92f27ed8887ed1c1 not found, continuing without symbols
        /home/acme/bin/perf with build id a83ecfb519a788774a84d5ddde633c9ba56c03ab not found, continuing without symbols
        /usr/sbin/openvpn with build id f2037a091ef36b591187a858d75e203690ea9409 not found, continuing without symbols
        Failed to open /lib/modules/2.6.33-rc4-tip+/kernel/drivers/net/e1000e/e1000e.ko, continuing without symbols
        Failed to open /lib/modules/2.6.33-rc4-tip+/kernel/drivers/net/wireless/iwlwifi/iwlcore.ko, continuing without symbols
      
        <SNIP more complaints about not finding the right build-ids,
              those will have to wait for 'perf archive' or plain
              copying what was collected by 'perf record' on the x86_64,
              source machine, see further below for an example of this >
      
        # Samples: 293085637
        #
        # Overhead          Command
        # ........  ...............
        #
            61.70%             find
            23.50%             perf
             5.86%          swapper
             3.12%             sshd
             2.39%             init
             0.87%             bash
             0.86%            sleep
             0.59%      dbus-daemon
             0.25%             hald
             0.24%   NetworkManager
             0.19%  hald-addon-rfki
             0.15%          openvpn
             0.07%             phy0
             0.07%         events/0
             0.05%          iwl3945
             0.05%         events/1
             0.03%      kondemand/0
        acme@parisc:~/git/linux-2.6-tip$
      
      Which matches what we get when running the same command for the
      same perf.data file on the F12, x86_64, source machine:
      
        [root@doppio linux-2.6-tip]# perf report --sort comm
        # Samples: 293085637
        #
        # Overhead          Command
        # ........  ...............
        #
            61.70%             find
            23.50%             perf
             5.86%          swapper
             3.12%             sshd
             2.39%             init
             0.87%             bash
             0.86%            sleep
             0.59%      dbus-daemon
             0.25%             hald
             0.24%   NetworkManager
             0.19%  hald-addon-rfki
             0.15%          openvpn
             0.07%             phy0
             0.07%         events/0
             0.05%          iwl3945
             0.05%         events/1
             0.03%      kondemand/0
        [root@doppio linux-2.6-tip]#
      
      The other modes work as well, modulo the problem with vmlinux:
      
        acme@parisc:~/git/linux-2.6-tip$ perf report --sort comm,dso 2> /dev/null | head -15
        # Samples: 293085637
        #
        # Overhead          Command                      Shared Object
        # ........  ...............  .................................
        #
            35.11%             find                   ffffffff81002b5a
            18.25%             perf                   ffffffff8102235f
            16.17%             find  libc-2.10.2.so
             9.07%             find  find
             5.80%          swapper                   ffffffff8102235f
             3.95%             perf  libc-2.10.2.so
             2.33%             init                   ffffffff810091b9
             1.65%             sshd  libcrypto.so.0.9.8k
             1.35%             find  [e1000e]
             0.68%            sleep  libc-2.10.2.so
        acme@parisc:~/git/linux-2.6-tip$
      
      And the lack of the right buildids:
      
        acme@parisc:~/git/linux-2.6-tip$ perf report --sort comm,dso,symbol 2> /dev/null | head -15
        # Samples: 293085637
        #
        # Overhead          Command                      Shared Object  Symbol
        # ........  ...............  .................................  ......
        #
            35.11%             find                   ffffffff81002b5a  [k] 0xffffffff81002b5a
            18.25%             perf                   ffffffff8102235f  [k] 0xffffffff8102235f
            16.17%             find  libc-2.10.2.so                     [.] 0x00000000045782
             9.07%             find  find                               [.] 0x0000000000fb0e
             5.80%          swapper                   ffffffff8102235f  [k] 0xffffffff8102235f
             3.95%             perf  libc-2.10.2.so                     [.] 0x0000000007f398
             2.33%             init                   ffffffff810091b9  [k] 0xffffffff810091b9
             1.65%             sshd  libcrypto.so.0.9.8k                [.] 0x00000000105440
             1.35%             find  [e1000e]                           [k] 0x00000000010948
             0.68%            sleep  libc-2.10.2.so                     [.] 0x0000000011ad5b
        acme@parisc:~/git/linux-2.6-tip$
      
      But if we:
      
        acme@parisc:~/git/linux-2.6-tip$ ls ~/.debug
        ls: cannot access /home/acme/.debug: No such file or directory
        acme@parisc:~/git/linux-2.6-tip$ mkdir -p ~/.debug/lib64/libc-2.10.2.so/
        acme@parisc:~/git/linux-2.6-tip$ scp doppio:.debug/lib64/libc-2.10.2.so/* ~/.debug/lib64/libc-2.10.2.so/
        acme@doppio's password:
        eb4ec8fa8b2a5eb18cad173c92f27ed8887ed1c1	             100% 1783KB 714.7KB/s   00:02
        acme@parisc:~/git/linux-2.6-tip$ mkdir -p ~/.debug/.build-id/eb
        acme@parisc:~/git/linux-2.6-tip$ ln -s ../../lib64/libc-2.10.2.so/eb4ec8fa8b2a5eb18cad173c92f27ed8887ed1c1 ~/.debug/.build-id/eb/4ec8fa8b2a5eb18cad173c92f27ed8887ed1c1
        acme@parisc:~/git/linux-2.6-tip$ perf report --dsos libc-2.10.2.so 2> /dev/null
        # dso: libc-2.10.2.so
        # Samples: 64281170
        #
        # Overhead          Command  Symbol
        # ........  ...............  ......
        #
            14.98%             perf  [.] __GI_strcmp
            12.30%             find  [.] __GI_memmove
             9.25%             find  [.] _int_malloc
             7.60%             find  [.] _IO_vfprintf_internal
             6.10%             find  [.] _IO_new_file_xsputn
             6.02%             find  [.] __GI_close
             3.08%             find  [.] _IO_file_overflow_internal
             3.08%             find  [.] malloc_consolidate
             3.08%             find  [.] _int_free
             3.08%             find  [.] __strchrnul
             3.08%             find  [.] __getdents64
             3.08%             find  [.] __write_nocancel
             3.08%            sleep  [.] __GI__dl_addr
             3.08%             sshd  [.] __libc_select
             3.08%             find  [.] _IO_new_file_write
             3.07%             find  [.] _IO_new_do_write
             3.06%             find  [.] __GI___errno_location
             3.05%             find  [.] __GI___libc_malloc
             3.04%             perf  [.] __GI_memcpy
             1.71%             find  [.] __fprintf_chk
             1.29%             bash  [.] __gconv_transform_utf8_internal
             0.79%      dbus-daemon  [.] __GI_strlen
        #
        # (For a higher level overview, try: perf report --sort comm,dso)
        #
        acme@parisc:~/git/linux-2.6-tip$
      
      Which matches what we get on the source, F12, x86_64 machine:
      
        [root@doppio linux-2.6-tip]# perf report --dsos libc-2.10.2.so
        # dso: libc-2.10.2.so
        # Samples: 64281170
        #
        # Overhead          Command  Symbol
        # ........  ...............  ......
        #
            14.98%             perf  [.] __GI_strcmp
            12.30%             find  [.] __GI_memmove
             9.25%             find  [.] _int_malloc
             7.60%             find  [.] _IO_vfprintf_internal
             6.10%             find  [.] _IO_new_file_xsputn
             6.02%             find  [.] __GI_close
             3.08%             find  [.] _IO_file_overflow_internal
             3.08%             find  [.] malloc_consolidate
             3.08%             find  [.] _int_free
             3.08%             find  [.] __strchrnul
             3.08%             find  [.] __getdents64
             3.08%             find  [.] __write_nocancel
             3.08%            sleep  [.] __GI__dl_addr
             3.08%             sshd  [.] __libc_select
             3.08%             find  [.] _IO_new_file_write
             3.07%             find  [.] _IO_new_do_write
             3.06%             find  [.] __GI___errno_location
             3.05%             find  [.] __GI___libc_malloc
             3.04%             perf  [.] __GI_memcpy
             1.71%             find  [.] __fprintf_chk
             1.29%             bash  [.] __gconv_transform_utf8_internal
             0.79%      dbus-daemon  [.] __GI_strlen
        #
        # (For a higher level overview, try: perf report --sort comm,dso)
        #
        [root@doppio linux-2.6-tip]#
      
      So I think this is really, really nice in that it demonstrates
      the portability of perf.data files and the use of build-ids
      accross such aliens worlds :-)
      
      There are some things to fix tho, like the bitmap on the header,
      but things are looking good.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1263478990-8200-2-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ba21594c
    • A
      perf tools: Don't cast RIP to pointers · 0d755034
      Arnaldo Carvalho de Melo 提交于
      Since they can come from another architecture with bigger
      pointers, i.e. processing a 64-bit perf.data on a 32-bit arch.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1263478990-8200-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0d755034
  12. 14 1月, 2010 1 次提交
    • A
      perf tools: Encode kernel module mappings in perf.data · b7cece76
      Arnaldo Carvalho de Melo 提交于
      We were always looking at the running machine /proc/modules,
      even when processing a perf.data file, which only makes sense
      when we're doing 'perf record' and 'perf report' on the same
      machine, and in close sucession, or if we don't use modules at
      all, right Peter? ;-)
      
      Now, at 'perf record' time we read /proc/modules, find the long
      path for modules, and put them as PERF_MMAP events, just like we
      did to encode the reloc reference symbol for vmlinux. Talking
      about that now it is encoded in .pgoff, so that we can use
      .{start,len} to store the address boundaries for the kernel so
      that when we reconstruct the kmaps tree we can do lookups right
      away, without having to fixup the end of the kernel maps like we
      did in the past (and now only in perf record).
      
      One more step in the 'perf archive' direction when we'll finally
      be able to collect data in one machine and analyse in another.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1263396139-4798-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b7cece76
  13. 13 1月, 2010 2 次提交
    • A
      perf symbols: Record the domain of DSOs in HEADER_BUILD_ID header table · a89e5abe
      Arnaldo Carvalho de Melo 提交于
      So that we can restore them to the right DSO list (either
      dsos__kernel or dsos__user).
      
      We do that just like the kernel does for the other events,
      encoding PERF_RECORD_MISC_{KERNEL,USER} in perf_event_header.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1262901583-8074-2-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a89e5abe
    • A
      perf tools: Handle relocatable kernels · 56b03f3c
      Arnaldo Carvalho de Melo 提交于
      DSOs don't have this problem because the kernel emits a
      PERF_MMAP for each new executable mapping it performs on
      monitored threads.
      
      To fix the kernel case we simulate the same behaviour, by having
      'perf record' to synthesize a PERF_MMAP for the kernel, encoded
      like this:
      
      [root@doppio ~]# perf record -a -f sleep 1
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.344 MB perf.data (~15038 samples) ]
      [root@doppio ~]# perf report -D | head -10
      
      0xd0 [0x40]: event: 1
      .
      . ... raw event: size 64 bytes
      .  0000:  01 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 ......@........
      .  0010:  00 00 00 81 ff ff ff ff 00 00 00 00 00 00 00 00 ...............
      .  0020:  00 00 00 00 00 00 00 00 5b 6b 65 72 6e 65 6c 2e ........  [kernel
      .  0030:  6b 61 6c 6c 73 79 6d 73 2e 5f 74 65 78 74 5d 00  kallsyms._text]
      .  0xd0
      [0x40]: PERF_RECORD_MMAP 0/0: [0xffffffff81000000((nil)) @ (nil)]: [kernel.kallsyms._text]
      
      I.e. we identify such event as having:
      
       .pid      = 0
       .filename = [kernel.kallsyms.REFNAME]
       .start    = REFNAME addr in /proc/kallsyms at 'perf record' time
      
      and use now a hardcoded value of '.text' for REFNAME.
      
      Then, later, in 'perf report', if there are any kernel hits and
      thus we need to resolve kernel symbols, we search for REFNAME
      and if its address changed, relocation happened and we thus must
      change the kernel mapping routines to one that uses .pgoff as
      the relocation to apply.
      
      This way we use the same mechanism used for the other DSOs and
      don't have to do a two pass in all the kernel symbols.
      Reported-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      LKML-Reference: <1262717431-1246-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      56b03f3c
  14. 28 12月, 2009 6 次提交
  15. 16 12月, 2009 2 次提交
  16. 15 12月, 2009 1 次提交
  17. 14 12月, 2009 4 次提交
    • A
      perf session: Move kmaps to perf_session · 4aa65636
      Arnaldo Carvalho de Melo 提交于
      There is still some more work to do to disentangle map creation
      from DSO loading, but this happens only for the kernel, and for
      the early adopters of perf diff, where this disentanglement
      matters most, we'll be testing different kernels, so no problem
      here.
      
      Further clarification: right now we create the kernel maps for
      the various modules and discontiguous kernel text maps when
      loading the DSO, we should do it as a two step process, first
      creating the maps, for multiple mappings with the same DSO
      store, then doing the dso load just once, for the first hit on
      one of the maps sharing this DSO backing store.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1260741029-4430-6-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4aa65636
    • A
      perf session: Move the global threads list to perf_session · b3165f41
      Arnaldo Carvalho de Melo 提交于
      So that we can process two perf.data files.
      
      We still need to add a O_MMAP mode for perf_session so that we
      can do all the mmap stuff in it.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1260741029-4430-5-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b3165f41
    • A
      perf session: Reduce the number of parms to perf_session__process_events · ec913369
      Arnaldo Carvalho de Melo 提交于
      By having the cwd/cwdlen in the perf_session struct and
      full_paths in perf_event_ops.
      
      Now its just a matter of passing the ops.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1260741029-4430-4-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ec913369
    • A
      perf session: Ditch register_perf_file_handler · 301a0b02
      Arnaldo Carvalho de Melo 提交于
      Pass the event_ops to perf_session__process_events instead.
      
      Also move the event_ops definition to session.h, starting to
      move things around to their right place, trimming the many
      unneeded headers we have.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1260741029-4430-2-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      301a0b02
  18. 12 12月, 2009 1 次提交
    • A
      perf tools: Introduce perf_session class · 94c744b6
      Arnaldo Carvalho de Melo 提交于
      That does all the initialization boilerplate, opening the file,
      reading the header, checking if it is valid, etc.
      
      And that will as well have the threads list, kmap (now) global
      variable, etc, so that we can handle two (or more) perf.data files
      describing sessions to compare.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1260573842-19720-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      94c744b6