1. 20 2月, 2010 1 次提交
    • D
      nmi_watchdog: Fix undefined 'apic' build bug · 2cc4452b
      Don Zickus 提交于
      Ingo provided me a config that fails to compile with:
      
        arch/x86/built-in.o: In function
        `arch_trigger_all_cpu_backtrace': (.text+0x17e78): undefined
        reference to `apic' make: *** [.tmp_vmlinux1] Error 1
      
      I realized I changed the compile behaviour of the nmi code by
      not wrapping it with CONFIG_LOCAL_APIC.  To fix this I add a
      compile check for ARCH_HAS_NMI_WATCHDOG around
      arch_trigger_all_cpu_backtrace.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: a.p.zijlstra@chello.nl
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      LKML-Reference: <1266548212-24243-1-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2cc4452b
  2. 17 2月, 2010 2 次提交
    • D
      nmi_watchdog: Properly configure for software events · 96ca4028
      Don Zickus 提交于
      Paul Mackerras brought up a good point that when fallbacking to
      software events, I may have been lucky in my configuration.
      
      Modified the code to explicit provide a new configuration for
      software events.
      Suggested-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1266357745-26671-1-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      96ca4028
    • D
      nmi_watchdog: support for oprofile · 6081b6cd
      Don Zickus 提交于
      Re-arrange the code so that when someone disables nmi_watchdog
      with:
      
        echo 0 > /proc/sys/kernel/nmi_watchdog
      
      it releases the hardware reservation on the PMUs.  This allows
      the oprofile module to grab those PMUs and do its thing.
      Otherwise oprofile fails to load because the hardware is
      reserved by the perf_events subsystem.
      
      Tested using:
      
        oprofile --vm-linux --start
      
      and watched it failed when nmi_watchdog is enabled and succeed
      when:
      
        oprofile --deinit && echo 0 > /proc/sys/kernel/nmi_watchdog
      
      is run.
      
      Note:  this has the side quirk of having the nmi_watchdog latch
      onto the software events instead of hardware events if oprofile
      has already reserved the hardware first.  User beware! :-)
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      Cc: eranian@google.com
      LKML-Reference: <1266357892-30504-1-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6081b6cd
  3. 14 2月, 2010 3 次提交
    • D
      nmi_watchdog: Fallback to software events when no hardware pmu detected · cf454aec
      Don Zickus 提交于
      Not all arches have a PMU or have perf_event support for their
      PMU.  The nmi_watchdog will fail in those cases.  Fallback to
      using software events to generate nmi_watchdog traffic with
      local apic interrupts.
      
      Tested on a Pentium4 and it worked as expected, excepting for
      detecting cpu lockups.
      
      The problem with using software events as a cpu lock up detector
      is the nmi_watchdog uses the logic that if local apic interrupts
      stop incrementing then the cpu is probably locked up.  But with
      software events we use the local apic to trigger the
      nmi_watchdog callback to see if local apic interrupts are still
      firing, which obviously they are otherwise we wouldn't have been
      triggered.
      
      The algorithm to detect cpu lock ups is the same as the old
      nmi_watchdog. Perhaps we need to find a better way to detect
      lock ups?
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: peterz@infradead.org
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      LKML-Reference: <1266013161-31197-3-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cf454aec
    • D
      nmi_watchdog: Compile and portability fixes · 504d7cf1
      Don Zickus 提交于
      The original patch was x86_64 centric.  Changed the code to make
      it less so.
      
      ested by building and running on a powerpc.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: peterz@infradead.org
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      LKML-Reference: <1266013161-31197-2-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      504d7cf1
    • D
      nmi_watchdog: Use a boolean config flag for compiling · c3128fb6
      Don Zickus 提交于
      Determines if an arch has setup arch specific perf_events and
      nmi_watchdog code.  This should restrict compiles to only those
      arches ready.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: peterz@infradead.org
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      LKML-Reference: <1266013161-31197-1-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c3128fb6
  4. 09 2月, 2010 1 次提交
    • I
      nmi_watchdog: Only enable on x86 for now · 8e7672cd
      Ingo Molnar 提交于
      It wont even build on other platforms just yet - so restrict it
      to x86 for now.
      
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      Cc: peterz@infradead.org
      LKML-Reference: <1265424425-31562-4-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8e7672cd
  5. 08 2月, 2010 6 次提交
    • D
      nmi_watchdog: Config option to enable new nmi_watchdog · 84e478c6
      Don Zickus 提交于
      These are the bits that enable the new nmi_watchdog and safely
      isolate the old nmi_watchdog.  Only one or the other can run,
      not both at the same time.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      Cc: peterz@infradead.org
      LKML-Reference: <1265424425-31562-4-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      84e478c6
    • D
      nmi_watchdog: Add new, generic implementation, using perf events · 1fb9d6ad
      Don Zickus 提交于
      This is a new generic nmi_watchdog implementation using the perf
      events infrastructure as suggested by Ingo.
      
      The implementation is simple, just create an in-kernel perf
      event and register an overflow handler to check for cpu lockups.
      
      I created a generic implementation that lives in kernel/ and
      the hardware specific part that for now lives in arch/x86.
      
      This approach has a number of advantages:
      
       - It simplifies the x86 PMU implementation in the long run,
         in that it removes the hardcoded low-level PMU implementation
         that was the NMI watchdog before.
      
       - It allows new NMI watchdog features to be added in a central
         place.
      
       - It allows other architectures to enable the NMI watchdog,
         as long as they have perf events (that provide NMIs)
         implemented.
      
       - It also allows for more graceful co-existence of existing
         perf events apps and the NMI watchdog - before these changes
         the relationship was exclusive. (The NMI watchdog will 'spend'
         a perf event when enabled. In later iterations we might be
         able to piggyback from an existing NMI event without having
         to allocate a hardware event for the NMI watchdog - turning
         this into a no-hardware-cost feature.)
      
      As for compatibility, we'll keep the old NMI watchdog code as
      well until the new one can 100% replace it on all CPUs, old and
      new alike.  That might take some time as the NMI watchdog has
      been ported to many CPU models.
      
      I have done light testing to make sure the framework works
      correctly and it does.
      
       v2: Set the correct timeout values based on the old nmi
           watchdog
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      Cc: peterz@infradead.org
      LKML-Reference: <1265424425-31562-3-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1fb9d6ad
    • D
      x86: Move notify_die from nmi.c to traps.c · e40b1720
      Don Zickus 提交于
      In order to handle a new nmi_watchdog approach, I need to move
      the notify_die() routine out of nmi_watchdog_tick() and into
      default_do_nmi(). This lets me easily swap out the old
      nmi_watchdog with the new one with just a config change.
      
      The change probably makes sense from a high level perspective
      because the nmi_watchdog shouldn't be handling notify_die
      routines anyway.  However, this move does change the semantics a
      little bit.  Instead of checking on every nmi interrupt if the
      cpus are stuck, only check them on the nmi_watchdog interrupts.
      
       v2: Move notify_die call into #idef block
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      Cc: peterz@infradead.org
      LKML-Reference: <1265424425-31562-2-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e40b1720
    • M
      x86/alternatives: Fix build warning · 076dc4a6
      Masami Hiramatsu 提交于
      Fixes these warnings:
      
       arch/x86/kernel/alternative.c: In function 'alternatives_text_reserved':
       arch/x86/kernel/alternative.c:402: warning: comparison of distinct pointer types lacks a cast
       arch/x86/kernel/alternative.c:402: warning: comparison of distinct pointer types lacks a cast
       arch/x86/kernel/alternative.c:405: warning: comparison of distinct pointer types lacks a cast
       arch/x86/kernel/alternative.c:405: warning: comparison of distinct pointer types lacks a cast
      
      Caused by:
      
        2cfa1978: ftrace/alternatives: Introducing *_text_reserved functions
      
      Changes in v2:
        - Use local variables to compare, instead of type casts.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      LKML-Reference: <20100205171647.15750.37221.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      076dc4a6
    • A
      perf top: Use address pattern in lookup_sym_source · 5f485364
      Arnaldo Carvalho de Melo 提交于
      Because we may have aliases, like __GI___strcoll_l in
      /lib64/libc-2.10.2.so that appears in objdump as:
      
      $ objdump --start-address=0x0000003715a86420 \
                 --stop-address=0x0000003715a872dc -dS /lib64/libc-2.10.2.so
      
      0000003715a86420 <__strcoll_l>:
        3715a86420:	55                   	push   %rbp
        3715a86421:	48 89 e5             	mov    %rsp,%rbp
        3715a86424:	41 57                	push   %r15
      [root@doppio linux-2.6-tip]#
      
      So look for the address exactly at the start of the line instead
      so that annotation can work for in these cases.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Kirill Smelkov <kirr@landau.phys.spbu.ru>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1265550376-12665-2-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5f485364
    • K
      perf top: Fix annotate for userspace · ee11b90b
      Kirill Smelkov 提交于
      First, for programs and prelinked libraries, annotate code was
      fooled by objdump output IPs (src->eip in the code) being
      wrongly converted to absolute IPs. In such case there were no
      conversion needed, but in
      
         src->eip = strtoull(src->line, NULL, 16);
         src->eip = map->unmap_ip(map, src->eip); // = eip + map->start - map->pgoff
      
      we were reading absolute address from objdump (e.g. 8048604) and
      then almost doubling it, because eip & map->start are
      approximately close for small programs.
      
      Needless to say, that later, in record_precise_ip() there was no
      matching with real runtime IPs.
      
      And second, like with `perf annotate` the problem with
      non-prelinked *.so was that we were doing rip -> objdump address
      conversion wrong.
      
      Also, because unlike `perf annotate`, `perf top` code does
      annotation based on absolute IPs for performance reasons(*), new
      helper for mapping objdump addresse to IP is introduced.
      
      (*) we get samples info in absolute IPs, and since we do lots of
          hit-testing on absolute IPs at runtime in record_precise_ip(), it's
          better to convert objdump addresses to IPs once and do no conversion
          at runtime.
      
      I also had to fix how objdump output is parsed (with hardcoded
      8/16 characters format, which was inappropriate for ET_DYN dsos
      with small addresses like '4ac')
      
      Also note, that not all objdump output lines has associtated
      IPs, e.g. look at source lines here:
      
          000004ac <my_strlen>:
          extern "C"
          int my_strlen(const char *s)
           4ac:   55                      push   %ebp
           4ad:   89 e5                   mov    %esp,%ebp
           4af:   83 ec 10                sub    $0x10,%esp
          {
              int len = 0;
           4b2:   c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%ebp)
           4b9:   eb 08                   jmp    4c3 <my_strlen+0x17>
      
              while (*s) {
                  ++len;
           4bb:   83 45 fc 01             addl   $0x1,-0x4(%ebp)
                  ++s;
           4bf:   83 45 08 01             addl   $0x1,0x8(%ebp)
      
      So we mark them with eip=0, and ignore such lines in annotate
      lookup code.
      Signed-off-by: NKirill Smelkov <kirr@landau.phys.spbu.ru>
      [ Note: one hunk of this patch was applied by Mike in 57d81889 ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1265550376-12665-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ee11b90b
  6. 05 2月, 2010 1 次提交
    • M
      kprobes: Add mcount to the kprobes blacklist · 5ecaafdb
      Masami Hiramatsu 提交于
      Since mcount function can be called from everywhere,
      it should be blacklisted. Moreover, the "mcount" symbol
      is a special symbol name. So, it is better to put it in
      the generic blacklist.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <20100205062433.3745.36726.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5ecaafdb
  7. 04 2月, 2010 20 次提交
    • I
      perf tools: Fix session init on non-modular kernels · 2161db96
      Ingo Molnar 提交于
      perf top and perf record refuses to initialize on non-modular kernels:
      refuse to initialize:
      
       $ perf top -v
        map_groups__set_modules_path_dir: cannot open /lib/modules/2.6.33-rc6-tip-00586-g398dde3-dirty/
      
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1265223128-11786-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2161db96
    • X
      perf tools: Clean up O_LARGEFILE et al usage · f887f301
      Xiao Guangrong 提交于
      Setting _FILE_OFFSET_BITS and using O_LARGEFILE, lseek64, etc,
      is redundant. Thanks H. Peter Anvin for pointing it out.
      
      So, this patch removes O_LARGEFILE, lseek64, etc.
      Suggested-by: N"H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <4B6A8972.3070605@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f887f301
    • S
      perf_events, x86: Fix bug in hw_perf_enable() · 447a194b
      Stephane Eranian 提交于
      We cannot assume that because hwc->idx == assign[i], we can avoid
      reprogramming the counter in hw_perf_enable().
      
      The event may have been scheduled out and another event may have been
      programmed into this counter. Thus, we need a more robust way of
      verifying if the counter still contains config/data related to an event.
      
      This patch adds a generation number to each counter on each cpu. Using
      this mechanism we can verify reliabilty whether the content of a counter
      corresponds to an event.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4b66dc67.0b38560a.1635.ffffae18@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      447a194b
    • P
      bitops: Ensure the compile time HWEIGHT is only used for such · fce877e3
      Peter Zijlstra 提交于
      Avoid accidental misuse by failing to compile things
      Suggested-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fce877e3
    • P
      perf_events, x86: Implement intel core solo/duo support · 8c48e444
      Peter Zijlstra 提交于
      Implement Intel Core Solo/Duo, aka.
      Intel Architectural Performance Monitoring Version 1.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8c48e444
    • P
      perf_events: Optimize perf_event_task_tick() · 9717e6cd
      Peter Zijlstra 提交于
      Pretty much all of the calls do perf_disable/perf_enable cycles, pull
      that out to cut back on hardware programming.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9717e6cd
    • M
      ftrace: Remove record freezing · f24bb999
      Masami Hiramatsu 提交于
      Remove record freezing. Because kprobes never puts probe on
      ftrace's mcount call anymore, it doesn't need ftrace to check
      whether kprobes on it.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: przemyslaw@pawelczyk.it
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100202214925.4694.73469.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f24bb999
    • M
      kprobes: Check probe address is reserved · 4554dbcb
      Masami Hiramatsu 提交于
      Check whether the address of new probe is already reserved by
      ftrace or alternatives (on x86) when registering new probe.
      If reserved, it returns an error and not register the probe.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: przemyslaw@pawelczyk.it
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <20100202214918.4694.94179.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4554dbcb
    • M
      ftrace/alternatives: Introducing *_text_reserved functions · 2cfa1978
      Masami Hiramatsu 提交于
      Introducing *_text_reserved functions for checking the text
      address range is partially reserved or not. This patch provides
      checking routines for x86 smp alternatives and dynamic ftrace.
      Since both functions modify fixed pieces of kernel text, they
      should reserve and protect those from other dynamic text
      modifier, like kprobes.
      
      This will also be extended when introducing other subsystems
      which modify fixed pieces of kernel text. Dynamic text modifiers
      should avoid those.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: przemyslaw@pawelczyk.it
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <20100202214911.4694.16587.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2cfa1978
    • M
      kprobes: Disable booster when CONFIG_PREEMPT=y · 615d0ebb
      Masami Hiramatsu 提交于
      Disable kprobe booster when CONFIG_PREEMPT=y at this time,
      because it can't ensure that all kernel threads preempted on
      kprobe's boosted slot run out from the slot even using
      freeze_processes().
      
      The booster on preemptive kernel will be resumed if
      synchronize_tasks() or something like that is introduced.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <20100202214904.4694.24330.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      615d0ebb
    • M
      perf annotate: Fix perf top module symbol annotation · 57d81889
      Mike Galbraith 提交于
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Cc: Kirill Smelkov <kirr@landau.phys.spbu.ru>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1265265106.6364.5.camel@marge.simson.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      57d81889
    • K
      perf top: Teach it to autolocate vmlinux · 6cff0e8d
      Kirill Smelkov 提交于
      By relying on logic in dso__load_kernel_sym(), we can
      automatically load vmlinux.
      
      The only thing which needs to be adjusted, is how --sym-annotate
      option is handled - now we can't rely on vmlinux been loaded
      until full successful pass of dso__load_vmlinux(), but that's
      not the case if we'll do sym_filter_entry setup in
      symbol_filter().
      
      So move this step right after event__process_sample() where we
      know the whole dso__load_kernel_sym() pass is done.
      
      By the way, though conceptually similar `perf top` still can't
      annotate userspace - see next patches with fixes.
      Signed-off-by: NKirill Smelkov <kirr@landau.phys.spbu.ru>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1265223128-11786-9-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6cff0e8d
    • K
      perf annotate: Fix it for non-prelinked *.so · 7a2b6209
      Kirill Smelkov 提交于
      The problem was we were incorrectly calculating objdump
      addresses for sym->start and sym->end, look:
      
      For simple ET_DYN type DSO (*.so) with one function, objdump -dS
      output is something like this:
      
          000004ac <my_strlen>:
          int my_strlen(const char *s)
           4ac:   55                      push   %ebp
           4ad:   89 e5                   mov    %esp,%ebp
           4af:   83 ec 10                sub    $0x10,%esp
          {
      
      i.e. we have relative-to-dso-mapping IPs (=RIP) there.
      
      For ET_EXEC type and probably for prelinked libs as well (sorry
      can't test - I don't use prelink) objdump outputs absolute IPs,
      e.g.
      
          08048604 <zz_strlen>:
          extern "C"
          int zz_strlen(const char *s)
           8048604:       55                      push   %ebp
           8048605:       89 e5                   mov    %esp,%ebp
           8048607:       83 ec 10                sub    $0x10,%esp
          {
      
      So, if sym->start is always relative to dso mapping(*), we'll
      have to unmap it for ET_EXEC like cases, and leave as is for
      ET_DYN cases.
      
      (*) and it is - we've explicitely made it relative. Look for
          adjust_symbols handling in dso__load_sym()
      
      Previously we were always unmapping sym->start and for ET_DYN
      dsos resulting addresses were wrong, and so objdump output was
      empty.
      
      The end result was that perf annotate output for symbols from
      non-prelinked *.so had always 0.00% percents only, which is
      wrong.
      
      To fix it, let's introduce a helper for converting rip to
      objdump address, and also let's document what map_ip() and
      unmap_ip() do -- I had to study sources for several hours to
      understand it.
      Signed-off-by: NKirill Smelkov <kirr@landau.phys.spbu.ru>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1265223128-11786-8-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7a2b6209
    • A
      perf tools: Adjust some verbosity levels · 29a9f66d
      Arnaldo Carvalho de Melo 提交于
      Not to pollute too much 'perf annotate' debugging sessions.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1265223128-11786-7-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      29a9f66d
    • A
      perf record: Stop intercepting events, use postprocessing to get build-ids · 6122e4e4
      Arnaldo Carvalho de Melo 提交于
      We want to stream events as fast as possible to perf.data, and
      also in the future we want to have splice working, when no
      interception will be possible.
      
      Using build_id__mark_dso_hit_ops to create the list of DSOs that
      back MMAPs we also optimize disk usage in the build-id cache by
      only caching DSOs that had hits.
      Suggested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1265223128-11786-6-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6122e4e4
    • A
      perf build-id: Move the routine to find DSOs with hits to the lib · 7b2567c1
      Arnaldo Carvalho de Melo 提交于
      Because 'perf record' will have to find the build-ids in after
      we stop recording, so as to reduce even more the impact in the
      workload while we do the measurement.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1265223128-11786-5-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7b2567c1
    • A
      perf probe: Don't use a perf_session instance just to resolve symbols · 8ad94c60
      Arnaldo Carvalho de Melo 提交于
      With the recent modifications done to untie the session and
      symbol layers, 'perf probe' now can use just the symbols layer.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8ad94c60
    • A
      perf symbols: Ditch vdso global variable · 8d92c02a
      Arnaldo Carvalho de Melo 提交于
      We can check using strcmp, most DSOs don't start with '[' so the
      test is cheap enough and we had to test it there anyway since
      when reading perf.data files we weren't calling the routine that
      created this global variable and thus weren't setting it as
      "loaded", which was causing a bogus:
      
        Failed to open [vdso], continuing without symbols
      
      Message as the first line of 'perf report'.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1265223128-11786-3-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8d92c02a
    • A
      perf symbols: Fixup vsyscall maps · 6275ce2d
      Arnaldo Carvalho de Melo 提交于
      While debugging a problem reported by Pekka Enberg by printing
      the IP and all the maps for a thread when we don't find a map
      for an IP I noticed that dso__load_sym needs to fixup these
      extra maps it creates to hold symbols in different ELF sections
      than the main kernel one.
      
      Now we're back showing things like:
      
      [root@doppio linux-2.6-tip]# perf report | grep vsyscall
           0.02%             mutt  [kernel.kallsyms].vsyscall_fn  [.] vread_hpet
           0.01%            named  [kernel.kallsyms].vsyscall_fn  [.] vread_hpet
           0.01%   NetworkManager  [kernel.kallsyms].vsyscall_fn  [.] vread_hpet
           0.01%         gconfd-2  [kernel.kallsyms].vsyscall_0   [.] vgettimeofday
           0.01%  hald-addon-rfki  [kernel.kallsyms].vsyscall_fn  [.] vread_hpet
           0.00%      dbus-daemon  [kernel.kallsyms].vsyscall_fn  [.] vread_hpet
      [root@doppio linux-2.6-tip]#
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1265223128-11786-2-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6275ce2d
    • A
      perf symbols: Remove perf_session usage in symbols layer · 9de89fe7
      Arnaldo Carvalho de Melo 提交于
      I noticed while writing the first test in 'perf regtest' that to
      just test the symbol handling routines one needs to create a
      perf session, that is a layer centered on a perf.data file,
      events, etc, so I untied these layers.
      
      This reduces the complexity for the users as the number of
      parameters to most of the symbols and session APIs now was
      reduced while not adding more state to all the map instances by
      only having data that is needed to split the kernel (kallsyms
      and ELF symtab sections) maps and do vmlinux relocation on the
      main kernel map.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1265223128-11786-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9de89fe7
  8. 03 2月, 2010 1 次提交
    • X
      perf tools: Use O_LARGEFILE to open perf data file · b8f46c5a
      Xiao Guangrong 提交于
      Open perf data file with O_LARGEFILE flag since its size is
      easily larger that 2G.
      
      For example:
      
       # rm -rf perf.data
       # ./perf kmem record sleep 300
      
       [ perf record: Woken up 0 times to write data ]
       [ perf record: Captured and wrote 3142.147 MB perf.data
       (~137282513 samples) ]
      
       # ll -h perf.data
       -rw------- 1 root root 3.1G .....
      Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <4B68F32A.9040203@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b8f46c5a
  9. 31 1月, 2010 5 次提交