1. 09 7月, 2010 2 次提交
  2. 08 7月, 2010 1 次提交
  3. 15 6月, 2010 1 次提交
  4. 12 5月, 2010 1 次提交
    • P
      powerpc/perf_event: Fix oops due to perf_event_do_pending call · 0fe1ac48
      Paul Mackerras 提交于
      Anton Blanchard found that large POWER systems would occasionally
      crash in the exception exit path when profiling with perf_events.
      The symptom was that an interrupt would occur late in the exit path
      when the MSR[RI] (recoverable interrupt) bit was clear.  Interrupts
      should be hard-disabled at this point but they were enabled.  Because
      the interrupt was not recoverable the system panicked.
      
      The reason is that the exception exit path was calling
      perf_event_do_pending after hard-disabling interrupts, and
      perf_event_do_pending will re-enable interrupts.
      
      The simplest and cleanest fix for this is to use the same mechanism
      that 32-bit powerpc does, namely to cause a self-IPI by setting the
      decrementer to 1.  This means we can remove the tests in the exception
      exit path and raw_local_irq_restore.
      
      This also makes sure that the call to perf_event_do_pending from
      timer_interrupt() happens within irq_enter/irq_exit.  (Note that
      calling perf_event_do_pending from timer_interrupt does not mean that
      there is a possible 1/HZ latency; setting the decrementer to 1 ensures
      that the timer interrupt will happen immediately, i.e. within one
      timebase tick, which is a few nanoseconds or 10s of nanoseconds.)
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Cc: stable@kernel.org
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0fe1ac48
  5. 06 5月, 2010 1 次提交
  6. 19 2月, 2010 1 次提交
  7. 17 2月, 2010 4 次提交
  8. 15 12月, 2009 1 次提交
  9. 09 12月, 2009 1 次提交
  10. 24 11月, 2009 1 次提交
  11. 30 10月, 2009 4 次提交
  12. 28 10月, 2009 1 次提交
    • A
      powerpc: tracing: Add powerpc tracepoints for interrupt entry and exit · 1bf4af16
      Anton Blanchard 提交于
      This adds powerpc-specific tracepoints for interrupt entry and exit.
      
      While we already have generic irq_handler_entry and irq_handler_exit
      tracepoints there are cases on our virtualised powerpc machines where an
      interrupt is presented to the OS, but subsequently handled by the hypervisor.
      This means no OS interrupt handler is invoked.
      
      Here is an example on a POWER6 machine with the patch below applied:
      
      <idle>-0     [006]  3243.949840744: irq_entry: pt_regs=c0000000ce31fb10
      <idle>-0     [006]  3243.949850520: irq_exit: pt_regs=c0000000ce31fb10
      
      <idle>-0     [007]  3243.950218208: irq_entry: pt_regs=c0000000ce323b10
      <idle>-0     [007]  3243.950224080: irq_exit: pt_regs=c0000000ce323b10
      
      <idle>-0     [000]  3244.021879320: irq_entry: pt_regs=c000000000a63aa0
      <idle>-0     [000]  3244.021883616: irq_handler_entry: irq=87 handler=eth0
      <idle>-0     [000]  3244.021887328: irq_handler_exit: irq=87 return=handled
      <idle>-0     [000]  3244.021897408: irq_exit: pt_regs=c000000000a63aa0
      
      Here we see two phantom interrupts (no handler was invoked), followed
      by a real interrupt for eth0. Without the tracepoints in this patch we
      would have missed the phantom interrupts.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      1bf4af16
  13. 21 9月, 2009 1 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
  14. 12 6月, 2009 1 次提交
  15. 09 6月, 2009 1 次提交
  16. 21 5月, 2009 4 次提交
  17. 07 4月, 2009 1 次提交
  18. 06 4月, 2009 2 次提交
    • P
      perf_counter: unify and fix delayed counter wakeup · 925d519a
      Peter Zijlstra 提交于
      While going over the wakeup code I noticed delayed wakeups only work
      for hardware counters but basically all software counters rely on
      them.
      
      This patch unifies and generalizes the delayed wakeup to fix this
      issue.
      
      Since we're dealing with NMI context bits here, use a cmpxchg() based
      single link list implementation to track counters that have pending
      wakeups.
      
      [ This should really be generic code for delayed wakeups, but since we
        cannot use cmpxchg()/xchg() in generic code, I've let it live in the
        perf_counter code. -- Eric Dumazet could use it to aggregate the
        network wakeups. ]
      
      Furthermore, the x86 method of using TIF flags was flawed in that its
      quite possible to end up setting the bit on the idle task, loosing the
      wakeup.
      
      The powerpc method uses per-cpu storage and does appear to be
      sufficient.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Orig-LKML-Reference: <20090330171023.153932974@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      925d519a
    • P
      perf_counter: abstract wakeup flag setting in core to fix powerpc build · b6c5a71d
      Paul Mackerras 提交于
      Impact: build fix for powerpc
      
      Commit bd753921015e7905 ("perf_counter: software counter event
      infrastructure") introduced a use of TIF_PERF_COUNTERS into the core
      perfcounter code.  This breaks the build on powerpc because we use
      a flag in a per-cpu area to signal wakeups on powerpc rather than
      a thread_info flag, because the thread_info flags have to be
      manipulated with atomic operations and are thus slower than per-cpu
      flags.
      
      This fixes the by changing the core to use an abstracted
      set_perf_counter_pending() function, which is defined on x86 to set
      the TIF_PERF_COUNTERS flag and on powerpc to set the per-cpu flag
      (paca->perf_counter_pending).  It changes the previous powerpc
      definition of set_perf_counter_pending to not take an argument and
      adds a clear_perf_counter_pending, so as to simplify the definition
      on x86.
      
      On x86, set_perf_counter_pending() is defined as a macro.  Defining
      it as a static inline in arch/x86/include/asm/perf_counters.h causes
      compile failures because <asm/perf_counters.h> gets included early in
      <linux/sched.h>, and the definitions of set_tsk_thread_flag etc. are
      therefore not available in <asm/perf_counters.h>.  (On powerpc this
      problem is avoided by defining set_perf_counter_pending etc. in
      <asm/hw_irq.h>.)
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      b6c5a71d
  19. 11 3月, 2009 1 次提交
  20. 13 1月, 2009 1 次提交
  21. 11 1月, 2009 1 次提交
    • Y
      sparseirq: use kstat_irqs_cpu instead · dee4102a
      Yinghai Lu 提交于
      Impact: build fix
      
      Ingo Molnar wrote:
      
      > tip/arch/blackfin/kernel/irqchip.c: In function 'show_interrupts':
      > tip/arch/blackfin/kernel/irqchip.c:85: error: 'struct kernel_stat' has no member named 'irqs'
      > make[2]: *** [arch/blackfin/kernel/irqchip.o] Error 1
      > make[2]: *** Waiting for unfinished jobs....
      >
      
      So could move kstat_irqs array to irq_desc struct.
      
      (s390, m68k, sparc) are not touched yet, because they don't support genirq
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dee4102a
  22. 09 1月, 2009 1 次提交
    • P
      powerpc: Provide a way to defer perf counter work until interrupts are enabled · 93a6d3ce
      Paul Mackerras 提交于
      Because 64-bit powerpc uses lazy (soft) interrupt disabling, it is
      possible for a performance monitor exception to come in when the
      kernel thinks interrupts are disabled (i.e. when they are
      soft-disabled but hard-enabled).  In such a situation the performance
      monitor exception handler might have some processing to do (such as
      process wakeups) which can't be done in what is effectively an NMI
      handler.
      
      This provides a way to defer that work until interrupts get enabled,
      either in raw_local_irq_restore() or by returning from an interrupt
      handler to code that had interrupts enabled.  We have a per-processor
      flag that indicates that there is work pending to do when interrupts
      subsequently get re-enabled.  This flag is checked in the interrupt
      return path and in raw_local_irq_restore(), and if it is set,
      perf_counter_do_pending() is called to do the pending work.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      93a6d3ce
  23. 13 12月, 2008 1 次提交
  24. 16 9月, 2008 2 次提交
    • S
      powerpc: Make the irq reverse mapping radix tree lockless · 150c6c8f
      Sebastien Dugue 提交于
      The radix trees used by interrupt controllers for their irq reverse
      mapping (currently only the XICS found on pSeries) have a complex
      locking scheme dating back to before the advent of the lockless radix
      tree.
      
      This takes advantage of the lockless radix tree and of the fact that
      the items of the tree are pointers to a static array (irq_map)
      elements which can never go under us to simplify the locking.
      
      Concurrency between readers and writers is handled by the intrinsic
      properties of the lockless radix tree.  Concurrency between writers is
      handled with a global mutex.
      Signed-off-by: NSebastien Dugue <sebastien.dugue@bull.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      150c6c8f
    • S
      powerpc: Separate the irq radix tree insertion and lookup · 967e012e
      Sebastien Dugue 提交于
      irq_radix_revmap() currently serves 2 purposes, irq mapping lookup
      and insertion which happen in interrupt and process context respectively.
      
      Separate the function into its 2 components, one for lookup only and one
      for insertion only.
      
      Fix the only user of the revmap tree (XICS) to use the new functions.
      
      Also, move the insertion into the radix tree of those irqs that were
      requested before it was initialized at said tree initialization.
      
      Mutual exclusion between the tree initialization and readers/writers is
      handled via a state variable (revmap_trees_allocated) set to 1 when the tree
      has been initialized and set to 2 after the already requested irqs have been
      inserted in the tree by the init path. This state is checked before any reader
      or writer access just like we used to check for tree.gfp_mask != 0 before.
      
      Finally, now that we're not any longer inserting nodes into the radix-tree
      in interrupt context, turn the GFP_ATOMIC allocations into GFP_KERNEL ones.
      Signed-off-by: NSebastien Dugue <sebastien.dugue@bull.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      967e012e
  25. 04 8月, 2008 1 次提交
  26. 16 6月, 2008 1 次提交
  27. 09 6月, 2008 1 次提交
  28. 03 6月, 2008 1 次提交