1. 19 10月, 2010 1 次提交
    • P
      irq_work: Add generic hardirq context callbacks · e360adbe
      Peter Zijlstra 提交于
      Provide a mechanism that allows running code in IRQ context. It is
      most useful for NMI code that needs to interact with the rest of the
      system -- like wakeup a task to drain buffers.
      
      Perf currently has such a mechanism, so extract that and provide it as
      a generic feature, independent of perf so that others may also
      benefit.
      
      The IRQ context callback is generated through self-IPIs where
      possible, or on architectures like powerpc the decrementer (the
      built-in timer facility) is set to generate an interrupt immediately.
      
      Architectures that don't have anything like this get to do with a
      callback from the timer tick. These architectures can call
      irq_work_run() at the tail of any IRQ handlers that might enqueue such
      work (like the perf IRQ handler) to avoid undue latencies in
      processing the work.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NKyle McMartin <kyle@mcmartin.ca>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      [ various fixes ]
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      LKML-Reference: <1287036094.7768.291.camel@yhuang-dev>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e360adbe
  2. 03 8月, 2010 1 次提交
  3. 31 7月, 2010 1 次提交
  4. 27 7月, 2010 1 次提交
  5. 14 7月, 2010 1 次提交
  6. 09 7月, 2010 1 次提交
  7. 08 7月, 2010 1 次提交
  8. 06 7月, 2010 4 次提交
  9. 22 6月, 2010 1 次提交
    • K
      powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors · 5aae8a53
      K.Prasad 提交于
      Implement perf-events based hw-breakpoint interfaces for PowerPC
      64-bit server (Book III S) processors.  This allows access to a
      given location to be used as an event that can be counted or
      profiled by the perf_events subsystem.
      
      This is done using the DABR (data breakpoint register), which can
      also be used for process debugging via ptrace.  When perf_event
      hw_breakpoint support is configured in, the perf_event subsystem
      manages the DABR and arbitrates access to it, and ptrace then
      creates a perf_event when it is requested to set a data breakpoint.
      
      [Adopted suggestions from Paul Mackerras <paulus@samba.org> to
      - emulate_step() all system-wide breakpoints and single-step only the
        per-task breakpoints
      - perform arch-specific cleanup before unregistration through
        arch_unregister_hw_breakpoint()
      ]
      Signed-off-by: NK.Prasad <prasad@linux.vnet.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      5aae8a53
  10. 28 5月, 2010 1 次提交
  11. 25 5月, 2010 1 次提交
    • S
      powerpc/kexec: Add support for FSL-BookE · b3df895a
      Sebastian Andrzej Siewior 提交于
      This adds support kexec on FSL-BookE where the MMU can not be simply
      switched off. The code borrows the initial MMU-setup code to create the
      identical mapping mapping. The only difference to the original boot code
      is the size of the mapping(s) and the executeable address.
      The kexec code maps the first 2 GiB of memory in 256 MiB steps. This
      should work also on e500v1 boxes.
      SMP support is still not available.
      
      (Kumar: Added minor change to build to ifdef CONFIG_PPC_STD_MMU_64 some
      code that was PPC64 specific)
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      b3df895a
  12. 07 4月, 2010 1 次提交
  13. 19 3月, 2010 1 次提交
    • F
      powerpc: Remove IOMMU_VMERGE config option · 191aee58
      FUJITA Tomonori 提交于
      The description says:
      
       Cause IO segments sent to a device for DMA to be merged virtually
       by the IOMMU when they happen to have been allocated contiguously.
       This doesn't add pressure to the IOMMU allocator. However, some
       drivers don't support getting large merged segments coming back
       from *_map_sg().
      
       Most drivers don't have this problem; it is safe to say Y here.
      
      It's out of date. Long ago, drivers didn't have a way to tell IOMMUs
      about their segment length limit (that is, the maximum segment length
      that they can handle). So IOMMUs merged as many segments as possible
      and gave too large segments to drivers.
      
      dma_get_max_seg_size() was introduced to solve the above
      problem. Device drives can use the API to tell IOMMU about the maximum
      segment length that they can handle. In addition, the default limit
      (64K) should be safe for everyone.
      
      So this config option seems to be unnecessary.
      
      Note that this config option just enables users to disable the virtual
      merging by default. Users can still disable the virtual merging by the
      boot parameter.
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      191aee58
  14. 13 3月, 2010 1 次提交
  15. 17 2月, 2010 1 次提交
    • D
      powerpc/booke: Introduce new CONFIG options for advanced debug registers · 172ae2e7
      Dave Kleikamp 提交于
      powerpc/booke: Introduce new CONFIG options for advanced debug registers
      
      From: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
      
      Introduce new config options to simplify the ifdefs pertaining to the
      advanced debug registers for booke and 40x processors:
      
      CONFIG_PPC_ADV_DEBUG_REGS - boolean: true for dac-based processors
      CONFIG_PPC_ADV_DEBUG_IACS - number of IAC registers
      CONFIG_PPC_ADV_DEBUG_DACS - number of DAC registers
      CONFIG_PPC_ADV_DEBUG_DVCS - number of DVC registers
      CONFIG_PPC_ADV_DEBUG_DAC_RANGE - DAC ranges supported
      
      Beginning conservatively, since I only have the facilities to test 440
      hardware.  I believe all 40x and booke platforms support at least 2 IAC
      and 2 DAC registers.  For 440, 4 IAC and 2 DVC registers are enabled, as
      well as the DAC ranges.
      Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
      Acked-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      172ae2e7
  16. 03 2月, 2010 1 次提交
    • A
      powerpc: Increase NR_IRQS Kconfig maximum to 32768 · 859aefc5
      Anton Blanchard 提交于
      With dynamic irq descriptors the overhead of a large NR_IRQS is much lower
      than it used to be. With more MSI-X capable adapters and drivers exploiting
      multiple vectors we may as well allow the user to increase it beyond the
      current maximum of 512.
      
      32768 seems large enough that we'd never have to bump it again (although I bet
      my prediction is horribly wrong). It boot tests OK and the vmlinux footprint
      increase is only around 500kB due to:
      
      struct irq_map_entry irq_map[NR_IRQS];
      
      We format /proc/interrupts correctly with the previous changes:
      
                   CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
        286:          0          0          0          0          0          0
        516:          0          0          0          0          0          0
      16689:       1833          0          0          0          0          0
      17157:          0          0          0          0          0          0
      17158:        319          0          0          0          0          0
      25092:          0          0          0          0          0          0
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      859aefc5
  17. 13 12月, 2009 1 次提交
  18. 09 12月, 2009 1 次提交
    • N
      sysfs/cpu: Add probe/release files · 12633e80
      Nathan Fontenot 提交于
      Version 3 of this patch is updated with documentation added to
      Documentation/ABI.  There are no changes to any of the C code from v2
      of the patch.
      
      In order to support kernel DLPAR of CPU resources we need to provide an
      interface to add (probe) and remove (release) the resource from the system.
      This patch Creates new generic probe and release sysfs files to facilitate
      cpu probe/release.  The probe/release interface provides for allowing each
      arch to supply their own routines for implementing the backend of adding
      and removing cpus to/from the system.
      
      This also creates the powerpc specific stubs to handle the arch callouts
      from writes to the sysfs files.
      
      The creation and use of these files is regulated by the
      CONFIG_ARCH_CPU_PROBE_RELEASE option so that only architectures that need the
      capability will have the files created.
      Signed-off-by: NNathan Fontenot <nfont@austin.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      12633e80
  19. 24 11月, 2009 1 次提交
  20. 12 11月, 2009 1 次提交
  21. 30 10月, 2009 4 次提交
    • M
      powerpc: Enable sparse irq_descs on powerpc · cd015707
      Michael Ellerman 提交于
      Defining CONFIG_SPARSE_IRQ enables generic code that gets rid of the
      static irq_desc array, and replaces it with an array of pointers to
      irq_descs.
      
      It also allows node local allocation of irq_descs, however we
      currently don't have the information available to do that, so we just
      allocate them on all on node 0.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cd015707
    • M
      powerpc: Make NR_IRQS a CONFIG option · 551b81f2
      Michael Ellerman 提交于
      The irq_desc array consumes quite a lot of space, and for systems
      that don't need or can't have 512 irqs it's just wasted space.
      
      The first 16 are reserved for ISA, so the minimum of 32 is really
      16 - and no one has asked for more than 512 so leave that as the
      maximum.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      551b81f2
    • A
      powerpc: Make it possible to select hibernation on all PowerPCs · 64eb38a6
      Anton Vorontsov 提交于
      Just as with kexec, hibernation may fail even on well-tested platforms:
      some PCI device, a driver of which doesn't play well with hibernation,
      is enough to break resuming.
      
      Hibernation code is not much platform dependent, and hiding features only
      because these were not verified on a particular hardware is
      counterproductive: we just prevent the features from being widely tested.
      
      For example, with this patch I just tested hibernation on a MPC83xx
      board, and it works quite well, modulo a few drivers that need some
      fixing.
      
      So, let's make it possible to select hibernation support for all
      PowerPCs, then let's wait for any possible bug reports, and actually fix
      (or just collect ;-) the bugs instead of hiding them. If some platforms
      really can't stand hibernation, we can make a blacklist, with proper
      comments why exactly hibernation doesn't work, whether it is possible to
      fix, and what needs to be done to fix it.
      
      CONFIG_HIBERNATION is still =n by default, so the commit doesn't change
      anything apart from ability to set it to =y.
      
      I'm not sure if EXPERIMENTAL dependency is needed, I'd rather not add it
      for a few reasons:
      
      1) It doesn't matter much, for distro kernels user has no clue that some
         feature is experimental. Majority of defconfigs enable EXPERIMENTAL
         anyway (90 vs. 4, which, btw, means that EXPERIMENTAL is overused
         in Kconfigs);
      
      2) EXPERIMENTAL is a good thing for features that change default
         behaviour of a kernel, while for hibernation user has to explicitly
         issue 'echo disk > /sys/power/state' to trigger any hibernation bugs;
      
      3) Per init/Kconfig, EXPERIMENTAL is a good thing to scare and discourage
         users from 'widespread use of a feature', while we want to encourage
         that use.
      Signed-off-by: NAnton Vorontsov <avorontsov@ru.mvista.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      64eb38a6
    • B
  22. 24 9月, 2009 1 次提交
    • A
      powerpc: Increase NODES_SHIFT on 64bit from 4 to 8 · ea55bf29
      Anton Blanchard 提交于
      Some System p configurations can already have more than 16 nodes so we
      need to increase NODES_SHIFT. I chose 256 to give us some room to grow in the
      future, although we can look at something smaller if the memory bloat is
      considered too much.
      
      Unless we clamp MAX_ACTIVE_REGIONS we end up with 300kB of extra bloat in
      early_node_map in mm/page_alloc.c:
      
      < 6144   early_node_map
      > 307200 early_node_map
      
      due to:
      
          #if MAX_NUMNODES >= 32
            /* If there can be many nodes, allow up to 50 holes per node */
            #define MAX_ACTIVE_REGIONS (MAX_NUMNODES*50)
          #else
            /* By default, allow up to 256 distinct regions */
          #define MAX_ACTIVE_REGIONS 256
      
      Since our memory is mostly contiguous it seems reasonable to keep this
      at 256 for now. I also set 32bit to 32 to save space (is there any chance
      a 32bit system will have more than 32 discontiguous memory ranges?).
      
      Even with that fixed we have a few data structures that grow:
      
      < 896   bootmem_node_data
      > 14336 bootmem_node_data
      
      < 1280  node_devices
      > 20480 node_devices
      
      < 25088 kmalloc_caches
      > 59648 kmalloc_caches
      
      < 1632  hstates
      > 21792 hstates
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ea55bf29
  23. 21 9月, 2009 1 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
  24. 28 8月, 2009 2 次提交
  25. 20 8月, 2009 2 次提交
  26. 14 8月, 2009 1 次提交
    • T
      powerpc64: convert to dynamic percpu allocator · c2a7e818
      Tejun Heo 提交于
      Now that percpu allows arbitrary embedding of the first chunk,
      powerpc64 can easily be converted to dynamic percpu allocator.
      Convert it.  powerpc supports several large page sizes.  Cap atom_size
      at 1M.  There isn't much to gain by going above that anyway.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      c2a7e818
  27. 26 6月, 2009 1 次提交
    • B
      powerpc: Add irqtrace support for 32-bit powerpc · 5d38902c
      Benjamin Herrenschmidt 提交于
      Based on initial work from: Dale Farnsworth <dale@farnsworth.org>
      
      Add the low level irq tracing hooks for 32-bit powerpc needed
      to enable full lockdep functionality.
      
      The approach taken to deal with the code in entry_32.S is that
      we don't trace all the transitions of MSR:EE when we just turn
      it off to peek at TI_FLAGS without races. Only when we are
      calling into C code or returning from exceptions with a state
      that have changed from what lockdep thinks.
      
      There's a little bugger though: If we take an exception that
      keeps interrupts enabled (such as an alignment exception) while
      interrupts are enabled, we will call trace_hardirqs_on() on the
      way back spurriously. Not a big deal, but to get rid of it would
      require remembering in pt_regs that the exception was one of the
      type that kept interrupts enabled which we don't know at this
      stage. (Well, we could test all cases for regs->trap but that
      sucks too much).
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Tested-by: NKumar Gala <galak@kernel.crashing.org>
      5d38902c
  28. 24 6月, 2009 1 次提交
    • T
      percpu: use dynamic percpu allocator as the default percpu allocator · e74e3962
      Tejun Heo 提交于
      This patch makes most !CONFIG_HAVE_SETUP_PER_CPU_AREA archs use
      dynamic percpu allocator.  The first chunk is allocated using
      embedding helper and 8k is reserved for modules.  This ensures that
      the new allocator behaves almost identically to the original allocator
      as long as static percpu variables are concerned, so it shouldn't
      introduce much breakage.
      
      s390 and alpha use custom SHIFT_PERCPU_PTR() to work around addressing
      range limit the addressing model imposes.  Unfortunately, this breaks
      if the address is specified using a variable, so for now, the two
      archs aren't converted.
      
      The following architectures are affected by this change.
      
      * sh
      * arm
      * cris
      * mips
      * sparc(32)
      * blackfin
      * avr32
      * parisc (broken, under investigation)
      * m32r
      * powerpc(32)
      
      As this change makes the dynamic allocator the default one,
      CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is replaced with its invert -
      CONFIG_HAVE_LEGACY_PER_CPU_AREA, which is added to yet-to-be converted
      archs.  These archs implement their own setup_per_cpu_areas() and the
      conversion is not trivial.
      
      * powerpc(64)
      * sparc(64)
      * ia64
      * alpha
      * s390
      
      Boot and batch alloc/free tests on x86_32 with debug code (x86_32
      doesn't use default first chunk initialization).  Compile tested on
      sparc(32), powerpc(32), arm and alpha.
      
      Kyle McMartin reported that this change breaks parisc.  The problem is
      still under investigation and he is okay with pushing this patch
      forward and fixing parisc later.
      
      [ Impact: use dynamic allocator for most archs w/o custom percpu setup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Reviewed-by: NChristoph Lameter <cl@linux.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Bryan Wu <cooloney@kernel.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      e74e3962
  29. 18 6月, 2009 1 次提交
    • P
      perf_counter: powerpc: Enable use of software counters on 32-bit powerpc · 105988c0
      Paul Mackerras 提交于
      This enables the perf_counter subsystem on 32-bit powerpc.  Since we
      don't have any support for hardware counters on 32-bit powerpc yet,
      only software counters can be used.
      
      Besides selecting HAVE_PERF_COUNTERS for 32-bit powerpc as well as
      64-bit, the main thing this does is add an implementation of
      set_perf_counter_pending().  This needs to arrange for
      perf_counter_do_pending() to be called when interrupts are enabled.
      Rather than add code to local_irq_restore as 64-bit does, the 32-bit
      set_perf_counter_pending() generates an interrupt by setting the
      decrementer to 1 so that a decrementer interrupt will become pending
      in 1 or 2 timebase ticks (if a decrementer interrupt isn't already
      pending).  When interrupts are enabled, timer_interrupt() will be
      called, and some new code in there calls perf_counter_do_pending().
      We use a per-cpu array of flags to indicate whether we need to call
      perf_counter_do_pending() or not.
      
      This introduces a couple of new Kconfig symbols: PPC_HAVE_PMU_SUPPORT,
      which is selected by processor families for which we have hardware PMU
      support (currently only PPC64), and PPC_PERF_CTRS, which enables the
      powerpc-specific perf_counter back-end.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linuxppc-dev@ozlabs.org
      Cc: benh@kernel.crashing.org
      LKML-Reference: <19000.55404.103840.393470@cargo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      105988c0
  30. 15 6月, 2009 2 次提交
  31. 09 6月, 2009 1 次提交