1. 30 10月, 2009 1 次提交
  2. 24 9月, 2009 1 次提交
    • A
      powerpc: Increase NODES_SHIFT on 64bit from 4 to 8 · ea55bf29
      Anton Blanchard 提交于
      Some System p configurations can already have more than 16 nodes so we
      need to increase NODES_SHIFT. I chose 256 to give us some room to grow in the
      future, although we can look at something smaller if the memory bloat is
      considered too much.
      
      Unless we clamp MAX_ACTIVE_REGIONS we end up with 300kB of extra bloat in
      early_node_map in mm/page_alloc.c:
      
      < 6144   early_node_map
      > 307200 early_node_map
      
      due to:
      
          #if MAX_NUMNODES >= 32
            /* If there can be many nodes, allow up to 50 holes per node */
            #define MAX_ACTIVE_REGIONS (MAX_NUMNODES*50)
          #else
            /* By default, allow up to 256 distinct regions */
          #define MAX_ACTIVE_REGIONS 256
      
      Since our memory is mostly contiguous it seems reasonable to keep this
      at 256 for now. I also set 32bit to 32 to save space (is there any chance
      a 32bit system will have more than 32 discontiguous memory ranges?).
      
      Even with that fixed we have a few data structures that grow:
      
      < 896   bootmem_node_data
      > 14336 bootmem_node_data
      
      < 1280  node_devices
      > 20480 node_devices
      
      < 25088 kmalloc_caches
      > 59648 kmalloc_caches
      
      < 1632  hstates
      > 21792 hstates
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ea55bf29
  3. 21 9月, 2009 1 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
  4. 28 8月, 2009 2 次提交
  5. 20 8月, 2009 2 次提交
  6. 14 8月, 2009 1 次提交
    • T
      powerpc64: convert to dynamic percpu allocator · c2a7e818
      Tejun Heo 提交于
      Now that percpu allows arbitrary embedding of the first chunk,
      powerpc64 can easily be converted to dynamic percpu allocator.
      Convert it.  powerpc supports several large page sizes.  Cap atom_size
      at 1M.  There isn't much to gain by going above that anyway.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      c2a7e818
  7. 26 6月, 2009 1 次提交
    • B
      powerpc: Add irqtrace support for 32-bit powerpc · 5d38902c
      Benjamin Herrenschmidt 提交于
      Based on initial work from: Dale Farnsworth <dale@farnsworth.org>
      
      Add the low level irq tracing hooks for 32-bit powerpc needed
      to enable full lockdep functionality.
      
      The approach taken to deal with the code in entry_32.S is that
      we don't trace all the transitions of MSR:EE when we just turn
      it off to peek at TI_FLAGS without races. Only when we are
      calling into C code or returning from exceptions with a state
      that have changed from what lockdep thinks.
      
      There's a little bugger though: If we take an exception that
      keeps interrupts enabled (such as an alignment exception) while
      interrupts are enabled, we will call trace_hardirqs_on() on the
      way back spurriously. Not a big deal, but to get rid of it would
      require remembering in pt_regs that the exception was one of the
      type that kept interrupts enabled which we don't know at this
      stage. (Well, we could test all cases for regs->trap but that
      sucks too much).
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Tested-by: NKumar Gala <galak@kernel.crashing.org>
      5d38902c
  8. 24 6月, 2009 1 次提交
    • T
      percpu: use dynamic percpu allocator as the default percpu allocator · e74e3962
      Tejun Heo 提交于
      This patch makes most !CONFIG_HAVE_SETUP_PER_CPU_AREA archs use
      dynamic percpu allocator.  The first chunk is allocated using
      embedding helper and 8k is reserved for modules.  This ensures that
      the new allocator behaves almost identically to the original allocator
      as long as static percpu variables are concerned, so it shouldn't
      introduce much breakage.
      
      s390 and alpha use custom SHIFT_PERCPU_PTR() to work around addressing
      range limit the addressing model imposes.  Unfortunately, this breaks
      if the address is specified using a variable, so for now, the two
      archs aren't converted.
      
      The following architectures are affected by this change.
      
      * sh
      * arm
      * cris
      * mips
      * sparc(32)
      * blackfin
      * avr32
      * parisc (broken, under investigation)
      * m32r
      * powerpc(32)
      
      As this change makes the dynamic allocator the default one,
      CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is replaced with its invert -
      CONFIG_HAVE_LEGACY_PER_CPU_AREA, which is added to yet-to-be converted
      archs.  These archs implement their own setup_per_cpu_areas() and the
      conversion is not trivial.
      
      * powerpc(64)
      * sparc(64)
      * ia64
      * alpha
      * s390
      
      Boot and batch alloc/free tests on x86_32 with debug code (x86_32
      doesn't use default first chunk initialization).  Compile tested on
      sparc(32), powerpc(32), arm and alpha.
      
      Kyle McMartin reported that this change breaks parisc.  The problem is
      still under investigation and he is okay with pushing this patch
      forward and fixing parisc later.
      
      [ Impact: use dynamic allocator for most archs w/o custom percpu setup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Reviewed-by: NChristoph Lameter <cl@linux.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Bryan Wu <cooloney@kernel.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      e74e3962
  9. 18 6月, 2009 1 次提交
    • P
      perf_counter: powerpc: Enable use of software counters on 32-bit powerpc · 105988c0
      Paul Mackerras 提交于
      This enables the perf_counter subsystem on 32-bit powerpc.  Since we
      don't have any support for hardware counters on 32-bit powerpc yet,
      only software counters can be used.
      
      Besides selecting HAVE_PERF_COUNTERS for 32-bit powerpc as well as
      64-bit, the main thing this does is add an implementation of
      set_perf_counter_pending().  This needs to arrange for
      perf_counter_do_pending() to be called when interrupts are enabled.
      Rather than add code to local_irq_restore as 64-bit does, the 32-bit
      set_perf_counter_pending() generates an interrupt by setting the
      decrementer to 1 so that a decrementer interrupt will become pending
      in 1 or 2 timebase ticks (if a decrementer interrupt isn't already
      pending).  When interrupts are enabled, timer_interrupt() will be
      called, and some new code in there calls perf_counter_do_pending().
      We use a per-cpu array of flags to indicate whether we need to call
      perf_counter_do_pending() or not.
      
      This introduces a couple of new Kconfig symbols: PPC_HAVE_PMU_SUPPORT,
      which is selected by processor families for which we have hardware PMU
      support (currently only PPC64), and PPC_PERF_CTRS, which enables the
      powerpc-specific perf_counter back-end.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linuxppc-dev@ozlabs.org
      Cc: benh@kernel.crashing.org
      LKML-Reference: <19000.55404.103840.393470@cargo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      105988c0
  10. 15 6月, 2009 2 次提交
  11. 09 6月, 2009 1 次提交
  12. 27 5月, 2009 2 次提交
    • B
      powerpc: Fix up dma_alloc_coherent() on platforms without cache coherency. · 8b31e49d
      Benjamin Herrenschmidt 提交于
      The implementation we just revived has issues, such as using a
      Kconfig-defined virtual address area in kernel space that nothing
      actually carves out (and thus will overlap whatever is there),
      or having some dependencies on being self contained in a single
      PTE page which adds unnecessary constraints on the kernel virtual
      address space.
      
      This fixes it by using more classic PTE accessors and automatically
      locating the area for consistent memory, carving an appropriate hole
      in the kernel virtual address space, leaving only the size of that
      area as a Kconfig option. It also brings some dma-mask related fixes
      from the ARM implementation which was almost identical initially but
      grew its own fixes.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8b31e49d
    • B
      Revert "powerpc: Rework dma-noncoherent to use generic vmalloc layer" · 84532a0f
      Benjamin Herrenschmidt 提交于
      This reverts commit 33f00dce.
      
          While it was a good idea to try to use the mm/vmalloc.c allocator instead
          of our own (in fact, ours is itself a dup on an old variant of the vmalloc
          one), unfortunately, the approach is terminally busted since
          dma_alloc_coherent() can be called at interrupt time or in atomic contexts
          and there's little chances we'll make the code in mm/vmalloc.c cope with\       that :-(
      
          Until we can get the generic code to forbid that idiocy and fix all
          drivers abusing it, we pretty much have no choice but revert to
          our custom virtual space allocator.
      
          There's also a problem with SMP safety since freeing such mapping
          would require an IPI which cannot be done at interrupt time.
      
          However, right now, I don't think we support any platform that is
          both SMP and has non-coherent DMA (don't laugh, I know such things
          do exist !) so we can sort that out later.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      84532a0f
  13. 21 5月, 2009 1 次提交
  14. 03 5月, 2009 1 次提交
    • D
      Move dtc and libfdt sources from arch/powerpc/boot to scripts/dtc · 9fffb55f
      David Gibson 提交于
      The powerpc kernel always requires an Open Firmware like device tree
      to supply device information.  On systems without OF, this comes from
      a flattened device tree blob.  This blob is usually generated by dtc,
      a tool which compiles a text description of the device tree into the
      flattened format used by the kernel.  Sometimes, the bootwrapper makes
      small changes to the pre-compiled device tree blob (e.g. filling in
      the size of RAM).  To do this it uses the libfdt library.
      
      Because these are only used on powerpc, the code for both these tools
      is included under arch/powerpc/boot (these were imported and are
      periodically updated from the upstream dtc tree).
      
      However, the microblaze architecture, currently being prepared for
      merging to mainline also uses dtc to produce device tree blobs.  A few
      other archs have also mentioned some interest in using dtc.
      Therefore, this patch moves dtc and libfdt from arch/powerpc into
      scripts, where it can be used by any architecture.
      
      The vast bulk of this patch is a literal move, the rest is adjusting
      the various Makefiles to use dtc and libfdt correctly from their new
      locations.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9fffb55f
  15. 15 4月, 2009 1 次提交
  16. 07 4月, 2009 1 次提交
  17. 01 4月, 2009 1 次提交
  18. 31 3月, 2009 1 次提交
  19. 30 3月, 2009 1 次提交
  20. 11 3月, 2009 1 次提交
    • B
      powerpc/kconfig: Kill PPC_MULTIPLATFORM · 28794d34
      Benjamin Herrenschmidt 提交于
      CONFIG_PPC_MULTIPLATFORM is a remain of the pre-powerpc days and isn't
      really meaningful anymore. It was basically equivalent to PPC64 || 6xx.
      
      This removes it along with the following changes:
      
       - 32-bit platforms that relied on PPC32 && PPC_MULTIPLATFORM now rely
         on 6xx which is what they want anyway.
      
       - A new symbol, PPC_BOOK3S, is defined that represent compliance with
         the "Server" variant of the architecture. This is set when either 6xx
         or PPC64 is set and open the door for future BOOK3E 64-bit.
      
       - 64-bit platforms that relied on PPC64 && PPC_MULTIPLATFORM now use
         PPC64 && PPC_BOOK3S
      
       - A separate and selectable CONFIG_PPC_OF_BOOT_TRAMPOLINE option is now
         used to control the use of prom_init.c
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      28794d34
  21. 23 2月, 2009 5 次提交
  22. 15 2月, 2009 1 次提交
    • Y
      powerpc/44x: Support for 256KB PAGE_SIZE · e1240122
      Yuri Tikhonov 提交于
      This patch adds support for 256KB pages on ppc44x-based boards.
      
      For simplification of implementation with 256KB pages we still assume
      2-level paging. As a side effect this leads to wasting extra memory space
      reserved for PTE tables: only 1/4 of pages allocated for PTEs are
      actually used. But this may be an acceptable trade-off to achieve the
      high performance we have with big PAGE_SIZEs in some applications (e.g.
      RAID).
      
      Also with 256KB PAGE_SIZE we increase THREAD_SIZE up to 32KB to minimize
      the risk of stack overflows in the cases of on-stack arrays, which size
      depends on the page size (e.g. multipage BIOs, NTFS, etc.).
      
      With 256KB PAGE_SIZE we need to decrease the PKMAP_ORDER at least down
      to 9, otherwise all high memory (2 ^ 10 * PAGE_SIZE == 256MB) we'll be
      occupied by PKMAP addresses leaving no place for vmalloc. We do not
      separate PKMAP_ORDER for 256K from 16K/64K PAGE_SIZE here; actually that
      value of 10 in support for 16K/64K had been selected rather intuitively.
      Thus now for all cases of PAGE_SIZE on ppc44x (including the default, 4KB,
      one) we have 512 pages for PKMAP.
      
      Because ELF standard supports only page sizes up to 64K, then you should
      use binutils later than 2.17.50.0.3 with '-zmax-page-size' set to 256K
      for building applications, which are to be run with the 256KB-page sized
      kernel. If using the older binutils, then you should patch them like follows:
      
      	--- binutils/bfd/elf32-ppc.c.orig
      	+++ binutils/bfd/elf32-ppc.c
      
      	-#define ELF_MAXPAGESIZE                0x10000
      	+#define ELF_MAXPAGESIZE                0x40000
      
      One more restriction we currently have with 256KB page sizes is inability
      to use shmem safely, so, for now, the 256KB is available only if you turn
      the CONFIG_SHMEM option off (another variant is to use BROKEN).
      Though, if you need shmem with 256KB pages, you can always remove the !SHMEM
      dependency in 'config PPC_256K_PAGES', and use the workaround available here:
       http://lkml.org/lkml/2008/12/19/20Signed-off-by: NYuri Tikhonov <yur@emcraft.com>
      Signed-off-by: NIlya Yanok <yanok@emcraft.com>
      Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
      e1240122
  23. 29 1月, 2009 3 次提交
    • K
      powerpc/fsl: Ensure PCI_QUIRKS are enabled for FSL_PCI · d0839118
      Kumar Gala 提交于
      The FSL PCI code depends on PCI quirks being enabled to function
      properly.  We can ensure this by doing a select in Kconfig of
      PCI_QUIRKS.
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      d0839118
    • T
      powerpc/fsl-booke: Make CAM entries used for lowmem configurable · 96051465
      Trent Piepho 提交于
      On booke processors, the code that maps low memory only uses up to three
      CAM entries, even though there are sixteen and nothing else uses them.
      
      Make this number configurable in the advanced options menu along with max
      low memory size.  If one wants 1 GB of lowmem, then it's typically
      necessary to have four CAM entries.
      Signed-off-by: NTrent Piepho <tpiepho@freescale.com>
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      96051465
    • T
      powerpc/fsl-booke: Allow larger CAM sizes than 256 MB · c8f3570b
      Trent Piepho 提交于
      The code that maps kernel low memory would only use page sizes up to 256
      MB.  On E500v2 pages up to 4 GB are supported.
      
      However, a page must be aligned to a multiple of the page's size.  I.e.
      256 MB pages must aligned to a 256 MB boundary.  This was enforced by a
      requirement that the physical and virtual addresses of the start of lowmem
      be aligned to 256 MB.  Clearly requiring 1GB or 4GB alignment to allow
      pages of that size isn't acceptable.
      
      To solve this, I simply have adjust_total_lowmem() take alignment into
      account when it decides what size pages to use.  Give it PAGE_OFFSET =
      0x7000_0000, PHYSICAL_START = 0x3000_0000, and 2GB of RAM, and it will map
      pages like this:
      PA 0x3000_0000 VA 0x7000_0000 Size 256 MB
      PA 0x4000_0000 VA 0x8000_0000 Size 1 GB
      PA 0x8000_0000 VA 0xC000_0000 Size 256 MB
      PA 0x9000_0000 VA 0xD000_0000 Size 256 MB
      PA 0xA000_0000 VA 0xE000_0000 Size 256 MB
      
      Because the lowmem mapping code now takes alignment into account,
      PHYSICAL_ALIGN can be lowered from 256 MB to 64 MB.  Even lower might be
      possible.  The lowmem code will work down to 4 kB but it's possible some of
      the boot code will fail before then.  Poor alignment will force small pages
      to be used, which combined with the limited number of TLB1 pages available,
      will result in very little memory getting mapped.  So alignments less than
      64 MB probably aren't very useful anyway.
      Signed-off-by: NTrent Piepho <tpiepho@freescale.com>
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      c8f3570b
  24. 28 1月, 2009 1 次提交
  25. 14 1月, 2009 1 次提交
  26. 08 1月, 2009 2 次提交
  27. 29 12月, 2008 1 次提交
  28. 23 12月, 2008 1 次提交
  29. 03 12月, 2008 1 次提交
    • B
      powerpc: Add sync_*_for_* to dma_ops · 15e09c0e
      Becky Bruce 提交于
      We need to swap these out once we start using swiotlb, so add
      them to dma_ops.  Create CONFIG_PPC_NEED_DMA_SYNC_OPS Kconfig
      option; this is currently enabled automatically if we're
      CONFIG_NOT_COHERENT_CACHE.  In the future, this will also
      be enabled for builds that need swiotlb.  If PPC_NEED_DMA_SYNC_OPS
      is not defined, the dma_sync_*_for_* ops compile to nothing.
      Otherwise, they access the dma_ops pointers for the sync ops.
      
      This patch also changes dma_sync_single_range_* to actually
      sync the range - previously it was using a generous
      dma_sync_single.  dma_sync_single_* is now implemented
      as a dma_sync_single_range with an offset of 0.
      Signed-off-by: NBecky Bruce <becky.bruce@freescale.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      15e09c0e