1. 21 5月, 2010 4 次提交
    • S
      powerpc/e500mc: Implement machine check handler. · fe04b112
      Scott Wood 提交于
      Most of the MSCR bit assigments are different in e500mc versus
      e500, and they are now write-one-to-clear.
      
      Some e500mc machine check conditions are made recoverable (as long as
      they aren't stuck on), most notably L1 instruction cache parity errors.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      fe04b112
    • A
      powerpc/numa: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim · 56608209
      Anton Blanchard 提交于
      I noticed /proc/sys/vm/zone_reclaim_mode was 0 on a ppc64 NUMA box. It gets
      enabled via this:
      
              /*
               * If another node is sufficiently far away then it is better
               * to reclaim pages in a zone before going off node.
               */
              if (distance > RECLAIM_DISTANCE)
                      zone_reclaim_mode = 1;
      
      Since we use the default value of 20 for REMOTE_DISTANCE and 20 for
      RECLAIM_DISTANCE it never kicks in.
      
      The local to remote bandwidth ratios can be quite large on System p
      machines so it makes sense for us to reclaim clean pagecache locally before
      going off node.
      
      The patch below sets a smaller value for RECLAIM_DISTANCE and thus enables
      zone reclaim.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      56608209
    • M
      powerpc/kexec: Fix race in kexec shutdown · 1fc711f7
      Michael Neuling 提交于
      In kexec_prepare_cpus, the primary CPU IPIs the secondary CPUs to
      kexec_smp_down().  kexec_smp_down() calls kexec_smp_wait() which sets
      the hw_cpu_id() to -1.  The primary does this while leaving IRQs on
      which means the primary can take a timer interrupt which can lead to
      the IPIing one of the secondary CPUs (say, for a scheduler re-balance)
      but since the secondary CPU now has a hw_cpu_id = -1, we IPI CPU
      -1... Kaboom!
      
      We are hitting this case regularly on POWER7 machines.
      
      There is also a second race, where the primary will tear down the MMU
      mappings before knowing the secondaries have entered real mode.
      
      Also, the secondaries are clearing out any pending IPIs before
      guaranteeing that no more will be received.
      
      This changes kexec_prepare_cpus() so that we turn off IRQs in the
      primary CPU much earlier.  It adds a paca flag to say that the
      secondaries have entered the kexec_smp_down() IPI and turned off IRQs,
      rather than overloading hw_cpu_id with -1.  This new paca flag is
      again used to in indicate when the secondaries has entered real mode.
      
      It also ensures that all CPUs have their IRQs off before we clear out
      any pending IPI requests (in kexec_cpu_down()) to ensure there are no
      trailing IPIs left unacknowledged.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1fc711f7
    • M
      powerpc/pseries: Add hcall to read 4 ptes at a time in real mode · f90ece28
      Michael Neuling 提交于
      This adds plpar_pte_read_4_raw() which can be used read 4 PTEs from
      PHYP at a time, while in real mode.
      
      It also creates a new hcall9 which can be used in real mode.  It's the
      same as plpar_hcall9 but minus the tracing hcall statistics which may
      require variables outside the RMO.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f90ece28
  2. 06 5月, 2010 6 次提交
  3. 05 5月, 2010 3 次提交
  4. 27 4月, 2010 1 次提交
    • K
      powerpc/fsl-booke: Fix CONFIG_RELOCATABLE support on FSL Book-E ppc32 · dbc9632a
      Kumar Gala 提交于
      The following commit broke CONFIG_RELOCATABLE support on FSL Book-E
      parts:
      
      commit 549e8152
      Author: Paul Mackerras <paulus@samba.org>
      Date:   Sat Aug 30 11:43:47 2008 +1000
      
          powerpc: Make the 64-bit kernel as a position-independent executable
      
      The change to __va and __pa to use PAGE_OFFSET & MEMORY_START causes
      problems on the Book-E parts because we don't know MEMORY_START until
      after we parse the device tree.  We need __va to work properly to even
      parse the device tree so we have a chicken an egg.  So go back to using
      he other definition of __va/__pa on CONFIG_BOOKE and use the
      PAGE_OFFSET/MEMORY_START version on "Classic" PPC64.
      
      Also updated casts to handle phys_addr_t being a different size from
      unsigned long (ie 36-bit physical on PPC32).
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      dbc9632a
  5. 13 4月, 2010 1 次提交
  6. 07 4月, 2010 5 次提交
  7. 19 3月, 2010 1 次提交
  8. 18 3月, 2010 1 次提交
    • P
      powerpc/perf_events: Fix call-graph recording, add perf_arch_fetch_caller_regs · 9eff26ea
      Paul Mackerras 提交于
      This implements a powerpc version of perf_arch_fetch_caller_regs
      to get correct call-graphs.
      
      It's implemented in assembly because that way we can be sure there isn't
      a stack frame for perf_arch_fetch_caller_regs.  If it was in C, gcc might
      or might not create a stack frame for it, which would affect the number
      of levels we have to skip.
      
      With this, we see results from perf record -e lock:lock_acquire like
      this:
      
       # Samples: 24878
       #
       # Overhead         Command      Shared Object  Symbol
       # ........  ..............  .................  ......
       #
          14.99%            perf  [kernel.kallsyms]  [k] ._raw_spin_lock
                            |
                            --- ._raw_spin_lock
                               |
                               |--25.00%-- .alloc_fd
                               |          (nil)
                               |          |
                               |          |--50.00%-- .anon_inode_getfd
                               |          |          .sys_perf_event_open
                               |          |          syscall_exit
                               |          |          syscall
                               |          |          create_counter
                               |          |          __cmd_record
                               |          |          run_builtin
                               |          |          main
                               |          |          0xfd2e704
                               |          |          0xfd2e8c0
                               |          |          (nil)
      
      ... etc.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: anton@samba.org
      Cc: linuxppc-dev@ozlabs.org
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100318050513.GA6575@drongo>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9eff26ea
  9. 17 3月, 2010 1 次提交
  10. 13 3月, 2010 7 次提交
    • F
      dma-mapping: powerpc: use generic pci_set_dma_mask and pci_set_consistent_dma_mask · 6e6c70e6
      FUJITA Tomonori 提交于
      This converts powerpc to use the generic pci_set_dma_mask and
      pci_set_consistent_dma_mask (drivers/pci/pci.c).
      
      The generic pci_set_dma_mask does what powerpc's pci_set_dma_mask does.
      
      Unlike powerpc's pci_set_consistent_dma_mask, the gneric
      pci_set_consistent_dma_mask sets only coherent_dma_mask.  It doesn't work
      for powerpc?  pci_set_consistent_dma_mask API should set only
      coherent_dma_mask?
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Cc: Greg KH <greg@kroah.com>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6e6c70e6
    • F
      pci-dma: add linux/pci-dma.h to linux/pci.h · f41b1771
      FUJITA Tomonori 提交于
      All the architectures properly set NEED_DMA_MAP_STATE now so we can safely
      add linux/pci-dma.h to linux/pci.h and remove the linux/pci-dma.h
      inclusion in arch's asm/pci.h
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f41b1771
    • F
      pci-dma: powerpc: use include/linux/pci-dma.h · af407c6d
      FUJITA Tomonori 提交于
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      af407c6d
    • C
      ptrace: move user_enable_single_step & co prototypes to linux/ptrace.h · dacbe41f
      Christoph Hellwig 提交于
      While in theory user_enable_single_step/user_disable_single_step/
      user_enable_blockstep could also be provided as an inline or macro there's
      no good reason to do so, and having the prototype in one places keeps code
      size and confusion down.
      
      Roland said:
      
        The original thought there was that user_enable_single_step() et al
        might well be only an instruction or three on a sane machine (as if we
        have any of those!), and since there is only one call site inlining
        would be beneficial.  But I agree that there is no strong reason to care
        about inlining it.
      
        As to the arch changes, there is only one thought I'd add to the
        record.  It was always my thinking that for an arch where
        PTRACE_SINGLESTEP does text-modifying breakpoint insertion,
        user_enable_single_step() should not be provided.  That is,
        arch_has_single_step()=>true means that there is an arch facility with
        "pure" semantics that does not have any unexpected side effects.
        Inserting a breakpoint might do very unexpected strange things in
        multi-threaded situations.  Aside from that, it is a peculiar side
        effect that user_{enable,disable}_single_step() should cause COW
        de-sharing of text pages and so forth.  For PTRACE_SINGLESTEP, all these
        peculiarities are the status quo ante for that arch, so having
        arch_ptrace() itself do those is one thing.  But for building other
        things in the future, it is nicer to have a uniform "pure" semantics
        that arch-independent code can expect.
      
        OTOH, all such arch issues are really up to the arch maintainer.  As
        of today, there is nothing but ptrace using user_enable_single_step() et
        al so it's a distinction without a practical difference.  If/when there
        are other facilities that use user_enable_single_step() and might care,
        the affected arch's can revisit the question when someone cares about
        the quality of the arch support for said new facility.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Roland McGrath <roland@redhat.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dacbe41f
    • C
      Add generic sys_olduname() · 5cacdb4a
      Christoph Hellwig 提交于
      Add generic implementations of the old and really old uname system calls.
      Note that sh only implements sys_olduname but not sys_oldolduname, but I'm
      not going to bother with another ifdef for that special case.
      
      m32r implemented an old uname but never wired it up, so kill it, too.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5cacdb4a
    • C
      improve sys_newuname() for compat architectures · e28cbf22
      Christoph Hellwig 提交于
      On an architecture that supports 32-bit compat we need to override the
      reported machine in uname with the 32-bit value.  Instead of doing this
      separately in every architecture introduce a COMPAT_UTS_MACHINE define in
      <asm/compat.h> and apply it directly in sys_newuname().
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e28cbf22
    • C
      Add generic sys_ipc wrapper · baed7fc9
      Christoph Hellwig 提交于
      Add a generic implementation of the ipc demultiplexer syscall.  Except for
      s390 and sparc64 all implementations of the sys_ipc are nearly identical.
      
      There are slight differences in the types of the parameters, where mips
      and powerpc as the only 64-bit architectures with sys_ipc use unsigned
      long for the "third" argument as it gets casted to a pointer later, while
      it traditionally is an "int" like most other paramters.  frv goes even
      further and uses unsigned long for all parameters execept for "ptr" which
      is a pointer type everywhere.  The change from int to unsigned long for
      "third" and back to "int" for the others on frv should be fine due to the
      in-register calling conventions for syscalls (we already had a similar
      issue with the generic sys_ptrace), but I'd prefer to have the arch
      maintainers looks over this in details.
      
      Except for that h8300, m68k and m68knommu lack an impplementation of the
      semtimedop sub call which this patch adds, and various architectures have
      gets used - at least on i386 it seems superflous as the compat code on
      x86-64 and ia64 doesn't even bother to implement it.
      
      [akpm@linux-foundation.org: add sys_ipc to sys_ni.c]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Reviewed-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NKyle McMartin <kyle@mcmartin.ca>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      baed7fc9
  11. 09 3月, 2010 2 次提交
  12. 05 3月, 2010 1 次提交
    • S
      powerpc/perf: e500 support · a1110654
      Scott Wood 提交于
      This implements perf_event support for the Freescale embedded performance
      monitor, based on the existing perf_event.c that supports server/classic
      chips.
      
      Some limitations:
      - Performance monitor interrupts are regular EE interrupts, and thus you
        can't profile places with interrupts disabled.  We may want to implement
        soft IRQ-disabling, with perfmon interrupts exempted and treated as NMIs.
      - When trying to schedule multiple event groups at once, and using
        restricted events, situations could arise where scheduling fails even
        though it would be possible.  Consider three groups, each with two events.
        One group has restricted events, the others don't.  The two non-restricted
        groups are scheduled, then one is removed, which happens to occupy the two
        counters that can't do restricted events.  The remaining non-restricted
        group will not be moved to the non-restricted-capable counters to make
        room if the restricted group tries to be scheduled.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      a1110654
  13. 01 3月, 2010 7 次提交