1. 13 4月, 2007 1 次提交
  2. 09 3月, 2007 1 次提交
  3. 07 2月, 2007 1 次提交
    • L
      [POWERPC] Fix performance monitor exception · 449d846d
      Livio Soares 提交于
      To the issue: some point during 2.6.20 development, Paul Mackerras
      introduced the "lazy IRQ  disabling" patch (very cool work,  BTW).
      In that patch, the performance monitor unit exception was marked as
      "maskable", in the sense that if interrupts were soft-disabled, that
      exception could be ignored.  This broke my PowerPC profiling code.
      The symptom that I see is that a varying number of interrupts
      (from 0 to $n$, typically closer to 0) get delivered, when, in
      reality, it should always be very close to $n$.
      
      The issue stems from the way masking is being done.   Masking in
      this fashion seems to  work well with the decrementer and external
      interrupts, because they are raised again until "really"  handled.
      For the PMU, however, this does not apply (at least on my Xserver
      machine with a 970FX processor).  If the PMU exception is not handled,
      it will _not_ be re-raised (at least on my machine).  The documentation
      states that the PMXE bit in MMCR0 is set to 0 when the PMU exception
      is raised.  However, software must re-set the bit to re-enable PMU
      exceptions.  If the exception is ignored (as currently) not only is
      that interrupt lost, but because software does not re-set PMXE, the
      PMU registers are "frozen" forever.
      
      [This patch means that performance monitor exceptions are taken and
      handled even if irqs are off, as long as some other interrupt hasn't
      come along and caused interrupts to be hard-disabled.  In this sense
      the PMU exception becomes like an NMI.  The oprofile code for most
      powerpc processors does nothing that is unsafe in an NMI context, but
      the Cell oprofile code does a spin_lock_irqsave.  However, that turns
      out to be OK because Cell doesn't actually use the performance
      monitor exception; performance monitor interrupts come in as a
      regular interrupt on Cell, so will be disabled when irqs are off.
       -- paulus.]
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      449d846d
  4. 04 12月, 2006 1 次提交
  5. 13 11月, 2006 1 次提交
  6. 02 11月, 2006 1 次提交
  7. 26 10月, 2006 1 次提交
  8. 25 10月, 2006 1 次提交
  9. 18 10月, 2006 1 次提交
    • P
      [POWERPC] Make sure interrupt enable gets restored properly · b0a779de
      Paul Mackerras 提交于
      The lazy IRQ disable patch missed a couple of places where the
      interrupt enable flags need to be restored correctly.  First, we
      weren't restoring the paca->hard_enabled flag on interrupt exit.
      Instead of saving it on entry, we compute it from the MSR_EE bit
      in the MSR we are restoring at exit.  Secondly, the MMU hash miss
      code was clearing both paca->soft_enabled and paca->hard_enabled
      but not restoring them in the case where hash_page was able to
      resolve the miss from the Linux page tables.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      b0a779de
  10. 16 10月, 2006 1 次提交
    • P
      [POWERPC] Lazy interrupt disabling for 64-bit machines · d04c56f7
      Paul Mackerras 提交于
      This implements a lazy strategy for disabling interrupts.  This means
      that local_irq_disable() et al. just clear the 'interrupts are
      enabled' flag in the paca.  If an interrupt comes along, the interrupt
      entry code notices that interrupts are supposed to be disabled, and
      clears the EE bit in SRR1, clears the 'interrupts are hard-enabled'
      flag in the paca, and returns.  This means that interrupts only
      actually get disabled in the processor when an interrupt comes along.
      
      When interrupts are enabled by local_irq_enable() et al., the code
      sets the interrupts-enabled flag in the paca, and then checks whether
      interrupts got hard-disabled.  If so, it also sets the EE bit in the
      MSR to hard-enable the interrupts.
      
      This has the potential to improve performance, and also makes it
      easier to make a kernel that can boot on iSeries and on other 64-bit
      machines, since this lazy-disable strategy is very similar to the
      soft-disable strategy that iSeries already uses.
      
      This version renames paca->proc_enabled to paca->soft_enabled, and
      changes a couple of soft-disables in the kexec code to hard-disables,
      which should fix the crash that Michael Ellerman saw.  This doesn't
      yet use a reserved CR field for the soft_enabled and hard_enabled
      flags.  This applies on top of Stephen Rothwell's patches to make it
      possible to build a combined iSeries/other kernel.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      d04c56f7
  11. 03 10月, 2006 1 次提交
  12. 13 9月, 2006 1 次提交
  13. 25 8月, 2006 1 次提交
    • O
      [POWERPC] Cleanup CPU inits · f39b7a55
      Olof Johansson 提交于
      Cleanup CPU inits a bit more, Geoff Levand already did some earlier.
      
      * Move CPU state save to cpu_setup, since cpu_setup is only ever done
        on cpu 0 on 64-bit and save is never done more than once.
      * Rename __restore_cpu_setup to __restore_cpu_ppc970 and add
        function pointers to the cputable to use instead. Powermac always
        has 970 so no need to check there.
      * Rename __970_cpu_preinit to __cpu_preinit_ppc970 and check PVR before
        calling it instead of in it, it's too early to use cputable.
      * Rename pSeries_secondary_smp_init to generic_secondary_smp_init since
        everyone but powermac and iSeries use it.
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      f39b7a55
  14. 29 7月, 2006 1 次提交
    • O
      [POWERPC] force 64bit mode in fwnmi handlers to workaround firmware bugs · 9fc0a92c
      Olaf Hering 提交于
      The firmware of POWER4 and JS20 systems does not switch the cpu to 64bit
      mode when the registered system_reset and machine_check handlers get called.
      If a 32bit process runs on that cpu at the time of the event, the cpu
      remains in 32bit mode. xmon and kdump can not deal with it, the result is
      an error like 'Bad kernel stack pointer fff2aad0 at 3200'.
      xmon just loses some register info, but booting the kdump kernel usually fails.
      
      Both handlers are not hot paths.  Duplicate the EXCEPTION_PROLOG_PSERIES macro
      and add two instructions to switch to 64bit:
      
       li     r11,5;
       rldimi r10,r11,61,0;
      Signed-off-by: NOlaf Hering <olh@suse.de>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      9fc0a92c
  15. 01 7月, 2006 1 次提交
  16. 29 6月, 2006 1 次提交
    • M
      [POWERPC] Make sure smp_processor_id works very early in boot · 33dbcf72
      Michael Ellerman 提交于
      There's a small period early in boot where we don't know which cpu we're
      running on. That's ok, except that it means we have no paca, or more
      correctly that our paca pointer points somewhere random.
      
      So that we can safely call things like smp_processor_id(), we need a paca,
      so just assume we're on cpu 0. No code should _write_ to the paca before
      we've set the correct one up.
      
      We setup the proper paca after we've scanned the flat device tree in
      early_setup(), so there's no need to do it again in start_here_common.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      33dbcf72
  17. 28 6月, 2006 3 次提交
  18. 21 6月, 2006 1 次提交
    • B
      [POWERPC] cell: add RAS support · acf7d768
      Benjamin Herrenschmidt 提交于
      This is a first version of support for the Cell BE "Reliability,
      Availability and Serviceability" features.
      
      It doesn't yet handle some of the RAS interrupts (the ones described in
      iic_is/iic_irr), I'm still working on a proper way to expose these. They
      are essentially a cascaded controller by themselves (sic !) though I may
      just handle them locally to the iic driver. I need also to sync with
      David Erb on the way he hooked in the performance monitor interrupt.
      
      So that's all for 2.6.17 and I'll do more work on that with my rework of
      the powerpc interrupt layer that I'm hacking on at the moment.
      Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      acf7d768
  19. 18 4月, 2006 1 次提交
    • P
      powerpc: Use correct sequence for putting CPU into nap mode · f39224a8
      Paul Mackerras 提交于
      We weren't using the recommended sequence for putting the CPU into
      nap mode.  When I changed the idle loop, for some reason 7447A cpus
      started hanging when we put them into nap mode.  Changing to the
      recommended sequence fixes that.
      
      The complexity here is that the recommended sequence is a loop that
      keeps putting the cpu back into nap mode.  Clearly we need some way
      to break out of the loop when an interrupt (external interrupt,
      decrementer, performance monitor) occurs.  Here we use a bit in
      the thread_info struct to indicate that we need this, and the exception
      entry code notices this and arranges for the exception to return
      to the value in the link register, thus breaking out of the loop.
      We use a new `local_flags' field in the thread_info which we can
      alter without needing to use an atomic update sequence.
      
      The PPC970 has the same recommended sequence, so we do the same thing
      there too.
      
      This also fixes a bug in the kernel stack overflow handling code on
      32-bit, since it was causing a value that we needed in a register to
      get trashed.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      f39224a8
  20. 27 3月, 2006 2 次提交
    • A
      [PATCH] powerpc: Allow non zero boot cpuids · 4df20460
      Anton Blanchard 提交于
      We currently have a hack to flip the boot cpu and its secondary thread
      to logical cpuid 0 and 1. This means the logical - physical mapping will
      differ depending on which cpu is boot cpu. This is most apparent on
      kexec, where we might kexec on any cpu and therefore change the mapping
      from boot to boot.
      
      The patch below does a first pass early on to work out the logical cpuid
      of the boot thread. We then fix up some paca structures to match.
      
      Ive also removed the boot_cpuid_phys variable for ppc64, to be
      consistent we use get_hard_smp_processor_id(boot_cpuid) everywhere.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      4df20460
    • O
      [PATCH] correct the comment about stackpointer alignment in __boot_from_prom · 6088857b
      Olaf Hering 提交于
      The address of variable val in prom_init_stdout is passed to prom_getprop.
      prom_getprop casts the pointer to u32 and passes it to call_prom in the hope
      that OpenFirmware stores something there.
      But the pointer is truncated in the lower bits and the expected value is
      stored somewhere else.
      
      In my testing I had a stackpointer of 0x0023e6b4. val was at offset 120,
      wich has address 0x0023e72c. But the value passed to OF was 0x0023e728.
      
      c00000000040b710:       3b 01 00 78     addi    r24,r1,120
      ...
      c00000000040b754:       57 08 00 38     rlwinm  r8,r24,0,0,28
      ...
      c00000000040b784:       80 01 00 78     lwz     r0,120(r1)
      ...
      c00000000040b798:       90 1b 00 0c     stw     r0,12(r27)
      ...
      
      The stackpointer came from 32bit code.
      The chain was yaboot -> zImage -> vmlinux
      
      PowerMac OpenFirmware does appearently not handle the ELF sections
      correctly.  If yaboot was compiled in
      /usr/src/packages/BUILD/lilo-10.1.1/yaboot, then the stackpointer is
      unaligned. But the stackpointer is correct if yaboot is compiled in
      /tmp/yaboot.
      
      This bug triggered since 2.6.15, now prom_getprop is an inline
      function. gcc clears the lower bits, instead of just clearing the
      upper 32 bits.
      Signed-off-by: NOlaf Hering <olh@suse.de>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      6088857b
  21. 05 3月, 2006 1 次提交
  22. 24 2月, 2006 3 次提交
    • P
      powerpc: Implement accurate task and CPU time accounting · c6622f63
      Paul Mackerras 提交于
      This implements accurate task and cpu time accounting for 64-bit
      powerpc kernels.  Instead of accounting a whole jiffy of time to a
      task on a timer interrupt because that task happened to be running at
      the time, we now account time in units of timebase ticks according to
      the actual time spent by the task in user mode and kernel mode.  We
      also count the time spent processing hardware and software interrupts
      accurately.  This is conditional on CONFIG_VIRT_CPU_ACCOUNTING.  If
      that is not set, we do tick-based approximate accounting as before.
      
      To get this accurate information, we read either the PURR (processor
      utilization of resources register) on POWER5 machines, or the timebase
      on other machines on
      
      * each entry to the kernel from usermode
      * each exit to usermode
      * transitions between process context, hard irq context and soft irq
        context in kernel mode
      * context switches.
      
      On POWER5 systems with shared-processor logical partitioning we also
      read both the PURR and the timebase at each timer interrupt and
      context switch in order to determine how much time has been taken by
      the hypervisor to run other partitions ("steal" time).  Unfortunately,
      since we need values of the PURR on both threads at the same time to
      accurately calculate the steal time, and since we can only calculate
      steal time on a per-core basis, the apportioning of the steal time
      between idle time (time which we ceded to the hypervisor in the idle
      loop) and actual stolen time is somewhat approximate at the moment.
      
      This is all based quite heavily on what s390 does, and it uses the
      generic interfaces that were added by the s390 developers,
      i.e. account_system_time(), account_user_time(), etc.
      
      This patch doesn't add any new interfaces between the kernel and
      userspace, and doesn't change the units in which time is reported to
      userspace by things such as /proc/stat, /proc/<pid>/stat, getrusage(),
      times(), etc.  Internally the various task and cpu times are stored in
      timebase units, but they are converted to USER_HZ units (1/100th of a
      second) when reported to userspace.  Some precision is therefore lost
      but there should not be any accumulating error, since the internal
      accumulation is at full precision.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      c6622f63
    • A
      [PATCH] powerpc64: remove broken/bitrotted HMT support · f1870f77
      Anton Blanchard 提交于
      HMT support is currently broken and needs to be reworked to play nicely
      with the SMT scheduler. Remove the bit rotten bits for the time being.
      
      I also updated an incorrect comment, we enter __secondary_hold with the
      physical cpu id in r3.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      f1870f77
    • A
      [PATCH] powerpc: Fix runlatch performance issues · cb2c9b27
      Anton Blanchard 提交于
      The runlatch SPR can take a lot of time to write. My original runlatch
      code would set it on every exception entry even though most of the time
      this was not required. It would also continually set it in the idle
      loop, which is an issue on an SMT capable processor.
      
      Now we cache the runlatch value in a threadinfo bit, and only check for
      it in decrementer and hardware interrupt exceptions as well as the idle
      loop. Boot on POWER3, POWER5 and iseries, and compile tested on pmac32.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      cb2c9b27
  23. 20 2月, 2006 1 次提交
  24. 10 2月, 2006 1 次提交
  25. 07 2月, 2006 1 次提交
  26. 13 1月, 2006 2 次提交
    • D
      [PATCH] powerpc: Remove lppaca structure from the PACA · 3356bb9f
      David Gibson 提交于
      At present the lppaca - the structure shared with the iSeries
      hypervisor and phyp - is contained within the PACA, our own low-level
      per-cpu structure.  This doesn't have to be so, the patch below
      removes it, making a separate array of lppaca structures.
      
      This saves approximately 500*NR_CPUS bytes of image size and kernel
      memory, because we don't need aligning gap between the Linux and
      hypervisor portions of every PACA.  On the other hand it means an
      extra level of dereference in many accesses to the lppaca.
      
      The patch also gets rid of several places where we assign the paca
      address to a local variable for no particular reason.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      3356bb9f
    • D
      [PATCH] powerpc: Cleanup LOADADDR etc. asm macros · e58c3495
      David Gibson 提交于
      This patch consolidates the variety of macros used for loading 32 or
      64-bit constants in assembler (LOADADDR, LOADBASE, SET_REG_TO_*).  The
      idea is to make the set of macros consistent across 32 and 64 bit and
      to make it more obvious which is the appropriate one to use in a given
      situation.  The new macros and their semantics are described in the
      comments in ppc_asm.h.
      
      In the process, we change several places that were unnecessarily using
      immediate loads on ppc64 to use the GOT/TOC.  Likewise we cleanup a
      couple of places where we were clumsily subtracting PAGE_OFFSET with
      asm instructions to use assemble-time arithmetic or the toreal() macro
      instead.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      e58c3495
  27. 09 1月, 2006 6 次提交
  28. 10 11月, 2005 1 次提交
  29. 07 11月, 2005 1 次提交
    • B
      [PATCH] ppc64: support 64k pages · 3c726f8d
      Benjamin Herrenschmidt 提交于
      Adds a new CONFIG_PPC_64K_PAGES which, when enabled, changes the kernel
      base page size to 64K.  The resulting kernel still boots on any
      hardware.  On current machines with 4K pages support only, the kernel
      will maintain 16 "subpages" for each 64K page transparently.
      
      Note that while real 64K capable HW has been tested, the current patch
      will not enable it yet as such hardware is not released yet, and I'm
      still verifying with the firmware architects the proper to get the
      information from the newer hypervisors.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3c726f8d