1. 13 12月, 2008 1 次提交
  2. 14 7月, 2008 1 次提交
  3. 26 6月, 2008 1 次提交
  4. 14 5月, 2008 1 次提交
    • M
      [POWERPC] Fix sparse warnings in arch/powerpc/kernel · 1c21a293
      Michael Ellerman 提交于
      Make a few things static in lparcfg.c
      Make init and exit routines static in rtas_flash.c
      Make things static in rtas_pci.c
      Make some functions static in rtas.c
      Make fops static in rtas-proc.c
      Remove unneeded extern for do_gtod in smp.c
      Make clocksource_init() static in time.c
      Make last_tick_len and ticklen_to_xs static in time.c
      Move the declaration of the pvr per-cpu into smp.h
      Make kexec_smp_down() and kexec_stack static in machine_kexec_64.c
      Don't return void in arch_teardown_msi_irqs() in msi.c
      Move declaration of GregorianDay()into asm/time.h
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      1c21a293
  5. 01 5月, 2008 2 次提交
  6. 07 2月, 2008 1 次提交
    • M
      taskstats scaled time cleanup · 06b8e878
      Michael Neuling 提交于
      This moves the ability to scale cputime into generic code.  This allows us
      to fix the issue in kernel/timer.c (noticed by Balbir) where we could only
      add an unscaled value to the scaled utime/stime.
      
      This adds a cputime_to_scaled function.  As before, the POWERPC version
      does the scaling based on the last SPURR/PURR ratio calculated.  The
      generic and s390 (only other arch to implement asm/cputime.h) versions are
      both NOPs.
      
      Also moves the SPURR and PURR snapshots closer.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Cc: Jay Lan <jlan@engr.sgi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      06b8e878
  7. 21 12月, 2007 1 次提交
    • S
      [POWERPC] Implement arch disable/enable irq hooks. · 7ac5dde9
      Scott Wood 提交于
      These hooks ensure that a decrementer interrupt is not pending when
      suspending; otherwise, problems may occur on 6xx/7xx/7xxx-based
      systems (except for powermacs, which use a separate suspend path).
      For example, with deep sleep on the 831x, a pending decrementer will
      cause a system freeze because the SoC thinks the decrementer interrupt
      would have woken the system, but the core must have interrupts
      disabled due to the setup required for deep sleep.
      
      Changed via-pmu.c to use the new ppc_md hooks, and made the arch_*
      functions call the generic_* functions unconditionally.  -- paulus
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      7ac5dde9
  8. 20 12月, 2007 5 次提交
  9. 20 11月, 2007 1 次提交
  10. 13 11月, 2007 1 次提交
  11. 10 11月, 2007 1 次提交
    • P
      sched: restore deterministic CPU accounting on powerpc · fa13a5a1
      Paul Mackerras 提交于
      Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the
      deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been
      broken on powerpc, because we end up counting user time twice: once in
      timer_interrupt() and once in update_process_times().
      
      This fixes the problem by pulling the code in update_process_times
      that updates utime and stime into a separate function called
      account_process_tick.  If CONFIG_VIRT_CPU_ACCOUNTING is not defined,
      there is a version of account_process_tick in kernel/timer.c that
      simply accounts a whole tick to either utime or stime as before.  If
      CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to
      implement account_process_tick.
      
      This also lets us simplify the s390 code a bit; it means that the s390
      timer interrupt can now call update_process_times even when
      CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a
      suitable account_process_tick().
      
      account_process_tick() now takes the task_struct * as an argument.
      Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fa13a5a1
  12. 08 11月, 2007 1 次提交
    • P
      [POWERPC] Fix off-by-one error in setting decrementer on Book E/4xx (v2) · 43875cc0
      Paul Mackerras 提交于
      The decrementer in Book E and 4xx processors interrupts on the
      transition from 1 to 0, rather than on the 0 to -1 transition as on
      64-bit server and 32-bit "classic" (6xx/7xx/7xxx) processors.  At the
      moment we subtract 1 from the count of how many decrementer ticks are
      required before the next interrupt before putting it into the
      decrementer, which is correct for server/classic processors, but could
      possibly cause the interrupt to happen too early on Book E and 4xx if
      the timebase/decrementer frequency is low.
      
      This fixes the problem by making set_dec subtract 1 from the count for
      server and classic processors, instead of having the callers subtract
      1.  Since set_dec already had a bunch of ifdefs to handle different
      processor types, there is no net increase in ugliness. :)
      
      Note that calling set_dec(0) may not generate an interrupt on some
      processors.  To make sure that decrementer_set_next_event always calls
      set_dec with an interval of at least 1 tick, we set min_delta_ns of
      the decrementer_clockevent to correspond to 2 ticks (2 rather than 1
      to compensate for truncations in the conversions between ticks and
      ns).
      
      This also removes a redundant call to set the decrementer to
      0x7fffffff - it was already set to that earlier in timer_interrupt.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      43875cc0
  13. 19 10月, 2007 1 次提交
    • M
      powerpc: add scaled time accounting · 4603ac18
      Michael Neuling 提交于
      This adds POWERPC specific hooks for scaled time accounting.
      
      POWER6 includes a SPURR register.  The SPURR is based off the PURR register
      but is scaled based on CPU frequency and issue rates.  This gives a more
      accurate account of the instructions used per task.  The PURR and timebase
      will be constant relative to the wall clock, irrespective of the CPU
      frequency.
      
      This implementation reads the SPURR register in account_system_vtime which
      is only call called on context witch and hard and soft irq entry and exit.
      The percentage of user and system time is then estimated using the ratio of
      these accounted by the PURR.  If the SPURR is not present, the PURR read.
      
      An earlier implementation of this patch read the SPURR whenever the PURR
      was read, which included the system call entry and exit path.
      Unfortunately this showed a performance regression on lmbench runs, so was
      re-implemented.
      
      I've included the lmbench results here when run bare metal on POWER6.  1st
      column is the unpatch results.  2nd column is the results using the below
      patch and the 3rd is the % diff of these results from the base.  4th and
      5th columns are the results and % differnce from the base using the older
      patch (SPURR read in syscall entry/exit path).
      
                                    Base        Scaled-Acct     SPURR-in-syscall
                                   Result      Result  % diff    Result % diff
      Simple syscall:              0.3086      0.3086  0.0000    0.3452 11.8600
      Simple read:                 0.4591      0.4671  1.7425    0.5044 9.86713
      Simple write:                0.4364      0.4366  0.0458    0.4731 8.40971
      Simple stat:                 2.0055      2.0295  1.1967    2.0669 3.06158
      Simple fstat:                0.5962      0.5876  -1.442    0.6368 6.80979
      Simple open/close:           3.1283      3.1009  -0.875    3.2088 2.57328
      Select on 10 fd's:           0.8554      0.8457  -1.133    0.8667 1.32101
      Select on 100 fd's:          3.5292      3.6329  2.9383    3.6664 3.88756
      Select on 250 fd's:          7.9097      8.1881  3.5197    8.2242 3.97613
      Select on 500 fd's:          15.2659     15.836  3.7357    15.873 3.97814
      Select on 10 tcp fd's:       0.9576      0.9416  -1.670    0.9752 1.83792
      Select on 100 tcp fd's:      7.248       7.2254  -0.311    7.2685 0.28283
      Select on 250 tcp fd's:      17.7742     17.707  -0.375    17.749 -0.1406
      Select on 500 tcp fd's:      35.4258     35.25   -0.496    35.286 -0.3929
      Signal handler installation: 0.6131      0.6075  -0.913    0.647  5.52927
      Signal handler overhead:     2.0919      2.1078  0.7600    2.1831 4.35967
      Protection fault:            0.7345      0.7478  1.8107    0.8031 9.33968
      Pipe latency:                33.006      16.398  -50.31    33.475 1.42368
      AF_UNIX sock stream latency: 14.5093     30.910  113.03    30.715 111.692
      Process fork+exit:           219.8       222.8   1.3648    229.37 4.35623
      Process fork+execve:         876.14      873.28  -0.32     868.66 -0.8533
      Process fork+/bin/sh -c:     2830        2876.5  1.6431    2958   4.52296
      File /var/tmp/XXX write bw:  1193497     1195536 0.1708    118657 -0.5799
      Pagefaults on /var/tmp/XXX:  3.1272      3.2117  2.7020    3.2521 3.99398
      
      Also, kernel compile times show no difference with this patch applied.
      
      [pbadari@us.ibm.com: Avoid unnecessary PURR reading]
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Jay Lan <jlan@engr.sgi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4603ac18
  14. 17 10月, 2007 1 次提交
  15. 11 10月, 2007 2 次提交
    • P
      [POWERPC] Make clockevents work on PPC601 processors · cdec12ae
      Paul Mackerras 提交于
      In testing the new clocksource and clockevent code on a PPC601
      processor, I discovered that the clockevent multiplier value for the
      decrementer clockevent was overflowing.  Because the RTCL register in
      the 601 effectively counts at 1GHz (it doesn't actually, but it
      increases by 128 every 128ns), and the shift value was 32, that meant
      the multiplier value had to be 2^32, which won't fit in an unsigned
      long on 32-bit.  The same problem would arise on any platform where
      the timebase frequency was 1GHz or more (not that we actually have any
      such machines today).
      
      This fixes it by reducing the shift value to 16.  Doing the
      calculations with a resolution of 2^-16 nanoseconds (15 femtoseconds)
      should be quite adequate.  :)
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      cdec12ae
    • P
      [POWERPC] Prevent decrementer clockevents from firing early · d968014b
      Paul Mackerras 提交于
      On old powermacs, we sometimes set the decrementer to 1 in order to
      trigger a decrementer interrupt, which we use to handle an interrupt
      that was pending at the time when it was re-enabled.  This was causing
      the decrementer clock event device to call the event function for the
      next event early, which was causing problems when high-res timers were
      not enabled.
      
      This fixes the problem by recording the timebase value at which the
      next event should occur, and checking the current timebase against the
      recorded value in timer_interrupt.  If it isn't time for the next
      event, it just reprograms the decrementer and returns.
      
      This also subtracts 1 from the value stored into the decrementer,
      which is appropriate because the decrementer interrupts on the
      transition from 0 to -1, not when the decrementer reaches 0.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      d968014b
  16. 03 10月, 2007 3 次提交
  17. 19 9月, 2007 1 次提交
  18. 20 8月, 2007 1 次提交
  19. 17 8月, 2007 1 次提交
  20. 21 7月, 2007 1 次提交
  21. 10 7月, 2007 1 次提交
    • T
      [POWERPC] Modify sched_clock() to make CONFIG_PRINTK_TIME more sane · fc9069fe
      Tony Breeds 提交于
      When booting a current kernel with CONFIG_PRINTK_TIME enabled you'll
      see messages like:
      
      [    0.000000] time_init: decrementer frequency = 188.044000 MHz
      [    0.000000] time_init: processor frequency   = 1504.352000 MHz
      [3712914.436297] Console: colour dummy device 80x25
      
      This cause by the initialisation of tb_to_ns_scale in time_init(), suddenly the
      multiplication in sched_clock() now does something :).  This patch modifies
      sched_clock() to report the offset since the machine booted so the same
      printk's now look like:
      
      [    0.000000] time_init: decrementer frequency = 188.044000 MHz
      [    0.000000] time_init: processor frequency   = 1504.352000 MHz
      [    0.000135] Console: colour dummy device 80x25
      
      Effectivly including the uptime in printk()s.
      
      This patch makes tb_to_ns_scale and tb_to_ns_shift static and
      read_mostly for good measure.
      Signed-off-by: NTony Breeds <tony@bakeyournoodle.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      fc9069fe
  22. 28 6月, 2007 1 次提交
    • T
      [POWERPC] Move iSeries_tb_recal into its own late_initcall. · 71712b45
      Tony Breeds 提交于
      Currently iSeries will recalibrate the cputime_factors in the first
      settimeofday() call.
      
      It seems the reason for doing this is to ensure a resaonable time delta after
      time_init().  On current kernels (with udev), this call is made 40-60 seconds
      into the boot process, by moving it to a late initcall it is called
      approximately 5 seconds after time_init() is called.  This is sufficient to
      recalibrate the timebase.
      Signed-off-by: NTony Breeds <tony@bakeyournoodle.com>
      CC: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      71712b45
  23. 25 6月, 2007 2 次提交
  24. 12 5月, 2007 1 次提交
    • W
      [POWERPC] Simplify smp_space_timers · e147ec8f
      will schmidt 提交于
      Greatly simplify the function smp_space_timers.
      
      The stolen time calculation (per comment within the code) doesn't need the
      half-jiffy stagger any more.  There isn't an issue with bouncing off global
      locks, so we really shouldn't need any sort of staggering at all.
      
      However, the last_jiffy value still needs to be set.   This removes the
      extra stagger logic, and just sets the values.
      
      This change should benefit applications that rely on barrier
      synchronization, and will help cut down OS jitter.
      
      Boot tested across the board (G5,power3,power4,power5,970mp blade).
      Signed-off-by: NWill Schmidt <will_schmidt@vnet.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      e147ec8f
  25. 13 4月, 2007 1 次提交
  26. 04 12月, 2006 1 次提交
  27. 22 11月, 2006 1 次提交
  28. 23 10月, 2006 1 次提交
    • S
      [POWERPC] Simplify stolen time calculation · cbcdb93d
      Stephen Rothwell 提交于
      In calculating stolen time, we were trying to actually account for time
      spent in the hypervisor.  We don't really have enough information to do
      that accurately, so don't try.  Instead, we now calculate stolen time as
      time that the current cpu thread is not actually dispatching instructions.
      On chips without a PURR, we cannot do this, so stolen time will always
      be zero.  On chips with a PURR, this is merely the difference between
      the elapsed PURR values and the elapsed TB values.
      
      This gives us much more sane vaules from tools such as mpstat, even if
      they are still a bit strange e.g. 2 busy threads on one cpu will both
      appear to have 50% user time and 50% stolen time while 1 busy thread on
      a cpu will look like 100% user on one of them and 100% idle on the other.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      cbcdb93d
  29. 07 10月, 2006 1 次提交
  30. 05 10月, 2006 1 次提交
    • D
      IRQ: Maintain regs pointer globally rather than passing to IRQ handlers · 7d12e780
      David Howells 提交于
      Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
      of passing regs around manually through all ~1800 interrupt handlers in the
      Linux kernel.
      
      The regs pointer is used in few places, but it potentially costs both stack
      space and code to pass it around.  On the FRV arch, removing the regs parameter
      from all the genirq function results in a 20% speed up of the IRQ exit path
      (ie: from leaving timer_interrupt() to leaving do_IRQ()).
      
      Where appropriate, an arch may override the generic storage facility and do
      something different with the variable.  On FRV, for instance, the address is
      maintained in GR28 at all times inside the kernel as part of general exception
      handling.
      
      Having looked over the code, it appears that the parameter may be handed down
      through up to twenty or so layers of functions.  Consider a USB character
      device attached to a USB hub, attached to a USB controller that posts its
      interrupts through a cascaded auxiliary interrupt controller.  A character
      device driver may want to pass regs to the sysrq handler through the input
      layer which adds another few layers of parameter passing.
      
      I've build this code with allyesconfig for x86_64 and i386.  I've runtested the
      main part of the code on FRV and i386, though I can't test most of the drivers.
      I've also done partial conversion for powerpc and MIPS - these at least compile
      with minimal configurations.
      
      This will affect all archs.  Mostly the changes should be relatively easy.
      Take do_IRQ(), store the regs pointer at the beginning, saving the old one:
      
      	struct pt_regs *old_regs = set_irq_regs(regs);
      
      And put the old one back at the end:
      
      	set_irq_regs(old_regs);
      
      Don't pass regs through to generic_handle_irq() or __do_IRQ().
      
      In timer_interrupt(), this sort of change will be necessary:
      
      	-	update_process_times(user_mode(regs));
      	-	profile_tick(CPU_PROFILING, regs);
      	+	update_process_times(user_mode(get_irq_regs()));
      	+	profile_tick(CPU_PROFILING);
      
      I'd like to move update_process_times()'s use of get_irq_regs() into itself,
      except that i386, alone of the archs, uses something other than user_mode().
      
      Some notes on the interrupt handling in the drivers:
      
       (*) input_dev() is now gone entirely.  The regs pointer is no longer stored in
           the input_dev struct.
      
       (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking.  It does
           something different depending on whether it's been supplied with a regs
           pointer or not.
      
       (*) Various IRQ handler function pointers have been moved to type
           irq_handler_t.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      (cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
      7d12e780
  31. 02 10月, 2006 1 次提交