1. 02 5月, 2009 1 次提交
    • M
      clocksource: setup mult_orig in clocksource_enable() · a25cbd04
      Magnus Damm 提交于
      Setup clocksource mult_orig in clocksource_enable().
      
      Clocksource drivers can save power by using keeping the
      device clock disabled while the clocksource is unused.
      
      In practice this means that the enable() and disable()
      callbacks perform clk_enable() and clk_disable().
      
      The enable() callback may also use clk_get_rate() to get
      the clock rate from the clock framework. This information
      can then be used to calculate the shift and mult variables.
      
      Currently the mult_orig variable is setup from mult at
      registration time only. This is conflicting with the above
      case since the clock is disabled and the mult variable is
      not yet calculated at the time of registration.
      
      Moving the mult_orig setup code to clocksource_enable()
      allows us to both handle the common case with no enable()
      callback and the mult-changed-after-enable() case.
      
      [ Impact: allow dynamic clock source usage ]
      Signed-off-by: NMagnus Damm <damm@igel.co.jp>
      LKML-Reference: <20090501054546.8193.10688.sendpatchset@rx1.opensource.se>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      a25cbd04
  2. 22 4月, 2009 2 次提交
  3. 27 2月, 2009 1 次提交
  4. 26 2月, 2009 14 次提交
    • I
      time: ntp: clean up second_overflow() · 39854fe8
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      The 'time_adj' local variable is named in a very confusing
      way because it almost shadows the 'time_adjust' global
      variable - which is used in this same function.
      
      Rename it to 'delta' - to make them stand apart more clearly.
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2545	    114	    144	   2803	    af3	ntp.o.before
         2545	    114	    144	   2803	    af3	ntp.o.after
      
      md5:
         1bf0b3be564512279ba7cee299d1d2be  ntp.o.before.asm
         1bf0b3be564512279ba7cee299d1d2be  ntp.o.after.asm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      39854fe8
    • I
      time: ntp: simplify ntp_tick_adj calculations · 069569e0
      Ingo Molnar 提交于
      Impact: micro-optimization
      
      Convert the (internal) ntp_tick_adj value we store from unscaled
      units to scaled units. This is a constant that we never modify,
      so scaling it up once during bootup is enough - we dont have to
      do it for every adjustment step.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      069569e0
    • I
      time: ntp: make 64-bit constants more robust · 2b9d1496
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
       - make PPM_SCALE an explicit s64 constant, to
         remove (s64) casts from usage sites.
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2536	    114	    136	   2786	    ae2	ntp.o.before
         2536	    114	    136	   2786	    ae2	ntp.o.after
      
      md5:
         40a7728d1188aa18e83e21a81fa7b150  ntp.o.before.asm
         40a7728d1188aa18e83e21a81fa7b150  ntp.o.after.asm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2b9d1496
    • I
      time: ntp: refactor do_adjtimex() some more · e9629165
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      Further simplify do_adjtimex():
      
       - introduce the ntp_start_leap_timer() helper function
       - eliminate the goto adj_done complication
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e9629165
    • I
      time: ntp: refactor do_adjtimex() · 80f22571
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      do_adjtimex() is currently a monster function with a maze of
      branches. Refactor the txc->modes setting aspects of it into
      two new helper functions:
      
      	process_adj_status()
      	process_adjtimex_modes()
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2512	    114	    136	   2762	    aca	ntp.o.before
         2512	    114	    136	   2762	    aca	ntp.o.after
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      80f22571
    • I
      time: ntp: fix bug in ntp_update_offset() & do_adjtimex() · 10dd31a7
      Ingo Molnar 提交于
      Impact: change (fix) the way the NTP PLL seconds offset is initialized/tracked
      
      Fix a bug and do a micro-optimization:
      
      When PLL is enabled we do not reset time_reftime. If the PLL
      was off for a long time (for example after bootup), this is
      arguably the wrong thing to do.
      
      We already had a hack for the common boot-time case in
      ntp_update_offset(), in form of:
      
      	if (unlikely(time_status & STA_FREQHOLD || time_reftime == 0))
       		secs = 0;
      
      But the update delta should be reset later on too - not just when
      the PLL is enabled for the first time after bootup.
      
      So do it on !STA_PLL -> STA_PLL transitions.
      
      This changes behavior, as previously if ntpd was disabled for
      a long time and we restarted it, we'd run from that last update,
      with a very large delta.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      10dd31a7
    • I
      time: ntp: micro-optimize ntp_update_offset() · c7986acb
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      The time_reftime update in ntp_update_offset() to xtime.tv_sec
      is a convoluted way of saying that we want to freeze the frequency
      and want the 'secs' delta to be 0. Also make this branch unlikely.
      
      This shaves off 8 bytes from the code size:
      
         text	   data	    bss	    dec	    hex	filename
         2504	    114	    136	   2754	    ac2	ntp.o.before
         2496	    114	    136	   2746	    aba	ntp.o.after
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7986acb
    • I
      time: ntp: simplify ntp_update_offset_fll() · 478b7aab
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      Change ntp_update_offset_fll() to delta logic instead of
      absolute value logic. This eliminates 'freq_adj' from the
      function.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      478b7aab
    • I
      time: ntp: refactor and clean up ntp_update_offset() · f939890b
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      - introduce the ntp_update_offset_fll() helper
      - clean up the flow and variable naming
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2504	    114	    136	   2754	    ac2	ntp.o.before
         2504	    114	    136	   2754	    ac2	ntp.o.after
      
      md5:
         01f7b8e1a5472a3056f9e4ae84d46315  ntp.o.before.asm
         01f7b8e1a5472a3056f9e4ae84d46315  ntp.o.after.asm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f939890b
    • I
      time: ntp: refactor up ntp_update_frequency() · bc26c31d
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      Change ntp_update_frequency() from a hard to follow code
      flow that uses global variables as temporaries, to a clean
      input+output flow.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bc26c31d
    • I
      time: ntp: clean up ntp_update_frequency() · 9ce616aa
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      Prepare a refactoring of ntp_update_frequency().
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2504	    114	    136	   2754	    ac2	ntp.o.before
         2504	    114	    136	   2754	    ac2	ntp.o.after
      
      md5:
         41f3009debc9b397d7394dd77d912f0a  ntp.o.before.asm
         41f3009debc9b397d7394dd77d912f0a  ntp.o.after.asm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9ce616aa
    • I
      time: ntp: simplify the MAX_TICKADJ_SCALED definition · bbd12676
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      There's an ugly u64 typecase in the MAX_TICKADJ_SCALED definition,
      this can be eliminated by making the MAX_TICKADJ constant's type
      64-bit (signed).
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2504	    114	    136	   2754	    ac2	ntp.o.before
         2504	    114	    136	   2754	    ac2	ntp.o.after
      
      md5:
         41f3009debc9b397d7394dd77d912f0a  ntp.o.before.asm
         41f3009debc9b397d7394dd77d912f0a  ntp.o.after.asm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bbd12676
    • I
      time: ntp: simplify the second_overflow() code flow · 3c972c24
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      Instead of a hierarchy of conditions, transform them to clean
      gradual conditions and return's.
      
      This makes the flow easier to read and makes the purpose of
      the function easier to understand.
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2552	    170	    168	   2890	    b4a	ntp.o.before
         2552	    170	    168	   2890	    b4a	ntp.o.after
      
      md5:
         eae1275df0b7d6290c13f6f6f8f05c8c  ntp.o.before.asm
         eae1275df0b7d6290c13f6f6f8f05c8c  ntp.o.after.asm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3c972c24
    • I
      time: ntp: clean up kernel/time/ntp.c · 53bbfa9e
      Ingo Molnar 提交于
      Impact: cleanup, no functionality changed
      
      Make this file a bit more readable by applying a consistent coding style.
      
      No code changed:
      
      kernel/time/ntp.o:
      
         text	   data	    bss	    dec	    hex	filename
         2552	    170	    168	   2890	    b4a	ntp.o.before
         2552	    170	    168	   2890	    b4a	ntp.o.after
      
      md5:
         eae1275df0b7d6290c13f6f6f8f05c8c  ntp.o.before.asm
         eae1275df0b7d6290c13f6f6f8f05c8c  ntp.o.after.asm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      53bbfa9e
  5. 19 2月, 2009 1 次提交
    • J
      time: apply NTP frequency/tick changes immediately · fdcedf7b
      john stultz 提交于
      Since the GENERIC_TIME changes landed, the adjtimex behavior changed
      for struct timex.tick and .freq changed. When the tick or freq value
      is set, we adjust the tick_length_base in ntp_update_frequency().
      However, this new value doesn't get applied to tick_length until the
      next second (via second_overflow).
      
      This means some applications that do quick time tweaking do not see the
      requested change made as quickly as expected.
      
      I've run a few tests with this change, and ntpd still functions fine.
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fdcedf7b
  6. 16 2月, 2009 2 次提交
    • P
      timecompare: generic infrastructure to map between two time bases · a75244c3
      Patrick Ohly 提交于
      Mapping from a struct timecounter to a time returned by functions like
      ktime_get_real() is implemented. This is sufficient to use this code
      in a network device driver which wants to support hardware time
      stamping and transformation of hardware time stamps to system time.
      
      The interface could have been made more versatile by not depending on
      a time counter, but this wasn't done to avoid writing glue code
      elsewhere.
      
      The method implemented here is the one used and analyzed under the name
      "assisted PTP" in the LCI PTP paper:
      http://www.linuxclustersinstitute.org/conferences/archive/2008/PDF/Ohly_92221.pdfAcked-by: NJohn Stultz <johnstul@us.ibm.com>
      Signed-off-by: NPatrick Ohly <patrick.ohly@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a75244c3
    • P
      clocksource: allow usage independent of timekeeping.c · a038a353
      Patrick Ohly 提交于
      So far struct clocksource acted as the interface between time/timekeeping.c
      and hardware. This patch generalizes the concept so that a similar
      interface can also be used in other contexts. For that it introduces
      new structures and related functions *without* touching the existing
      struct clocksource.
      
      The reasons for adding these new structures to clocksource.[ch] are
      * the APIs are clearly related
      * struct clocksource could be cleaned up to use the new structs
      * avoids proliferation of files with similar names (timesource.h?
        timecounter.h?)
      
      As outlined in the discussion with John Stultz, this patch adds
      * struct cyclecounter: stateless API to hardware which counts clock cycles
      * struct timecounter: stateful utility code built on a cyclecounter which
        provides a nanosecond counter
      * only the function to read the nanosecond counter; deltas are used internally
        and not exposed to users of timecounter
      
      The code does no locking of the shared state. It must be called at least
      as often as the cycle counter wraps around to detect these wrap arounds.
      Both is the responsibility of the timecounter user.
      Acked-by: NJohn Stultz <johnstul@us.ibm.com>
      Signed-off-by: NPatrick Ohly <patrick.ohly@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a038a353
  7. 31 1月, 2009 1 次提交
    • S
      hrtimers: allow the hot-unplugging of all cpus · 94df7de0
      Sebastien Dugue 提交于
      Impact: fix CPU hotplug hang on Power6 testbox
      
      On architectures that support offlining all cpus (at least powerpc/pseries),
      hot-unpluging the tick_do_timer_cpu can result in a system hang.
      
      This comes from the fact that if the cpu going down happens to be the
      cpu doing the tick, then as the tick_do_timer_cpu handover happens after the
      cpu is dead (via the CPU_DEAD notification), we're left without ticks,
      jiffies are frozen and any task relying on timers (msleep, ...) is stuck.
      That's particularly the case for the cpu looping in __cpu_die() waiting
      for the dying cpu to be dead.
      
      This patch addresses this by having the tick_do_timer_cpu handover happen
      earlier during the CPU_DYING notification. For this, a new clockevent
      notification type is introduced (CLOCK_EVT_NOTIFY_CPU_DYING) which is triggered
      in hrtimer_cpu_notify().
      Signed-off-by: NSebastien Dugue <sebastien.dugue@bull.net>
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      94df7de0
  8. 16 1月, 2009 1 次提交
    • M
      clockevents: let set_mode() setup delta information · 2d68259d
      Magnus Damm 提交于
      Allow the set_mode() clockevent callback to decide and fill in delta
      details such as shift, mult, max_delta_ns and min_delta_ns.
      
      With this change the clockevent can be registered without delta details
      which allows us to keep the parent clock disabled until the clockevent
      gets setup using set_mode().
      
      Letting set_mode() fill in or update delta details allows us to save
      power by disabling the parent clock while the clockevent is unused.
      This may however make the parent clock rate change, so next time the
      clockevent gets enabled we need let set_mode() to update the detla
      details accordingly. Doing it at registration time is not enough.
      
      Furthermore, the delta details seem unused in the case of periodic-only
      clockevent drivers, so this change also allows registration of such
      drivers without the delta details filled in.
      Signed-off-by: NMagnus Damm <damm@igel.co.jp>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      2d68259d
  9. 15 1月, 2009 1 次提交
  10. 06 1月, 2009 1 次提交
  11. 01 1月, 2009 2 次提交
  12. 31 12月, 2008 3 次提交
    • M
      [PATCH] idle cputime accounting · 79741dd3
      Martin Schwidefsky 提交于
      The cpu time spent by the idle process actually doing something is
      currently accounted as idle time. This is plain wrong, the architectures
      that support VIRT_CPU_ACCOUNTING=y can do better: distinguish between the
      time spent doing nothing and the time spent by idle doing work. The first
      is accounted with account_idle_time and the second with account_system_time.
      The architectures that use the account_xxx_time interface directly and not
      the account_xxx_ticks interface now need to do the check for the idle
      process in their arch code. In particular to improve the system vs true
      idle time accounting the arch code needs to measure the true idle time
      instead of just testing for the idle process.
      To improve the tick based accounting as well we would need an architecture
      primitive that can tell us if the pt_regs of the interrupted context
      points to the magic instruction that halts the cpu.
      
      In addition idle time is no more added to the stime of the idle process.
      This field now contains the system time of the idle process as it should
      be. On systems without VIRT_CPU_ACCOUNTING this will always be zero as
      every tick that occurs while idle is running will be accounted as idle
      time.
      
      This patch contains the necessary common code changes to be able to
      distinguish idle system time and true idle time. The architectures with
      support for VIRT_CPU_ACCOUNTING need some changes to exploit this.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      79741dd3
    • M
      [PATCH] fix scaled & unscaled cputime accounting · 457533a7
      Martin Schwidefsky 提交于
      The utimescaled / stimescaled fields in the task structure and the
      global cpustat should be set on all architectures. On s390 the calls
      to account_user_time_scaled and account_system_time_scaled never have
      been added. In addition system time that is accounted as guest time
      to the user time of a process is accounted to the scaled system time
      instead of the scaled user time.
      To fix the bugs and to prevent future forgetfulness this patch merges
      account_system_time_scaled into account_system_time and
      account_user_time_scaled into account_user_time.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Michael Neuling <mikey@neuling.org>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      457533a7
    • T
      sched_clock: prevent scd->clock from moving backwards, take #2 · 1c5745aa
      Thomas Gleixner 提交于
      Redo:
      
        5b7dba4f: sched_clock: prevent scd->clock from moving backwards
      
      which had to be reverted due to s2ram hangs:
      
        ca7e716c: Revert "sched_clock: prevent scd->clock from moving backwards"
      
      ... this time with resume restoring GTOD later in the sequence
      taken into account as well.
      
      The "timekeeping_suspended" flag is not very nice but we cannot call into
      GTOD before it has been properly resumed and the scheduler will run very
      early in the resume sequence.
      
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1c5745aa
  13. 30 12月, 2008 1 次提交
    • S
      hrtimers: allow the hot-unplugging of all cpus · 5762ba18
      Sebastien Dugue 提交于
      Impact: fix CPU hotplug hang on Power6 testbox
      
      On architectures that support offlining all cpus (at least powerpc/pseries),
      hot-unpluging the tick_do_timer_cpu can result in a system hang.
      
      This comes from the fact that if the cpu going down happens to be the
      cpu doing the tick, then as the tick_do_timer_cpu handover happens after the
      cpu is dead (via the CPU_DEAD notification), we're left without ticks,
      jiffies are frozen and any task relying on timers (msleep, ...) is stuck.
      That's particularly the case for the cpu looping in __cpu_die() waiting
      for the dying cpu to be dead.
      
      This patch addresses this by having the tick_do_timer_cpu handover happen
      earlier during the CPU_DYING notification. For this, a new clockevent
      notification type is introduced (CLOCK_EVT_NOTIFY_CPU_DYING) which is triggered
      in hrtimer_cpu_notify().
      Signed-off-by: NSebastien Dugue <sebastien.dugue@bull.net>
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5762ba18
  14. 13 12月, 2008 2 次提交
  15. 12 12月, 2008 2 次提交
    • W
      nohz: suppress needless timer reprogramming · 00147449
      Woodruff, Richard 提交于
      In my device I get many interrupts from a high speed USB device in a very
      short period of time.  The system spends a lot of time reprogramming the
      hardware timer which is in a slower timing domain as compared to the CPU. 
      This results in the CPU spending a huge amount of time waiting for the
      timer posting to be done.  All of this reprogramming is useless as the
      wake up time has not changed.
      
      As measured using ETM trace this drops my reprogramming penalty from
      almost 60% CPU load down to 15% during high interrupt rate.  I can send
      traces to show this.
      
      Suppress setting of duplicate timer event when timer already stopped. 
      Timer programming can be very costly and can result in long cpu stall/wait
      times.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [tglx@linutronix.de: move the check to the right place and avoid raising
      		     the softirq for nothing]
      Signed-off-by: NRichard Woodruff <r-woodruff2@ti.com>
      Cc: johnstul@us.ibm.com
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      00147449
    • H
      nohz: no softirq pending warnings for offline cpus · fa116ea3
      Heiko Carstens 提交于
      Impact: remove false positive warning
      
      After a cpu was taken down during cpu hotplug (read: disabled for interrupts)
      it still might have pending softirqs. However take_cpu_down makes sure
      that the idle task will run next instead of ksoftirqd on the taken down cpu.
      The idle task will call tick_nohz_stop_sched_tick which might warn about
      pending softirqs just before the cpu kills itself completely.
      
      However the pending softirqs on the dead cpu aren't a problem because they
      will be moved to an online cpu during CPU_DEAD handling.
      
      So make sure we warn only for online cpus.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fa116ea3
  16. 04 12月, 2008 1 次提交
    • J
      time: catch xtime_nsec underflows and fix them · 6c9bacb4
      john stultz 提交于
      Impact: fix time warp bug
      
      Alex Shi, along with Yanmin Zhang have been noticing occasional time
      inconsistencies recently. Through their great diagnosis, they found that
      the xtime_nsec value used in update_wall_time was occasionally going
      negative. After looking through the code for awhile, I realized we have
      the possibility for an underflow when three conditions are met in
      update_wall_time():
      
        1) We have accumulated a second's worth of nanoseconds, so we
           incremented xtime.tv_sec and appropriately decrement xtime_nsec.
           (This doesn't cause xtime_nsec to go negative, but it can cause it
            to be small).
      
        2) The remaining offset value is large, but just slightly less then
           cycle_interval.
      
        3) clocksource_adjust() is speeding up the clock, causing a
           corrective amount (compensating for the increase in the multiplier
           being multiplied against the unaccumulated offset value) to be
           subtracted from xtime_nsec.
      
      This can cause xtime_nsec to underflow.
      
      Unfortunately, since we notify the NTP subsystem via second_overflow()
      whenever we accumulate a full second, and this effects the error
      accumulation that has already occured, we cannot simply revert the
      accumulated second from xtime nor move the second accumulation to after
      the clocksource_adjust call without a change in behavior.
      
      This leaves us with (at least) two options:
      
      1) Simply return from clocksource_adjust() without making a change if we
         notice the adjustment would cause xtime_nsec to go negative.
      
      This would work, but I'm concerned that if a large adjustment was needed
      (due to the error being large), it may be possible to get stuck with an
      ever increasing error that becomes too large to correct (since it may
      always force xtime_nsec negative). This may just be paranoia on my part.
      
      2) Catch xtime_nsec if it is negative, then add back the amount its
         negative to both xtime_nsec and the error.
      
      This second method is consistent with how we've handled earlier rounding
      issues, and also has the benefit that the error being added is always in
      the oposite direction also always equal or smaller then the correction
      being applied. So the risk of a corner case where things get out of
      control is lessened.
      
      This patch fixes bug 11970, as tested by Yanmin Zhang
      http://bugzilla.kernel.org/show_bug.cgi?id=11970
      
      Reported-by: alex.shi@intel.com
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      Acked-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
      Tested-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6c9bacb4
  17. 25 11月, 2008 2 次提交
    • P
      hrtimer: removing all ur callback modes · ca109491
      Peter Zijlstra 提交于
      Impact: cleanup, move all hrtimer processing into hardirq context
      
      This is an attempt at removing some of the hrtimer complexity by
      reducing the number of callback modes to 1.
      
      This means that all hrtimer callback functions will be ran from HARD-irq
      context.
      
      I went through all the 30 odd hrtimer callback functions in the kernel
      and saw only one that I'm not quite sure of, which is the one in
      net/can/bcm.c - hence I'm CC-ing the folks responsible for that code.
      
      Furthermore, the hrtimer core now calls callbacks directly with IRQs
      disabled in case you try to enqueue an expired timer. If this timer is a
      periodic timer (which should use hrtimer_forward() to advance its time)
      then it might be possible to end up in an inf. recursive loop due to the
      fact that hrtimer_forward() doesn't round up to the next timer
      granularity, and therefore keeps on calling the callback - obviously
      this needs a fix.
      
      Aside from that, this seems to compile and actually boot on my dual core
      test box - although I'm sure there are some bugs in, me not hitting any
      makes me certain :-)
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ca109491
    • R
      sched: convert nohz_cpu_mask to cpumask_var_t. · 6a7b3dc3
      Rusty Russell 提交于
      Impact: (future) size reduction for large NR_CPUS.
      
      Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
      space for small nr_cpu_ids but big CONFIG_NR_CPUS.  cpumask_var_t
      is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6a7b3dc3
  18. 11 11月, 2008 1 次提交
    • T
      nohz: disable tick_nohz_kick_tick() for now · ae99286b
      Thomas Gleixner 提交于
      Impact: nohz powersavings and wakeup regression
      
      commit fb02fbc1 (NOHZ: restart tick
      device from irq_enter()) causes a serious wakeup regression.
      
      While the patch is correct it does not take into account that spurious
      wakeups happen on x86. A fix for this issue is available, but we just
      revert to the .27 behaviour and let long running softirqs screw
      themself.
      
      Disable it for now.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      ae99286b
  19. 22 10月, 2008 1 次提交
    • T
      NOHZ: fix thinko in the timer restart code path · c4bd822e
      Thomas Gleixner 提交于
      commit fb02fbc1 (NOHZ: restart tick
      device from irq_enter())
      
      solves the problem of stale jiffies when long running softirqs happen
      in a long idle sleep period, but it has a major thinko in it:
      
      When the interrupt which came in _is_ the timer interrupt which should
      expire ts->sched_timer then we cancel and rearm the timer _before_ it
      gets expired in hrtimer_interrupt() to the next period. That means the
      call back function is not called. This game can go on for ever :(
      
      Prevent this by making sure to only rearm the timer when the expiry
      time is more than one tick_period away. Otherwise keep it running as
      it is either already expired or will expiry at the right point to
      update jiffies.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NVenkatesch Pallipadi <venkatesh.pallipadi@intel.com>
      c4bd822e