1. 31 1月, 2009 1 次提交
    • S
      hrtimers: allow the hot-unplugging of all cpus · 94df7de0
      Sebastien Dugue 提交于
      Impact: fix CPU hotplug hang on Power6 testbox
      
      On architectures that support offlining all cpus (at least powerpc/pseries),
      hot-unpluging the tick_do_timer_cpu can result in a system hang.
      
      This comes from the fact that if the cpu going down happens to be the
      cpu doing the tick, then as the tick_do_timer_cpu handover happens after the
      cpu is dead (via the CPU_DEAD notification), we're left without ticks,
      jiffies are frozen and any task relying on timers (msleep, ...) is stuck.
      That's particularly the case for the cpu looping in __cpu_die() waiting
      for the dying cpu to be dead.
      
      This patch addresses this by having the tick_do_timer_cpu handover happen
      earlier during the CPU_DYING notification. For this, a new clockevent
      notification type is introduced (CLOCK_EVT_NOTIFY_CPU_DYING) which is triggered
      in hrtimer_cpu_notify().
      Signed-off-by: NSebastien Dugue <sebastien.dugue@bull.net>
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      94df7de0
  2. 01 1月, 2009 1 次提交
    • R
      cpumask: convert kernel time functions · 6b954823
      Rusty Russell 提交于
      Impact: Use new APIs
      
      Convert kernel/time functions to use struct cpumask *.
      
      Note the ugly bitmap declarations in tick-broadcast.c.  These should
      be cpumask_var_t, but there was no obvious initialization function to
      put the alloc_cpumask_var() calls in.  This was safe.
      
      (Eventually 'struct cpumask' will be undefined for CONFIG_CPUMASK_OFFSTACK,
      so we use a bitmap here to show we really mean it).
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NMike Travis <travis@sgi.com>
      6b954823
  3. 13 12月, 2008 2 次提交
  4. 23 9月, 2008 2 次提交
    • T
      clockevents: prevent mode mismatch on cpu online · 27ce4cb4
      Thomas Gleixner 提交于
      Impact: timer hang on CPU online observed on AMD C1E systems
      
      When a CPU is brought online then the broadcast machinery can
      be in the one shot state already. Check this and setup the timer 
      device of the new CPU in one shot mode so the broadcast code
      can pick up the next_event value correctly.
      
      Another AMD C1E oddity, as we switch to broadcast immediately and
      not after the full bring up via the ACPI cpu idle code.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      27ce4cb4
    • T
      clockevents: prevent cpu online to interfere with nohz · 6441402b
      Thomas Gleixner 提交于
      Impact: rare hang which can be triggered on CPU online.
      
      tick_do_timer_cpu keeps track of the CPU which updates jiffies
      via do_timer. The value -1 is used to signal, that currently no
      CPU is doing this. There are two cases, where the variable can 
      have this state:
      
       boot:
          necessary for systems where the boot cpu id can be != 0
      
       nohz long idle sleep:
          When the CPU which did the jiffies update last goes into
          a long idle sleep it drops the update jiffies duty so
          another CPU which is not idle can pick it up and keep
          jiffies going.
      
      Using the same value for both situations is wrong, as the CPU online
      code can see the -1 state when the timer of the newly onlined CPU is
      setup. The setup for a newly onlined CPU goes through periodic mode
      and can pick up the do_timer duty without being aware of the nohz /
      highres mode of the already running system.
      
      Use two separate states and make them constants to avoid magic
      numbers confusion. 
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      6441402b
  5. 17 9月, 2008 1 次提交
    • T
      clockevents: make device shutdown robust · 2344abbc
      Thomas Gleixner 提交于
      The device shut down does not cleanup the next_event variable of the
      clock event device. So when the device is reactivated the possible
      stale next_event value can prevent the device to be reprogrammed as it
      claims to wait on a event already.
      
      This is the root cause of the resurfacing suspend/resume problem,
      where systems need key press to come back to life.
      
      Fix this by setting next_event to KTIME_MAX when the device is shut
      down. Use a separate function for shutdown which takes care of that
      and only keep the direct set mode call in the broadcast code, where we
      can not touch the next_event value.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      2344abbc
  6. 05 9月, 2008 1 次提交
    • V
      clockevents: prevent clockevent event_handler ending up handler_noop · 7c1e7689
      Venkatesh Pallipadi 提交于
      There is a ordering related problem with clockevents code, due to which
      clockevents_register_device() called after tickless/highres switch
      will not work. The new clockevent ends up with clockevents_handle_noop as
      event handler, resulting in no timer activity.
      
      The problematic path seems to be
      
      * old device already has hrtimer_interrupt as the event_handler
      * new clockevent device registers with a higher rating
      * tick_check_new_device() is called
        * clockevents_exchange_device() gets called
          * old->event_handler is set to clockevents_handle_noop
        * tick_setup_device() is called for the new device
          * which sets new->event_handler using the old->event_handler which is noop.
      
      Change the ordering so that new device inherits the proper handler.
      
      This does not have any issue in normal case as most likely all the clockevent
      devices are setup before the highres switch. But, can potentially be affecting
      some corner case where HPET force detect happens after the highres switch.
      This was a problem with HPET in MSI mode code that we have been experimenting
      with.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NShaohua Li <shaohua.li@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7c1e7689
  7. 26 7月, 2008 1 次提交
  8. 19 7月, 2008 1 次提交
  9. 17 4月, 2008 1 次提交
    • R
      [S390] genirq/clockevents: move irq affinity prototypes/inlines to interrupt.h · d7b90689
      Russell King 提交于
      > Generic code is not supposed to include irq.h. Replace this include
      > by linux/hardirq.h instead and add/replace an include of linux/irq.h
      > in asm header files where necessary.
      > This change should only matter for architectures that make use of
      > GENERIC_CLOCKEVENTS.
      > Architectures in question are mips, x86, arm, sh, powerpc, uml and sparc64.
      >
      > I did some cross compile tests for mips, x86_64, arm, powerpc and sparc64.
      > This patch fixes also build breakages caused by the include replacement in
      > tick-common.h.
      
      I generally dislike adding optional linux/* includes in asm/* includes -
      I'm nervous about this causing include loops.
      
      However, there's a separate point to be discussed here.
      
      That is, what interfaces are expected of every architecture in the kernel.
      If generic code wants to be able to set the affinity of interrupts, then
      that needs to become part of the interfaces listed in linux/interrupt.h
      rather than linux/irq.h.
      
      So what I suggest is this approach instead (against Linus' tree of a
      couple of days ago) - we move irq_set_affinity() and irq_can_set_affinity()
      to linux/interrupt.h, change the linux/irq.h includes to linux/interrupt.h
      and include asm/irq_regs.h where needed (asm/irq_regs.h is supposed to be
      rarely used include since not much touches the stacked parent context
      registers.)
      
      Build tested on ARM PXA family kernels and ARM's Realview platform
      kernels which both use genirq.
      
      [ tglx@linutronix.de: add GENERIC_HARDIRQ dependencies ]
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      d7b90689
  10. 15 10月, 2007 1 次提交
    • T
      clockevents: introduce force broadcast notifier · 1595f452
      Thomas Gleixner 提交于
      The 64bit SMP bootup is slightly different to the 32bit one. It enables
      the boot CPU local APIC timer before all CPUs are brought up. Some AMD C1E
      systems have the C1E feature flag only set in the secondary CPU. Due to
      the early enable of the boot CPU local APIC timer the APIC timer is
      registered as a fully functional device. When we detect the wreckage during
      the bringup of the secondary CPU, we need to force the boot CPU into
      broadcast mode. 
      
      Add a new notifier reason and implement the force broadcast in the clock
      events layer.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      1595f452
  11. 13 10月, 2007 1 次提交
  12. 22 7月, 2007 1 次提交
  13. 09 5月, 2007 1 次提交
  14. 17 3月, 2007 1 次提交
    • T
      [PATCH] clockevents: Fix suspend/resume to disk hangs · cd05a1f8
      Thomas Gleixner 提交于
      I finally found a dual core box, which survives suspend/resume without
      crashing in the middle of nowhere. Sigh, I never figured out from the
      code and the bug reports what's going on.
      
      The observed hangs are caused by a stale state transition of the clock
      event devices, which keeps the RCU synchronization away from completion,
      when the non boot CPU is brought back up.
      
      The suspend/resume in oneshot mode needs the similar care as the
      periodic mode during suspend to RAM. My assumption that the state
      transitions during the different shutdown/bringups of s2disk would go
      through the periodic boot phase and then switch over to highres resp.
      nohz mode were simply wrong.
      
      Add the appropriate suspend / resume handling for the non periodic
      modes.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cd05a1f8
  15. 07 3月, 2007 1 次提交
  16. 27 2月, 2007 1 次提交
  17. 17 2月, 2007 4 次提交
    • I
      [PATCH] Add debugging feature /proc/timer_list · 289f480a
      Ingo Molnar 提交于
      add /proc/timer_list, which prints all currently pending (high-res) timers,
      all clock-event sources and their parameters in a human-readable form.
      
      Sample output:
      
      Timer List Version: v0.1
      HRTIMER_MAX_CLOCK_BASES: 2
      now at 4246046273872 nsecs
      
      cpu: 0
       clock 0:
        .index:      0
        .resolution: 1 nsecs
        .get_time:   ktime_get_real
        .offset:     1273998312645738432 nsecs
      active timers:
       clock 1:
        .index:      1
        .resolution: 1 nsecs
        .get_time:   ktime_get
        .offset:     0 nsecs
      active timers:
       #0: <f5a90ec8>, hrtimer_sched_tick, hrtimer_stop_sched_tick, swapper/0
       # expires at 4246432689566 nsecs [in 386415694 nsecs]
       #1: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, pcscd/2050
       # expires at 4247018194689 nsecs [in 971920817 nsecs]
       #2: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, irqbalance/1909
       # expires at 4247351358392 nsecs [in 1305084520 nsecs]
       #3: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, crond/2157
       # expires at 4249097614968 nsecs [in 3051341096 nsecs]
       #4: <f5a90ec8>, it_real_fn, do_setitimer, syslogd/1888
       # expires at 4251329900926 nsecs [in 5283627054 nsecs]
        .expires_next   : 4246432689566 nsecs
        .hres_active    : 1
        .check_clocks   : 0
        .nr_events      : 31306
        .idle_tick      : 4246020791890 nsecs
        .tick_stopped   : 1
        .idle_jiffies   : 986504
        .idle_calls     : 40700
        .idle_sleeps    : 36014
        .idle_entrytime : 4246019418883 nsecs
        .idle_sleeptime : 4178181972709 nsecs
      
      cpu: 1
       clock 0:
        .index:      0
        .resolution: 1 nsecs
        .get_time:   ktime_get_real
        .offset:     1273998312645738432 nsecs
      active timers:
       clock 1:
        .index:      1
        .resolution: 1 nsecs
        .get_time:   ktime_get
        .offset:     0 nsecs
      active timers:
       #0: <f5a90ec8>, hrtimer_sched_tick, hrtimer_restart_sched_tick, swapper/0
       # expires at 4246050084568 nsecs [in 3810696 nsecs]
       #1: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, atd/2227
       # expires at 4261010635003 nsecs [in 14964361131 nsecs]
       #2: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, smartd/2332
       # expires at 5469485798970 nsecs [in 1223439525098 nsecs]
        .expires_next   : 4246050084568 nsecs
        .hres_active    : 1
        .check_clocks   : 0
        .nr_events      : 24043
        .idle_tick      : 4246046084568 nsecs
        .tick_stopped   : 0
        .idle_jiffies   : 986510
        .idle_calls     : 26360
        .idle_sleeps    : 22551
        .idle_entrytime : 4246043874339 nsecs
        .idle_sleeptime : 4170763761184 nsecs
      
      tick_broadcast_mask: 00000003
      event_broadcast_mask: 00000001
      
      CPU#0's local event device:
      
      Clock Event Device: lapic
       capabilities:   0000000e
       max_delta_ns:   807385544
       min_delta_ns:   1443
       mult:           44624025
       shift:          32
       set_next_event: lapic_next_event
       set_mode:       lapic_timer_setup
       event_handler:  hrtimer_interrupt
        .installed:  1
        .expires:    4246432689566 nsecs
      
      CPU#1's local event device:
      
      Clock Event Device: lapic
       capabilities:   0000000e
       max_delta_ns:   807385544
       min_delta_ns:   1443
       mult:           44624025
       shift:          32
       set_next_event: lapic_next_event
       set_mode:       lapic_timer_setup
       event_handler:  hrtimer_interrupt
        .installed:  1
        .expires:    4246050084568 nsecs
      
      Clock Event Device: hpet
       capabilities:   00000007
       max_delta_ns:   2147483647
       min_delta_ns:   3352
       mult:           61496110
       shift:          32
       set_next_event: hpet_next_event
       set_mode:       hpet_set_mode
       event_handler:  handle_nextevt_broadcast
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      289f480a
    • T
      [PATCH] tick-management: dyntick / highres functionality · 79bf2bb3
      Thomas Gleixner 提交于
      With Ingo Molnar <mingo@elte.hu>
      
      Add functions to provide dynamic ticks and high resolution timers.  The code
      which keeps track of jiffies and handles the long idle periods is shared
      between tick based and high resolution timer based dynticks.  The dyntick
      functionality can be disabled on the kernel commandline.  Provide also the
      infrastructure to support high resolution timers.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      79bf2bb3
    • T
      [PATCH] tick-management: broadcast functionality · f8381cba
      Thomas Gleixner 提交于
      With Ingo Molnar <mingo@elte.hu>
      
      Add broadcast functionality, so per cpu clock event devices can be registered
      as dummy devices or switched from/to broadcast on demand.  The broadcast
      function distributes the events via the broadcast function of the clock event
      device.  This is primarily designed to replace the switch apic timer to / from
      IPI in power states, where the apic stops.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f8381cba
    • T
      [PATCH] tick-management: core functionality · 906568c9
      Thomas Gleixner 提交于
      With Ingo Molnar <mingo@elte.hu>
      
      The tick-management code is the first user of the clockevents layer.  It takes
      clock event devices from the clock events core and uses them to provide the
      periodic tick.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      906568c9