1. 11 4月, 2013 1 次提交
  2. 09 4月, 2013 2 次提交
    • D
      hrtimer: Fix ktime_add_ns() overflow on 32bit architectures · 51fd36f3
      David Engraf 提交于
      One can trigger an overflow when using ktime_add_ns() on a 32bit
      architecture not supporting CONFIG_KTIME_SCALAR.
      
      When passing a very high value for u64 nsec, e.g. 7881299347898368000
      the do_div() function converts this value to seconds (7881299347) which
      is still to high to pass to the ktime_set() function as long. The result
      in is a negative value.
      
      The problem on my system occurs in the tick-sched.c,
      tick_nohz_stop_sched_tick() when time_delta is set to
      timekeeping_max_deferment(). The check for time_delta < KTIME_MAX is
      valid, thus ktime_add_ns() is called with a too large value resulting in
      a negative expire value. This leads to an endless loop in the ticker code:
      
      time_delta: 7881299347898368000
      expires = ktime_add_ns(last_update, time_delta)
      expires: negative value
      
      This fix caps the value to KTIME_MAX.
      
      This error doesn't occurs on 64bit or architectures supporting
      CONFIG_KTIME_SCALAR (e.g. ARM, x86-32).
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NDavid Engraf <david.engraf@sysgo.com>
      [jstultz: Minor tweaks to commit message & header]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      51fd36f3
    • P
      hrtimer: Add expiry time overflow check in hrtimer_interrupt · 8f294b5a
      Prarit Bhargava 提交于
      The settimeofday01 test in the LTP testsuite effectively does
      
              gettimeofday(current time);
              settimeofday(Jan 1, 1970 + 100 seconds);
              settimeofday(current time);
      
      This test causes a stack trace to be displayed on the console during the
      setting of timeofday to Jan 1, 1970 + 100 seconds:
      
      [  131.066751] ------------[ cut here ]------------
      [  131.096448] WARNING: at kernel/time/clockevents.c:209 clockevents_program_event+0x135/0x140()
      [  131.104935] Hardware name: Dinar
      [  131.108150] Modules linked in: sg nfsv3 nfs_acl nfsv4 auth_rpcgss nfs dns_resolver fscache lockd sunrpc nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables kvm_amd kvm sp5100_tco bnx2 i2c_piix4 crc32c_intel k10temp fam15h_power ghash_clmulni_intel amd64_edac_mod pcspkr serio_raw edac_mce_amd edac_core microcode xfs libcrc32c sr_mod sd_mod cdrom ata_generic crc_t10dif pata_acpi radeon i2c_algo_bit drm_kms_helper ttm drm ahci pata_atiixp libahci libata usb_storage i2c_core dm_mirror dm_region_hash dm_log dm_mod
      [  131.176784] Pid: 0, comm: swapper/28 Not tainted 3.8.0+ #6
      [  131.182248] Call Trace:
      [  131.184684]  <IRQ>  [<ffffffff810612af>] warn_slowpath_common+0x7f/0xc0
      [  131.191312]  [<ffffffff8106130a>] warn_slowpath_null+0x1a/0x20
      [  131.197131]  [<ffffffff810b9fd5>] clockevents_program_event+0x135/0x140
      [  131.203721]  [<ffffffff810bb584>] tick_program_event+0x24/0x30
      [  131.209534]  [<ffffffff81089ab1>] hrtimer_interrupt+0x131/0x230
      [  131.215437]  [<ffffffff814b9600>] ? cpufreq_p4_target+0x130/0x130
      [  131.221509]  [<ffffffff81619119>] smp_apic_timer_interrupt+0x69/0x99
      [  131.227839]  [<ffffffff8161805d>] apic_timer_interrupt+0x6d/0x80
      [  131.233816]  <EOI>  [<ffffffff81099745>] ? sched_clock_cpu+0xc5/0x120
      [  131.240267]  [<ffffffff814b9ff0>] ? cpuidle_wrap_enter+0x50/0xa0
      [  131.246252]  [<ffffffff814b9fe9>] ? cpuidle_wrap_enter+0x49/0xa0
      [  131.252238]  [<ffffffff814ba050>] cpuidle_enter_tk+0x10/0x20
      [  131.257877]  [<ffffffff814b9c89>] cpuidle_idle_call+0xa9/0x260
      [  131.263692]  [<ffffffff8101c42f>] cpu_idle+0xaf/0x120
      [  131.268727]  [<ffffffff815f8971>] start_secondary+0x255/0x257
      [  131.274449] ---[ end trace 1151a50552231615 ]---
      
      When we change the system time to a low value like this, the value of
      timekeeper->offs_real will be a negative value.
      
      It seems that the WARN occurs because an hrtimer has been started in the time
      between the releasing of the timekeeper lock and the IPI call (via a call to
      on_each_cpu) in clock_was_set() in the do_settimeofday() code.  The end result
      is that a REALTIME_CLOCK timer has been added with softexpires = expires =
      KTIME_MAX.  The hrtimer_interrupt() fires/is called and the loop at
      kernel/hrtimer.c:1289 is executed.  In this loop the code subtracts the
      clock base's offset (which was set to timekeeper->offs_real in
      do_settimeofday()) from the current hrtimer_cpu_base->expiry value (which
      was KTIME_MAX):
      
      	KTIME_MAX - (a negative value) = overflow
      
      A simple check for an overflow can resolve this problem.  Using KTIME_MAX
      instead of the overflow value will result in the hrtimer function being run,
      and the reprogramming of the timer after that.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      [jstultz: Tweaked commit subject]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      8f294b5a
  3. 05 4月, 2013 12 次提交
  4. 26 3月, 2013 1 次提交
  5. 25 3月, 2013 2 次提交
  6. 23 3月, 2013 7 次提交
  7. 16 3月, 2013 3 次提交
    • F
      timekeeping: utilize the suspend-nonstop clocksource to count suspended time · e445cf1c
      Feng Tang 提交于
      There are some new processors whose TSC clocksource won't stop during
      suspend. Currently, after system resumes, kernel will use persistent
      clock or RTC to compensate the sleep time, but with these nonstop
      clocksources, we could skip the special compensation from external
      sources, and just use current clocksource for time recounting.
      
      This can solve some time drift bugs caused by some not-so-accurate or
      error-prone RTC devices.
      
      The current way to count suspended time is first try to use the persistent
      clock, and then try the RTC if persistent clock can't be used. This
      patch will change the trying order to:
      	suspend-nonstop clocksource -> persistent clock -> RTC
      
      When counting the sleep time with nonstop clocksource, use an accurate way
      suggested by Jason Gunthorpe to cover very large delta cycles.
      Signed-off-by: NFeng Tang <feng.tang@intel.com>
      [jstultz: Small optimization, avoiding re-reading the clocksource]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      e445cf1c
    • J
      timekeeping: Use inject_offset in warp_clock · 7859e404
      John Stultz 提交于
      When warping the clock (from a local time RTC), use
      timekeeping_inject_offset() to atomically add the offset.
      
      This avoids any minor time error caused by the delay between
      reading the time, and then setting the adjusted time.
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      7859e404
    • D
      timekeeping: Avoid adjust kernel time once hwclock kept in UTC time · c30bd099
      Dong Zhu 提交于
      If the Hardware Clock kept in local time,kernel will adjust the time
      to be UTC time.But if Hardware Clock kept in UTC time,system will make
      a dummy settimeofday call first (sys_tz.tz_minuteswest = 0) to make sure
      the time is not shifted,so at this point I think maybe it is not necessary
      to set the kernel time once the sys_tz.tz_minuteswest is zero.
      Signed-off-by: NDong Zhu <bluezhudong@gmail.com>
      [jstultz: Updated to merge with conflicting changes ]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      c30bd099
  8. 13 3月, 2013 3 次提交
    • T
      tick: Provide a check for a forced broadcast pending · eaa907c5
      Thomas Gleixner 提交于
      On the CPU which gets woken along with the target CPU of the broadcast
      the following happens:
      
        deep_idle()
      			<-- spurious wakeup
        broadcast_exit()
          set forced bit
        
        enable interrupts
          
      			<-- Nothing happens
      
        disable interrupts
      
        broadcast_enter()
      			<-- Here we observe the forced bit is set
        deep_idle()
      
      Now after that the target CPU of the broadcast runs the broadcast
      handler and finds the other CPU in both the broadcast and the forced
      mask, sends the IPI and stuff gets back to normal.
      
      So it's not actually harmful, just more evidence for the theory, that
      hardware designers have access to very special drug supplies.
      
      Now there is no point in going back to deep idle just to wake up again
      right away via an IPI. Provide a check which allows the idle code to
      avoid the deep idle transition.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: LAK <linux-arm-kernel@lists.infradead.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Arjan van de Veen <arjan@infradead.org>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Tested-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Jason Liu <liu.h.jason@gmail.com>
      Link: http://lkml.kernel.org/r/20130306111537.565418308@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      eaa907c5
    • T
      tick: Handle broadcast wakeup of multiple cpus · 989dcb64
      Thomas Gleixner 提交于
      Some brilliant hardware implementations wake multiple cores when the
      broadcast timer fires. This leads to the following interesting
      problem:
      
      CPU0				CPU1
      wakeup from idle		wakeup from idle
      
      leave broadcast mode		leave broadcast mode
       restart per cpu timer		 restart per cpu timer
       	     	 		go back to idle
      handle broadcast
       (empty mask)			
      				enter broadcast mode
      				programm broadcast device
      enter broadcast mode
      programm broadcast device
      
      So what happens is that due to the forced reprogramming of the cpu
      local timer, we need to set a event in the future. Now if we manage to
      go back to idle before the timer fires, we switch off the timer and
      arm the broadcast device with an already expired time (covered by
      forced mode). So in the worst case we repeat the above ping pong
      forever.
      					
      Unfortunately we have no information about what caused the wakeup, but
      we can check current time against the expiry time of the local cpu. If
      the local event is already in the past, we know that the broadcast
      timer is about to fire and send an IPI. So we mark ourself as an IPI
      target even if we left broadcast mode and avoid the reprogramming of
      the local cpu timer.
      
      This still leaves the possibility that a CPU which is not handling the
      broadcast interrupt is going to reach idle again before the IPI
      arrives. This can't be solved in the core code and will be handled in
      follow up patches.
      Reported-by: NJason Liu <liu.h.jason@gmail.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: LAK <linux-arm-kernel@lists.infradead.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Arjan van de Veen <arjan@infradead.org>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Tested-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
      Link: http://lkml.kernel.org/r/20130306111537.492045206@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      989dcb64
    • T
      tick: Avoid programming the local cpu timer if broadcast pending · 26517f3e
      Thomas Gleixner 提交于
      If the local cpu timer stops in deep idle, we arm the broadcast device
      and get woken by an IPI. Now when we return from deep idle we reenable
      the local cpu timer unconditionally before handling the IPI. But
      that's a pointless exercise: the timer is already expired and the IPI
      is on the way. And it's an expensive exercise as we use the forced
      reprogramming mode so that we do not lose a timer event. This forced
      reprogramming will loop at least once in the retry.
      
      To avoid this reprogramming, we mark the cpu in a pending bit mask
      before we send the IPI. Now when the IPI target cpu wakes up, it will
      see the pending bit set and skip the reprogramming. The reprogramming
      of the cpu local timer will happen in the IPI handler which runs the
      cpu local timer interrupt function.
      Reported-by: NJason Liu <liu.h.jason@gmail.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: LAK <linux-arm-kernel@lists.infradead.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Arjan van de Veen <arjan@infradead.org>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Tested-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
      Link: http://lkml.kernel.org/r/20130306111537.431082074@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      26517f3e
  9. 09 3月, 2013 1 次提交
  10. 07 3月, 2013 5 次提交
  11. 03 3月, 2013 2 次提交
    • A
      fix compat_sys_rt_sigprocmask() · db61ec29
      Al Viro 提交于
      Converting bitmask to 32bit granularity is fine, but we'd better
      _do_ something with the result.  Such as "copy it to userland"...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      db61ec29
    • J
      trace/ring_buffer: handle 64bit aligned structs · 649508f6
      James Hogan 提交于
      Some 32 bit architectures require 64 bit values to be aligned (for
      example Meta which has 64 bit read/write instructions). These require 8
      byte alignment of event data too, so use
      !CONFIG_HAVE_64BIT_ALIGNED_ACCESS instead of !CONFIG_64BIT ||
      CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS to decide alignment, and align
      buffer_data_page::data accordingly.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Acked-by: Steven Rostedt <rostedt@goodmis.org> (previous version subtly different)
      649508f6
  12. 02 3月, 2013 1 次提交