1. 24 7月, 2014 2 次提交
  2. 23 6月, 2014 6 次提交
  3. 05 6月, 2014 1 次提交
    • J
      timekeeping: use printk_deferred when holding timekeeping seqlock · 6d9bcb62
      John Stultz 提交于
      Jiri Bohac pointed out that there are rare but potential deadlock
      possibilities when calling printk while holding the timekeeping
      seqlock.
      
      This is due to printk() triggering console sem wakeup, which can
      cause scheduling code to trigger hrtimers which may try to read
      the time.
      
      Specifically, as Jiri pointed out, that path is:
        printk
          vprintk_emit
            console_unlock
              up(&console_sem)
                __up
      	    wake_up_process
      	      try_to_wake_up
      	        ttwu_do_activate
      		  ttwu_activate
      		    activate_task
      		      enqueue_task
      		        enqueue_task_fair
      			  hrtick_update
      			    hrtick_start_fair
      			      hrtick_start_fair
      			        get_time
      				  ktime_get
      				    --> endless loop on
      				    read_seqcount_retry(&timekeeper_seq, ...)
      
      This patch tries to avoid this issue by using printk_deferred (previously
      named printk_sched) which should defer printing via a irq_work_queue.
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Reported-by: NJiri Bohac <jbohac@suse.cz>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6d9bcb62
  4. 13 5月, 2014 2 次提交
  5. 23 4月, 2014 1 次提交
  6. 16 4月, 2014 3 次提交
  7. 08 4月, 2014 1 次提交
  8. 28 3月, 2014 1 次提交
  9. 26 3月, 2014 2 次提交
  10. 03 3月, 2014 1 次提交
  11. 20 2月, 2014 1 次提交
    • S
      sched_clock: Prevent callers from seeing half-updated data · 5ae8aabe
      Stephen Boyd 提交于
      The generic sched_clock registration function was previously
      done lockless, due to the fact that it was expected to be called
      only once. However, now there are systems that may register
      multiple sched_clock sources, for which the lack of locking has
      casued problems:
      
      If two sched_clock sources are registered we may end up in a
      situation where a call to sched_clock() may be accessing the
      epoch cycle count for the old counter and the cycle count for the
      new counter. This can lead to confusing results where
      sched_clock() values jump and then are reset to 0 (due to the way
      the registration function forces the epoch_ns to be 0).
      
      Fix this by reorganizing the registration function to hold the
      seqlock for as short a time as possible while we update the
      clock_data structure for a new counter. We also put any
      accumulated time into epoch_ns instead of resetting the time to
      0 so that the clock doesn't reset after each successful
      registration.
      
      [jstultz: Added extra context to the commit message]
      Reported-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Josh Cartwright <joshc@codeaurora.org>
      Link: http://lkml.kernel.org/r/1392662736-7803-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      5ae8aabe
  12. 15 2月, 2014 1 次提交
  13. 14 2月, 2014 1 次提交
    • T
      tick: Clear broadcast pending bit when switching to oneshot · dd5fd9b9
      Thomas Gleixner 提交于
      AMD systems which use the C1E workaround in the amd_e400_idle routine
      trigger the WARN_ON_ONCE in the broadcast code when onlining a CPU.
      
      The reason is that the idle routine of those AMD systems switches the
      cpu into forced broadcast mode early on before the newly brought up
      CPU can switch over to high resolution / NOHZ mode. The timer related
      CPU1 bringup looks like this:
      
        clockevent_register_device(local_apic);
        tick_setup(local_apic);
        ...
        idle()
      	tick_broadcast_on_off(FORCE);
      	tick_broadcast_oneshot_control(ENTER)
      	  cpumask_set(cpu, broadcast_oneshot_mask);
      	halt();
      
      Now the broadcast interrupt on CPU0 sets CPU1 in the
      broadcast_pending_mask and wakes CPU1. So CPU1 continues:
      
      	local_apic_timer_interrupt()
      	   tick_handle_periodic();
      	   softirq()
      	     tick_init_highres();
      	       cpumask_clr(cpu, broadcast_oneshot_mask);
      	
      	tick_broadcast_oneshot_control(ENTER)
      	   WARN_ON(cpumask_test(cpu, broadcast_pending_mask);
      
      So while we remove CPU1 from the broadcast_oneshot_mask when we switch
      over to highres mode, we do not clear the pending bit, which then
      triggers the warning when we go back to idle.
      
      The reason why this is only visible on C1E affected AMD systems is
      that the other machines enter the deep sleep states via
      acpi_idle/intel_idle and exit the broadcast mode before executing the
      remote triggered local_apic_timer_interrupt. So the pending bit is
      already cleared when the switch over to highres mode is clearing the
      oneshot mask.
      
      The solution is simple: Clear the pending bit together with the mask
      bit when we switch over to highres mode.
      
      Stanislaw came up independently with the same patch by enforcing the
      C1E workaround and debugging the fallout. I picked mine, because mine
      has a changelog :)
      Reported-by: Npoma <pomidorabelisima@gmail.com>
      Debugged-by: NStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Olaf Hering <olaf@aepfle.de>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Justin M. Forbes <jforbes@redhat.com>
      Cc: Josh Boyer <jwboyer@redhat.com>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1402111434180.21991@ionos.tec.linutronix.de
      Cc: stable@vger.kernel.org # 3.10+
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      dd5fd9b9
  14. 09 2月, 2014 1 次提交
  15. 07 2月, 2014 6 次提交
  16. 06 2月, 2014 1 次提交
  17. 16 1月, 2014 3 次提交
  18. 13 1月, 2014 1 次提交
    • P
      sched/clock, x86: Use a static_key for sched_clock_stable · 35af99e6
      Peter Zijlstra 提交于
      In order to avoid the runtime condition and variable load turn
      sched_clock_stable into a static_key.
      
      Also provide a shorter implementation of local_clock() and
      cpu_clock(int) when sched_clock_stable==1.
      
                              MAINLINE   PRE       POST
      
          sched_clock_stable: 1          1         1
          (cold) sched_clock: 329841     221876    215295
          (cold) local_clock: 301773     234692    220773
          (warm) sched_clock: 38375      25602     25659
          (warm) local_clock: 100371     33265     27242
          (warm) rdtsc:       27340      24214     24208
          sched_clock_stable: 0          0         0
          (cold) sched_clock: 382634     235941    237019
          (cold) local_clock: 396890     297017    294819
          (warm) sched_clock: 38194      25233     25609
          (warm) local_clock: 143452     71234     71232
          (warm) rdtsc:       27345      24245     24243
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-eummbdechzz37mwmpags1gjr@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      35af99e6
  19. 12 1月, 2014 1 次提交
  20. 24 12月, 2013 4 次提交
    • J
      timekeeping: Remove comment that's mostly out of date · 38aef31c
      John Stultz 提交于
      Prior to 92bb1fcf (Only
      do nanosecond rounding on GENERIC_TIME_VSYSCALL_OLD
      systems), the comment here was accuate, but now we can
      mostly avoid the extra rounding which causes the unlikey
      to be actually likely here.
      
      So remove the out of date comment.
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      38aef31c
    • Y
      timekeeper: fix comment typo for tk_setup_internals() · d26e4fe0
      Yijing Wang 提交于
      Fix trivial comment typo for tk_setup_internals().
      Signed-off-by: NYijing Wang <wangyijing@huawei.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      d26e4fe0
    • J
      timekeeping: Fix missing timekeeping_update in suspend path · 330a1617
      John Stultz 提交于
      Since 48cdc135 (Implement a shadow timekeeper), we have to
      call timekeeping_update() after any adjustment to the timekeeping
      structure in order to make sure that any adjustments to the structure
      persist.
      
      In the timekeeping suspend path, we udpate the timekeeper
      structure, so we should be sure to update the shadow-timekeeper
      before releasing the timekeeping locks. Currently this isn't done.
      
      In most cases, the next time related code to run would be
      timekeeping_resume, which does update the shadow-timekeeper, but
      in an abundence of caution, this patch adds the call to
      timekeeping_update() in the suspend path.
      
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: stable <stable@vger.kernel.org> #3.10+
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      330a1617
    • J
      timekeeping: Fix CLOCK_TAI timer/nanosleep delays · 04005f60
      John Stultz 提交于
      A think-o in the calculation of the monotonic -> tai time offset
      results in CLOCK_TAI timers and nanosleeps to expire late (the
      latency is ~2x the tai offset).
      
      Fix this by adding the tai offset from the realtime offset instead
      of subtracting.
      
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: stable <stable@vger.kernel.org> #3.10+
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      04005f60