1. 21 6月, 2017 1 次提交
    • J
      time: Clean up CLOCK_MONOTONIC_RAW time handling · fc6eead7
      John Stultz 提交于
      Now that we fixed the sub-ns handling for CLOCK_MONOTONIC_RAW,
      remove the duplicitive tk->raw_time.tv_nsec, which can be
      stored in tk->tkr_raw.xtime_nsec (similarly to how its handled
      for monotonic time).
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Miroslav Lichvar <mlichvar@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Stephen Boyd <stephen.boyd@linaro.org>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Daniel Mentz <danielmentz@google.com>
      Tested-by: NDaniel Mentz <danielmentz@google.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      fc6eead7
  2. 20 6月, 2017 2 次提交
    • J
      time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting · 3d88d56c
      John Stultz 提交于
      Due to how the MONOTONIC_RAW accumulation logic was handled,
      there is the potential for a 1ns discontinuity when we do
      accumulations. This small discontinuity has for the most part
      gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW
      in their vDSO clock_gettime implementation, we've seen failures
      with the inconsistency-check test in kselftest.
      
      This patch addresses the issue by using the same sub-ns
      accumulation handling that CLOCK_MONOTONIC uses, which avoids
      the issue for in-kernel users.
      
      Since the ARM64 vDSO implementation has its own clock_gettime
      calculation logic, this patch reduces the frequency of errors,
      but failures are still seen. The ARM64 vDSO will need to be
      updated to include the sub-nanosecond xtime_nsec values in its
      calculation for this issue to be completely fixed.
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Tested-by: NDaniel Mentz <danielmentz@google.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Stephen Boyd <stephen.boyd@linaro.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: "stable #4 . 8+" <stable@vger.kernel.org>
      Cc: Miroslav Lichvar <mlichvar@redhat.com>
      Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      3d88d56c
    • J
      time: Fix clock->read(clock) race around clocksource changes · ceea5e37
      John Stultz 提交于
      In tests, which excercise switching of clocksources, a NULL
      pointer dereference can be observed on AMR64 platforms in the
      clocksource read() function:
      
      u64 clocksource_mmio_readl_down(struct clocksource *c)
      {
      	return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask;
      }
      
      This is called from the core timekeeping code via:
      
      	cycle_now = tkr->read(tkr->clock);
      
      tkr->read is the cached tkr->clock->read() function pointer.
      When the clocksource is changed then tkr->clock and tkr->read
      are updated sequentially. The code above results in a sequential
      load operation of tkr->read and tkr->clock as well.
      
      If the store to tkr->clock hits between the loads of tkr->read
      and tkr->clock, then the old read() function is called with the
      new clock pointer. As a consequence the read() function
      dereferences a different data structure and the resulting 'reg'
      pointer can point anywhere including NULL.
      
      This problem was introduced when the timekeeping code was
      switched over to use struct tk_read_base. Before that, it was
      theoretically possible as well when the compiler decided to
      reload clock in the code sequence:
      
           now = tk->clock->read(tk->clock);
      
      Add a helper function which avoids the issue by reading
      tk_read_base->clock once into a local variable clk and then issue
      the read function via clk->read(clk). This guarantees that the
      read() function always gets the proper clocksource pointer handed
      in.
      
      Since there is now no use for the tkr.read pointer, this patch
      also removes it, and to address stopping the fast timekeeper
      during suspend/resume, it introduces a dummy clocksource to use
      rather then just a dummy read function.
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Stephen Boyd <stephen.boyd@linaro.org>
      Cc: stable <stable@vger.kernel.org>
      Cc: Miroslav Lichvar <mlichvar@redhat.com>
      Cc: Daniel Mentz <danielmentz@google.com>
      Link: http://lkml.kernel.org/r/1496965462-20003-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      ceea5e37
  3. 13 6月, 2017 1 次提交
  4. 04 6月, 2017 2 次提交
    • T
      alarmtimer: Rate limit periodic intervals · ff86bf0c
      Thomas Gleixner 提交于
      The alarmtimer code has another source of potentially rearming itself too
      fast. Interval timers with a very samll interval have a similar CPU hog
      effect as the previously fixed overflow issue.
      
      The reason is that alarmtimers do not implement the normal protection
      against this kind of problem which the other posix timer use:
      
        timer expires -> queue signal -> deliver signal -> rearm timer
      
      This scheme brings the rearming under scheduler control and prevents
      permanently firing timers which hog the CPU.
      
      Bringing this scheme to the alarm timer code is a major overhaul because it
      lacks all the necessary mechanisms completely.
      
      So for a quick fix limit the interval to one jiffie. This is not
      problematic in practice as alarmtimers are usually backed by an RTC for
      suspend which have 1 second resolution. It could be therefor argued that
      the resolution of this clock should be set to 1 second in general, but
      that's outside the scope of this fix.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20170530211655.896767100@linutronix.de
      ff86bf0c
    • T
      alarmtimer: Prevent overflow of relative timers · f4781e76
      Thomas Gleixner 提交于
      Andrey reported a alartimer related RCU stall while fuzzing the kernel with
      syzkaller.
      
      The reason for this is an overflow in ktime_add() which brings the
      resulting time into negative space and causes immediate expiry of the
      timer. The following rearm with a small interval does not bring the timer
      back into positive space due to the same issue.
      
      This results in a permanent firing alarmtimer which hogs the CPU.
      
      Use ktime_add_safe() instead which detects the overflow and clamps the
      result to KTIME_SEC_MAX.
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20170530211655.802921648@linutronix.de
      f4781e76
  5. 24 5月, 2017 1 次提交
  6. 13 5月, 2017 1 次提交
  7. 20 4月, 2017 1 次提交
  8. 17 4月, 2017 1 次提交
  9. 15 4月, 2017 7 次提交
  10. 31 3月, 2017 1 次提交
  11. 24 3月, 2017 4 次提交
    • M
      treewide: Fix typo in xml/driver-api/basics.xml · 0ba42a59
      Masanari Iida 提交于
      This patch fix spelling typos found in
      Documentation/output/xml/driver-api/basics.xml.
      It is because the xml file was generated from comments in source,
      so I had to fix the comments.
      Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      0ba42a59
    • T
      sysrq: Reset the watchdog timers while displaying high-resolution timers · 01070427
      Tom Hromatka 提交于
      On systems with a large number of CPUs, running sysrq-<q> can cause
      watchdog timeouts.  There are two slow sections of code in the sysrq-<q>
      path in timer_list.c.
      
      1. print_active_timers() - This function is called by print_cpu() and
         contains a slow goto loop.  On a machine with hundreds of CPUs, this
         loop took approximately 100ms for the first CPU in a NUMA node.
         (Subsequent CPUs in the same node ran much quicker.)  The total time
         to print all of the CPUs is ultimately long enough to trigger the
         soft lockup watchdog.
      
      2. print_tickdevice() - This function outputs a large amount of textual
         information.  This function also took approximately 100ms per CPU.
      
      Since sysrq-<q> is not a performance critical path, there should be no
      harm in touching the nmi watchdog in both slow sections above.  Touching
      it in just one location was insufficient on systems with hundreds of
      CPUs as occasional timeouts were still observed during testing.
      
      This issue was observed on an Oracle T7 machine with 128 CPUs, but I
      anticipate it may affect other systems with similarly large numbers of
      CPUs.
      Signed-off-by: NTom Hromatka <tom.hromatka@oracle.com>
      Reviewed-by: NRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      01070427
    • D
      timers, sched_clock: Update timeout for clock wrap · 1b8955bc
      David Engraf 提交于
      The scheduler clock framework may not use the correct timeout for the clock
      wrap. This happens when a new clock driver calls sched_clock_register()
      after the kernel called sched_clock_postinit(). In this case the clock wrap
      timeout is too long thus sched_clock_poll() is called too late and the clock
      already wrapped.
      
      On my ARM system the scheduler was no longer scheduling any other task than
      the idle task because the sched_clock() wrapped.
      Signed-off-by: NDavid Engraf <david.engraf@sysgo.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      1b8955bc
    • N
      clockevents: Make clockevents_config() static · 0695bd99
      Nicolai Stange 提交于
      A clockevent device's rate should be configured before or at registration
      and changed afterwards through clockevents_update_freq() only.
      
      For the configuration at registration, we already have
      clockevents_config_and_register().
      
      Right now, there are no clockevents_config() users outside of the
      clockevents core.
      
      To mitigiate the risk of drivers errorneously reconfiguring their rates
      through clockevents_config() *after* device registration, make
      clockevents_config() static.
      Signed-off-by: NNicolai Stange <nicstange@gmail.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      0695bd99
  12. 23 3月, 2017 1 次提交
    • R
      cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely · b7eaf1aa
      Rafael J. Wysocki 提交于
      The way the schedutil governor uses the PELT metric causes it to
      underestimate the CPU utilization in some cases.
      
      That can be easily demonstrated by running kernel compilation on
      a Sandy Bridge Intel processor, running turbostat in parallel with
      it and looking at the values written to the MSR_IA32_PERF_CTL
      register.  Namely, the expected result would be that when all CPUs
      were 100% busy, all of them would be requested to run in the maximum
      P-state, but observation shows that this clearly isn't the case.
      The CPUs run in the maximum P-state for a while and then are
      requested to run slower and go back to the maximum P-state after
      a while again.  That causes the actual frequency of the processor to
      visibly oscillate below the sustainable maximum in a jittery fashion
      which clearly is not desirable.
      
      That has been attributed to CPU utilization metric updates on task
      migration that cause the total utilization value for the CPU to be
      reduced by the utilization of the migrated task.  If that happens,
      the schedutil governor may see a CPU utilization reduction and will
      attempt to reduce the CPU frequency accordingly right away.  That
      may be premature, though, for example if the system is generally
      busy and there are other runnable tasks waiting to be run on that
      CPU already.
      
      This is unlikely to be an issue on systems where cpufreq policies are
      shared between multiple CPUs, because in those cases the policy
      utilization is computed as the maximum of the CPU utilization values
      over the whole policy and if that turns out to be low, reducing the
      frequency for the policy most likely is a good idea anyway.  On
      systems with one CPU per policy, however, it may affect performance
      adversely and even lead to increased energy consumption in some cases.
      
      On those systems it may be addressed by taking another utilization
      metric into consideration, like whether or not the CPU whose
      frequency is about to be reduced has been idle recently, because if
      that's not the case, the CPU is likely to be busy in the near future
      and its frequency should not be reduced.
      
      To that end, use the counter of idle calls in the timekeeping code.
      Namely, make the schedutil governor look at that counter for the
      current CPU every time before its frequency is about to be reduced.
      If the counter has not changed since the previous iteration of the
      governor computations for that CPU, the CPU has been busy for all
      that time and its frequency should not be decreased, so if the new
      frequency would be lower than the one set previously, the governor
      will skip the frequency update.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Reviewed-by: NJoel Fernandes <joelaf@google.com>
      b7eaf1aa
  13. 18 3月, 2017 1 次提交
  14. 14 3月, 2017 1 次提交
  15. 07 3月, 2017 1 次提交
  16. 02 3月, 2017 10 次提交
  17. 18 2月, 2017 1 次提交
  18. 17 2月, 2017 1 次提交
    • L
      Revert "nohz: Fix collision between tick and other hrtimers" · 558e8e27
      Linus Torvalds 提交于
      This reverts commit 24b91e36 and commit
      7bdb59f1 ("tick/nohz: Fix possible missing clock reprog after tick
      soft restart") that depends on it,
      
      Pavel reports that it causes occasional boot hangs for him that seem to
      depend on just how the machine was booted.  In particular, his machine
      hangs at around the PCI fixups of the EHCI USB host controller, but only
      hangs from cold boot, not from a warm boot.
      
      Thomas Gleixner suspecs it's a CPU hotplug interaction, particularly
      since Pavel also saw suspend/resume issues that seem to be related.
      We're reverting for now while trying to figure out the root cause.
      Reported-bisected-and-tested-by: NPavel Machek <pavel@ucw.cz>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@kernel.org  # reverted commits were marked for stable
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      558e8e27
  19. 15 2月, 2017 1 次提交
    • S
      timekeeping: Use deferred printk() in debug code · f222449c
      Sergey Senozhatsky 提交于
      We cannot do printk() from tk_debug_account_sleep_time(), because
      tk_debug_account_sleep_time() is called under tk_core seq lock.
      The reason why printk() is unsafe there is that console_sem may
      invoke scheduler (up()->wake_up_process()->activate_task()), which,
      in turn, can return back to timekeeping code, for instance, via
      get_time()->ktime_get(), deadlocking the system on tk_core seq lock.
      
      [   48.950592] ======================================================
      [   48.950622] [ INFO: possible circular locking dependency detected ]
      [   48.950622] 4.10.0-rc7-next-20170213+ #101 Not tainted
      [   48.950622] -------------------------------------------------------
      [   48.950622] kworker/0:0/3 is trying to acquire lock:
      [   48.950653]  (tk_core){----..}, at: [<c01cc624>] retrigger_next_event+0x4c/0x90
      [   48.950683]
                     but task is already holding lock:
      [   48.950683]  (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90
      [   48.950714]
                     which lock already depends on the new lock.
      
      [   48.950714]
                     the existing dependency chain (in reverse order) is:
      [   48.950714]
                     -> #5 (hrtimer_bases.lock){-.-...}:
      [   48.950744]        _raw_spin_lock_irqsave+0x50/0x64
      [   48.950775]        lock_hrtimer_base+0x28/0x58
      [   48.950775]        hrtimer_start_range_ns+0x20/0x5c8
      [   48.950775]        __enqueue_rt_entity+0x320/0x360
      [   48.950805]        enqueue_rt_entity+0x2c/0x44
      [   48.950805]        enqueue_task_rt+0x24/0x94
      [   48.950836]        ttwu_do_activate+0x54/0xc0
      [   48.950836]        try_to_wake_up+0x248/0x5c8
      [   48.950836]        __setup_irq+0x420/0x5f0
      [   48.950836]        request_threaded_irq+0xdc/0x184
      [   48.950866]        devm_request_threaded_irq+0x58/0xa4
      [   48.950866]        omap_i2c_probe+0x530/0x6a0
      [   48.950897]        platform_drv_probe+0x50/0xb0
      [   48.950897]        driver_probe_device+0x1f8/0x2cc
      [   48.950897]        __driver_attach+0xc0/0xc4
      [   48.950927]        bus_for_each_dev+0x6c/0xa0
      [   48.950927]        bus_add_driver+0x100/0x210
      [   48.950927]        driver_register+0x78/0xf4
      [   48.950958]        do_one_initcall+0x3c/0x16c
      [   48.950958]        kernel_init_freeable+0x20c/0x2d8
      [   48.950958]        kernel_init+0x8/0x110
      [   48.950988]        ret_from_fork+0x14/0x24
      [   48.950988]
                     -> #4 (&rt_b->rt_runtime_lock){-.-...}:
      [   48.951019]        _raw_spin_lock+0x40/0x50
      [   48.951019]        rq_offline_rt+0x9c/0x2bc
      [   48.951019]        set_rq_offline.part.2+0x2c/0x58
      [   48.951049]        rq_attach_root+0x134/0x144
      [   48.951049]        cpu_attach_domain+0x18c/0x6f4
      [   48.951049]        build_sched_domains+0xba4/0xd80
      [   48.951080]        sched_init_smp+0x68/0x10c
      [   48.951080]        kernel_init_freeable+0x160/0x2d8
      [   48.951080]        kernel_init+0x8/0x110
      [   48.951080]        ret_from_fork+0x14/0x24
      [   48.951110]
                     -> #3 (&rq->lock){-.-.-.}:
      [   48.951110]        _raw_spin_lock+0x40/0x50
      [   48.951141]        task_fork_fair+0x30/0x124
      [   48.951141]        sched_fork+0x194/0x2e0
      [   48.951141]        copy_process.part.5+0x448/0x1a20
      [   48.951171]        _do_fork+0x98/0x7e8
      [   48.951171]        kernel_thread+0x2c/0x34
      [   48.951171]        rest_init+0x1c/0x18c
      [   48.951202]        start_kernel+0x35c/0x3d4
      [   48.951202]        0x8000807c
      [   48.951202]
                     -> #2 (&p->pi_lock){-.-.-.}:
      [   48.951232]        _raw_spin_lock_irqsave+0x50/0x64
      [   48.951232]        try_to_wake_up+0x30/0x5c8
      [   48.951232]        up+0x4c/0x60
      [   48.951263]        __up_console_sem+0x2c/0x58
      [   48.951263]        console_unlock+0x3b4/0x650
      [   48.951263]        vprintk_emit+0x270/0x474
      [   48.951293]        vprintk_default+0x20/0x28
      [   48.951293]        printk+0x20/0x30
      [   48.951324]        kauditd_hold_skb+0x94/0xb8
      [   48.951324]        kauditd_thread+0x1a4/0x56c
      [   48.951324]        kthread+0x104/0x148
      [   48.951354]        ret_from_fork+0x14/0x24
      [   48.951354]
                     -> #1 ((console_sem).lock){-.....}:
      [   48.951385]        _raw_spin_lock_irqsave+0x50/0x64
      [   48.951385]        down_trylock+0xc/0x2c
      [   48.951385]        __down_trylock_console_sem+0x24/0x80
      [   48.951385]        console_trylock+0x10/0x8c
      [   48.951416]        vprintk_emit+0x264/0x474
      [   48.951416]        vprintk_default+0x20/0x28
      [   48.951416]        printk+0x20/0x30
      [   48.951446]        tk_debug_account_sleep_time+0x5c/0x70
      [   48.951446]        __timekeeping_inject_sleeptime.constprop.3+0x170/0x1a0
      [   48.951446]        timekeeping_resume+0x218/0x23c
      [   48.951477]        syscore_resume+0x94/0x42c
      [   48.951477]        suspend_enter+0x554/0x9b4
      [   48.951477]        suspend_devices_and_enter+0xd8/0x4b4
      [   48.951507]        enter_state+0x934/0xbd4
      [   48.951507]        pm_suspend+0x14/0x70
      [   48.951507]        state_store+0x68/0xc8
      [   48.951538]        kernfs_fop_write+0xf4/0x1f8
      [   48.951538]        __vfs_write+0x1c/0x114
      [   48.951538]        vfs_write+0xa0/0x168
      [   48.951568]        SyS_write+0x3c/0x90
      [   48.951568]        __sys_trace_return+0x0/0x10
      [   48.951568]
                     -> #0 (tk_core){----..}:
      [   48.951599]        lock_acquire+0xe0/0x294
      [   48.951599]        ktime_get_update_offsets_now+0x5c/0x1d4
      [   48.951629]        retrigger_next_event+0x4c/0x90
      [   48.951629]        on_each_cpu+0x40/0x7c
      [   48.951629]        clock_was_set_work+0x14/0x20
      [   48.951660]        process_one_work+0x2b4/0x808
      [   48.951660]        worker_thread+0x3c/0x550
      [   48.951660]        kthread+0x104/0x148
      [   48.951690]        ret_from_fork+0x14/0x24
      [   48.951690]
                     other info that might help us debug this:
      
      [   48.951690] Chain exists of:
                       tk_core --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock
      
      [   48.951721]  Possible unsafe locking scenario:
      
      [   48.951721]        CPU0                    CPU1
      [   48.951721]        ----                    ----
      [   48.951721]   lock(hrtimer_bases.lock);
      [   48.951751]                                lock(&rt_b->rt_runtime_lock);
      [   48.951751]                                lock(hrtimer_bases.lock);
      [   48.951751]   lock(tk_core);
      [   48.951782]
                      *** DEADLOCK ***
      
      [   48.951782] 3 locks held by kworker/0:0/3:
      [   48.951782]  #0:  ("events"){.+.+.+}, at: [<c0156590>] process_one_work+0x1f8/0x808
      [   48.951812]  #1:  (hrtimer_work){+.+...}, at: [<c0156590>] process_one_work+0x1f8/0x808
      [   48.951843]  #2:  (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90
      [   48.951843]   stack backtrace:
      [   48.951873] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.10.0-rc7-next-20170213+
      [   48.951904] Workqueue: events clock_was_set_work
      [   48.951904] [<c0110208>] (unwind_backtrace) from [<c010c224>] (show_stack+0x10/0x14)
      [   48.951934] [<c010c224>] (show_stack) from [<c04ca6c0>] (dump_stack+0xac/0xe0)
      [   48.951934] [<c04ca6c0>] (dump_stack) from [<c019b5cc>] (print_circular_bug+0x1d0/0x308)
      [   48.951965] [<c019b5cc>] (print_circular_bug) from [<c019d2a8>] (validate_chain+0xf50/0x1324)
      [   48.951965] [<c019d2a8>] (validate_chain) from [<c019ec18>] (__lock_acquire+0x468/0x7e8)
      [   48.951995] [<c019ec18>] (__lock_acquire) from [<c019f634>] (lock_acquire+0xe0/0x294)
      [   48.951995] [<c019f634>] (lock_acquire) from [<c01d0ea0>] (ktime_get_update_offsets_now+0x5c/0x1d4)
      [   48.952026] [<c01d0ea0>] (ktime_get_update_offsets_now) from [<c01cc624>] (retrigger_next_event+0x4c/0x90)
      [   48.952026] [<c01cc624>] (retrigger_next_event) from [<c01e4e24>] (on_each_cpu+0x40/0x7c)
      [   48.952056] [<c01e4e24>] (on_each_cpu) from [<c01cafc4>] (clock_was_set_work+0x14/0x20)
      [   48.952056] [<c01cafc4>] (clock_was_set_work) from [<c015664c>] (process_one_work+0x2b4/0x808)
      [   48.952087] [<c015664c>] (process_one_work) from [<c0157774>] (worker_thread+0x3c/0x550)
      [   48.952087] [<c0157774>] (worker_thread) from [<c015d644>] (kthread+0x104/0x148)
      [   48.952087] [<c015d644>] (kthread) from [<c0107830>] (ret_from_fork+0x14/0x24)
      
      Replace printk() with printk_deferred(), which does not call into
      the scheduler.
      
      Fixes: 0bf43f15 ("timekeeping: Prints the amounts of time spent during suspend")
      Reported-and-tested-by: NTony Lindgren <tony@atomide.com>
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: "[4.9+]" <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      f222449c
  20. 13 2月, 2017 1 次提交
    • M
      tick/broadcast: Prevent deadlock on tick_broadcast_lock · 202461e2
      Mike Galbraith 提交于
      tick_broadcast_lock is taken from interrupt context, but the following call
      chain takes the lock without disabling interrupts:
      
      [   12.703736]  _raw_spin_lock+0x3b/0x50
      [   12.703738]  tick_broadcast_control+0x5a/0x1a0
      [   12.703742]  intel_idle_cpu_online+0x22/0x100
      [   12.703744]  cpuhp_invoke_callback+0x245/0x9d0
      [   12.703752]  cpuhp_thread_fun+0x52/0x110
      [   12.703754]  smpboot_thread_fn+0x276/0x320
      
      So the following deadlock can happen:
      
         lock(tick_broadcast_lock);
         <Interrupt>
            lock(tick_broadcast_lock);
      
      intel_idle_cpu_online() is the only place which violates the calling
      convention of tick_broadcast_control(). This was caused by the removal of
      the smp function call in course of the cpu hotplug rework.
      
      Instead of slapping local_irq_disable/enable() at the call site, we can
      relax the calling convention and handle it in the core code, which makes
      the whole machinery more robust.
      
      Fixes: 29d7bbad ("intel_idle: Remove superfluous SMP fuction call")
      Reported-by: NGabriel C <nix.or.die@gmail.com>
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Cc: Ruslan Ruslichenko <rruslich@cisco.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: lwn@lwn.net
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: stable <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1486953115.5912.4.camel@gmx.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      202461e2