1. 20 8月, 2014 7 次提交
  2. 12 8月, 2014 6 次提交
  3. 01 8月, 2014 1 次提交
    • J
      timer: Fix lock inversion between hrtimer_bases.lock and scheduler locks · 504d5874
      Jan Kara 提交于
      clockevents_increase_min_delta() calls printk() from under
      hrtimer_bases.lock. That causes lock inversion on scheduler locks because
      printk() can call into the scheduler. Lockdep puts it as:
      
      ======================================================
      [ INFO: possible circular locking dependency detected ]
      3.15.0-rc8-06195-g939f04be #2 Not tainted
      -------------------------------------------------------
      trinity-main/74 is trying to acquire lock:
       (&port_lock_key){-.....}, at: [<811c60be>] serial8250_console_write+0x8c/0x10c
      
      but task is already holding lock:
       (hrtimer_bases.lock){-.-...}, at: [<8103caeb>] hrtimer_try_to_cancel+0x13/0x66
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #5 (hrtimer_bases.lock){-.-...}:
             [<8104a942>] lock_acquire+0x92/0x101
             [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e
             [<8103c918>] __hrtimer_start_range_ns+0x1c/0x197
             [<8107ec20>] perf_swevent_start_hrtimer.part.41+0x7a/0x85
             [<81080792>] task_clock_event_start+0x3a/0x3f
             [<810807a4>] task_clock_event_add+0xd/0x14
             [<8108259a>] event_sched_in+0xb6/0x17a
             [<810826a2>] group_sched_in+0x44/0x122
             [<81082885>] ctx_sched_in.isra.67+0x105/0x11f
             [<810828e6>] perf_event_sched_in.isra.70+0x47/0x4b
             [<81082bf6>] __perf_install_in_context+0x8b/0xa3
             [<8107eb8e>] remote_function+0x12/0x2a
             [<8105f5af>] smp_call_function_single+0x2d/0x53
             [<8107e17d>] task_function_call+0x30/0x36
             [<8107fb82>] perf_install_in_context+0x87/0xbb
             [<810852c9>] SYSC_perf_event_open+0x5c6/0x701
             [<810856f9>] SyS_perf_event_open+0x17/0x19
             [<8142f8ee>] syscall_call+0x7/0xb
      
      -> #4 (&ctx->lock){......}:
             [<8104a942>] lock_acquire+0x92/0x101
             [<8142f04c>] _raw_spin_lock+0x21/0x30
             [<81081df3>] __perf_event_task_sched_out+0x1dc/0x34f
             [<8142cacc>] __schedule+0x4c6/0x4cb
             [<8142cae0>] schedule+0xf/0x11
             [<8142f9a6>] work_resched+0x5/0x30
      
      -> #3 (&rq->lock){-.-.-.}:
             [<8104a942>] lock_acquire+0x92/0x101
             [<8142f04c>] _raw_spin_lock+0x21/0x30
             [<81040873>] __task_rq_lock+0x33/0x3a
             [<8104184c>] wake_up_new_task+0x25/0xc2
             [<8102474b>] do_fork+0x15c/0x2a0
             [<810248a9>] kernel_thread+0x1a/0x1f
             [<814232a2>] rest_init+0x1a/0x10e
             [<817af949>] start_kernel+0x303/0x308
             [<817af2ab>] i386_start_kernel+0x79/0x7d
      
      -> #2 (&p->pi_lock){-.-...}:
             [<8104a942>] lock_acquire+0x92/0x101
             [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e
             [<810413dd>] try_to_wake_up+0x1d/0xd6
             [<810414cd>] default_wake_function+0xb/0xd
             [<810461f3>] __wake_up_common+0x39/0x59
             [<81046346>] __wake_up+0x29/0x3b
             [<811b8733>] tty_wakeup+0x49/0x51
             [<811c3568>] uart_write_wakeup+0x17/0x19
             [<811c5dc1>] serial8250_tx_chars+0xbc/0xfb
             [<811c5f28>] serial8250_handle_irq+0x54/0x6a
             [<811c5f57>] serial8250_default_handle_irq+0x19/0x1c
             [<811c56d8>] serial8250_interrupt+0x38/0x9e
             [<810510e7>] handle_irq_event_percpu+0x5f/0x1e2
             [<81051296>] handle_irq_event+0x2c/0x43
             [<81052cee>] handle_level_irq+0x57/0x80
             [<81002a72>] handle_irq+0x46/0x5c
             [<810027df>] do_IRQ+0x32/0x89
             [<8143036e>] common_interrupt+0x2e/0x33
             [<8142f23c>] _raw_spin_unlock_irqrestore+0x3f/0x49
             [<811c25a4>] uart_start+0x2d/0x32
             [<811c2c04>] uart_write+0xc7/0xd6
             [<811bc6f6>] n_tty_write+0xb8/0x35e
             [<811b9beb>] tty_write+0x163/0x1e4
             [<811b9cd9>] redirected_tty_write+0x6d/0x75
             [<810b6ed6>] vfs_write+0x75/0xb0
             [<810b7265>] SyS_write+0x44/0x77
             [<8142f8ee>] syscall_call+0x7/0xb
      
      -> #1 (&tty->write_wait){-.....}:
             [<8104a942>] lock_acquire+0x92/0x101
             [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e
             [<81046332>] __wake_up+0x15/0x3b
             [<811b8733>] tty_wakeup+0x49/0x51
             [<811c3568>] uart_write_wakeup+0x17/0x19
             [<811c5dc1>] serial8250_tx_chars+0xbc/0xfb
             [<811c5f28>] serial8250_handle_irq+0x54/0x6a
             [<811c5f57>] serial8250_default_handle_irq+0x19/0x1c
             [<811c56d8>] serial8250_interrupt+0x38/0x9e
             [<810510e7>] handle_irq_event_percpu+0x5f/0x1e2
             [<81051296>] handle_irq_event+0x2c/0x43
             [<81052cee>] handle_level_irq+0x57/0x80
             [<81002a72>] handle_irq+0x46/0x5c
             [<810027df>] do_IRQ+0x32/0x89
             [<8143036e>] common_interrupt+0x2e/0x33
             [<8142f23c>] _raw_spin_unlock_irqrestore+0x3f/0x49
             [<811c25a4>] uart_start+0x2d/0x32
             [<811c2c04>] uart_write+0xc7/0xd6
             [<811bc6f6>] n_tty_write+0xb8/0x35e
             [<811b9beb>] tty_write+0x163/0x1e4
             [<811b9cd9>] redirected_tty_write+0x6d/0x75
             [<810b6ed6>] vfs_write+0x75/0xb0
             [<810b7265>] SyS_write+0x44/0x77
             [<8142f8ee>] syscall_call+0x7/0xb
      
      -> #0 (&port_lock_key){-.....}:
             [<8104a62d>] __lock_acquire+0x9ea/0xc6d
             [<8104a942>] lock_acquire+0x92/0x101
             [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e
             [<811c60be>] serial8250_console_write+0x8c/0x10c
             [<8104e402>] call_console_drivers.constprop.31+0x87/0x118
             [<8104f5d5>] console_unlock+0x1d7/0x398
             [<8104fb70>] vprintk_emit+0x3da/0x3e4
             [<81425f76>] printk+0x17/0x19
             [<8105bfa0>] clockevents_program_min_delta+0x104/0x116
             [<8105c548>] clockevents_program_event+0xe7/0xf3
             [<8105cc1c>] tick_program_event+0x1e/0x23
             [<8103c43c>] hrtimer_force_reprogram+0x88/0x8f
             [<8103c49e>] __remove_hrtimer+0x5b/0x79
             [<8103cb21>] hrtimer_try_to_cancel+0x49/0x66
             [<8103cb4b>] hrtimer_cancel+0xd/0x18
             [<8107f102>] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30
             [<81080705>] task_clock_event_stop+0x20/0x64
             [<81080756>] task_clock_event_del+0xd/0xf
             [<81081350>] event_sched_out+0xab/0x11e
             [<810813e0>] group_sched_out+0x1d/0x66
             [<81081682>] ctx_sched_out+0xaf/0xbf
             [<81081e04>] __perf_event_task_sched_out+0x1ed/0x34f
             [<8142cacc>] __schedule+0x4c6/0x4cb
             [<8142cae0>] schedule+0xf/0x11
             [<8142f9a6>] work_resched+0x5/0x30
      
      other info that might help us debug this:
      
      Chain exists of:
        &port_lock_key --> &ctx->lock --> hrtimer_bases.lock
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(hrtimer_bases.lock);
                                     lock(&ctx->lock);
                                     lock(hrtimer_bases.lock);
        lock(&port_lock_key);
      
       *** DEADLOCK ***
      
      4 locks held by trinity-main/74:
       #0:  (&rq->lock){-.-.-.}, at: [<8142c6f3>] __schedule+0xed/0x4cb
       #1:  (&ctx->lock){......}, at: [<81081df3>] __perf_event_task_sched_out+0x1dc/0x34f
       #2:  (hrtimer_bases.lock){-.-...}, at: [<8103caeb>] hrtimer_try_to_cancel+0x13/0x66
       #3:  (console_lock){+.+...}, at: [<8104fb5d>] vprintk_emit+0x3c7/0x3e4
      
      stack backtrace:
      CPU: 0 PID: 74 Comm: trinity-main Not tainted 3.15.0-rc8-06195-g939f04be #2
       00000000 81c3a310 8b995c14 81426f69 8b995c44 81425a99 8161f671 8161f570
       8161f538 8161f559 8161f538 8b995c78 8b142bb0 00000004 8b142fdc 8b142bb0
       8b995ca8 8104a62d 8b142fac 000016f2 81c3a310 00000001 00000001 00000003
      Call Trace:
       [<81426f69>] dump_stack+0x16/0x18
       [<81425a99>] print_circular_bug+0x18f/0x19c
       [<8104a62d>] __lock_acquire+0x9ea/0xc6d
       [<8104a942>] lock_acquire+0x92/0x101
       [<811c60be>] ? serial8250_console_write+0x8c/0x10c
       [<811c6032>] ? wait_for_xmitr+0x76/0x76
       [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e
       [<811c60be>] ? serial8250_console_write+0x8c/0x10c
       [<811c60be>] serial8250_console_write+0x8c/0x10c
       [<8104af87>] ? lock_release+0x191/0x223
       [<811c6032>] ? wait_for_xmitr+0x76/0x76
       [<8104e402>] call_console_drivers.constprop.31+0x87/0x118
       [<8104f5d5>] console_unlock+0x1d7/0x398
       [<8104fb70>] vprintk_emit+0x3da/0x3e4
       [<81425f76>] printk+0x17/0x19
       [<8105bfa0>] clockevents_program_min_delta+0x104/0x116
       [<8105cc1c>] tick_program_event+0x1e/0x23
       [<8103c43c>] hrtimer_force_reprogram+0x88/0x8f
       [<8103c49e>] __remove_hrtimer+0x5b/0x79
       [<8103cb21>] hrtimer_try_to_cancel+0x49/0x66
       [<8103cb4b>] hrtimer_cancel+0xd/0x18
       [<8107f102>] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30
       [<81080705>] task_clock_event_stop+0x20/0x64
       [<81080756>] task_clock_event_del+0xd/0xf
       [<81081350>] event_sched_out+0xab/0x11e
       [<810813e0>] group_sched_out+0x1d/0x66
       [<81081682>] ctx_sched_out+0xaf/0xbf
       [<81081e04>] __perf_event_task_sched_out+0x1ed/0x34f
       [<8104416d>] ? __dequeue_entity+0x23/0x27
       [<81044505>] ? pick_next_task_fair+0xb1/0x120
       [<8142cacc>] __schedule+0x4c6/0x4cb
       [<81047574>] ? trace_hardirqs_off_caller+0xd7/0x108
       [<810475b0>] ? trace_hardirqs_off+0xb/0xd
       [<81056346>] ? rcu_irq_exit+0x64/0x77
      
      Fix the problem by using printk_deferred() which does not call into the
      scheduler.
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Cc: stable@vger.kernel.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      504d5874
  4. 31 7月, 2014 3 次提交
  5. 30 7月, 2014 1 次提交
  6. 29 7月, 2014 1 次提交
  7. 28 7月, 2014 5 次提交
  8. 24 7月, 2014 5 次提交
    • S
      ftrace: Add warning if tramp hash does not match nr_trampolines · dc6f03f2
      Steven Rostedt (Red Hat) 提交于
      After adding all the records to the tramp_hash, add a check that makes
      sure that the number of records added matches the number of records
      expected to match and do a WARN_ON and disable ftrace if they do
      not match.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      dc6f03f2
    • S
      ftrace: Fix trampoline hash update check on rec->flags · 2a0343ba
      Steven Rostedt (Red Hat) 提交于
      In the loop of ftrace_save_ops_tramp_hash(), it adds all the recs
      to the ops hash if the rec has only one callback attached and the
      ops is connected to the rec. It gives a nasty warning and shuts down
      ftrace if the rec doesn't have a trampoline set for it. But this
      can happen with the following scenario:
      
        # cd /sys/kernel/debug/tracing
        # echo schedule do_IRQ > set_ftrace_filter
        # mkdir instances/foo
        # echo schedule > instances/foo/set_ftrace_filter
        # echo function_graph > current_function
        # echo function > instances/foo/current_function
        # echo nop > instances/foo/current_function
      
      The above would then trigger the following warning and disable
      ftrace:
      
       ------------[ cut here ]------------
       WARNING: CPU: 0 PID: 3145 at kernel/trace/ftrace.c:2212 ftrace_run_update_code+0xe4/0x15b()
       Modules linked in: ipt_MASQUERADE sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ip [...]
       CPU: 1 PID: 3145 Comm: bash Not tainted 3.16.0-rc3-test+ #136
       Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
        0000000000000000 ffffffff81808a88 ffffffff81502130 0000000000000000
        ffffffff81040ca1 ffff880077c08000 ffffffff810bd286 0000000000000001
        ffffffff81a56830 ffff88007a041be0 ffff88007a872d60 00000000000001be
       Call Trace:
        [<ffffffff81502130>] ? dump_stack+0x4a/0x75
        [<ffffffff81040ca1>] ? warn_slowpath_common+0x7e/0x97
        [<ffffffff810bd286>] ? ftrace_run_update_code+0xe4/0x15b
        [<ffffffff810bd286>] ? ftrace_run_update_code+0xe4/0x15b
        [<ffffffff810bda1a>] ? ftrace_shutdown+0x11c/0x16b
        [<ffffffff810bda87>] ? unregister_ftrace_function+0x1e/0x38
        [<ffffffff810cc7e1>] ? function_trace_reset+0x1a/0x28
        [<ffffffff810c924f>] ? tracing_set_tracer+0xc1/0x276
        [<ffffffff810c9477>] ? tracing_set_trace_write+0x73/0x91
        [<ffffffff81132383>] ? __sb_start_write+0x9a/0xcc
        [<ffffffff8120478f>] ? security_file_permission+0x1b/0x31
        [<ffffffff81130e49>] ? vfs_write+0xac/0x11c
        [<ffffffff8113115d>] ? SyS_write+0x60/0x8e
        [<ffffffff81508112>] ? system_call_fastpath+0x16/0x1b
       ---[ end trace 938c4415cbc7dc96 ]---
       ------------[ cut here ]------------
      
      Link: http://lkml.kernel.org/r/20140723120805.GB21376@redhat.comReported-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2a0343ba
    • S
      sched_clock: Avoid corrupting hrtimer tree during suspend · f723aa18
      Stephen Boyd 提交于
      During suspend we call sched_clock_poll() to update the epoch and
      accumulated time and reprogram the sched_clock_timer to fire
      before the next wrap-around time. Unfortunately,
      sched_clock_poll() doesn't restart the timer, instead it relies
      on the hrtimer layer to do that and during suspend we aren't
      calling that function from the hrtimer layer. Instead, we're
      reprogramming the expires time while the hrtimer is enqueued,
      which can cause the hrtimer tree to be corrupted. Furthermore, we
      restart the timer during suspend but we update the epoch during
      resume which seems counter-intuitive.
      
      Let's fix this by saving the accumulated state and canceling the
      timer during suspend. On resume we can update the epoch and
      restart the timer similar to what we would do if we were starting
      the clock for the first time.
      
      Fixes: a08ca5d1 "sched_clock: Use an hrtimer instead of timer"
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Link: http://lkml.kernel.org/r/1406174630-23458-1-git-send-email-john.stultz@linaro.org
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      f723aa18
    • S
      ring-buffer: Use rb_page_size() instead of open coded head_page size · 10e83fd0
      Steven Rostedt (Red Hat) 提交于
      There's a helper function to get a ring buffer page size (the number
      of bytes of data recorded on the page), called rb_page_size().
      Use that instead of open coding it.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      10e83fd0
    • S
      ftrace: Rename ftrace_ops field from trampolines to nr_trampolines · 0162d621
      Steven Rostedt (Red Hat) 提交于
      Having two fields within the same struct that is off by one character
      can be confusing and error prone. Rename the counter "trampolines"
      to "nr_trampolines" to explicitly show it is a counter and not to
      be confused by the "trampoline" field.
      Suggested-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      0162d621
  9. 23 7月, 2014 6 次提交
    • L
      workqueue: use nr_node_ids instead of wq_numa_tbl_len · ddcb57e2
      Lai Jiangshan 提交于
      They are the same and nr_node_ids is provided by the memory subsystem.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      ddcb57e2
    • L
      workqueue: remove the misnamed out_unlock label in get_unbound_pool() · 3fb1823c
      Lai Jiangshan 提交于
      After the locking was moved up to the caller of the get_unbound_pool(),
      out_unlock label doesn't need to do any unlock operation and the name
      became bad, so we just remove this label, and the only usage-site
      "goto out_unlock" is subsituted to "return pool".
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      3fb1823c
    • L
      workqueue: remove the stale comment in pwq_unbound_release_workfn() · 29b1cb41
      Lai Jiangshan 提交于
      In 75ccf595 ("workqueue: prepare flush_workqueue() for dynamic
      creation and destrucion of unbound pool_workqueues"), a comment
      about the synchronization for the pwq in pwq_unbound_release_workfn()
      was added. The comment claimed the flush_mutex wasn't strictly
      necessary, it was correct in that time, due to the pwq was protected
      by workqueue_lock.
      
      But it is incorrect now since the wq->flush_mutex was renamed to
      wq->mutex and workqueue_lock was removed, the wq->mutex is strictly
      needed. But the comment was miss-updated when the synchronization
      was changed.
      
      This patch removes the incorrect comments and doesn't add any new
      comment to explain why wq->mutex is needed here, which is definitely
      obvious and wq->pwqs_node has "WQ" notation in its definition which is
      better comment.
      
      The old commit mentioned above also introduced a comment in link_pwq()
      about the synchronization. This comment is also removed in this patch
      since the whole link_pwq() is proteced by wq->mutex.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      29b1cb41
    • L
      workqueue: move rescuer pool detachment to the end · 13b1d625
      Lai Jiangshan 提交于
      In 51697d39 ("workqueue: use generic attach/detach routine for
      rescuers"), The rescuer detaches itself from the pool before put_pwq()
      so that the put_unbound_pool() will not destroy the rescuer-attached
      pool.
      
      It is unnecessary.  worker_detach_from_pool() can be used as the last
      statement to access to the pool just like the regular workers,
      put_unbound_pool() will wait for it to detach and then free the pool.
      
      So we move the worker_detach_from_pool() down, make it coincide with
      the regular workers.
      
      tj: Minor description update.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      13b1d625
    • L
      workqueue: unfold start_worker() into create_worker() · 051e1850
      Lai Jiangshan 提交于
      Simply unfold the code of start_worker() into create_worker() and
      remove the original start_worker() and create_and_start_worker().
      
      The only trade-off is the introduced overhead that the pool->lock
      is released and regrabbed after the newly worker is started.
      The overhead is acceptible since the manager is slow path.
      
      And because this new locking behavior, the newly created worker
      may grab the lock earlier than the manager and go to process
      work items. In this case, the recheck need_to_create_worker() may be
      true as expected and the manager goes to restart which is the
      correct behavior.
      
      tj: Minor updates to description and comments.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      051e1850
    • L
      workqueue: remove @wakeup from worker_set_flags() · 228f1d00
      Lai Jiangshan 提交于
      worker_set_flags() has only two callers, each specifying %true and
      %false for @wakeup.  Let's push the wake up to the caller and remove
      @wakeup from worker_set_flags().  The caller can use the following
      instead if wakeup is necessary:
      
      	worker_set_flags();
      	if (need_more_worker(pool))
       		wake_up_worker(pool);
      
      This makes the code simpler.  This patch doesn't introduce behavior
      changes.
      
      tj: Updated description and comments.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      228f1d00
  10. 22 7月, 2014 1 次提交
    • L
      workqueue: remove an unneeded UNBOUND test before waking up the next worker · a489a03e
      Lai Jiangshan 提交于
      In process_one_work():
      
      	if ((worker->flags & WORKER_UNBOUND) && need_more_worker(pool))
      		wake_up_worker(pool);
      
      the first test is unneeded.  Even if the first test is removed, it
      doesn't affect the wake-up logic for WORKER_UNBOUND, and it will not
      introduce any useless wake-ups for normal per-cpu workers since
      nr_running is always >= 1.  It will introduce useless/redundant
      wake-ups for CPU_INTENSIVE, but this case is rare and the next patch
      will also remove this redundant wake-up.
      
      tj: Minor updates to the description and comment.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      a489a03e
  11. 21 7月, 2014 1 次提交
  12. 19 7月, 2014 3 次提交