- 20 8月, 2014 7 次提交
-
-
由 Kirill Tkhai 提交于
Implement task_on_rq_queued() and use it everywhere instead of on_rq check. No functional changes. The only exception is we do not use the wrapper in check_for_tasks(), because it requires to export task_on_rq_queued() in global header files. Next patch in series would return it back, so we do not twist it from here to there. Signed-off-by: NKirill Tkhai <ktkhai@parallels.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Kirill Tkhai <tkhai@yandex.ru> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1408528052.23412.87.camel@tkhaiSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kirill Tkhai 提交于
(sched_entity::on_rq == 1) does not guarantee the task is pickable; changes on throttled cfs_rq must not lead to reschedule. Check for task_struct::on_rq instead. Signed-off-by: NKirill Tkhai <ktkhai@parallels.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1407312361.8424.35.camel@tkhaiSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Pranith Kumar 提交于
Match the declaration of runqueues with the definition. Signed-off-by: NPranith Kumar <bobby.prani@gmail.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1407950893-32731-1-git-send-email-bobby.prani@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
Change autogroup_move_group() to use for_each_thread() instead of buggy while_each_thread(). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Frank Mayhar <fmayhar@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sanjay Rao <srao@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140813192003.GA19334@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
Change thread_group_cputime() to use for_each_thread() instead of buggy while_each_thread(). This also makes the pid_alive() check unnecessary. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Frank Mayhar <fmayhar@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sanjay Rao <srao@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140813192000.GA19327@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
Change kernel/sched/debug.c to use for_each_process_thread(). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Frank Mayhar <fmayhar@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sanjay Rao <srao@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140813191956.GA19324@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
Change kernel/sched/core.c to use for_each_process_thread(). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Frank Mayhar <fmayhar@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sanjay Rao <srao@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140813191953.GA19315@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 12 8月, 2014 6 次提交
-
-
由 Rik van Riel 提交于
Commit c61037e9 fixes the phenomenon of 'fantom' cores due to N*frac(smt_power) >= 1 by limiting the capacity to the actual number of cores in the load balancing code. This patch applies the same correction to the NUMA balancing code. Signed-off-by: NRik van Riel <riel@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: mgorman@suse.de Cc: vincent.guittot@linaro.org Cc: Morten.Rasmussen@arm.com Cc: nicolas.pitre@linaro.org Cc: efault@gmx.de Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1407173008-9334-3-git-send-email-riel@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Rik van Riel 提交于
Commit a43455a1 ensures that task_numa_migrate will call task_numa_compare on the preferred node all the time, even when the preferred node has no free capacity. This could lead to a performance regression if nr_running == capacity on both the source and the destination node. This can be avoided by also checking for nr_running == capacity on the source node, which is one stricter than checking .has_free_capacity. Signed-off-by: NRik van Riel <riel@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: mgorman@suse.de Cc: vincent.guittot@linaro.org Cc: Morten.Rasmussen@arm.com Cc: nicolas.pitre@linaro.org Cc: efault@gmx.de Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1407173008-9334-2-git-send-email-riel@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Zhihui Zhang 提交于
The child variable in build_overlap_sched_groups() actually refers to the peer or sibling domain of the given CPU. Rename it to sibling to be consistent with the naming in build_group_mask(). Signed-off-by: NZhihui Zhang <zzhsuny@gmail.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1406942283-18249-1-git-send-email-zzhsuny@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Allow calculate_imbalance() to 'create' idle cpus in the busiest group if there are idle cpus in the local group. Suggested-by: NRik van Riel <riel@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NVincent Guittot <vincent.guittot@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140729152705.GX12054@laptop.lanSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Rik van Riel 提交于
Currently update_sd_pick_busiest only identifies the busiest sd that is either overloaded, or has a group imbalance. When no sd is imbalanced or overloaded, the load balancer fails to find the busiest domain. This breaks load balancing between domains that are not overloaded, in the !SD_ASYM_PACKING case. This patch makes update_sd_pick_busiest return true when the busiest sd yet is encountered. Groups are ranked in the order overloaded > imbalanced > other, with higher ranked groups getting priority even when their load is lower. This is necessary due to the possibility of unequal capacities and cpumasks between domains within a sched group. Behaviour for SD_ASYM_PACKING does not seem to match the comment, but I have no hardware to test that so I have left the behaviour of that code unchanged. Enum for group classification suggested by Peter Zijlstra. Signed-off-by: NRik van Riel <riel@redhat.com> [peterz: replaced sg_lb_stats::group_imb with the new enum group_type in an attempt to avoid endless recalculation] Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NVincent Guittot <vincent.guittot@linaro.org> Acked-by: NMichael Neuling <mikey@neuling.org> Cc: ktkhai@parallels.com Cc: tim.c.chen@linux.intel.com Cc: nicolas.pitre@linaro.org Cc: jhladky@redhat.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140729152743.GI3935@laptopSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Rik noticed that calculate_imbalance() relies on update_sd_pick_busiest() to guarantee that busiest->sum_nr_running > busiest->group_capacity_factor. Break this implicit assumption (with the intent of not providing it anymore) by having calculat_imbalance() verify it and not rely on others. Reported-by: NRik van Riel <riel@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NVincent Guittot <vincent.guittot@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20140729152631.GW12054@laptop.lanSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 01 8月, 2014 1 次提交
-
-
由 Jan Kara 提交于
clockevents_increase_min_delta() calls printk() from under hrtimer_bases.lock. That causes lock inversion on scheduler locks because printk() can call into the scheduler. Lockdep puts it as: ====================================================== [ INFO: possible circular locking dependency detected ] 3.15.0-rc8-06195-g939f04be #2 Not tainted ------------------------------------------------------- trinity-main/74 is trying to acquire lock: (&port_lock_key){-.....}, at: [<811c60be>] serial8250_console_write+0x8c/0x10c but task is already holding lock: (hrtimer_bases.lock){-.-...}, at: [<8103caeb>] hrtimer_try_to_cancel+0x13/0x66 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #5 (hrtimer_bases.lock){-.-...}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<8103c918>] __hrtimer_start_range_ns+0x1c/0x197 [<8107ec20>] perf_swevent_start_hrtimer.part.41+0x7a/0x85 [<81080792>] task_clock_event_start+0x3a/0x3f [<810807a4>] task_clock_event_add+0xd/0x14 [<8108259a>] event_sched_in+0xb6/0x17a [<810826a2>] group_sched_in+0x44/0x122 [<81082885>] ctx_sched_in.isra.67+0x105/0x11f [<810828e6>] perf_event_sched_in.isra.70+0x47/0x4b [<81082bf6>] __perf_install_in_context+0x8b/0xa3 [<8107eb8e>] remote_function+0x12/0x2a [<8105f5af>] smp_call_function_single+0x2d/0x53 [<8107e17d>] task_function_call+0x30/0x36 [<8107fb82>] perf_install_in_context+0x87/0xbb [<810852c9>] SYSC_perf_event_open+0x5c6/0x701 [<810856f9>] SyS_perf_event_open+0x17/0x19 [<8142f8ee>] syscall_call+0x7/0xb -> #4 (&ctx->lock){......}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f04c>] _raw_spin_lock+0x21/0x30 [<81081df3>] __perf_event_task_sched_out+0x1dc/0x34f [<8142cacc>] __schedule+0x4c6/0x4cb [<8142cae0>] schedule+0xf/0x11 [<8142f9a6>] work_resched+0x5/0x30 -> #3 (&rq->lock){-.-.-.}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f04c>] _raw_spin_lock+0x21/0x30 [<81040873>] __task_rq_lock+0x33/0x3a [<8104184c>] wake_up_new_task+0x25/0xc2 [<8102474b>] do_fork+0x15c/0x2a0 [<810248a9>] kernel_thread+0x1a/0x1f [<814232a2>] rest_init+0x1a/0x10e [<817af949>] start_kernel+0x303/0x308 [<817af2ab>] i386_start_kernel+0x79/0x7d -> #2 (&p->pi_lock){-.-...}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<810413dd>] try_to_wake_up+0x1d/0xd6 [<810414cd>] default_wake_function+0xb/0xd [<810461f3>] __wake_up_common+0x39/0x59 [<81046346>] __wake_up+0x29/0x3b [<811b8733>] tty_wakeup+0x49/0x51 [<811c3568>] uart_write_wakeup+0x17/0x19 [<811c5dc1>] serial8250_tx_chars+0xbc/0xfb [<811c5f28>] serial8250_handle_irq+0x54/0x6a [<811c5f57>] serial8250_default_handle_irq+0x19/0x1c [<811c56d8>] serial8250_interrupt+0x38/0x9e [<810510e7>] handle_irq_event_percpu+0x5f/0x1e2 [<81051296>] handle_irq_event+0x2c/0x43 [<81052cee>] handle_level_irq+0x57/0x80 [<81002a72>] handle_irq+0x46/0x5c [<810027df>] do_IRQ+0x32/0x89 [<8143036e>] common_interrupt+0x2e/0x33 [<8142f23c>] _raw_spin_unlock_irqrestore+0x3f/0x49 [<811c25a4>] uart_start+0x2d/0x32 [<811c2c04>] uart_write+0xc7/0xd6 [<811bc6f6>] n_tty_write+0xb8/0x35e [<811b9beb>] tty_write+0x163/0x1e4 [<811b9cd9>] redirected_tty_write+0x6d/0x75 [<810b6ed6>] vfs_write+0x75/0xb0 [<810b7265>] SyS_write+0x44/0x77 [<8142f8ee>] syscall_call+0x7/0xb -> #1 (&tty->write_wait){-.....}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<81046332>] __wake_up+0x15/0x3b [<811b8733>] tty_wakeup+0x49/0x51 [<811c3568>] uart_write_wakeup+0x17/0x19 [<811c5dc1>] serial8250_tx_chars+0xbc/0xfb [<811c5f28>] serial8250_handle_irq+0x54/0x6a [<811c5f57>] serial8250_default_handle_irq+0x19/0x1c [<811c56d8>] serial8250_interrupt+0x38/0x9e [<810510e7>] handle_irq_event_percpu+0x5f/0x1e2 [<81051296>] handle_irq_event+0x2c/0x43 [<81052cee>] handle_level_irq+0x57/0x80 [<81002a72>] handle_irq+0x46/0x5c [<810027df>] do_IRQ+0x32/0x89 [<8143036e>] common_interrupt+0x2e/0x33 [<8142f23c>] _raw_spin_unlock_irqrestore+0x3f/0x49 [<811c25a4>] uart_start+0x2d/0x32 [<811c2c04>] uart_write+0xc7/0xd6 [<811bc6f6>] n_tty_write+0xb8/0x35e [<811b9beb>] tty_write+0x163/0x1e4 [<811b9cd9>] redirected_tty_write+0x6d/0x75 [<810b6ed6>] vfs_write+0x75/0xb0 [<810b7265>] SyS_write+0x44/0x77 [<8142f8ee>] syscall_call+0x7/0xb -> #0 (&port_lock_key){-.....}: [<8104a62d>] __lock_acquire+0x9ea/0xc6d [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<811c60be>] serial8250_console_write+0x8c/0x10c [<8104e402>] call_console_drivers.constprop.31+0x87/0x118 [<8104f5d5>] console_unlock+0x1d7/0x398 [<8104fb70>] vprintk_emit+0x3da/0x3e4 [<81425f76>] printk+0x17/0x19 [<8105bfa0>] clockevents_program_min_delta+0x104/0x116 [<8105c548>] clockevents_program_event+0xe7/0xf3 [<8105cc1c>] tick_program_event+0x1e/0x23 [<8103c43c>] hrtimer_force_reprogram+0x88/0x8f [<8103c49e>] __remove_hrtimer+0x5b/0x79 [<8103cb21>] hrtimer_try_to_cancel+0x49/0x66 [<8103cb4b>] hrtimer_cancel+0xd/0x18 [<8107f102>] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30 [<81080705>] task_clock_event_stop+0x20/0x64 [<81080756>] task_clock_event_del+0xd/0xf [<81081350>] event_sched_out+0xab/0x11e [<810813e0>] group_sched_out+0x1d/0x66 [<81081682>] ctx_sched_out+0xaf/0xbf [<81081e04>] __perf_event_task_sched_out+0x1ed/0x34f [<8142cacc>] __schedule+0x4c6/0x4cb [<8142cae0>] schedule+0xf/0x11 [<8142f9a6>] work_resched+0x5/0x30 other info that might help us debug this: Chain exists of: &port_lock_key --> &ctx->lock --> hrtimer_bases.lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(hrtimer_bases.lock); lock(&ctx->lock); lock(hrtimer_bases.lock); lock(&port_lock_key); *** DEADLOCK *** 4 locks held by trinity-main/74: #0: (&rq->lock){-.-.-.}, at: [<8142c6f3>] __schedule+0xed/0x4cb #1: (&ctx->lock){......}, at: [<81081df3>] __perf_event_task_sched_out+0x1dc/0x34f #2: (hrtimer_bases.lock){-.-...}, at: [<8103caeb>] hrtimer_try_to_cancel+0x13/0x66 #3: (console_lock){+.+...}, at: [<8104fb5d>] vprintk_emit+0x3c7/0x3e4 stack backtrace: CPU: 0 PID: 74 Comm: trinity-main Not tainted 3.15.0-rc8-06195-g939f04be #2 00000000 81c3a310 8b995c14 81426f69 8b995c44 81425a99 8161f671 8161f570 8161f538 8161f559 8161f538 8b995c78 8b142bb0 00000004 8b142fdc 8b142bb0 8b995ca8 8104a62d 8b142fac 000016f2 81c3a310 00000001 00000001 00000003 Call Trace: [<81426f69>] dump_stack+0x16/0x18 [<81425a99>] print_circular_bug+0x18f/0x19c [<8104a62d>] __lock_acquire+0x9ea/0xc6d [<8104a942>] lock_acquire+0x92/0x101 [<811c60be>] ? serial8250_console_write+0x8c/0x10c [<811c6032>] ? wait_for_xmitr+0x76/0x76 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<811c60be>] ? serial8250_console_write+0x8c/0x10c [<811c60be>] serial8250_console_write+0x8c/0x10c [<8104af87>] ? lock_release+0x191/0x223 [<811c6032>] ? wait_for_xmitr+0x76/0x76 [<8104e402>] call_console_drivers.constprop.31+0x87/0x118 [<8104f5d5>] console_unlock+0x1d7/0x398 [<8104fb70>] vprintk_emit+0x3da/0x3e4 [<81425f76>] printk+0x17/0x19 [<8105bfa0>] clockevents_program_min_delta+0x104/0x116 [<8105cc1c>] tick_program_event+0x1e/0x23 [<8103c43c>] hrtimer_force_reprogram+0x88/0x8f [<8103c49e>] __remove_hrtimer+0x5b/0x79 [<8103cb21>] hrtimer_try_to_cancel+0x49/0x66 [<8103cb4b>] hrtimer_cancel+0xd/0x18 [<8107f102>] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30 [<81080705>] task_clock_event_stop+0x20/0x64 [<81080756>] task_clock_event_del+0xd/0xf [<81081350>] event_sched_out+0xab/0x11e [<810813e0>] group_sched_out+0x1d/0x66 [<81081682>] ctx_sched_out+0xaf/0xbf [<81081e04>] __perf_event_task_sched_out+0x1ed/0x34f [<8104416d>] ? __dequeue_entity+0x23/0x27 [<81044505>] ? pick_next_task_fair+0xb1/0x120 [<8142cacc>] __schedule+0x4c6/0x4cb [<81047574>] ? trace_hardirqs_off_caller+0xd7/0x108 [<810475b0>] ? trace_hardirqs_off+0xb/0xd [<81056346>] ? rcu_irq_exit+0x64/0x77 Fix the problem by using printk_deferred() which does not call into the scheduler. Reported-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NJan Kara <jack@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 31 7月, 2014 3 次提交
-
-
由 David Rientjes 提交于
free_huge_page() is undefined without CONFIG_HUGETLBFS and there's no need to filter PageHuge() page is such a configuration either, so avoid exporting the symbol to fix a build error: In file included from kernel/kexec.c:14:0: kernel/kexec.c: In function 'crash_save_vmcoreinfo_init': kernel/kexec.c:1623:20: error: 'free_huge_page' undeclared (first use in this function) VMCOREINFO_SYMBOL(free_huge_page); ^ Introduced by commit 8f1d26d0 ("kexec: export free_huge_page to VMCOREINFO") Reported-by: Nkbuild test robot <fengguang.wu@intel.com> Acked-by: NOlof Johansson <olof@lixom.net> Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Josh Triplett 提交于
My IBM email addresses haven't worked for years; also map some old-but-functional forwarding addresses to my canonical address. Update my GPG key fingerprint; I moved to 4096R a long time ago. Update description. Signed-off-by: NJosh Triplett <josh@joshtriplett.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Atsushi Kumagai 提交于
PG_head_mask was added into VMCOREINFO to filter huge pages in b3acc56b ("kexec: save PG_head_mask in VMCOREINFO"), but makedumpfile still need another symbol to filter *hugetlbfs* pages. If a user hope to filter user pages, makedumpfile tries to exclude them by checking the condition whether the page is anonymous, but hugetlbfs pages aren't anonymous while they also be user pages. We know it's possible to detect them in the same way as PageHuge(), so we need the start address of free_huge_page(): int PageHuge(struct page *page) { if (!PageCompound(page)) return 0; page = compound_head(page); return get_compound_page_dtor(page) == free_huge_page; } For that reason, this patch changes free_huge_page() into public to export it to VMCOREINFO. Signed-off-by: NAtsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Acked-by: NBaoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 7月, 2014 1 次提交
-
-
由 Li Zefan 提交于
The WARN_ON() is used to check if we break the legal hierarchy, on which the effective mems should be equal to configured mems. Reported-by: NMike Qiu <qiudayu@linux.vnet.ibm.com> Tested-by: NMike Qiu <qiudayu@linux.vnet.ibm.com> Signed-off-by: NLi Zefan <lizefan@huawei.com>
-
- 29 7月, 2014 1 次提交
-
-
由 Lai Jiangshan 提交于
If the worker is already executing a work item when another is queued, we can safely skip wakeup without worrying about stalling queue thus avoiding waking up the busy worker spuriously. Spurious wakeups should be fine but still isn't nice and avoiding it is trivial here. tj: Updated description. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 28 7月, 2014 5 次提交
-
-
由 Masanari Iida 提交于
This patch fix following warning caused by missing description "overload" in kernel/sched/fair.c Warning(.//kernel/sched/fair.c:5906): No description found for parameter 'overload' Signed-off-by: NMasanari Iida <standby24x7@gmail.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1406518686-7274-1-git-send-email-standby24x7@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Steven Rostedt 提交于
Instead of passing around a magic number -1 for the sched_setparam() policy, use a more descriptive macro name like SETPARAM_POLICY. [ based on top of Daniel's sched_setparam() fix ] Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Daniel Bristot de Oliveira<bristot@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140723112826.6ed6cbce@gandalf.local.homeSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
We hard assume that higher topology levels are supersets of lower levels. Detect, warn and try to fixup when we encounter this violated. Tested-by: NDietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Josh Boyer <jwboyer@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Bruno Wolff III <bruno@wolff.to> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140722094740.GJ12054@laptop.lanSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jiri Olsa 提交于
There's no need to check cloned event's permission once the parent was already checked. Also the code is checking 'current' process permissions, which is not owner process for cloned events, thus could end up with wrong permission check result. Reported-by: NAlexander Yarygin <yarygin@linux.vnet.ibm.com> Tested-by: NAlexander Yarygin <yarygin@linux.vnet.ibm.com> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1405079782-8139-1-git-send-email-jolsa@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
The scheduler uses policy == -1 to preserve the current policy state to implement sched_setparam(). But, as (int) -1 is equals to 0xffffffff, it's matching the if (policy & SCHED_RESET_ON_FORK) on _sched_setscheduler(). This match changes the policy value to an invalid value, breaking the sched_setparam() syscall. This patch checks policy == -1 before check the SCHED_RESET_ON_FORK flag. The following program shows the bug: int main(void) { struct sched_param param = { .sched_priority = 5, }; sched_setscheduler(0, SCHED_FIFO, ¶m); param.sched_priority = 1; sched_setparam(0, ¶m); param.sched_priority = 0; sched_getparam(0, ¶m); if (param.sched_priority != 1) printf("failed priority setting (found %d instead of 1)\n", param.sched_priority); else printf("priority setting fine\n"); } Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> # 3.14+ Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Fixes: 7479f3c9 "sched: Move SCHED_RESET_ON_FORK into attr::sched_flags" Link: http://lkml.kernel.org/r/9ebe0566a08dbbb3999759d3f20d6004bb2dbcfa.1406079891.git.bristot@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 24 7月, 2014 5 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
After adding all the records to the tramp_hash, add a check that makes sure that the number of records added matches the number of records expected to match and do a WARN_ON and disable ftrace if they do not match. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt (Red Hat) 提交于
In the loop of ftrace_save_ops_tramp_hash(), it adds all the recs to the ops hash if the rec has only one callback attached and the ops is connected to the rec. It gives a nasty warning and shuts down ftrace if the rec doesn't have a trampoline set for it. But this can happen with the following scenario: # cd /sys/kernel/debug/tracing # echo schedule do_IRQ > set_ftrace_filter # mkdir instances/foo # echo schedule > instances/foo/set_ftrace_filter # echo function_graph > current_function # echo function > instances/foo/current_function # echo nop > instances/foo/current_function The above would then trigger the following warning and disable ftrace: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 3145 at kernel/trace/ftrace.c:2212 ftrace_run_update_code+0xe4/0x15b() Modules linked in: ipt_MASQUERADE sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ip [...] CPU: 1 PID: 3145 Comm: bash Not tainted 3.16.0-rc3-test+ #136 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007 0000000000000000 ffffffff81808a88 ffffffff81502130 0000000000000000 ffffffff81040ca1 ffff880077c08000 ffffffff810bd286 0000000000000001 ffffffff81a56830 ffff88007a041be0 ffff88007a872d60 00000000000001be Call Trace: [<ffffffff81502130>] ? dump_stack+0x4a/0x75 [<ffffffff81040ca1>] ? warn_slowpath_common+0x7e/0x97 [<ffffffff810bd286>] ? ftrace_run_update_code+0xe4/0x15b [<ffffffff810bd286>] ? ftrace_run_update_code+0xe4/0x15b [<ffffffff810bda1a>] ? ftrace_shutdown+0x11c/0x16b [<ffffffff810bda87>] ? unregister_ftrace_function+0x1e/0x38 [<ffffffff810cc7e1>] ? function_trace_reset+0x1a/0x28 [<ffffffff810c924f>] ? tracing_set_tracer+0xc1/0x276 [<ffffffff810c9477>] ? tracing_set_trace_write+0x73/0x91 [<ffffffff81132383>] ? __sb_start_write+0x9a/0xcc [<ffffffff8120478f>] ? security_file_permission+0x1b/0x31 [<ffffffff81130e49>] ? vfs_write+0xac/0x11c [<ffffffff8113115d>] ? SyS_write+0x60/0x8e [<ffffffff81508112>] ? system_call_fastpath+0x16/0x1b ---[ end trace 938c4415cbc7dc96 ]--- ------------[ cut here ]------------ Link: http://lkml.kernel.org/r/20140723120805.GB21376@redhat.comReported-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Stephen Boyd 提交于
During suspend we call sched_clock_poll() to update the epoch and accumulated time and reprogram the sched_clock_timer to fire before the next wrap-around time. Unfortunately, sched_clock_poll() doesn't restart the timer, instead it relies on the hrtimer layer to do that and during suspend we aren't calling that function from the hrtimer layer. Instead, we're reprogramming the expires time while the hrtimer is enqueued, which can cause the hrtimer tree to be corrupted. Furthermore, we restart the timer during suspend but we update the epoch during resume which seems counter-intuitive. Let's fix this by saving the accumulated state and canceling the timer during suspend. On resume we can update the epoch and restart the timer similar to what we would do if we were starting the clock for the first time. Fixes: a08ca5d1 "sched_clock: Use an hrtimer instead of timer" Signed-off-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/1406174630-23458-1-git-send-email-john.stultz@linaro.org Cc: Ingo Molnar <mingo@kernel.org> Cc: stable <stable@vger.kernel.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Steven Rostedt (Red Hat) 提交于
There's a helper function to get a ring buffer page size (the number of bytes of data recorded on the page), called rb_page_size(). Use that instead of open coding it. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt (Red Hat) 提交于
Having two fields within the same struct that is off by one character can be confusing and error prone. Rename the counter "trampolines" to "nr_trampolines" to explicitly show it is a counter and not to be confused by the "trampoline" field. Suggested-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 23 7月, 2014 6 次提交
-
-
由 Lai Jiangshan 提交于
They are the same and nr_node_ids is provided by the memory subsystem. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Lai Jiangshan 提交于
After the locking was moved up to the caller of the get_unbound_pool(), out_unlock label doesn't need to do any unlock operation and the name became bad, so we just remove this label, and the only usage-site "goto out_unlock" is subsituted to "return pool". Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Lai Jiangshan 提交于
In 75ccf595 ("workqueue: prepare flush_workqueue() for dynamic creation and destrucion of unbound pool_workqueues"), a comment about the synchronization for the pwq in pwq_unbound_release_workfn() was added. The comment claimed the flush_mutex wasn't strictly necessary, it was correct in that time, due to the pwq was protected by workqueue_lock. But it is incorrect now since the wq->flush_mutex was renamed to wq->mutex and workqueue_lock was removed, the wq->mutex is strictly needed. But the comment was miss-updated when the synchronization was changed. This patch removes the incorrect comments and doesn't add any new comment to explain why wq->mutex is needed here, which is definitely obvious and wq->pwqs_node has "WQ" notation in its definition which is better comment. The old commit mentioned above also introduced a comment in link_pwq() about the synchronization. This comment is also removed in this patch since the whole link_pwq() is proteced by wq->mutex. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Lai Jiangshan 提交于
In 51697d39 ("workqueue: use generic attach/detach routine for rescuers"), The rescuer detaches itself from the pool before put_pwq() so that the put_unbound_pool() will not destroy the rescuer-attached pool. It is unnecessary. worker_detach_from_pool() can be used as the last statement to access to the pool just like the regular workers, put_unbound_pool() will wait for it to detach and then free the pool. So we move the worker_detach_from_pool() down, make it coincide with the regular workers. tj: Minor description update. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Lai Jiangshan 提交于
Simply unfold the code of start_worker() into create_worker() and remove the original start_worker() and create_and_start_worker(). The only trade-off is the introduced overhead that the pool->lock is released and regrabbed after the newly worker is started. The overhead is acceptible since the manager is slow path. And because this new locking behavior, the newly created worker may grab the lock earlier than the manager and go to process work items. In this case, the recheck need_to_create_worker() may be true as expected and the manager goes to restart which is the correct behavior. tj: Minor updates to description and comments. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Lai Jiangshan 提交于
worker_set_flags() has only two callers, each specifying %true and %false for @wakeup. Let's push the wake up to the caller and remove @wakeup from worker_set_flags(). The caller can use the following instead if wakeup is necessary: worker_set_flags(); if (need_more_worker(pool)) wake_up_worker(pool); This makes the code simpler. This patch doesn't introduce behavior changes. tj: Updated description and comments. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 22 7月, 2014 1 次提交
-
-
由 Lai Jiangshan 提交于
In process_one_work(): if ((worker->flags & WORKER_UNBOUND) && need_more_worker(pool)) wake_up_worker(pool); the first test is unneeded. Even if the first test is removed, it doesn't affect the wake-up logic for WORKER_UNBOUND, and it will not introduce any useless wake-ups for normal per-cpu workers since nr_running is always >= 1. It will introduce useless/redundant wake-ups for CPU_INTENSIVE, but this case is rare and the next patch will also remove this redundant wake-up. tj: Minor updates to the description and comment. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 21 7月, 2014 1 次提交
-
-
由 Tony Luck 提交于
The "uptime" trace clock added in: commit 8aacf017 tracing: Add "uptime" trace clock that uses jiffies has wraparound problems when the system has been up more than 1 hour 11 minutes and 34 seconds. It converts jiffies to nanoseconds using: (u64)jiffies_to_usecs(jiffy) * 1000ULL but since jiffies_to_usecs() only returns a 32-bit value, it truncates at 2^32 microseconds. An additional problem on 32-bit systems is that the argument is "unsigned long", so fixing the return value only helps until 2^32 jiffies (49.7 days on a HZ=1000 system). Avoid these problems by using jiffies_64 as our basis, and not converting to nanoseconds (we do convert to clock_t because user facing API must not be dependent on internal kernel HZ values). Link: http://lkml.kernel.org/p/99d63c5bfe9b320a3b428d773825a37095bf6a51.1405708254.git.tony.luck@intel.com Cc: stable@vger.kernel.org # 3.10+ Fixes: 8aacf017 "tracing: Add "uptime" trace clock that uses jiffies" Signed-off-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 19 7月, 2014 3 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
Local functions should be static. Reported-by: Nkbuild test robot <fengguang.wu@intel.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Lai Jiangshan 提交于
We don't need to wake up regular worker when nr_running==1, so need_more_worker() is sufficient here. And need_more_worker() gives us better readability due to the name of "keep_working()" implies the rescuer should keep working now but the rescuer is actually leaving. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Wang Nan 提交于
Do not waste time copying the old hash if the hash is going to be reset. Just allocate a new hash and free the old one, as that is the same result as copying te old one and then resetting it. Link: http://lkml.kernel.org/p/1405384820-48837-1-git-send-email-wangnan0@huawei.comSigned-off-by: NWang Nan <wangnan0@huawei.com> [ SDR: Removed unused ftrace_filter_reset() function ] Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-