- 05 6月, 2014 21 次提交
-
-
由 Petr Mladek 提交于
There was no check for too long messages. The check for free space always passed when first_seq and next_seq were equal. Enough free space was not guaranteed, though. log_store() might be called to store messages up to 64kB + 64kB + 16B. This is sum of maximal text_len, dict_len values, and the size of the structure printk_log. On the other hand, the minimal size for the main log buffer currently is 4kB and it is enforced only by Kconfig. The good news is that the usage looks safe right now. log_store() is called only from vprintk_emit() and cont_flush(). Here the "text" part is always passed via a static buffer and the length is limited to LOG_LINE_MAX which is 1024. The "dict" part is NULL in most cases. The only exceptions is when vprintk_emit() is called from printk_emit() and dev_vprintk_emit(). But printk_emit() is currently used only in devkmsg_writev() and here "dict" is NULL as well. In dev_vprintk_emit(), "dict" is limited by the static buffer "hdr" of the size 128 bytes. It meas that the current maximal printed text is 1024B + 128B + 16B and it always fit the log buffer. But it is only matter of time when someone calls printk_emit() with unsafe parameters, especially the "dict" one. This patch adds a check for the free space when the buffer is empty. It reuses the already existing log_has_space() function but it has to add an extra parameter. It defines whether the buffer is empty. Note that the same values of "first_idx" and "next_idx" might also mean that the buffer is full. If the buffer is empty, we must respect the current position of the indexes. We cannot reset them to the beginning of the buffer. Otherwise, the functions reading the buffer would get crazy. The question is what to do when the message is too long. This patch uses the easiest solution and just ignores the problematic message. Let's do something better in a followup patch. Signed-off-by: NPetr Mladek <pmladek@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Petr Mladek 提交于
The check for free space in the log buffer always passes when "first_seq" and "next_seq" are equal. In theory, it might cause writing outside of the log buffer. Fortunately, the current usage looks safe because the used "text" and "dict" buffers are quite limited. See the second patch for more details. Anyway, it is better to be on the safe side and add a check. An easy solution is done in the 2nd patch and it is improved in the 4th patch. 5th patch fixes the computation of the printed message length. 1st and 3rd patches just do some code refactoring to make the other patches easier. This patch (of 5): There will be needed some fixes in the check for free space. They will be easier if the code is moved outside of the quite long log_store() function. This patch does not change the existing behavior. Signed-off-by: NPetr Mladek <pmladek@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kirill A. Shutemov 提交于
Nobody seems uses it for a long time. Let's drop it. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Fabian Frederick 提交于
sysctl_hung_task_panic has been changed to unsigned int. use kstrtouint instead of obsolete simple_strtoul Signed-off-by: NFabian Frederick <fabf@skynet.be> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Fabian Frederick 提交于
Also fixes checkpatch warnings on proc_dostring function parameters Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Fabian Frederick 提交于
Replace obsolete function. kstrtoint is used as reboot_cpu is an integer. Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Fabian Frederick 提交于
[akpm@linux-foundation.org: don't overwrite kstrtoull()'s errno] Signed-off-by: NFabian Frederick <fabf@skynet.be> Cc: Michal Hocko <mhocko@suse.cz> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Fabian Frederick 提交于
Signed-off-by: NFabian Frederick <fabf@skynet.be> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Fabian Frederick 提交于
Signed-off-by: NFabian Frederick <fabf@skynet.be> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Fabian Frederick 提交于
This patch also fixes one function declaration over 80 characters. Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Fabian Frederick 提交于
Fix checkpatch warnings about EXPORT_SYMBOL and return() Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Fabian Frederick 提交于
- EXPORT_SYMBOL - typo: unexpectidly->unexpectedly - function prototype over 80 characters Signed-off-by: NFabian Frederick <fabf@skynet.be> Cc: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Fabian Frederick 提交于
Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Fabian Frederick 提交于
no level printk converted to pr_warn (if err) no level printk converted to pr_info (disabling non-boot cpus) Other printk converted to respective level. Signed-off-by: NFabian Frederick <fabf@skynet.be> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Fabian Frederick 提交于
sys_sgetmask and sys_ssetmask are obsolete system calls no longer supported in libc. This patch replaces architecture related __ARCH_WANT_SYS_SGETMAX by expert mode configuration.That option is enabled by default for those architectures. Signed-off-by: NFabian Frederick <fabf@skynet.be> Cc: Steven Miao <realmz6@gmail.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
If cpusets are not in use then we still check a global variable on every page allocation. Use jump labels to avoid the overhead. Signed-off-by: NMel Gorman <mgorman@suse.de> Reviewed-by: NRik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Michal Hocko <mhocko@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Oleg Nesterov 提交于
for_each_process_thread() is sub-optimal. All threads share the same ->mm, we can swicth to the next process once we found a thread with ->mm != NULL and ->mm != mm. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Reviewed-by: NMichal Hocko <mhocko@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Peter Chiang <pchiang@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Oleg Nesterov 提交于
"Search through everything else" in mm_update_next_owner() can hit a kthread which adopted this "mm" via use_mm(), it should not be used as mm->owner. Add the PF_KTHREAD check. While at it, change this code to use for_each_process_thread() instead of deprecated do_each_thread/while_each_thread. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Reviewed-by: NMichal Hocko <mhocko@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Peter Chiang <pchiang@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Oleg Nesterov 提交于
CONFIG_MM_OWNER makes no sense. It is not user-selectable, it is only selected by CONFIG_MEMCG automatically. So we can kill this option in init/Kconfig and do s/CONFIG_MM_OWNER/CONFIG_MEMCG/ globally. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vladimir Davydov 提交于
Currently to allocate a page that should be charged to kmemcg (e.g. threadinfo), we pass __GFP_KMEMCG flag to the page allocator. The page allocated is then to be freed by free_memcg_kmem_pages. Apart from looking asymmetrical, this also requires intrusion to the general allocation path. So let's introduce separate functions that will alloc/free pages charged to kmemcg. The new functions are called alloc_kmem_pages and free_kmem_pages. They should be used when the caller actually would like to use kmalloc, but has to fall back to the page allocator for the allocation is large. They only differ from alloc_pages and free_pages in that besides allocating or freeing pages they also charge them to the kmem resource counter of the current memory cgroup. [sfr@canb.auug.org.au: export kmalloc_order() to modules] Signed-off-by: NVladimir Davydov <vdavydov@parallels.com> Acked-by: NGreg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Acked-by: NMichal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tetsuo Handa 提交于
Commit 786235ee ("kthread: make kthread_create() killable") meant for allowing kthread_create() to abort as soon as killed by the OOM-killer. But returning -ENOMEM is wrong if killed by SIGKILL from userspace. Change kthread_create() to return -EINTR upon SIGKILL. Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Oleg Nesterov <oleg@redhat.com> Acked-by: NDavid Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> [3.13+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 6月, 2014 1 次提交
-
-
由 Niv Yehezkel 提交于
Fix a trivial comment typo (s/mam/map) in kernel/power/swap.c. Signed-off-by: NNiv Yehezkel <executerx@gmail.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 29 5月, 2014 1 次提交
-
-
由 Stephen Chivers 提交于
Commit 5f5c9ae5 "serial_core: Unregister console in uart_remove_one_port()" fixed a crash where a serial port was removed but not deregistered as a console. There is a side effect of that commit for platforms having serial consoles and of_serial configured (CONFIG_SERIAL_OF_PLATFORM). The serial console is disabled midway through the boot process. This cessation of the serial console affects PowerPC computers such as the MVME5100 and SAM440EP. The sequence is: bootconsole [udbg0] enabled .... serial8250/16550 driver initialises and registers its UARTS, one of these is the serial console. console [ttyS0] enabled .... of_serial probes "platform" devices, registering them as it goes. One of these is the serial console. console [ttyS0] disabled. The disabling of the serial console is due to: a. unregister_console in printk not clearing the CONS_ENABLED bit in the console flags, even though it has announced that the console is disabled; and b. of_platform_serial_probe in of_serial not setting the port type before it registers with serial8250_register_8250_port. This patch ensures that the serial console is re-enabled when of_serial registers a serial port that corresponds to the designated console. Signed-off-by: NStephen Chivers <schivers@csc.com> Tested-by: NStephen Chivers <schivers@csc.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [unregister_console] Cc: stable <stable@vger.kernel.org> # 3.15 === The above failure was identified in Linux-3.15-rc2. Tested using MVME5100 and SAM440EP PowerPC computers with kernels built from Linux-3.15-rc5 and tty-next. The continued operation of the serial console is vital for computers such as the MVME5100 as that Single Board Computer does not have any grapical/display hardware. Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 28 5月, 2014 3 次提交
-
-
由 Thomas Gleixner 提交于
The current deadlock detection logic does not work reliably due to the following early exit path: /* * Drop out, when the task has no waiters. Note, * top_waiter can be NULL, when we are in the deboosting * mode! */ if (top_waiter && (!task_has_pi_waiters(task) || top_waiter != task_top_pi_waiter(task))) goto out_unlock_pi; So this not only exits when the task has no waiters, it also exits unconditionally when the current waiter is not the top priority waiter of the task. So in a nested locking scenario, it might abort the lock chain walk and therefor miss a potential deadlock. Simple fix: Continue the chain walk, when deadlock detection is enabled. We also avoid the whole enqueue, if we detect the deadlock right away (A-A). It's an optimization, but also prevents that another waiter who comes in after the detection and before the task has undone the damage observes the situation and detects the deadlock and returns -EDEADLOCK, which is wrong as the other task is not in a deadlock situation. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20140522031949.725272460@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Srivatsa S. Bhat 提交于
If we try to perform a kexec when the machine is in ST (Single-Threaded) mode (ppc64_cpu --smt=off), the kexec operation doesn't succeed properly, and we get the following messages during boot: [ 0.089866] POWER8 performance monitor hardware support registered [ 0.089985] power8-pmu: PMAO restore workaround active. [ 5.095419] Processor 1 is stuck. [ 10.097933] Processor 2 is stuck. [ 15.100480] Processor 3 is stuck. [ 20.102982] Processor 4 is stuck. [ 25.105489] Processor 5 is stuck. [ 30.108005] Processor 6 is stuck. [ 35.110518] Processor 7 is stuck. [ 40.113369] Processor 9 is stuck. [ 45.115879] Processor 10 is stuck. [ 50.118389] Processor 11 is stuck. [ 55.120904] Processor 12 is stuck. [ 60.123425] Processor 13 is stuck. [ 65.125970] Processor 14 is stuck. [ 70.128495] Processor 15 is stuck. [ 75.131316] Processor 17 is stuck. Note that only the sibling threads are stuck, while the primary threads (0, 8, 16 etc) boot just fine. Looking closer at the previous step of kexec, we observe that kexec tries to wakeup (bring online) the sibling threads of all the cores, before performing kexec: [ 9464.131231] Starting new kernel [ 9464.148507] kexec: Waking offline cpu 1. [ 9464.148552] kexec: Waking offline cpu 2. [ 9464.148600] kexec: Waking offline cpu 3. [ 9464.148636] kexec: Waking offline cpu 4. [ 9464.148671] kexec: Waking offline cpu 5. [ 9464.148708] kexec: Waking offline cpu 6. [ 9464.148743] kexec: Waking offline cpu 7. [ 9464.148779] kexec: Waking offline cpu 9. [ 9464.148815] kexec: Waking offline cpu 10. [ 9464.148851] kexec: Waking offline cpu 11. [ 9464.148887] kexec: Waking offline cpu 12. [ 9464.148922] kexec: Waking offline cpu 13. [ 9464.148958] kexec: Waking offline cpu 14. [ 9464.148994] kexec: Waking offline cpu 15. [ 9464.149030] kexec: Waking offline cpu 17. Instrumenting this piece of code revealed that the cpu_up() operation actually fails with -EBUSY. Thus, only the primary threads of all the cores are online during kexec, and hence this is a sure-shot receipe for disaster, as explained in commit e8e5c215 (powerpc/kexec: Fix orphaned offline CPUs across kexec), as well as in the comment above wake_offline_cpus(). It turns out that cpu_up() was returning -EBUSY because the variable 'cpu_hotplug_disabled' was set to 1; and this disabling of CPU hotplug was done by migrate_to_reboot_cpu() inside kernel_kexec(). Now, migrate_to_reboot_cpu() was originally written with the assumption that any further code will not need to perform CPU hotplug, since we are anyway in the reboot path. However, kexec is clearly not such a case, since we depend on onlining CPUs, atleast on powerpc. So re-enable cpu-hotplug after returning from migrate_to_reboot_cpu() in the kexec path, to fix this regression in kexec on powerpc. Also, wrap the cpu_up() in powerpc kexec code within a WARN_ON(), so that we can catch such issues more easily in the future. Fixes: c97102ba (kexec: migrate to reboot cpu) Cc: stable@vger.kernel.org Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Jianyu Zhan 提交于
There is still one residue of sysfs remaining: the sb_magic SYSFS_MAGIC. However this should be kernfs user specific, so this patch moves it out. Kerrnfs user should specify their magic number while mouting. Signed-off-by: NJianyu Zhan <nasa4836@gmail.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 26 5月, 2014 3 次提交
-
-
由 Rafael J. Wysocki 提交于
On some systems the platform doesn't support neither PM_SUSPEND_MEM nor PM_SUSPEND_STANDBY, so PM_SUSPEND_FREEZE is the only available system sleep state. However, some user space frameworks only use the "mem" and (sometimes) "standby" sleep state labels, so the users of those systems need to modify user space in order to be able to use system suspend at all and that is not always possible. For this reason, add a new kernel command line argument, relative_sleep_states, allowing the users of those systems to change the way in which the kernel assigns labels to system sleep states. Namely, for relative_sleep_states=1, the "mem", "standby" and "freeze" labels will enumerate the available system sleem states from the deepest to the shallowest, respectively, so that "mem" is always present in /sys/power/state and the other state strings may or may not be presend depending on what is supported by the platform. Update system sleep states documentation to reflect this change. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rafael J. Wysocki 提交于
Use the observation that, for platform-dependent sleep states (PM_SUSPEND_STANDBY, PM_SUSPEND_MEM), a given state is either always supported or always unsupported and store that information in pm_states[] instead of calling valid_state() every time we need to check it. Also do not use valid_state() for PM_SUSPEND_FREEZE, which is always valid, and move the pm_test_level validity check for PM_SUSPEND_FREEZE directly into enter_state(). Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rafael J. Wysocki 提交于
To allow sleep states corresponding to the "mem", "standby" and "freeze" lables to be different from the pm_states[] indexes of those strings, introduce struct pm_sleep_state, consisting of a string label and a state number, and turn pm_states[] into an array of objects of that type. This modification should not lead to any functional changes. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 24 5月, 2014 1 次提交
-
-
由 Bjorn Helgaas 提交于
The resource map sanity check message is a bit confusing. Change it to be more readable: -resource map sanity check conflict: 0xfed10000 0xfed15fff 0xfed10000 0xfed13fff pnp 00:01 +resource sanity check: requesting [mem 0xfed10000-0xfed15fff], which spans more than pnp 00:01 [mem 0xfed10000-0xfed13fff] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 22 5月, 2014 10 次提交
-
-
由 Rik van Riel 提交于
Affine wakeups have the potential to interfere with NUMA placement. If a task wakes up too many other tasks, affine wakeups will get disabled. However, regardless of how many other tasks it wakes up, it gets re-enabled once a second, potentially interfering with NUMA placement of other tasks. By decaying wakee_wakes in half instead of zeroing it, we can avoid that problem for some workloads. Signed-off-by: NRik van Riel <riel@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: chegu_vinod@hp.com Cc: umgwanakikbuti@gmail.com Link: http://lkml.kernel.org/r/20140516001332.67f91af2@annuminas.surriel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Rik van Riel 提交于
Update the migrate_improves/degrades_locality() functions with knowledge of pseudo-interleaving. Do not consider moving tasks around within the set of group's active nodes as improving or degrading locality. Instead, leave the load balancer free to balance the load between a numa_group's active nodes. Also, switch from the group/task_weight functions to the group/task_fault functions. The "weight" functions involve a division, but both calls use the same divisor, so there's no point in doing that from these functions. On a 4 node (x10 core) system, performance of SPECjbb2005 seems unaffected, though the number of migrations with 2 8-warehouse wide instances seems to have almost halved, due to the scheduler running each instance on a single node. Signed-off-by: NRik van Riel <riel@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: mgorman@suse.de Cc: chegu_vinod@hp.com Link: http://lkml.kernel.org/r/20140515130306.61aae7db@cuia.bos.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Rik van Riel 提交于
Currently the NUMA balancing code only allows moving tasks between NUMA nodes when the load on both nodes is in balance. This breaks down when the load was imbalanced to begin with. Allow tasks to be moved between NUMA nodes if the imbalance is small, or if the new imbalance is be smaller than the original one. Suggested-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NRik van Riel <riel@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: mgorman@suse.de Cc: chegu_vinod@hp.com Signed-off-by: NIngo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/20140514132221.274b3463@annuminas.surriel.com
-
由 xiaofeng.yan 提交于
sched/rt: Fix 'struct sched_dl_entity' and dl_task_time() comments, to match the current upstream code Signed-off-by: Nxiaofeng.yan <xiaofeng.yan@huawei.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1399605687-18094-1-git-send-email-xiaofeng.yan@huawei.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dongsheng Yang 提交于
sched: Consolidate open coded implementations of nice level frobbing into nice_to_rlimit() and rlimit_to_nice() Signed-off-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/a568a1e3cc8e78648f41b5035fa5e381d36274da.1399532322.git.yangds.fnst@cn.fujitsu.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Corey Minyard 提交于
If the sched_clock time starts at a large value, the kernel will spin in sched_avg_update for a long time while rq->age_stamp catches up with rq->clock. The comment in kernel/sched/clock.c says that there is no strict promise that it starts at zero. So initialize rq->age_stamp when a cpu starts up to avoid this. I was seeing long delays on a simulator that didn't start the clock at zero. This might also be an issue on reboots on processors that don't re-initialize the timer to zero on reset, and when using kexec. Signed-off-by: NCorey Minyard <cminyard@mvista.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1399574859-11714-1-git-send-email-minyard@acm.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kirill Tkhai 提交于
Sometimes ->nr_running may cross 2 but interrupt is not being sent to rq's cpu. In this case we don't reenable the timer. Looks like this may be the reason for rare unexpected effects, if nohz is enabled. Patch replaces all places of direct changing of nr_running and makes add_nr_running() caring about crossing border. Signed-off-by: NKirill Tkhai <tkhai@yandex.ru> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140508225830.2469.97461.stgit@localhostSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jason Low 提交于
Currently, in idle_balance(), we update rq->next_balance when we pull_tasks. However, it is also important to update this in the !pulled_tasks case too. When the CPU is "busy" (the CPU isn't idle), rq->next_balance gets computed using sd->busy_factor (so we increase the balance interval when the CPU is busy). However, when the CPU goes idle, rq->next_balance could still be set to a large value that was computed with the sd->busy_factor. Thus, we need to also update rq->next_balance in idle_balance() in the cases where !pulled_tasks too, so that rq->next_balance gets updated without taking the busy_factor into account when the CPU is about to go idle. This patch makes rq->next_balance get updated independently of whether or not we pulled_task. Also, we add logic to ensure that we always traverse at least 1 of the sched domains to get a proper next_balance value for updating rq->next_balance. Additionally, since load_balance() modifies the sd->balance_interval, we need to re-obtain the sched domain's interval after the call to load_balance() in rebalance_domains() before we update rq->next_balance. This patch adds and uses 2 new helper functions, update_next_balance() and get_sd_balance_interval() to update next_balance and obtain the sched domain's balance_interval. Signed-off-by: NJason Low <jason.low2@hp.com> Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: daniel.lezcano@linaro.org Cc: alex.shi@linaro.org Cc: efault@gmx.de Cc: vincent.guittot@linaro.org Cc: morten.rasmussen@arm.com Cc: aswin@hp.com Link: http://lkml.kernel.org/r/1399596562.2200.7.camel@j-VirtualBoxSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dongsheng Yang 提交于
Suggested-by: NKees Cook <keescook@chromium.org> Signed-off-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1399541715-19568-1-git-send-email-yangds.fnst@cn.fujitsu.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dietmar Eggemann 提交于
There is no need to zero struct sched_group member cpumask and struct sched_group_power member power since both structures are already allocated as zeroed memory in __sdt_alloc(). This patch has been tested with BUG_ON(!cpumask_empty(sched_group_cpus(sg))); and BUG_ON(sg->sgp->power); in build_sched_groups() on ARM TC2 and INTEL i5 M520 platform including CPU hotplug scenarios. Signed-off-by: NDietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1398865178-12577-1-git-send-email-dietmar.eggemann@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-