- 08 3月, 2011 1 次提交
-
-
由 Al Viro 提交于
a) struct inode is not going to be freed under ->d_compare(); however, the thing PROC_I(inode)->sysctl points to just might. Fortunately, it's enough to make freeing that sucker delayed, provided that we don't step on its ->unregistering, clear the pointer to it in PROC_I(inode) before dropping the reference and check if it's NULL in ->d_compare(). b) I'm not sure that we *can* walk into NULL inode here (we recheck dentry->seq between verifying that it's still hashed / fetching dentry->d_inode and passing it to ->d_compare() and there's no negative hashed dentries in /proc/sys/*), but if we can walk into that, we really should not have ->d_compare() return 0 on it! Said that, I really suspect that this check can be simply killed. Nick? Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 03 3月, 2011 1 次提交
-
-
由 Tao Ma 提交于
If we enable trace events to trace block actions, We use blk_fill_rwbs_rq to analyze the corresponding actions in request's cmd_flags, but we only choose the minor 2 bits from it, so most of other flags(e.g, REQ_SYNC) are missing. For example, with a sync write we get: write_test-2409 [001] 160.013869: block_rq_insert: 3,64 W 0 () 258135 + = 8 [write_test] Since now we have integrated the flags of both bio and request, it is safe to pass rq->cmd_flags directly to blk_fill_rwbs and blk_fill_rwbs_rq isn't needed any more. With this patch, after a sync write we get: write_test-2417 [000] 226.603878: block_rq_insert: 3,64 WS 0 () 258135 += 8 [write_test] Signed-off-by: NTao Ma <boyu.mt@taobao.com> Acked-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 26 2月, 2011 1 次提交
-
-
由 Thomas Gleixner 提交于
When the per cpu timer is marked CLOCK_EVT_FEAT_C3STOP, then we only can switch into oneshot mode, when the backup broadcast device supports oneshot mode as well. Otherwise we would try to switch the broadcast device into an unsupported mode unconditionally. This went unnoticed so far as the current available broadcast devices support oneshot mode. Seth unearthed this problem while debugging and working around an hpet related BIOS wreckage. Add the necessary check to tick_is_oneshot_available(). Reported-and-tested-by: NSeth Forshee <seth.forshee@canonical.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> LKML-Reference: <alpine.LFD.2.00.1102252231200.2701@localhost6.localdomain6> Cc: stable@kernel.org # .21 ->
-
- 19 2月, 2011 2 次提交
-
-
由 Thomas Gleixner 提交于
With CONFIG_SHIRQ_DEBUG=y we call a newly installed interrupt handler in request_threaded_irq(). The original implementation (commit a304e1b8) called the handler _BEFORE_ it was installed, but that caused problems with handlers calling disable_irq_nosync(). See commit 377bf1e4. It's braindead in the first place to call disable_irq_nosync in shared handlers, but .... Moving this call after we installed the handler looks innocent, but it is very subtle broken on SMP. Interrupt handlers rely on the fact, that the irq core prevents reentrancy. Now this debug call violates that promise because we run the handler w/o the IRQ_INPROGRESS protection - which we cannot apply here because that would result in a possibly forever masked interrupt line. A concurrent real hardware interrupt on a different CPU results in handler reentrancy and can lead to complete wreckage, which was unfortunately observed in reality and took a fricking long time to debug. Leave the code here for now. We want this debug feature, but that's not easy to fix. We really should get rid of those disable_irq_nosync() abusers and remove that function completely. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Anton Vorontsov <avorontsov@ru.mvista.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: stable@kernel.org # .28 -> .37
-
由 Thomas Gleixner 提交于
Lars-Peter Clausen pointed out: I stumbled upon this while looking through the existing archs using SPARSE_IRQ. Even with SPARSE_IRQ the NR_IRQS is still the upper limit for the number of IRQs. Both PXA and MMP set NR_IRQS to IRQ_BOARD_START, with IRQ_BOARD_START being the number of IRQs used by the core. In various machine files the nr_irqs field of the ARM machine defintion struct is then set to "IRQ_BOARD_START + NR_BOARD_IRQS". As a result "nr_irqs" will greater then NR_IRQS which then again causes the "allocated_irqs" bitmap in the core irq code to be accessed beyond its size overwriting unrelated data. The core code really misses a sanity check there. This went unnoticed so far as by chance the compiler/linker places data behind that bitmap which gets initialized later on those affected platforms. So the obvious fix would be to add a sanity check in early_irq_init() and break all affected platforms. Though that check wants to be backported to stable as well, which will require to fix all known problematic platforms and probably some more yet not known ones as well. Lots of churn. A way simpler solution is to allocate a slightly larger bitmap and avoid the whole churn w/o breaking anything. Add a few warnings when an arch returns utter crap. Reported-by: NLars-Peter Clausen <lars@metafoo.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org # .37 Cc: Haojian Zhuang <haojian.zhuang@marvell.com> Cc: Eric Miao <eric.y.miao@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org>
-
- 17 2月, 2011 3 次提交
-
-
由 Stanislaw Gruszka 提交于
Currently we return 0 in swsusp_alloc() when alloc_image_page() fails. Fix that. Also remove unneeded "error" variable since the only useful value of error is -ENOMEM. [rjw: Fixed up the changelog and changed subject.] Signed-off-by: NStanislaw Gruszka <stf_xl@wp.pl> Cc: stable@kernel.org Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
-
由 Tejun Heo 提交于
MAYDAY_INITIAL_TIMEOUT is defined as HZ / 100 and depending on configuration may end up 0 or 1. Even when it's 1, depending on when the mayday timer is added in the current jiffy interval, it may expire way before a jiffy has passed. Make sure MAYDAY_INITIAL_TIMEOUT is at least two to guarantee that at least a full jiffy has passed before calling rescuers. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NRay Jui <rjui@broadcom.com> Cc: stable@kernel.org
-
由 Tejun Heo 提交于
There are two spellings in use for 'freeze' + 'able' - 'freezable' and 'freezeable'. The former is the more prominent one. The latter is mostly used by workqueue and in a few other odd places. Unify the spelling to 'freezable'. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NAlan Stern <stern@rowland.harvard.edu> Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl> Acked-by: NGreg Kroah-Hartman <gregkh@suse.de> Acked-by: NDmitry Torokhov <dtor@mail.ru> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alex Dubov <oakad@yahoo.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Steven Whitehouse <swhiteho@redhat.com>
-
- 16 2月, 2011 1 次提交
-
-
由 Peter Zijlstra 提交于
It was possible to call pmu::start() on an already running event. In particular this lead so some wreckage as the hrtimer events would re-initialize active timers. This was due to throttled events being activated again by scheduling. Scheduling in a context would add and force start events, resulting in running events with a possible throttle status. The next tick to hit that task will then try to unthrottle the event and call ->start() on an already running event. Reported-by: NJeff Moyer <jmoyer@redhat.com> Cc: <stable@kernel.org> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 14 2月, 2011 1 次提交
-
-
由 Tejun Heo 提交于
After executing the matching works, a rescuer leaves the gcwq whether there are more pending works or not. This may decrease the concurrency level to zero and stall execution until a new work item is queued on the gcwq. Make rescuer wake up a regular worker when it leaves a gcwq if there are more works to execute, so that execution isn't stalled. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NRay Jui <rjui@broadcom.com> Cc: stable@kernel.org
-
- 12 2月, 2011 2 次提交
-
-
由 Kees Cook 提交于
In the continuing effort to avoid kernel addresses leaking to unprivileged users, this patch switches to %pK for /proc/timer_list reporting. Signed-off-by: NKees Cook <kees.cook@canonical.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Dan Rosenberg <drosenberg@vsecurity.com> Cc: Eugene Teo <eugeneteo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20110212032125.GA23571@outflux.net> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
The wake_up_process() call in ptrace_detach() is spurious and not interlocked with the tracee state. IOW, the tracee could be running or sleeping in any place in the kernel by the time wake_up_process() is called. This can lead to the tracee waking up unexpectedly which can be dangerous. The wake_up is spurious and should be removed but for now reduce its toxicity by only waking up if the tracee is in TRACED or STOPPED state. This bug can possibly be used as an attack vector. I don't think it will take too much effort to come up with an attack which triggers oops somewhere. Most sleeps are wrapped in condition test loops and should be safe but we have quite a number of places where sleep and wakeup conditions are expected to be interlocked. Although the window of opportunity is tiny, ptrace can be used by non-privileged users and with some loading the window can definitely be extended and exploited. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NRoland McGrath <roland@redhat.com> Acked-by: NOleg Nesterov <oleg@redhat.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 2月, 2011 2 次提交
-
-
由 Chris Wright 提交于
Expand security_capable() to include cred, so that it can be usable in a wider range of call sites. Signed-off-by: NChris Wright <chrisw@sous-sol.org> Acked-by: NSerge Hallyn <serge.hallyn@canonical.com> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 Linus Torvalds 提交于
In commit ce6ada35 ("security: Define CAP_SYSLOG") Serge Hallyn introduced CAP_SYSLOG, but broke backwards compatibility by no longer accepting CAP_SYS_ADMIN as an override (it would cause a warning and then reject the operation). Re-instate CAP_SYS_ADMIN - but keeping the warning - as an acceptable capability until any legacy applications have been updated. There are apparently applications out there that drop all capabilities except for CAP_SYS_ADMIN in order to access the syslog. (This is a re-implementation of a patch by Serge, cleaning the logic up and making the code more readable) Acked-by: NSerge Hallyn <serge@hallyn.com> Reviewed-by: NJames Morris <jmorris@namei.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 2月, 2011 1 次提交
-
-
由 Don Zickus 提交于
During boot if the hardlockup detector fails to initialize, it complains very loudly. Some failures should be expected under certain situations, ie no lapics, or resource in-use. Tone those error messages down a bit. Keep the rest at a high level. Reported-by: NPaul Bolle <pebolle@tiscali.nl> Tested-by: NPaul Bolle <pebolle@tiscali.nl> Signed-off-by: NDon Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1297278153-21111-1-git-send-email-dzickus@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 08 2月, 2011 3 次提交
-
-
由 Peter Zijlstra 提交于
Both attempts at trying to allow softirq usage for del_timer_sync() failed (produced bogus warnings), so revert the commit for this release: f266a511: lockdep, timer: Fix del_timer_sync() annotation and try again later. Reported-by: NBorislav Petkov <bp@alien8.de> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Yong Zhang <yong.zhang0@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <1297174680.13327.107.camel@laptop> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Tetsuo Handa 提交于
In prepare_kernel_cred() since 2.6.29, put_cred(new) is called without assigning new->usage when security_prepare_creds() returned an error. As a result, memory for new and refcount for new->{user,group_info,tgcred} are leaked because put_cred(new) won't call __put_cred() unless old->usage == 1. Fix these leaks by assigning new->usage (and new->subscribers which was added in 2.6.32) before calling security_prepare_creds(). Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tetsuo Handa 提交于
In cred_alloc_blank() since 2.6.32, abort_creds(new) is called with new->security == NULL and new->magic == 0 when security_cred_alloc_blank() returns an error. As a result, BUG() will be triggered if SELinux is enabled or CONFIG_DEBUG_CREDENTIALS=y. If CONFIG_DEBUG_CREDENTIALS=y, BUG() is called from __invalid_creds() because cred->magic == 0. Failing that, BUG() is called from selinux_cred_free() because selinux_cred_free() is not expecting cred->security == NULL. This does not affect smack_cred_free(), tomoyo_cred_free() or apparmor_cred_free(). Fix these bugs by (1) Set new->magic before calling security_cred_alloc_blank(). (2) Handle null cred->security in creds_are_invalid() and selinux_cred_free(). Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 2月, 2011 1 次提交
-
-
由 Peter Zijlstra 提交于
Calling local_bh_enable() will want to actually start processing softirqs, which isn't a good idea since this can get called with IRQs disabled. Cure this by using _local_bh_enable() which doesn't start processing softirqs, and use raw_local_irq_save() to avoid any softirqs from happening without letting lockdep think IRQs are in fact disabled. Reported-by: NNick Bowler <nbowler@elliptictech.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: NYong Zhang <yong.zhang0@gmail.com> LKML-Reference: <20110203141548.039540914@chello.nl> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 03 2月, 2011 6 次提交
-
-
由 Steven Rostedt 提交于
Currently the syscall_meta structures for the syscall tracepoints are placed in the __syscall_metadata section, and at link time, the linker makes one large array of all these syscall metadata structures. On boot up, this array is read (much like the initcall sections) and the syscall data is processed. The problem is that there is no guarantee that gcc will place complex structures nicely together in an array format. Two structures in the same file may be placed awkwardly, because gcc has no clue that they are suppose to be in an array. A hack was used previous to force the alignment to 4, to pack the structures together. But this caused alignment issues with other architectures (sparc). Instead of packing the structures into an array, the structures' addresses are now put into the __syscall_metadata section. As pointers are always the natural alignment, gcc should always pack them tightly together (otherwise initcall, extable, etc would also fail). By having the pointers to the structures in the section, we can still iterate the trace_events without causing unnecessary alignment problems with other architectures, or depending on the current behaviour of gcc that will likely change in the future just to tick us kernel developers off a little more. The __syscall_metadata section is also moved into the .init.data section as it is now only needed at boot up. Suggested-by: NDavid Miller <davem@davemloft.net> Acked-by: NDavid S. Miller <davem@davemloft.net> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Mathieu Desnoyers 提交于
Make the tracepoints more robust, making them solid enough to handle compiler changes by not relying on anything based on compiler-specific behavior with respect to structure alignment. Implement an approach proposed by David Miller: use an array of const pointers to refer to the individual structures, and export this pointer array through the linker script rather than the structures per se. It will consume 32 extra bytes per tracepoint (24 for structure padding and 8 for the pointers), but are less likely to break due to compiler changes. History: commit 7e066fb8 tracepoints: add DECLARE_TRACE() and DEFINE_TRACE() added the aligned(32) type and variable attribute to the tracepoint structures to deal with gcc happily aligning statically defined structures on 32-byte multiples. One attempt was to use a 8-byte alignment for tracepoint structures by applying both the variable and type attribute to tracepoint structures definitions and declarations. It worked fine with gcc 4.5.1, but broke with gcc 4.4.4 and 4.4.5. The reason is that the "aligned" attribute only specify the _minimum_ alignment for a structure, leaving both the compiler and the linker free to align on larger multiples. Because tracepoint.c expects the structures to be placed as an array within each section, up-alignment cause NULL-pointer exceptions due to the extra unexpected padding. (this patch applies on top of -tip) Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: NDavid S. Miller <davem@davemloft.net> LKML-Reference: <20110126222622.GA10794@Krystal> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@elte.hu> CC: Thomas Gleixner <tglx@linutronix.de> CC: Andrew Morton <akpm@linux-foundation.org> CC: Peter Zijlstra <peterz@infradead.org> CC: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Peter Zijlstra 提交于
cpu_stopper_thread() migration_cpu_stop() __migrate_task() deactivate_task() dequeue_task() dequeue_task_rq() update_curr_rt() Will call update_curr_rt() on rq->curr, which at that time is rq->stop. The problem is that rq->stop.prio matches an RT prio and thus falsely assumes its a rt_sched_class task. Reported-Debuged-Tested-Acked-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Cc: stable@kernel.org # .37 Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
It is quite possible for the event to have been disabled between perf_event_read() sending the IPI and the CPU servicing the IPI and calling __perf_event_read(), hence revalidate the state. Reported-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Steven Rostedt 提交于
Currently the trace_event structures are placed in the _ftrace_events section, and at link time, the linker makes one large array of all the trace_event structures. On boot up, this array is read (much like the initcall sections) and the events are processed. The problem is that there is no guarantee that gcc will place complex structures nicely together in an array format. Two structures in the same file may be placed awkwardly, because gcc has no clue that they are suppose to be in an array. A hack was used previous to force the alignment to 4, to pack the structures together. But this caused alignment issues with other architectures (sparc). Instead of packing the structures into an array, the structures' addresses are now put into the _ftrace_event section. As pointers are always the natural alignment, gcc should always pack them tightly together (otherwise initcall, extable, etc would also fail). By having the pointers to the structures in the section, we can still iterate the trace_events without causing unnecessary alignment problems with other architectures, or depending on the current behaviour of gcc that will likely change in the future just to tick us kernel developers off a little more. The _ftrace_event section is also moved into the .init.data section as it is now only needed at boot up. Suggested-by: NDavid Miller <davem@davemloft.net> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Thomas Gleixner 提交于
move_native_irq() masks and unmasks the interrupt line unconditionally, but the interrupt line might be masked due to a threaded oneshot handler in progress. Unmasking the line in that case can lead to interrupt storms. Observed on PREEMPT_RT. Originally-from: Ingo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org
-
- 31 1月, 2011 4 次提交
-
-
由 Marcin Slusarz 提交于
Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com> [ add {}'s to fix a warning ] Signed-off-by: NDon Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: <stable@kernel.org> LKML-Reference: <1296230433-6261-3-git-send-email-dzickus@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Marcin Slusarz 提交于
If it was not possible to enable watchdog for any cpu, switch watchdog_enabled back to 0, because it's visible via kernel.watchdog sysctl. Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: NDon Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: <stable@kernel.org> LKML-Reference: <1296230433-6261-2-git-send-email-dzickus@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Marcin Slusarz 提交于
Passing nowatchdog to kernel disables 2 things: creation of watchdog threads AND initialization of percpu watchdog_hrtimer. As hrtimers are initialized only at boot it's not possible to enable watchdog later - for me all watchdog threads started to eat 100% of CPU time, but they could just crash. Additionally, even if these threads would start properly, watchdog_disable_all_cpus was guarded by no_watchdog check, so you couldn't disable watchdog. To fix this, remove no_watchdog variable and use already existing watchdog_enabled variable. Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com> [ removed another no_watchdog instance ] Signed-off-by: NDon Zickus <dzickus@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: <stable@kernel.org> LKML-Reference: <1296230433-6261-1-git-send-email-dzickus@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Kacper Kornet 提交于
Since check_prlimit_permission always fails in the case of SUID/GUID processes, such processes are not able to read or set their own limits. This commit changes this by assuming that process can always read/change its own limits. Signed-off-by: NKacper Kornet <kornet@camk.edu.pl> Acked-by: NJiri Slaby <jslaby@suse.cz> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 1月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
Commit 927c7a9e ("perf: Fix race in callchains") introduced a mismatch in the sizing of struct callchain_cpus_entries. nr_cpu_ids must be used instead of num_possible_cpus(), or we might get out of bound memory accesses on some machines. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: Stephane Eranian <eranian@google.com> CC: stable@kernel.org LKML-Reference: <1295980851.3588.351.camel@edumazet-laptop> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 26 1月, 2011 4 次提交
-
-
由 Paul Turner 提交于
The delta in clock_task is a more fair attribution of how much time a tg has been contributing load to the current cpu. While not really important it also means we're more in sync (by magnitude) with respect to periodic updates (since __update_curr deltas are clock_task based). Signed-off-by: NPaul Turner <pjt@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20110122044852.007092349@google.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Paul Turner 提交于
Since updates are against an entity's queuing cfs_rq it's not possible to enter update_cfs_{shares,load} with a NULL cfs_rq. (Indeed, update_cfs_load would crash prior to the check if we did anyway since we load is examined during the initializers). Also, in the update_cfs_load case there's no point in maintaining averages for rq->cfs_rq since we don't perform shares distribution at that level -- NULL check is replaced accordingly. Thanks to Dan Carpenter for pointing out the deference before NULL check. Signed-off-by: NPaul Turner <pjt@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20110122044851.825284940@google.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Paul Turner 提交于
While care is taken around the zero-point in effective_load to not exceed the instantaneous rq->weight, it's still possible (e.g. using wake_idx != 0) for (load + effective_load) to underflow. In this case the comparing the unsigned values can result in incorrect balanced decisions. Signed-off-by: NPaul Turner <pjt@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20110122044851.734245014@google.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Torben Hohn 提交于
The -rt patches change the console_semaphore to console_mutex. As a result, a quite large chunk of the patches changes all acquire/release_console_sem() to acquire/release_console_mutex() This commit makes things use more neutral function names which dont make implications about the underlying lock. The only real change is the return value of console_trylock which is inverted from try_acquire_console_sem() This patch also paves the way to switching console_sem from a semaphore to a mutex. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: make console_trylock return 1 on success, per Geert] Signed-off-by: NTorben Hohn <torbenh@gmx.de> Cc: Thomas Gleixner <tglx@tglx.de> Cc: Greg KH <gregkh@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 1月, 2011 1 次提交
-
-
由 Andy Whitcroft 提交于
Currently sysrq_enabled and __sysrq_enabled are initialised separately and inconsistently, leading to sysrq being actually enabled by reported as not enabled in sysfs. The first change to the sysfs configurable synchronises these two: static int __read_mostly sysrq_enabled = 1; static int __sysrq_enabled; Add a common define to carry the default for these preventing them becoming out of sync again. Default this to 1 to mirror previous behaviour. Signed-off-by: NAndy Whitcroft <apw@canonical.com> Cc: stable@kernel.org Signed-off-by: NDmitry Torokhov <dtor@mail.ru>
-
- 24 1月, 2011 2 次提交
-
-
由 Yong Zhang 提交于
Michael Witten and Christian Kujau reported that the autogroup scheduling feature hurts interactivity on their UP systems. It turns out that this is an older bug in the group scheduling code, and the wider appeal provided by the autogroup feature exposed it more prominently. When on UP with FAIR_GROUP_SCHED enabled, tune shares only affect tg->shares, but is not reflected in tg->se->load. The reason is that update_cfs_shares() does nothing on UP. So introduce update_cfs_shares() for UP && FAIR_GROUP_SCHED. This issue was found when enable autogroup scheduling was enabled, but it is an older bug that also exists on cgroup.cpu on UP. Reported-and-Tested-by: NMichael Witten <mfwitten@gmail.com> Reported-and-Tested-by: NChristian Kujau <christian@nerdbynature.de> Signed-off-by: NYong Zhang <yong.zhang0@gmail.com> Acked-by: NPekka Enberg <penberg@kernel.org> Acked-by: NMike Galbraith <efault@gmx.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20110124073352.GA24186@windriver.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Dmitry Torokhov 提交于
Currently only drivers that are built as modules have their versions shown in /sys/module/<module_name>/version, but this information might also be useful for built-in drivers as well. This especially important for drivers that do not define any parameters - such drivers, if built-in, are completely invisible from userspace. This patch changes MODULE_VERSION() macro so that in case when we are compiling built-in module, version information is stored in a separate section. Kernel then uses this data to create 'version' sysfs attribute in the same fashion it creates attributes for module parameters. Signed-off-by: NDmitry Torokhov <dtor@vmware.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 22 1月, 2011 1 次提交
-
-
由 Oleg Nesterov 提交于
In theory, almost every user of task->child->perf_event_ctxp[] is wrong. find_get_context() can install the new context at any moment, we need read_barrier_depends(). dbe08d82 "perf: Fix find_get_context() vs perf_event_exit_task() race" added rcu_dereference() into perf_event_exit_task_context() to make the precedent, but this makes __rcu_dereference_check() unhappy. Use rcu_dereference_raw() to shut up the warning. Reported-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Cc: acme@redhat.com Cc: paulus@samba.org Cc: stern@rowland.harvard.edu Cc: a.p.zijlstra@chello.nl Cc: fweisbec@gmail.com Cc: roland@redhat.com Cc: prasad@linux.vnet.ibm.com Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> LKML-Reference: <20110121174547.GA8796@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 21 1月, 2011 2 次提交
-
-
由 Peter Zijlstra 提交于
Lockdep spotted: loop_1b_instruc/1899 is trying to acquire lock: (event_mutex){+.+.+.}, at: [<ffffffff810e1908>] perf_trace_init+0x3b/0x2f7 but task is already holding lock: (&ctx->mutex){+.+.+.}, at: [<ffffffff810eb45b>] perf_event_init_context+0xc0/0x218 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&ctx->mutex){+.+.+.}: -> #2 (cpu_hotplug.lock){+.+.+.}: -> #1 (module_mutex){+.+...}: -> #0 (event_mutex){+.+.+.}: But because the deadlock would be cpuhotplug (cpu-event) vs fork (task-event) it cannot, in fact, happen. We can annotate this by giving the perf_event_context used for the cpuctx a different lock class from those used by tasks. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
All architectures are finally converted. Remove the cruft. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Mike Frysinger <vapier@gentoo.org> Cc: David Howells <dhowells@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Michal Simek <monstr@monstr.eu> Acked-by: NDavid Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jeff Dike <jdike@addtoit.com>
-