- 28 4月, 2008 5 次提交
-
-
由 Eric Paris 提交于
This patch standardized the string auditing interfaces. No userspace changes will be visible and this is all just cleanup and consistancy work. We have the following string audit interfaces to use: void audit_log_n_hex(struct audit_buffer *ab, const unsigned char *buf, size_t len); void audit_log_n_string(struct audit_buffer *ab, const char *buf, size_t n); void audit_log_string(struct audit_buffer *ab, const char *buf); void audit_log_n_untrustedstring(struct audit_buffer *ab, const char *string, size_t n); void audit_log_untrustedstring(struct audit_buffer *ab, const char *string); This may be the first step to possibly fixing some of the issues that people have with the string output from the kernel audit system. But we still don't have an agreed upon solution to that problem. Signed-off-by: NEric Paris <eparis@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Eric Paris 提交于
A deadlock is possible between kauditd and auditd under load if auditd receives a signal. When auditd receives a signal it sends a netlink message to the kernel asking for information about the sender of the signal. In that same context the audit system will attempt to send a netlink message back to the userspace auditd. If kauditd has already filled the socket buffer (see netlink_attachskb()) auditd will now put itself to sleep waiting for room to send the message. Since auditd is responsible for draining that socket we have a deadlock. The fix, since the response from the kernel does not need to be synchronous is to send the signal information back to auditd in a separate thread. And thus auditd can continue to drain the audit queue normally. Signed-off-by: NEric Paris <eparis@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Eric Paris 提交于
This patch causes the kernel audit subsystem to store up to audit_backlog_limit messages for use by auditd if it ever appears sometime in the future in userspace. This is useful to collect audit messages during bootup and even when auditd is stopped. This is NOT a reliable mechanism, it does not ever call audit_panic, nor should it. audit_log_lost()/audit_panic() are called during the normal delivery mechanism. The messages are still sent to printk/syslog as usual and if too many messages appear to be queued they will be silently discarded. I liked doing it by default, but this patch only uses the queue in question if it was booted with audit=1 or if the kernel was built enabling audit by default. Signed-off-by: NEric Paris <eparis@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Eric Paris 提交于
Previously I added sessionid output to all audit messages where it was available but we still didn't know the sessionid of the sender of netlink messages. This patch adds that information to netlink messages so we can audit who sent netlink messages. Signed-off-by: NEric Paris <eparis@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Eric Paris 提交于
A couple of audit printk statements did not have a newline. Signed-off-by: NEric Paris <eparis@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 27 4月, 2008 2 次提交
-
-
由 Carsten Otte 提交于
The SIE instruction on s390 uses the 2nd half of the page table page to virtualize the storage keys of a guest. This patch offers the s390_enable_sie function, which reorganizes the page tables of a single-threaded process to reserve space in the page table: s390_enable_sie makes sure that the process is single threaded and then uses dup_mm to create a new mm with reorganized page tables. The old mm is freed and the process has now a page status extended field after every page table. Code that wants to exploit pgstes should SELECT CONFIG_PGSTE. This patch has a small common code hit, namely making dup_mm non-static. Edit (Carsten): I've modified Martin's patch, following Jeremy Fitzhardinge's review feedback. Now we do have the prototype for dup_mm in include/linux/sched.h. Following Martin's suggestion, s390_enable_sie() does now call task_lock() to prevent race against ptrace modification of mm_users. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Acked-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Al Viro 提交于
Arrgghhh... Sorry about that, I'd been sure I'd folded that one, but it actually got lost. Please apply - that breaks execve(). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Tested-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 4月, 2008 1 次提交
-
-
由 Al Viro 提交于
Arrgghhh... Sorry about that, I'd been sure I'd folded that one, but it actually got lost. Please apply - that unbreaks execve(). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
- 25 4月, 2008 8 次提交
-
-
由 Al Viro 提交于
* let unshare_files() give caller the displaced files_struct * don't bother with grabbing reference only to drop it in the caller if it hadn't been shared in the first place * in that form unshare_files() is trivially implemented via unshare_fd(), so we eliminate the duplicate logics in fork.c * reset_files_struct() is not just only called for current; it will break the system if somebody ever calls it for anything else (we can't modify ->files of somebody else). Lose the task_struct * argument. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
* unshare_files() can fail; doing it after irreversible actions is wrong and de_thread() is certainly irreversible. * since we do it unconditionally anyway, we might as well do it in do_execve() and save ourselves the PITA in binfmt handlers, etc. * while we are at it, binfmt_som actually leaked files_struct on failure. As a side benefit, unshare_files(), put_files_struct() and reset_files_struct() become unexported. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
updating current->files requires task_lock Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 David Miller 提交于
There is no guarantee that there is physical ram below 4GB, and in fact many boxes don't have exactly that. Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
fix __aggregate_redistribute_shares() related lockup reported by David S. Miller. The problem this code tries to solve is 'accurately' calculating the 'fair' share of the group weight for each cpu. The current code falls back to a global group rebalance in case the sched_domain's span it looks at has no shares, but does have tasks. The reason it gets stuck here, is because its inherently racy - if someone steals the last task after we compute the agg->rq_weight, but before we rebalance, we'll never get out of the loop. We could of course go fix that, but while looking at this issue I found that this 'fallback' wasn't nearly as rare as I'd hoped it to be. In fact its quite common - and given it walks the whole machine, thats very bad. The new approach is simple (why didn't I think of it before?), we set the aggregate shares to the full task group weight, and each larger sched domain that encounters an aggregate shares larger than the weight, clips it (it already re-distributes anyway). This nicely converges to the desired global picture where the sum of all shares equals the task group weight. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
David Miller reported: |---------------> the following commit: | commit 27ec4407 | Author: Ingo Molnar <mingo@elte.hu> | Date: Thu Feb 28 21:00:21 2008 +0100 | | sched: make cpu_clock() globally synchronous | | Alexey Zaytsev reported (and bisected) that the introduction of | cpu_clock() in printk made the timestamps jump back and forth. | | Make cpu_clock() more reliable while still keeping it fast when it's | called frequently. | | Signed-off-by: Ingo Molnar <mingo@elte.hu> causes watchdog triggers when a cpu exits NOHZ state when it has been there for >= the soft lockup threshold, for example here are some messages from a 128 cpu Niagara2 box: [ 168.106406] BUG: soft lockup - CPU#11 stuck for 128s! [dd:3239] [ 168.989592] BUG: soft lockup - CPU#21 stuck for 86s! [swapper:0] [ 168.999587] BUG: soft lockup - CPU#29 stuck for 91s! [make:4511] [ 168.999615] BUG: soft lockup - CPU#2 stuck for 85s! [swapper:0] [ 169.020514] BUG: soft lockup - CPU#37 stuck for 91s! [swapper:0] [ 169.020514] BUG: soft lockup - CPU#45 stuck for 91s! [sh:4515] [ 169.020515] BUG: soft lockup - CPU#69 stuck for 92s! [swapper:0] [ 169.020515] BUG: soft lockup - CPU#77 stuck for 92s! [swapper:0] [ 169.020515] BUG: soft lockup - CPU#61 stuck for 92s! [swapper:0] [ 169.112554] BUG: soft lockup - CPU#85 stuck for 92s! [swapper:0] [ 169.112554] BUG: soft lockup - CPU#101 stuck for 92s! [swapper:0] [ 169.112554] BUG: soft lockup - CPU#109 stuck for 92s! [swapper:0] [ 169.112554] BUG: soft lockup - CPU#117 stuck for 92s! [swapper:0] [ 169.171483] BUG: soft lockup - CPU#40 stuck for 80s! [dd:3239] [ 169.331483] BUG: soft lockup - CPU#13 stuck for 86s! [swapper:0] [ 169.351500] BUG: soft lockup - CPU#43 stuck for 101s! [dd:3239] [ 169.531482] BUG: soft lockup - CPU#9 stuck for 129s! [mkdir:4565] [ 169.595754] BUG: soft lockup - CPU#20 stuck for 93s! [swapper:0] [ 169.626787] BUG: soft lockup - CPU#52 stuck for 93s! [swapper:0] [ 169.626787] BUG: soft lockup - CPU#84 stuck for 92s! [swapper:0] [ 169.636812] BUG: soft lockup - CPU#116 stuck for 94s! [swapper:0] It's simple enough to trigger this by doing a 10 minute sleep after a fresh bootup then starting a parallel kernel build. I suspect this might be reintroducing a problem we've had and fixed before, see the thread: http://marc.info/?l=linux-kernel&m=119546414004065&w=2 <---------------| touch the softlockup watchdog when exiting NOHZ state - we are obviously not locked up. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
Regression caused by 434d53b0Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
由 Russ Anderson 提交于
A recent change prevents SGI Altix from booting. This patch fixes the problem. The regresson was introduced in commit 434d53b0Signed-off-by: NRuss Anderson <rja@sgi.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 23 4月, 2008 2 次提交
-
-
由 Al Viro 提交于
The only reason to have separated __...() for those was to keep them inlined for local users in exit.c. Since Alexey removed the inline on those, there's no reason whatsoever to keep them around; just collapse with normal variants. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Randy Dunlap 提交于
Add missing kernel-doc in kernel/sched.c: Warning(linux-2.6.25-git3//kernel/sched.c:7044): No description found for parameter 'span' Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 4月, 2008 3 次提交
-
-
由 YOSHIFUJI Hideaki 提交于
Sorry I have just realized set_normalized_timespec() (used in timespec_sub()) is not exported, and link will fail because of it... Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Roland McGrath 提交于
This adds support for PTRACE_GETSIGINFO and PTRACE_SETSIGINFO in compat_ptrace_request. It relies on existing arch definitions for copy_siginfo_to_user32 and copy_siginfo_from_user32. On powerpc, this fixes a longstanding regression of 32-bit ptrace calls on 64-bit kernels vs native calls (64-bit calls or 32-bit kernels). This can be seen in a 32-bit call using PTRACE_GETSIGINFO to examine e.g. siginfo_t.si_addr from a signal that sets it. (This was broken as of 2.6.24 and, I presume, many or all prior versions.) Signed-off-by: NRoland McGrath <roland@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Pavel Machek 提交于
These are small cleanups all over the tree. Trivial style and comment changes to fs/select.c, kernel/signal.c, kernel/stop_machine.c & mm/pdflush.c Signed-off-by: NPavel Machek <pavel@suse.cz> Signed-off-by: NJesper Juhl <jesper.juhl@gmail.com>
-
- 21 4月, 2008 4 次提交
-
-
由 Thomas Gleixner 提交于
The previous optimization did not take the case into account where a clock provides its own softirq_get_time() function. Check for the availablitiy of the clock get time function first and then check if we need to retrieve the time for both clocks via hrtimer_softirq_gettime() to avoid a double evaluation of time in that case as well. Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Dimitri Sivanich 提交于
It seems that hrtimer_run_queues() is calling hrtimer_get_softirq_time() more often than it needs to. This can cause frequent contention on systems with large numbers of processors/cores. With this patch, hrtimer_run_queues only calls hrtimer_get_softirq_time() if there is a pending timer in one of the hrtimer bases, and only once. This also combines hrtimer_run_queues() and the inline run_hrtimer_queue() into one function. [ tglx@linutronix.de: coding style ] Signed-off-by: NDimitri Sivanich <sivanich@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Glauber Costa 提交于
braodcast -> broadcast Signed-off-by: NGlauber Costa <gcosta@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Ivan Kokshaysky 提交于
Done per Linus' request and suggestions. Linus has explained that better than I'll be able to explain: On Thu, Mar 27, 2008 at 10:12:10AM -0700, Linus Torvalds wrote: > Actually, before we go any further, there might be a less intrusive > alternative: add just a couple of flags to the resource flags field (we > still have something like 8 unused bits on 32-bit), and use those to > implement a generic "resource_alignment()" routine. > > Two flags would do it: > > - IORESOURCE_SIZEALIGN: size indicates alignment (regular PCI device > resources) > > - IORESOURCE_STARTALIGN: start field is alignment (PCI bus resources > during probing) > > and then the case of both flags zero (or both bits set) would actually be > "invalid", and we would also clear the IORESOURCE_STARTALIGN flag when we > actually allocate the resource (so that we don't use the "start" field as > alignment incorrectly when it no longer indicates alignment). > > That wouldn't be totally generic, but it would have the nice property of > automatically at least add sanity checking for that whole "res->start has > the odd meaning of 'alignment' during probing" and remove the need for a > new field, and it would allow us to have a generic "resource_alignment()" > routine that just gets a resource pointer. Besides, I removed IORESOURCE_BUS_HAS_VGA flag which was unused for ages. Signed-off-by: NIvan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Gary Hade <garyhade@us.ibm.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 20 4月, 2008 15 次提交
-
-
由 Ingo Molnar 提交于
Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
provide a text based interface to the scheduler features; this saves the 'user' from setting bits using decimal arithmetic. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
unused at the moment. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Print a tree of weights. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
In order to level the hierarchy, we need to calculate load based on the root view. That is, each task's load is in the same unit. A / \ B 1 / \ 2 3 To compute 1's load we do: weight(1) -------------- rq_weight(A) To compute 2's load we do: weight(2) weight(B) ------------ * ----------- rq_weight(B) rw_weight(A) This yields load fractions in comparable units. The consequence is that it changes virtual time. We used to have: time_{i} vtime_{i} = ------------ weight_{i} vtime = \Sum vtime_{i} = time / rq_weight. But with the new way of load calculation we get that vtime equals time. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
De-couple load-balancing from the rb-trees, so that I can change their organization. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Currently FAIR_GROUP sched grows the scheduler latency outside of sysctl_sched_latency, invert this so it stays within. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Now that the group hierarchy can have an arbitrary depth the O(n^2) nature of RT task dequeues will really hurt. Optimize this by providing space to store the tree path, so we can walk it the other way. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Add some extra debug output so we can get a better overview of the full hierarchy. We print the cgroup path after each cfs_rq, so we can see what group we're looking at. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Implement SMP nice support for the full group hierarchy. On each load-balance action, compile a sched_domain wide view of the full task_group tree. We compute the domain wide view when walking down the hierarchy, and readjust the weights when walking back up. After collecting and readjusting the domain wide view, we try to balance the tasks within the task_groups. The current approach is a naively balance each task group until we've moved the targeted amount of load. Inspired by Srivatsa Vaddsgiri's previous code and Abhishek Chandra's H-SMP paper. XXX: there will be some numerical issues due to the limited nature of SCHED_LOAD_SCALE wrt to representing a task_groups influence on the total weight. When the tree is deep enough, or the task weight small enough, we'll run out of bits. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> CC: Abhishek Chandra <chandra@cs.umn.edu> CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Hidetoshi Seto 提交于
[rebased for sched-devel/latest] - Add a new cpuset file, having levels: sched_relax_domain_level - Modify partition_sched_domains() and build_sched_domains() to take attributes parameter passed from cpuset. - Fill newidle_idx for node domains which currently unused but might be required if sched_relax_domain_level become higher. - We can change the default level by boot option 'relax_domain_level='. Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
multi level rt constraints Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Add the full parent<->child relation thing into task_groups as well. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-