- 07 8月, 2008 1 次提交
-
-
由 Dmitry Baryshkov 提交于
Spotted by Randy. Acked-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NDmitry Baryshkov <dbaryshkov@gmail.com> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 06 8月, 2008 5 次提交
-
-
由 Richard Hughes 提交于
A documentation cleanup patch. With a minor tweak to clarify units for kbs. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Nmark gross <mgross@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jan Beulich 提交于
get_order() takes byte-sized input, not a page-granular one. Irrespective of this fix I'm inclined to believe that this doesn't work right anyway - bitmap_allocate_region() has an implicit assumption of 'pos' being suitable for 'order', which this function doesn't seem to enforce (and since it's being called with a byte-granular value there's no reason to believe that the callers would make sure device_addr is passed accordingly - it's also not documented that way). Signed-off-by: NJan Beulich <jbeulich@novell.com> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Dmitry Baryshkov <dbaryshkov@gmail.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Brownell 提交于
While I'm glad to finally see the hole fixed whereby passing an invalid IRQ trigger type to request_irq() would be ignored, the current diagnostic isn't quite useful. Fixed by also listing the trigger type which was rejected. Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net> Acked-by: NUwe Kleine-König <Uwe.Kleine-Koenig@digi.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Oleg Nesterov 提交于
Change __down_common() to use signal_pending_state() instead of open coding. The changes in kernel/semaphore.o are just artifacts, the state checks are optimized away. Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru> Cc: Ingo Molnar <mingo@elte.hu> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tom Zanussi 提交于
In relay's current read implementation, if the buffer is completely full but hasn't triggered the buffer-full condition (i.e. the last write didn't cross the subbuffer boundary) and the last subbuffer is exactly full, the subbuffer accounting code erroneously finds nothing available. This patch fixes the problem. Signed-off-by: NTom Zanussi <tzanussi@gmail.com> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Andrea Righi <righi.andrea@gmail.com> Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 8月, 2008 1 次提交
-
-
由 Jeremy Fitzhardinge 提交于
The "user" parameter to __sched_setscheduler indicates whether the change is being done on behalf of a user process or not. If not, we shouldn't apply any permissions checks, so don't call security_task_setscheduler(). Signed-off-by: NJeremy Fitzhardinge <jeremy@goop.org> Tested-by: NSteve Wise <swise@opengridcomputing.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 8月, 2008 1 次提交
-
-
由 zhangxiliang 提交于
Sorry, I miss a blank between if and "(". And I add "unlikely" to check "ctx" in audit_match_perm() and audit_match_filetype(). This is a new patch for it. Signed-off-by: NZhang Xiliang <zhangxiliang@cn.fujitsu.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 02 8月, 2008 6 次提交
-
-
由 Roland McGrath 提交于
My commit 2b2a1ff6 introduced a regression (sorry about that) for the odd case of exit_signal=0 (e.g. clone_flags=0). This is not a normal use, but it's used by a case in the glibc test suite. Dying with exit_signal=0 sends no signal, but it's supposed to wake up a parent's blocked wait*() calls (unlike the delayed_group_leader case). This fixes tracehook_notify_death() and its caller to distinguish a "signal 0" wakeup from the delayed_group_leader case (with no wakeup). Signed-off-by: NRoland McGrath <roland@redhat.com> Tested-by: NSerge Hallyn <serue@us.ibm.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 zhangxiliang 提交于
When the "status_get->mask" is "AUDIT_STATUS_RATE_LIMIT || AUDIT_STATUS_BACKLOG_LIMIT". If "audit_set_rate_limit" fails and "audit_set_backlog_limit" succeeds, the "err" value will be greater than or equal to 0. It will miss the failure of rate set. Signed-off-by: NZhang Xiliang <zhangxiliang@cn.fujitsu.com> Acked-by: NEric Paris <eparis@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 zhangxiliang 提交于
When calling audit_filter_task(), it calls audit_filter_rules() with audit_context is NULL. If the key field is set, the result in audit_filter_rules() will be set to 1 and ctx->filterkey will be set to key. But the ctx is NULL in this condition, so kernel will panic. Signed-off-by: NZhang Xiliang <zhangxiliang@cn.fujitsu.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 zhangxiliang 提交于
> shouldn't these be using the "audit_get_loginuid(current)" and if we > are going to output loginuid we also should be outputting sessionid Thanks for your detailed explanation. I have made a new patch for outputing "loginuid" and "sessionid" by audit_get_loginuid(current) and audit_get_sessionid(current). If there are some deficiencies, please give me your indication. Signed-off-by: NZhang Xiliang <zhangxiliang@cn.fujitsu.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Vesa-Matti J Kari 提交于
Hello, According to my understanding there is an off-by-one bug in the function: audit_string_contains_control() in: kernel/audit.c Patch is included. I do not know from how many places the function is called from, but for example, SELinux Access Vector Cache tries to log untrusted filenames via call path: avc_audit() audit_log_untrustedstring() audit_log_n_untrustedstring() audit_string_contains_control() If audit_string_contains_control() detects control characters, then the string is hex-encoded. But the hex=0x7f dec=127, DEL-character, is not detected. I guess this could have at least some minor security implications, since a user can create a filename with 0x7f in it, causing logged filename to possibly look different when someone reads it on the terminal. Signed-off-by: NVesa-Matti Kari <vmkari@cc.helsinki.fi> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Eric Paris 提交于
Makes the kernel audit subsystem collect information about the sending process when that process sends SIGUSR2 to the userspace audit daemon. SIGUSR2 is a new interesting signal to auditd telling auditd that it should try to start logging to disk again and the error condition which caused it to stop logging to disk (usually out of space) has been rectified. Signed-off-by: NEric Paris <eparis@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 01 8月, 2008 2 次提交
-
-
由 Jason Wessel 提交于
The command "info threads" did not work correctly with kgdb. It would result in a silent kernel hang if used. This patach addresses several problems. - Fix use of deprecated NR_CPUS - Fix kgdb to not walk linearly through the pid space - Correctly implement shadow pids - Change the threads per query to a #define - Fix kgdb_hex2long to work with negated values The threads 0 and -1 are reserved to represent the current task. That means that CPU 0 will start with a shadow thread id of -2, and CPU 1 will have a shadow thread id of -3, etc... From the debugger you can switch to a shadow thread to see what one of the other cpus was doing, however it is not possible to execute run control operations on any other cpu execept the cpu executing the kgdb_handle_exception(). Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
-
由 Jason Wessel 提交于
A regression to the kgdb core was found in the case of using the CONFIG_DEBUG_RODATA kernel option. When this option is on, a breakpoint cannot be written into any readonly memory page. When an external debugger requests a breakpoint to get set, the kgdb_validate_break_address() was only checking to see if the address to place the breakpoint was readable and lacked a write check. This patch changes the validate routine to try reading (via the breakpoint set request) and also to try immediately writing the break point. If either fails, an error is correctly returned and the debugger behaves correctly. Then an end user can make the descision to use hardware breakpoints. Also update the documentation to reflect that using CONFIG_DEBUG_RODATA will inhibit the use of software breakpoints. Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
-
- 31 7月, 2008 12 次提交
-
-
由 Andi Kleen 提交于
They are really class devices, but were incorrectly declared. This leads to crashes with the recent changes that makes non normal sysdevs use a different prototype. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Pierre Ossman <drzeus-list@drzeus.cx> Cc: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Oleg Nesterov 提交于
Dmitry Adamushko pointed out that the error handling in __create_workqueue_key() is not clear, add the comment. Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru> Cc: Dmitry Adamushko <dmitry.adamushko@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Uwe Kleine-König 提交于
The comment assumed the burst to be one and the ratelimit used to be named printk_ratelimit_jiffies. Signed-off-by: NUwe Kleine-König <Uwe.Kleine-Koenig@digi.com> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mathieu Desnoyers 提交于
Paul pointed out two incorrect read barriers in the marker handler code in the path where multiple probes are connected. Those are ordering reads of "ptype" (single or multi probe marker), "multi" array pointer, and "multi" array data access. It should be ordered like this : read ptype smp_rmb() read multi array pointer smp_read_barrier_depends() access data referenced by multi array pointer The code with a single probe connected (optimized case, does not have to allocate an array) has correct memory ordering. It applies to kernel 2.6.26.x, 2.6.25.x and linux-next. Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Use cpuset.stack_list rather than kfifo, so we avoid memory allocation for kfifo. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
When multiple cpusets are overlapping in their 'cpus' and hence they form a single sched domain, the largest sched_relax_domain_level among those should be used. But when top_cpuset's sched_load_balance is set, its sched_relax_domain_level is used regardless other sub-cpusets'. This patch fixes it by walking the cpuset hierarchy to find the largest sched_relax_domain_level. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: NPaul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
All child cpusets contain a subset of the parent's cpus, so we can skip them when partitioning sched domains. This decreases 'csa' greately for cpusets with multi-level hierarchy. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: NPaul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
clean up hierarchy traversal code Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Paul Jackson <pj@sgi.com> Cc: Cliff Wickman <cpw@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
It's not small enough, and has 2 call sites. text data bss dec hex filename 12813 1676 4832 19321 4b79 cgroup.o.orig 12775 1676 4832 19283 4b53 cgroup.o Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
- just call free_cg_links() in allocate_cg_links() - the list will get initialized in allocate_cg_links(), so don't init it twice Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
There's a leak if copy_from_user() returns failure. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Magnus Damm 提交于
Avoid one-off errors by introducing a resource_size() function. Signed-off-by: NMagnus Damm <damm@igel.co.jp> Cc: Ben Dooks <ben-linux@fluff.org> Cc: Jean Delvare <khali@linux-fr.org> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 7月, 2008 3 次提交
-
-
由 Andrea Arcangeli 提交于
With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to pages. There are secondary MMUs (with secondary sptes and secondary tlbs) too. sptes in the kvm case are shadow pagetables, but when I say spte in mmu-notifier context, I mean "secondary pte". In GRU case there's no actual secondary pte and there's only a secondary tlb because the GRU secondary MMU has no knowledge about sptes and every secondary tlb miss event in the MMU always generates a page fault that has to be resolved by the CPU (this is not the case of KVM where the a secondary tlb miss will walk sptes in hardware and it will refill the secondary tlb transparently to software if the corresponding spte is present). The same way zap_page_range has to invalidate the pte before freeing the page, the spte (and secondary tlb) must also be invalidated before any page is freed and reused. Currently we take a page_count pin on every page mapped by sptes, but that means the pages can't be swapped whenever they're mapped by any spte because they're part of the guest working set. Furthermore a spte unmap event can immediately lead to a page to be freed when the pin is released (so requiring the same complex and relatively slow tlb_gather smp safe logic we have in zap_page_range and that can be avoided completely if the spte unmap event doesn't require an unpin of the page previously mapped in the secondary MMU). The mmu notifiers allow kvm/GRU/XPMEM to attach to the tsk->mm and know when the VM is swapping or freeing or doing anything on the primary MMU so that the secondary MMU code can drop sptes before the pages are freed, avoiding all page pinning and allowing 100% reliable swapping of guest physical address space. Furthermore it avoids the code that teardown the mappings of the secondary MMU, to implement a logic like tlb_gather in zap_page_range that would require many IPI to flush other cpu tlbs, for each fixed number of spte unmapped. To make an example: if what happens on the primary MMU is a protection downgrade (from writeable to wrprotect) the secondary MMU mappings will be invalidated, and the next secondary-mmu-page-fault will call get_user_pages and trigger a do_wp_page through get_user_pages if it called get_user_pages with write=1, and it'll re-establishing an updated spte or secondary-tlb-mapping on the copied page. Or it will setup a readonly spte or readonly tlb mapping if it's a guest-read, if it calls get_user_pages with write=0. This is just an example. This allows to map any page pointed by any pte (and in turn visible in the primary CPU MMU), into a secondary MMU (be it a pure tlb like GRU, or an full MMU with both sptes and secondary-tlb like the shadow-pagetable layer with kvm), or a remote DMA in software like XPMEM (hence needing of schedule in XPMEM code to send the invalidate to the remote node, while no need to schedule in kvm/gru as it's an immediate event like invalidating primary-mmu pte). At least for KVM without this patch it's impossible to swap guests reliably. And having this feature and removing the page pin allows several other optimizations that simplify life considerably. Dependencies: 1) mm_take_all_locks() to register the mmu notifier when the whole VM isn't doing anything with "mm". This allows mmu notifier users to keep track if the VM is in the middle of the invalidate_range_begin/end critical section with an atomic counter incraese in range_begin and decreased in range_end. No secondary MMU page fault is allowed to map any spte or secondary tlb reference, while the VM is in the middle of range_begin/end as any page returned by get_user_pages in that critical section could later immediately be freed without any further ->invalidate_page notification (invalidate_range_begin/end works on ranges and ->invalidate_page isn't called immediately before freeing the page). To stop all page freeing and pagetable overwrites the mmap_sem must be taken in write mode and all other anon_vma/i_mmap locks must be taken too. 2) It'd be a waste to add branches in the VM if nobody could possibly run KVM/GRU/XPMEM on the kernel, so mmu notifiers will only enabled if CONFIG_KVM=m/y. In the current kernel kvm won't yet take advantage of mmu notifiers, but this already allows to compile a KVM external module against a kernel with mmu notifiers enabled and from the next pull from kvm.git we'll start using them. And GRU/XPMEM will also be able to continue the development by enabling KVM=m in their config, until they submit all GRU/XPMEM GPLv2 code to the mainline kernel. Then they can also enable MMU_NOTIFIERS in the same way KVM does it (even if KVM=n). This guarantees nobody selects MMU_NOTIFIER=y if KVM and GRU and XPMEM are all =n. The mmu_notifier_register call can fail because mm_take_all_locks may be interrupted by a signal and return -EINTR. Because mmu_notifier_reigster is used when a driver startup, a failure can be gracefully handled. Here an example of the change applied to kvm to register the mmu notifiers. Usually when a driver startups other allocations are required anyway and -ENOMEM failure paths exists already. struct kvm *kvm_arch_create_vm(void) { struct kvm *kvm = kzalloc(sizeof(struct kvm), GFP_KERNEL); + int err; if (!kvm) return ERR_PTR(-ENOMEM); INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); + kvm->arch.mmu_notifier.ops = &kvm_mmu_notifier_ops; + err = mmu_notifier_register(&kvm->arch.mmu_notifier, current->mm); + if (err) { + kfree(kvm); + return ERR_PTR(err); + } + return kvm; } mmu_notifier_unregister returns void and it's reliable. The patch also adds a few needed but missing includes that would prevent kernel to compile after these changes on non-x86 archs (x86 didn't need them by luck). [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix mm/filemap_xip.c build] [akpm@linux-foundation.org: fix mm/mmu_notifier.c build] Signed-off-by: NAndrea Arcangeli <andrea@qumranet.com> Signed-off-by: NNick Piggin <npiggin@suse.de> Signed-off-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Jack Steiner <steiner@sgi.com> Cc: Robin Holt <holt@sgi.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Kanoj Sarcar <kanojsarcar@yahoo.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Avi Kivity <avi@qumranet.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Chris Wright <chrisw@redhat.com> Cc: Marcelo Tosatti <marcelo@kvack.org> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Izik Eidus <izike@qumranet.com> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
Clean up and optimize cpumask_of_cpu(), by sharing all the zero words. Instead of stupidly generating all possible i=0...NR_CPUS 2^i patterns creating a huge array of constant bitmasks, realize that the zero words can be shared. In other words, on a 64-bit architecture, we only ever need 64 of these arrays - with a different bit set in one single world (with enough zero words around it so that we can create any bitmask by just offsetting in that big array). And then we just put enough zeroes around it that we can point every single cpumask to be one of those things. So when we have 4k CPU's, instead of having 4k arrays (of 4k bits each, with one bit set in each array - 2MB memory total), we have exactly 64 arrays instead, each 8k bits in size (64kB total). And then we just point cpumask(n) to the right position (which we can calculate dynamically). Once we have the right arrays, getting "cpumask(n)" ends up being: static inline const cpumask_t *get_cpu_mask(unsigned int cpu) { const unsigned long *p = cpu_bit_bitmap[1 + cpu % BITS_PER_LONG]; p -= cpu / BITS_PER_LONG; return (const cpumask_t *)p; } This brings other advantages and simplifications as well: - we are not wasting memory that is just filled with a single bit in various different places - we don't need all those games to re-create the arrays in some dense format, because they're already going to be dense enough. if we compile a kernel for up to 4k CPU's, "wasting" that 64kB of memory is a non-issue (especially since by doing this "overlapping" trick we probably get better cache behaviour anyway). [ mingo@elte.hu: Converted Linus's mails into a commit. See: http://lkml.org/lkml/2008/7/27/156 http://lkml.org/lkml/2008/7/28/320 Also applied a family filter - which also has the side-effect of leaving out the bits where Linus calls me an idio... Oh, never mind ;-) ] Signed-off-by: NIngo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Mike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Randy Dunlap 提交于
Fix @key parameter to mutex_init() and one of its callers. Warning(linux-2.6.26-git11//drivers/base/class.c:210): No description found for parameter 'key' Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Acked-by: NGreg Kroah-Hartman <gregkh@suse.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 28 7月, 2008 9 次提交
-
-
由 Rusty Russell 提交于
Simple conversion. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Abhishek Sagar <sagar.abhishek@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org>
-
由 Rusty Russell 提交于
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
Instead of a "cpu" arg with magic values NR_CPUS (any cpu) and ~0 (all cpus), pass a cpumask_t. Allow NULL for the common case (where we don't care which CPU the function is run on): temporary cpumask_t's are usually considered bad for stack space. This deprecates stop_machine_run, to be removed soon when all the callers are dead. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
Akinobu points out that if take_cpu_down() succeeds, the cpu must be offline. Remove the cpu_online() check, and put a BUG_ON(). Quoting Akinobu Mita: Actually the cpu_online() check was necessary before appling this stop_machine: simplify patch. With old __stop_machine_run(), __stop_machine_run() could succeed (return !IS_ERR(p) value) even if take_cpu_down() returned non-zero value. The return value of take_cpu_down() was obtained through kthread_stop().. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: "Akinobu Mita" <akinobu.mita@gmail.com>
-
由 Rusty Russell 提交于
stop_machine creates a kthread which creates kernel threads. We can create those threads directly and simplify things a little. Some care must be taken with CPU hotunplug, which has special needs, but that code seems more robust than it was in the past. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
由 Jason Baron 提交于
-allow stop_mahcine_run() to call a function on all cpus. Calling stop_machine_run() with a 'ALL_CPUS' invokes this new behavior. stop_machine_run() proceeds as normal until the calling cpu has invoked 'fn'. Then, we tell all the other cpus to call 'fn'. Signed-off-by: NJason Baron <jbaron@redhat.com> Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> CC: Adrian Bunk <bunk@stusta.de> CC: Andi Kleen <andi@firstfloor.org> CC: Alexey Dobriyan <adobriyan@gmail.com> CC: Christoph Hellwig <hch@infradead.org> CC: mingo@elte.hu CC: akpm@osdl.org
-
由 WANG Cong 提交于
This patch fixed the warning: CC kernel/module.o /home/wangcong/Projects/linux-2.6/kernel/module.c:332: warning: ‘lookup_symbol’ defined but not used Signed-off-by: NWANG Cong <wangcong@zeuux.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Andrea Righi 提交于
Simplify the code of include/linux/task_io_accounting.h. It is also more reasonable to have all the task i/o-related statistics in a single struct (task_io_accounting). Signed-off-by: NAndrea Righi <righi.andrea@gmail.com> Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrea Righi 提交于
Put all i/o statistics in struct proc_io_accounting and use inline functions to initialize and increment statistics, removing a lot of single variable assignments. This also reduces the kernel size as following (with CONFIG_TASK_XACCT=y and CONFIG_TASK_IO_ACCOUNTING=y). text data bss dec hex filename 11651 0 0 11651 2d83 kernel/exit.o.before 11619 0 0 11619 2d63 kernel/exit.o.after 10886 132 136 11154 2b92 kernel/fork.o.before 10758 132 136 11026 2b12 kernel/fork.o.after 3082029 807968 4818600 8708597 84e1f5 vmlinux.o.before 3081869 807968 4818600 8708437 84e155 vmlinux.o.after Signed-off-by: NAndrea Righi <righi.andrea@gmail.com> Acked-by: NOleg Nesterov <oleg@tv-sign.ru> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-