- 17 7月, 2010 3 次提交
-
-
由 Pekka Enberg 提交于
As of commit dcce284a ("mm: Extend gfp masking to the page allocator") and commit 7e85ee0c ("slab,slub: don't enable interrupts during early boot"), the slab allocator makes sure we don't attempt to sleep during boot. Therefore, remove bootmem special cases from the scheduler and use plain GFP_KERNEL instead. Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1279225102-2572-1-git-send-email-penberg@cs.helsinki.fi> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Currently we update cpu_power() too often, update_group_power() only updates the local group's cpu_power but it gets called for all groups. Furthermore, CPU_NEWLY_IDLE invocations will result in all cpus calling it, even though a slow update of cpu_power is sufficient. Therefore move the update under 'idle != CPU_NEWLY_IDLE && local_group' to reduce superfluous invocations. Reported-by: NVenkatesh Pallipadi <venki@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1278612989.1900.176.camel@laptop> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Suresh Siddha 提交于
Suresh spotted that we don't update the rq->clock in the nohz load-balancer path. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1278626014.2834.74.camel@sbs-t61.sc.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 29 6月, 2010 1 次提交
-
-
由 Michael Neuling 提交于
No logic changes, only spelling. Signed-off-by: NMichael Neuling <mikey@neuling.org> Cc: linuxppc-dev@ozlabs.org Cc: David Howells <dhowells@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <15249.1277776921@neuling.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 22 6月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
Commit 3a101d05 (sched: adjust when cpu_active and cpuset configurations are updated during cpu on/offlining) added hotplug notifiers marked with __cpuexit; however, ia64 drops text in __cpuexit during link unlike x86. This means that functions which are referenced during init but used only for cpu hot unplugging afterwards shouldn't be marked with __cpuexit. Drop __cpuexit from those functions. Reported-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NTony Luck <tony.luck@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4C1FDF5B.1040301@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 6月, 2010 8 次提交
-
-
由 Oleg Nesterov 提交于
fastpath_timer_check()->thread_group_cputimer() is racy and unneeded. It is racy because another thread can clear ->running before thread_group_cputimer() takes cputimer->lock. In this case thread_group_cputimer() will set ->running = true again and call thread_group_cputime(). But since we do not hold tasklist or siglock, we can race with fork/exit and copy the wrong results into cputimer->cputime. It is unneeded because if ->running == true we can just use the numbers in cputimer->cputime we already have. Change fastpath_timer_check() to copy cputimer->cputime into the local variable under cputimer->lock. We do not re-check ->running under cputimer->lock, run_posix_cpu_timers() does this check later. Note: we can add more optimizations on top of this change. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100611180446.GA13025@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Oleg Nesterov 提交于
run_posix_cpu_timers() doesn't work if current has already passed exit_notify(). This was needed to prevent the races with do_wait(). Since ea6d290c ->signal is always valid and can't go away. We can remove the "tsk->exit_state == 0" in fastpath_timer_check() and convert run_posix_cpu_timers() to use lock_task_sighand(). Note: it makes sense to take group_leader's sighand instead, the sub-thread still uses CPU after release_task(). But we need more changes to do this. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100610231018.GA25942@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Oleg Nesterov 提交于
thread_group_cputime() looks as if it is rcu-safe, but in fact this was wrong until ea6d290c which pins task->signal to task_struct. It checks ->sighand != NULL under rcu, but this can't help if ->signal can go away. Fortunately the caller either holds ->siglock, or it is fastpath_timer_check() which uses current and checks exit_state == 0. - Since ea6d290c commit tsk->signal is stable, we can read it first and avoid the initialization from INIT_CPUTIME. - Even if tsk->signal is always valid, we still have to check it is safe to use next_thread() under rcu_read_lock(). Currently the code checks ->sighand != NULL, change it to use pid_alive() which is commonly used to ensure the task wasn't unhashed before we take rcu_read_lock(). Add the comment to explain this check. - Change the main loop to use the while_each_thread() helper. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100610230956.GA25921@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Oleg Nesterov 提交于
account_group_xxx() functions check ->exit_state to ensure that current->signal is valid and can't go away. This is not needed since ea6d290c, task->signal is pinned to task_struct. The comment and another hack in account_group_exec_runtime() refers to task_rq_unlock_wait() which was already removed by b7b8ff63. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100610230952.GA25914@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Oleg Nesterov 提交于
Remove the obsolete ->signal != NULL check in watchdog(). Since ea6d290c ->signal can't be NULL. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100610230948.GA25911@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Oleg Nesterov 提交于
__sched_setscheduler() takes lock_task_sighand() to access task->signal. This is not needed since ea6d290c, ->signal can't go away. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100610230944.GA25903@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Michael Neuling 提交于
Docbook fails in sched_fair.c due to comments added in the asymmetric packing patch series. This fixes these errors. No code changes. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <24737.1276135581@neuling.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Michael Neuling 提交于
The CPU power test is the wrong way around in fix_small_capacity. This was due to a small changes made in the posted patch on lkml to what was was taken upstream. This patch fixes asymmetric packing for POWER7. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <12629.1276124617@neuling.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 09 6月, 2010 12 次提交
-
-
由 Michael Neuling 提交于
Check to see if the group is packed in a sched doman. This is primarily intended to used at the sibling level. Some cores like POWER7 prefer to use lower numbered SMT threads. In the case of POWER7, it can move to lower SMT modes only when higher threads are idle. When in lower SMT modes, the threads will perform better since they share less core resources. Hence when we have idle threads, we want them to be the higher ones. This adds a hook into f_b_g() called check_asym_packing() to check the packing. This packing function is run on idle threads. It checks to see if the busiest CPU in this domain (core in the P7 case) has a higher CPU number than what where the packing function is being run on. If it is, calculate the imbalance and return the higher busier thread as the busiest group to f_b_g(). Here we are assuming a lower CPU number will be equivalent to a lower SMT thread number. It also creates a new SD_ASYM_PACKING flag to enable this feature at any scheduler domain level. It also creates an arch hook to enable this feature at the sibling level. The default function doesn't enable this feature. Based heavily on patch from Peter Zijlstra. Fixes from Srivatsa Vaddagiri. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20100608045702.2936CCC897@localhost.localdomain> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Srivatsa Vaddagiri 提交于
Handle cpu capacity being reported as 0 on cores with more number of hardware threads. For example on a Power7 core with 4 hardware threads, core power is 1177 and thus power of each hardware thread is 1177/4 = 294. This low power can lead to capacity for each hardware thread being calculated as 0, which leads to tasks bouncing within the core madly! Fix this by reporting capacity for hardware threads as 1, provided their power is not scaled down significantly because of frequency scaling or real-time tasks usage of cpu. Signed-off-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <20100608045702.21D03CC895@localhost.localdomain> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Venkatesh Pallipadi 提交于
In the new push model, all idle CPUs indeed go into nohz mode. There is still the concept of idle load balancer (performing the load balancing on behalf of all the idle cpu's in the system). Busy CPU kicks the nohz balancer when any of the nohz CPUs need idle load balancing. The kickee CPU does the idle load balancing on behalf of all idle CPUs instead of the normal idle balance. This addresses the below two problems with the current nohz ilb logic: * the idle load balancer continued to have periodic ticks during idle and wokeup frequently, even though it did not have any rebalancing to do on behalf of any of the idle CPUs. * On x86 and CPUs that have APIC timer stoppage on idle CPUs, this periodic wakeup can result in a periodic additional interrupt on a CPU doing the timer broadcast. Also currently we are migrating the unpinned timers from an idle to the cpu doing idle load balancing (when all the cpus in the system are idle, there is no idle load balancing cpu and timers get added to the same idle cpu where the request was made. So the existing optimization works only on semi idle system). And In semi idle system, we no longer have periodic ticks on the idle load balancer CPU. Using that cpu will add more delays to the timers than intended (as that cpu's timer base may not be uptodate wrt jiffies etc). This was causing mysterious slowdowns during boot etc. For now, in the semi idle case, use the nearest busy cpu for migrating timers from an idle cpu. This is good for power-savings anyway. Signed-off-by: NVenkatesh Pallipadi <venki@google.com> Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <1274486981.2840.46.camel@sbs-t61.sc.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Venkatesh Pallipadi 提交于
tickless idle has a negative side effect on update_cpu_load(), which in turn can affect load balancing behavior. update_cpu_load() is supposed to be called every tick, to keep track of various load indicies. With tickless idle, there are no scheduler ticks called on the idle CPUs. Idle CPUs may still do load balancing (with idle_load_balance CPU) using the stale cpu_load. It will also cause problems when all CPUs go idle for a while and become active again. In this case loads would not degrade as expected. This is how rq->nr_load_updates change looks like under different conditions: <cpu_num> <nr_load_updates change> All CPUS idle for 10 seconds (HZ=1000) 0 1621 10 496 11 139 12 875 13 1672 14 12 15 21 1 1472 2 2426 3 1161 4 2108 5 1525 6 701 7 249 8 766 9 1967 One CPU busy rest idle for 10 seconds 0 10003 10 601 11 95 12 966 13 1597 14 114 15 98 1 3457 2 93 3 6679 4 1425 5 1479 6 595 7 193 8 633 9 1687 All CPUs busy for 10 seconds 0 10026 10 10026 11 10026 12 10026 13 10025 14 10025 15 10025 1 10026 2 10026 3 10026 4 10026 5 10026 6 10026 7 10026 8 10026 9 10026 That is update_cpu_load works properly only when all CPUs are busy. If all are idle, all the CPUs get way lower updates. And when few CPUs are busy and rest are idle, only busy and ilb CPU does proper updates and rest of the idle CPUs will do lower updates. The patch keeps track of when a last update was done and fixes up the load avg based on current time. On one of my test system SPECjbb with warehouse 1..numcpus, patch improves throughput numbers by ~1% (average of 6 runs). On another test system (with different domain hierarchy) there is no noticable change in perf. Signed-off-by: NVenkatesh Pallipadi <venki@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <AANLkTilLtDWQsAUrIxJ6s04WTgmw9GuOODc5AOrYsaR5@mail.gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Oleg Nesterov 提交于
- Contrary to what 6d558c3a says, there is no need to reload prev = rq->curr after the context switch. You always schedule back to where you came from, prev must be equal to current even if cpu/rq was changed. - This also means reacquire_kernel_lock() can use prev instead of current. - No need to reassign switch_count if reacquire_kernel_lock() reports need_resched(), we can just move the initial assignment down, under the "need_resched_nonpreemptible:" label. - Try to update the comment after context_switch(). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100519125711.GA30199@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
For people who otherwise get to write: cpu_clock(smp_processor_id()), there is now: local_clock(). Also, as per suggestion from Andrew, provide some documentation on the various clock interfaces, and minimize the unsigned long long vs u64 mess. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jens Axboe <jaxboe@fusionio.com> LKML-Reference: <1275052414.1645.52.camel@laptop> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
Concurrency managed workqueue needs to know when workers are going to sleep and waking up. Using these two hooks, cmwq keeps track of the current concurrency level and throttles execution of new works if it's too high and wakes up another worker from the sleep hook if it becomes too low. This patch introduces PF_WQ_WORKER to identify workqueue workers and adds the following two hooks. * wq_worker_waking_up(): called when a worker is woken up. * wq_worker_sleeping(): called when a worker is going to sleep and may return a pointer to a local task which should be woken up. The returned task is woken up using try_to_wake_up_local() which is simplified ttwu which is called under rq lock and can only wake up local tasks. Both hooks are currently defined as noop in kernel/workqueue_sched.h. Later cmwq implementation will replace them with proper implementation. These hooks are hard coded as they'll always be enabled. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Ingo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
Factor ttwu_activate() and ttwu_woken_up() out of try_to_wake_up(). The factoring out doesn't affect try_to_wake_up() much code-generation-wise. Depending on configuration options, it ends up generating the same object code as before or slightly different one due to different register assignment. This is to help future implementation of try_to_wake_up_local(). Mike Galbraith suggested rename to ttwu_post_activation() from ttwu_woken_up() and comment update in try_to_wake_up(). Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Ingo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
Currently, when a cpu goes down, cpu_active is cleared before CPU_DOWN_PREPARE starts and cpuset configuration is updated from a default priority cpu notifier. When a cpu is coming up, it's set before CPU_ONLINE but cpuset configuration again is updated from the same cpu notifier. For cpu notifiers, this presents an inconsistent state. Threads which a CPU_DOWN_PREPARE notifier expects to be bound to the CPU can be migrated to other cpus because the cpu is no more inactive. Fix it by updating cpu_active in the highest priority cpu notifier and cpuset configuration in the second highest when a cpu is coming up. Down path is updated similarly. This guarantees that all other cpu notifiers see consistent cpu_active and cpuset configuration. cpuset_track_online_cpus() notifier is converted to cpuset_update_active_cpus() which just updates the configuration and now called from cpuset_cpu_[in]active() notifiers registered from sched_init_smp(). If cpuset is disabled, cpuset_update_active_cpus() degenerates into partition_sched_domains() making separate notifier for !CONFIG_CPUSETS unnecessary. This problem is triggered by cmwq. During CPU_DOWN_PREPARE, hotplug callback creates a kthread and kthread_bind()s it to the target cpu, and the thread is expected to run on that cpu. * Ingo's test discovered __cpuinit/exit markups were incorrect. Fixed. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Menage <menage@google.com>
-
由 Tejun Heo 提交于
Instead of hardcoding priority 10 and 20 in sched and perf, collect them into CPU_PRI_* enums. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
-
由 Peter Zijlstra 提交于
PROVE_RCU has a few issues with the cpu_cgroup because the scheduler typically holds rq->lock around the css rcu derefs but the generic cgroup code doesn't (and can't) know about that lock. Provide means to add extra checks to the css dereference and use that in the scheduler to annotate its users. The addition of rq->lock to these checks is correct because the cgroup_subsys::attach() method takes the rq->lock for each task it moves, therefore by holding that lock, we ensure the task is pinned to the current cgroup and the RCU derefence is valid. That leaves one genuine race in __sched_setscheduler() where we used task_group() without holding any of the required locks and thus raced with the cgroup code. Solve this by moving the check under the appropriate lock. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Frederic reported that frequency driven swevents didn't work properly and even caused a division-by-zero error. It turns out there are two bugs, the division-by-zero comes from a failure to deal with that in perf_calculate_period(). The other was more interesting and turned out to be a wrong comparison in perf_adjust_period(). The comparison was between an s64 and u64 and got implicitly converted to an unsigned comparison. The problem is that period_left is typically < 0, so it ended up being always true. Cure this by making the local period variables s64. Reported-by: NFrederic Weisbecker <fweisbec@gmail.com> Tested-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 05 6月, 2010 11 次提交
-
-
由 Rusty Russell 提交于
Problem: it's hard to avoid an init routine stumbling over a request_module these days. And it's not clear it's always a bad idea: for example, a module like kvm with dynamic dependencies on kvm-intel or kvm-amd would be neater if it could simply request_module the right one. In this particular case, it's libcrc32c: libcrc32c_mod_init crypto_alloc_shash crypto_alloc_tfm crypto_find_alg crypto_alg_mod_lookup crypto_larval_lookup request_module If another module is waiting inside resolve_symbol() for libcrc32c to finish initializing (ie. bne2 depends on libcrc32c) then it does so holding the module lock, and our request_module() can't make progress until that is released. Waiting inside resolve_symbol() without the lock isn't all that hard: we just need to pass the -EBUSY up the call chain so we can sleep where we don't hold the lock. Error reporting is a bit trickier: we need to copy the name of the unfinished module before releasing the lock. Other notes: 1) This also fixes a theoretical issue where a weak dependency would allow symbol version mismatches to be ignored. 2) We rename use_module to ref_module to make life easier for the only external user (the out-of-tree ksplice patches). Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tim Abbot <tabbott@ksplice.com> Tested-by: NBrandon Philips <bphilips@suse.de>
-
由 Rusty Russell 提交于
It disabled preempt so it was "safe", but nothing stops another module slipping in before this module is added to the global list now we don't hold the lock the whole time. So we check this just after we check for duplicate modules, and just before we put the module in the global list. (find_symbol finds symbols in coming and going modules, too). Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Linus Torvalds 提交于
I think Rusty may have made the lock a bit _too_ finegrained there, and didn't add it to some places that needed it. It looks, for example, like PATCH 1/2 actually drops the lock in places where it's needed ("find_module()" is documented to need it, but now load_module() didn't hold it at all when it did the find_module()). Rather than adding a new "module_loading" list, I think we should be able to just use the existing "modules" list, and just fix up the locking a bit. In fact, maybe we could just move the "look up existing module" a bit later - optimistically assuming that the module doesn't exist, and then just undoing the work if it turns out that we were wrong, just before adding ourselves to the list. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
Kay Sievers <kay.sievers@vrfy.org> reports that we still have some contention over module loading which is slowing boot. Linus also disliked a previous "drop lock and regrab" patch to fix the bne2 "gave up waiting for init of module libcrc32c" message. This is more ambitious: we only grab the lock where we need it. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Brandon Philips <brandon@ifup.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Linus Torvalds <torvalds@linux-foundation.org>
-
由 Rusty Russell 提交于
These were placed in the header in ef665c1a to get the various SYSFS/MODULE config combintations to compile. That may have been necessary then, but it's not now. These functions are all local to module.c. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Randy Dunlap <randy.dunlap@oracle.com>
-
由 Rusty Russell 提交于
This means a little extra work, but is more logical: we don't put anything in sysfs until we're about to put the module into the global list an parse its parameters. This also gives us a logical place to put duplicate module detection in the next patch. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
Linus changed the structure, and luckily this didn't compile any more. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Martin Hicks <mort@sgi.com>
-
由 Linus Torvalds 提交于
When adding a module that depends on another one, we used to create a one-way list of "modules_which_use_me", so that module unloading could see who needs a module. It's actually quite simple to make that list go both ways: so that we not only can see "who uses me", but also see a list of modules that are "used by me". In fact, we always wanted that list in "module_unload_free()": when we unload a module, we want to also release all the other modules that are used by that module. But because we didn't have that list, we used to first iterate over all modules, and then iterate over each "used by me" list of that module. By making the list two-way, we simplify module_unload_free(), and it allows for some trivial fixes later too. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (cleaned & rebased)
-
由 Akinobu Mita 提交于
The commit 80b5184c ("kernel/: convert cpu notifier to return encapsulate errno value") changed the return value of cpu notifier callbacks. Those callbacks don't return NOTIFY_BAD on failures anymore. But there are a few callbacks which are called directly at init time and checking the return value. I forgot to change BUG_ON checking by the direct callers in the commit. Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Greg Thelen 提交于
Child groups should have a greater depth than their parents. Prior to this change, the parent would incorrectly report zero memory usage for child cgroups when use_hierarchy is enabled. test script: mount -t cgroup none /cgroups -o memory cd /cgroups mkdir cg1 echo 1 > cg1/memory.use_hierarchy mkdir cg1/cg11 echo $$ > cg1/cg11/tasks dd if=/dev/zero of=/tmp/foo bs=1M count=1 echo echo CHILD grep cache cg1/cg11/memory.stat echo echo PARENT grep cache cg1/memory.stat echo $$ > tasks rmdir cg1/cg11 cg1 cd / umount /cgroups Using fae9c791, a recent patch that changed alloc_css_id() depth computation, the parent incorrectly reports zero usage: root@ubuntu:~# ./test 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.0151844 s, 69.1 MB/s CHILD cache 1048576 total_cache 1048576 PARENT cache 0 total_cache 0 With this patch, the parent correctly includes child usage: root@ubuntu:~# ./test 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.0136827 s, 76.6 MB/s CHILD cache 1052672 total_cache 1052672 PARENT cache 0 total_cache 1052672 Signed-off-by: NGreg Thelen <gthelen@google.com> Acked-by: NPaul Menage <menage@google.com> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: <stable@kernel.org> [2.6.34.x] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Oleg Nesterov 提交于
task_struct->pesonality is "unsigned int", but sys_personality() paths use "unsigned long pesonality". This means that every assignment or comparison is not right. In particular, if this argument does not fit into "unsigned int" __set_personality() changes the caller's personality and then sys_personality() returns -EINVAL. Turn this argument into "unsigned int" and avoid overflows. Obviously, this is the user-visible change, we just ignore the upper bits. But this can't break the sane application. There is another thing which can confuse the poorly written applications. User-space thinks that this syscall returns int, not long. This means that the returned value can be negative and look like the error code. But note that libc won't be confused and thus errno won't be set, and with this patch the user-space can never get -1 unless sys_personality() really fails. And, most importantly, the negative RET != -1 is only possible if that app previously called personality(RET). Pointed-out-by: NWenming Zhang <wezhang@redhat.com> Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 6月, 2010 2 次提交
-
-
由 Peter Zijlstra 提交于
Frederic reported that because swevents handling doesn't disable IRQs anymore, we can get a recursion of perf_adjust_period(), once from overflow handling and once from the tick. If both call ->disable, we get a double hlist_del_rcu() and trigger a LIST_POISON2 dereference. Since we don't actually need to stop/start a swevent to re-programm the hardware (lack of hardware to program), simply nop out these callbacks for the swevent pmu. Reported-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1275557609.27810.35218.camel@twins> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jens Axboe 提交于
This changes the interface to be based on bytes instead. The API matches that of F_SETPIPE_SZ in that it rounds up the passed in size so that the resulting page array is a power-of-2 in size. The proc file is renamed to /proc/sys/fs/pipe-max-size to reflect this change. Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 02 6月, 2010 1 次提交
-
-
由 Daniel J Blueman 提交于
In commit e9fb7631 ("cpu-hotplug: introduce cpu_notify(), __cpu_notify(), cpu_notify_nofail()") the new helper functions access cpu_chain. As a result, it shouldn't be marked __cpuinitdata (via section mismatch warning). Alternatively, the helper functions should be forced inline, or marked __ref or __cpuinit. In the meantime, this patch silences the warning the trivial way. Signed-off-by: NDaniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 6月, 2010 1 次提交
-
-
由 Peter Zijlstra 提交于
Mike reports that since e9e9250b (sched: Scale down cpu_power due to RT tasks), wake_affine() goes funny on RT tasks due to them still having a !0 weight and wake_affine() still subtracts that from the rq weight. Since nobody should be using se->weight for RT tasks, set the value to zero. Also, since we now use ->cpu_power to normalize rq weights to account for RT cpu usage, add that factor into the imbalance computation. Reported-by: NMike Galbraith <efault@gmx.de> Tested-by: NMike Galbraith <efault@gmx.de> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1275316109.27810.22969.camel@twins> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-