提交 · 112f53f5d700589de741dca67c77439e96ea94a7 · OpenHarmony / kernel_linux

20 4月, 2008 21 次提交

由 Peter Zijlstra 提交于 3月 19, 2008

Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

112f53f5

sched: add new set_cpus_allowed_ptr function · cd8ba7cd

由 Mike Travis 提交于 3月 26, 2008

Add a new function that accepts a pointer to the "newly allowed cpus"
cpumask argument.

int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask)

The current set_cpus_allowed() function is modified to use the above
but this does not result in an ABI change.  And with some compiler
optimization help, it may not introduce any additional overhead.

Additionally, to enforce the read only nature of the new_mask arg, the
"const" property is migrated to sub-functions called by set_cpus_allowed.
This silences compiler warnings.
Signed-off-by: NMike Travis <travis@sgi.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cd8ba7cd

init: move setup of nr_cpu_ids to as early as possible · e0982e90

由 Mike Travis 提交于 3月 26, 2008

Move the setting of nr_cpu_ids from sched_init() to start_kernel()
so that it's available as early as possible.

Note that an arch has the option of setting it even earlier if need be,
but it should not result in a different value than the setup_nr_cpu_ids()
function.
Signed-off-by: NMike Travis <travis@sgi.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e0982e90

sched: remove another cpumask_t variable from stack · 4bdbaad3

由 Mike Travis 提交于 4月 15, 2008

    * Remove another cpumask_t variable from stack that was missed in the
      last kernel_sched_c updates.
Signed-off-by: NMike Travis <travis@sgi.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4bdbaad3

cpumask: reduce stack usage in SD_x_INIT initializers · 7c16ec58

由 Mike Travis 提交于 4月 04, 2008

  * Remove empty cpumask_t (and all non-zero/non-null) variables
    in SD_*_INIT macros.  Use memset(0) to clear.  Also, don't
    inline the initializer functions to save on stack space in
    build_sched_domains().

  * Merge change to include/linux/topology.h that uses the new
    node_to_cpumask_ptr function in the nr_cpus_node macro into
    this patch.

Depends on:
	[mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch
	[sched-devel]: sched: add new set_cpus_allowed_ptr function

Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: NMike Travis <travis@sgi.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7c16ec58

nodemask: use new node_to_cpumask_ptr function · c5f59f08

由 Mike Travis 提交于 4月 04, 2008

  * Use new node_to_cpumask_ptr.  This creates a pointer to the
    cpumask for a given node.  This definition is in mm patch:

	asm-generic-add-node_to_cpumask_ptr-macro.patch

  * Use new set_cpus_allowed_ptr function.

Depends on:
	[mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch
	[sched-devel]: sched: add new set_cpus_allowed_ptr function
	[x86/latest]: x86: add cpus_scnprintf function

Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Greg Banks <gnb@melbourne.sgi.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: NMike Travis <travis@sgi.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c5f59f08

generic: reduce stack pressure in sched_affinity · b53e921b

由 Mike Travis 提交于 4月 04, 2008

  * Modify sched_affinity functions to pass cpumask_t variables by reference
    instead of by value.

  * Use new set_cpus_allowed_ptr function.

Depends on:
	[sched-devel]: sched: add new set_cpus_allowed_ptr function

Cc: Paul Jackson <pj@sgi.com>
Cc: Cliff Wickman <cpw@sgi.com>
Signed-off-by: NMike Travis <travis@sgi.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b53e921b

cpuset: modify cpuset_set_cpus_allowed to use cpumask pointer · f9a86fcb

由 Mike Travis 提交于 4月 04, 2008

  * Modify cpuset_cpus_allowed to return the currently allowed cpuset
    via a pointer argument instead of as the function return value.

  * Use new set_cpus_allowed_ptr function.

  * Cleanup CPU_MASK_ALL and NODE_MASK_ALL uses.

Depends on:
	[sched-devel]: sched: add new set_cpus_allowed_ptr function
Signed-off-by: NMike Travis <travis@sgi.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f9a86fcb

sched: remove fixed NR_CPUS sized arrays in kernel_sched_c · 434d53b0

由 Mike Travis 提交于 4月 04, 2008

 * Change fixed size arrays to per_cpu variables or dynamically allocated
   arrays in sched_init() and sched_init_smp().

     (1) static struct sched_entity *init_sched_entity_p[NR_CPUS];
     (1) static struct cfs_rq *init_cfs_rq_p[NR_CPUS];
     (1) static struct sched_rt_entity *init_sched_rt_entity_p[NR_CPUS];
     (1) static struct rt_rq *init_rt_rq_p[NR_CPUS];
	 static struct sched_group **sched_group_nodes_bycpu[NR_CPUS];

     (1) - these arrays are allocated via alloc_bootmem_low()

 * Change sched_domain_debug_one() to use cpulist_scnprintf instead of
   cpumask_scnprintf.  This reduces the output buffer required and improves
   readability when large NR_CPU count machines arrive.

 * In sched_create_group() we allocate new arrays based on nr_cpu_ids.
Signed-off-by: NMike Travis <travis@sgi.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

434d53b0

sched: allow cpuacct stats to be reset · 0297b803

由 Dhaval Giani 提交于 2月 29, 2008

Currently the schedstats implementation does not allow the statistics
to be reset. This patch aims to allow that.

  echo 0 > cpuacct.usage

resets the usage. Any other value is not allowed and returns -EINVAL.
Signed-off-by: NDhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0297b803

sched: cleanup cpuacct variable names · 32cd756a

由 Dhaval Giani 提交于 2月 29, 2008

Change the variable names to the common convention for the cpuacct
subsystem.
Signed-off-by: NDhaval Giani <dhaval@linux.vnet.ibm.com>
Acked-by: NBalbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

32cd756a

sched: rt-group: smp balancing · ac086bc2

由 Peter Zijlstra 提交于 4月 19, 2008

Currently the rt group scheduling does a per cpu runtime limit, however
the rt load balancer makes no guarantees about an equal spread of real-
time tasks, just that at any one time, the highest priority tasks run.

Solve this by making the runtime limit a global property by borrowing
excessive runtime from the other cpus once the local limit runs out.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ac086bc2

sched: rt-group: synchonised bandwidth period · d0b27fa7

由 Peter Zijlstra 提交于 4月 19, 2008

Various SMP balancing algorithms require that the bandwidth period
run in sync.

Possible improvements are moving the rt_bandwidth thing into root_domain
and keeping a span per rt_bandwidth which marks throttled cpus.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d0b27fa7

I
sched: remove sysctl_sched_batch_wakeup_granularity · 50df5d6a
由 Ingo Molnar 提交于 3月 14, 2008
```
it's unused.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
50df5d6a
I
sched: reenable sync wakeups · 02e2b83b
由 Ingo Molnar 提交于 3月 19, 2008
```
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
02e2b83b
I
sched: cache hot buddy · d25ce4cd
由 Ingo Molnar 提交于 3月 17, 2008
```
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
d25ce4cd
I
sched: feat affine wakeups · 1fc8afa4
由 Ingo Molnar 提交于 3月 19, 2008
```
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
1fc8afa4

sched: introduce SCHED_FEAT_SYNC_WAKEUPS, turn it off · b85d0667

由 Ingo Molnar 提交于 3月 16, 2008

turn off sync wakeups by default. They are not needed anymore - the
buddy logic should be smart enough to keep the system from
overscheduling.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b85d0667

sched: fix rq->clock overflows detection with CONFIG_NO_HZ · 15934a37

由 Guillaume Chazarain 提交于 4月 19, 2008

When using CONFIG_NO_HZ, rq->tick_timestamp is not updated every TICK_NSEC.
We check that the number of skipped ticks matches the clock jump seen in
__update_rq_clock().
Signed-off-by: NGuillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

15934a37

sched: sched.c needs tick.h · 30914a58

由 Reynes Philippe 提交于 3月 17, 2008

kernel/sched.c:506: erreur: implicit declaration of function tick_get_tick_sched
kernel/sched.c:506: erreur: invalid type argument of ->
kernel/sched.c:506: erreur: NOHZ_MODE_INACTIVE undeclared (first use in this function)
kernel/sched.c:506: erreur: (Each undeclared identifier is reported only once
kernel/sched.c:506: erreur: for each function it appears in.)
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

30914a58

sched: make cpu_clock() globally synchronous · 27ec4407

由 Ingo Molnar 提交于 2月 28, 2008

Alexey Zaytsev reported (and bisected) that the introduction of
cpu_clock() in printk made the timestamps jump back and forth.

Make cpu_clock() more reliable while still keeping it fast when it's
called frequently.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

27ec4407

26 3月, 2008 1 次提交

NOHZ: reevaluate idle sleep length after add_timer_on() · 06d8308c

由 Thomas Gleixner 提交于 3月 22, 2008

add_timer_on() can add a timer on a CPU which is currently in a long
idle sleep, but the timer wheel is not reevaluated by the nohz code on
that CPU. So a timer can be delayed for quite a long time. This
triggered a false positive in the clocksource watchdog code.

To avoid this we need to wake up the idle CPU and enforce the
reevaluation of the timer wheel for the next timer event.

Add a function, which checks a given CPU for idle state, marks the
idle task with NEED_RESCHED and sends a reschedule IPI to notify the
other CPU of the change in the timer wheel.

Call this function from add_timer_on().
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NIngo Molnar <mingo@elte.hu>
Cc: stable@kernel.org

--
 include/linux/sched.h |    6 ++++++
 kernel/sched.c        |   43 +++++++++++++++++++++++++++++++++++++++++++
 kernel/timer.c        |   10 +++++++++-
 3 files changed, 58 insertions(+), 1 deletion(-)

06d8308c

21 3月, 2008 4 次提交

sched: add arch_update_cpu_topology hook. · 22e52b07

由 Heiko Carstens 提交于 3月 12, 2008

Will be called each time the scheduling domains are rebuild.
Needed for architectures that don't have a static cpu topology.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

22e52b07

sched: add exported arch_reinit_sched_domains() to header file. · 9aefd0ab

由 Heiko Carstens 提交于 3月 12, 2008

Needed so it can be called from outside of sched.c.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9aefd0ab

sched: remove double unlikely from schedule() · 23e3c3cd

由 Roel Kluin 提交于 3月 13, 2008

Combine two unlikely's
Signed-off-by: NRoel Kluin <12o3l@tiscali.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

23e3c3cd

sched: cleanup old and rarely used 'debug' features. · 2070ee01

由 Peter Zijlstra 提交于 3月 21, 2008

TREE_AVG and APPROX_AVG are initial task placement policies that have been
disabled for a long while.. time to remove them.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2070ee01

19 3月, 2008 2 次提交

sched: wakeup-buddy tasks are cache-hot · f540a608

由 Ingo Molnar 提交于 3月 15, 2008

Wakeup-buddy tasks are cache-hot - this makes it a bit harder
for the load-balancer to tear them apart. (but it's still possible,
if the load is sufficiently assymetric)
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f540a608

sched: improve affine wakeups · 4ae7d5ce

由 Ingo Molnar 提交于 3月 19, 2008

improve affine wakeups. Maintain the 'overlap' metric based on CFS's
sum_exec_runtime - which means the amount of time a task executes
after it wakes up some other task.

Use the 'overlap' for the wakeup decisions: if the 'overlap' is short,
it means there's strong workload coupling between this task and the
woken up task. If the 'overlap' is large then the workload is decoupled
and the scheduler will move them to separate CPUs more easily.

( Also slightly move the preempt_check within try_to_wake_up() - this has
  no effect on functionality but allows 'early wakeups' (for still-on-rq
  tasks) to be correctly accounted as well.)
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4ae7d5ce

15 3月, 2008 4 次提交

sched: fix overload performance: buddy wakeups · aa2ac252

由 Peter Zijlstra 提交于 3月 14, 2008

Currently we schedule to the leftmost task in the runqueue. When the
runtimes are very short because of some server/client ping-pong,
especially in over-saturated workloads, this will cycle through all
tasks trashing the cache.

Reduce cache trashing by keeping dependent tasks together by running
newly woken tasks first. However, by not running the leftmost task first
we could starve tasks because the wakee can gain unlimited runtime.

Therefore we only run the wakee if its within a small
(wakeup_granularity) window of the leftmost task. This preserves
fairness, but does alternate server/client task groups.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

aa2ac252

sched: fix calc_delta_mine() · 27d11726

由 Ingo Molnar 提交于 3月 14, 2008

lw->weight can be 0 for a short time during bootup.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

27d11726

sched: fix update_load_add()/sub() · e89996ae

由 Ingo Molnar 提交于 3月 14, 2008

Clear the cached inverse value when updating load. This is needed for
calc_delta_mine() to work correctly when using the rq load.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

e89996ae

sched: fix race in schedule() · 0e1f3483

由 Hiroshi Shimamoto 提交于 3月 10, 2008

Fix a hard to trigger crash seen in the -rt kernel that also affects
the vanilla scheduler.

There is a race condition between schedule() and some dequeue/enqueue
functions; rt_mutex_setprio(), __setscheduler() and sched_move_task().

When scheduling to idle, idle_balance() is called to pull tasks from
other busy processor. It might drop the rq lock. It means that those 3
functions encounter on_rq=0 and running=1. The current task should be
put when running.

Here is a possible scenario:

   CPU0                               CPU1
    |                              schedule()
    |                              ->deactivate_task()
    |                              ->idle_balance()
    |                              -->load_balance_newidle()
rt_mutex_setprio()                     |
    |                              --->double_lock_balance()
    *get lock                          *rel lock
    * on_rq=0, ruuning=1               |
    * sched_class is changed           |
    *rel lock                          *get lock
    :                                  |
                                       :
                                   ->put_prev_task_rt()
                                   ->pick_next_task_fair()
                                       => panic

The current process of CPU1(P1) is scheduling. Deactivated P1, and the
scheduler looks for another process on other CPU's runqueue because CPU1
will be idle. idle_balance(), load_balance_newidle() and
double_lock_balance() are called and double_lock_balance() could drop
the rq lock. On the other hand, CPU0 is trying to boost the priority of
P1. The result of boosting only P1's prio and sched_class are changed to
RT. The sched entities of P1 and P1's group are never put. It makes
cfs_rq invalid, because the cfs_rq has curr and no leaf, but
pick_next_task_fair() is called, then the kernel panics.
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0e1f3483

11 3月, 2008 2 次提交

keep rd->online and cpu_online_map in sync · 08f503b0

由 Gregory Haskins 提交于 3月 10, 2008

It is possible to allow the root-domain cache of online cpus to
become out of sync with the global cpu_online_map.  This is because we
currently trigger removal of cpus too early in the notifier chain.
Other DOWN_PREPARE handlers may in fact run and reconfigure the
root-domain topology, thereby stomping on our own offline handling.

The end result is that rd->online may become out of sync with
cpu_online_map, which results in potential task misrouting.

So change the offline handling to be more tightly coupled with the
global offline process by triggering on CPU_DYING intead of
CPU_DOWN_PREPARE.
Signed-off-by: NGregory Haskins <ghaskins@novell.com>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

08f503b0

Revert "cpu hotplug: adjust root-domain->online span in response to hotplug event" · 1f94ef59

由 Gregory Haskins 提交于 3月 10, 2008

This reverts commit 393d94d9.

Lets fix this right.
Signed-off-by: NGregory Haskins <ghaskins@novell.com>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1f94ef59

10 3月, 2008 1 次提交

cpu hotplug: adjust root-domain->online span in response to hotplug event · 393d94d9

由 Gregory Haskins 提交于 3月 08, 2008

We currently set the root-domain online span automatically when the
domain is added to the cpu if the cpu is already a member of
cpu_online_map.

This was done as a hack/bug-fix for s2ram, but it also causes a problem
with hotplug CPU_DOWN transitioning.  The right way to fix the original
problem is to actually respond to CPU_UP events, instead of CPU_ONLINE,
which is already too late.

This solves the hung reboot regression reported by Andrew Morton and
others.
Signed-off-by: NGregory Haskins <ghaskins@novell.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

393d94d9

07 3月, 2008 5 次提交

sched: don't allow rt_runtime_us to be zero for groups having rt tasks · 521f1a24

由 Dhaval Giani 提交于 2月 28, 2008

This patch checks if we can set the rt_runtime_us to 0. If there is a
realtime task in the group, we don't want to set the rt_runtime_us as 0
or bad things will happen. (that task wont get any CPU time despite
being TASK_RUNNNG)
Signed-off-by: NDhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

521f1a24

sched: rt-group: fixup schedulability constraints calculation · 2692a240

由 Peter Zijlstra 提交于 2月 27, 2008

it was only possible to configure the rt-group scheduling parameters
beyond the default value in a very small range.

that's because div64_64() has a different calling convention than
do_div() :/

fix a few untidies while we are here; sysctl_sched_rt_period may overflow
due to that multiplication, so cast to u64 first. Also that RUNTIME_INF
juggling makes little sense although its an effective NOP.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2692a240

sched: fix the wrong time slice value for SCHED_FIFO tasks · 1868f958

由 Miao Xie 提交于 3月 07, 2008

Function sys_sched_rr_get_interval returns wrong time slice value for
SCHED_FIFO tasks. The time slice for SCHED_FIFO tasks should be 0.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1868f958

sched: export task_nice · 150d8bed

由 Pavel Roskin 提交于 3月 05, 2008

The API is trivial, and so is the implementation.
Signed-off-by: NPavel Roskin <proski@gnu.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

150d8bed

sched: retain vruntime · 810b3817

由 Peter Zijlstra 提交于 2月 29, 2008

Kei Tokunaga reported an interactivity problem when moving tasks
between control groups.

Tasks would retain their old vruntime when moved between groups, this
can cause funny lags. Re-set the vruntime on group move to fit within
the new tree.
Reported-by: NKei Tokunaga <tokunaga.keiich@jp.fujitsu.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

810b3817

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多