提交 · 95e904c7da715aa2dbfb595da66b63de37a0bb04 · OpenHarmony / kernel_linux

17 6月, 2008 1 次提交

sched: fix defined-but-unused warning · 95e904c7

由 Rabin Vincent 提交于 5月 11, 2008

Fix this warning, which appears with !CONFIG_SMP:
kernel/sched.c:1216: warning: `init_hrtick' defined but not used
Signed-off-by: NRabin Vincent <rabin@rab.in>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

95e904c7

12 6月, 2008 2 次提交

sched: 64-bit: fix arithmetics overflow · 7a232e03

由 Lai Jiangshan 提交于 6月 12, 2008

(overflow means weight >= 2^32 here, because inv_weigh = 2^32/weight)

A weight of a cfs_rq is the sum of weights of which entities
are queued on this cfs_rq, so it will overflow when there are
too many entities.

Although, overflow occurs very rarely, but it break fairness when
it occurs. 64-bits systems have more memory than 32-bit systems
and 64-bit systems can create more process usually, so overflow may
occur more frequently.

This patch guarantees fairness when overflow happens on 64-bit systems.
Thanks to the optimization of compiler, it changes nothing on 32-bit.
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7a232e03

sched: fair group: fix overflow(was: fix divide by zero) · 2e084786

由 Lai Jiangshan 提交于 6月 12, 2008

I found a bug which can be reproduced by this way:(linux-2.6.26-rc5, x86-64)
(use 2^32, 2^33, ...., 2^63 as shares value)

# mkdir /dev/cpuctl
# mount -t cgroup -o cpu cpuctl /dev/cpuctl
# cd /dev/cpuctl
# mkdir sub
# echo 0x8000000000000000 > sub/cpu.shares
# echo $$ > sub/tasks
oops here! divide by zero.

This is because do_div() expects the 2th parameter to be 32 bits,
but unsigned long is 64 bits in x86_64.

Peter Zijstra pointed it out that the sane thing to do is limit the
shares value to something smaller instead of using an even more
expensive divide.

Also, I found another bug about "the shares value is too large":

pid1 and pid2 are set affinity to cpu#0
pid1 is attached to cg1 and pid2 is attached to cg2

if cg1/cpu.shares = 1024 cg2/cpu.shares = 2000000000
then pid2 got 100% usage of cpu, and pid1 0%

if cg1/cpu.shares = 1024 cg2/cpu.shares = 20000000000
then pid2 got 0% usage of cpu, and pid1 100%

And a weight of a cfs_rq is the sum of weights of which entities
are queued on this cfs_rq, so the shares value should be limited
to a smaller value.

I think that (1UL << 18) is a good limited value:

1) it's not too large, we can create a lot of group before overflow
2) it's several times the weight value for nice=-19 (not too small)
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2e084786

10 6月, 2008 1 次提交

sched: fix TASK_WAKEKILL vs SIGKILL race · 16882c1e

由 Oleg Nesterov 提交于 6月 08, 2008

schedule() has the special "TASK_INTERRUPTIBLE && signal_pending()" case,
this allows us to do

	current->state = TASK_INTERRUPTIBLE;
	schedule();

without fear to sleep with pending signal.

However, the code like

	current->state = TASK_KILLABLE;
	schedule();

is not right, schedule() doesn't take TASK_WAKEKILL into account. This means
that mutex_lock_killable(), wait_for_completion_killable(), down_killable(),
schedule_timeout_killable() can miss SIGKILL (and btw the second SIGKILL has
no effect).

Introduce the new helper, signal_pending_state(), and change schedule() to
use it. Hopefully it will have more users, that is why the task's state is
passed separately.

Note this "__TASK_STOPPED | __TASK_TRACED" check in signal_pending_state().
This is needed to preserve the current behaviour (ptrace_notify). I hope
this check will be removed soon, but this (afaics good) change needs the
separate discussion.

The fast path is "(state & (INTERRUPTIBLE | WAKEKILL)) + signal_pending(p)",
basically the same that schedule() does now. However, this patch of course
bloats schedule().
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

16882c1e

29 5月, 2008 4 次提交

revert ("sched: fair-group: SMP-nice for group scheduling") · 6363ca57

由 Ingo Molnar 提交于 5月 29, 2008

Yanmin Zhang reported:

Comparing with 2.6.25, volanoMark has big regression with kernel 2.6.26-rc1.
It's about 50% on my 8-core stoakley, 16-core tigerton, and Itanium Montecito.

With bisect, I located the following patch:

| 18d95a28 is first bad commit
| commit 18d95a28
| Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
| Date:   Sat Apr 19 19:45:00 2008 +0200
|
|     sched: fair-group: SMP-nice for group scheduling

Revert it so that we get v2.6.25 behavior.
Bisected-by: NYanmin Zhang <yanmin_zhang@linux.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6363ca57

I
sched: cleanup · 4285f594
由 Ingo Molnar 提交于 5月 16, 2008
```
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
4285f594

sched: unite unlikely pairs in rt_policy() and schedule_debug() · 3f33a7ce

由 Roel Kluin 提交于 5月 13, 2008

Removes obfuscation and may improve assembly.
Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3f33a7ce

revert ("sched: fair: weight calculations") · f9305d4a

由 Ingo Molnar 提交于 5月 29, 2008

Yanmin Zhang reported:

Comparing with kernel 2.6.25, sysbench+mysql(oltp, readonly) has many
regressions with 2.6.26-rc1:

 1) 8-core stoakley: 28%;
 2) 16-core tigerton: 20%;
 3) Itanium Montvale: 50%.

Bisect located this patch:

| 8f1bc385 is first bad commit
| commit 8f1bc385
| Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
| Date:   Sat Apr 19 19:45:00 2008 +0200
|
|     sched: fair: weight calculations

Revert it to the 2.6.25 state.
Bisected-by: NYanmin Zhang <yanmin_zhang@linux.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f9305d4a

15 5月, 2008 1 次提交

cgroups: fix compile warning · 0c70814c

由 Mirco Tischler 提交于 5月 14, 2008

Return type of cpu_rt_runtime_write() should be int instead of ssize_t.
Signed-off-by: NMirco Tischler <mt-ml@gmx.de>
Acked-by: NPaul Menage <menage@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0c70814c

12 5月, 2008 1 次提交

Add new 'cond_resched_bkl()' helper function · c3921ab7

由 Linus Torvalds 提交于 5月 11, 2008

It acts exactly like a regular 'cond_resched()', but will not get
optimized away when CONFIG_PREEMPT is set.

Normal kernel code is already preemptable in the presense of
CONFIG_PREEMPT, so cond_resched() is optimized away (see commit
02b67cc3 "sched: do not do
cond_resched() when CONFIG_PREEMPT").

But when wanting to conditionally reschedule while holding a lock, you
need to use "cond_sched_lock(lock)", and the new function is the BKL
equivalent of that.

Also make fs/locks.c use it.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c3921ab7

11 5月, 2008 1 次提交

BKL: revert back to the old spinlock implementation · 8e3e076c

由 Linus Torvalds 提交于 5月 10, 2008

The generic semaphore rewrite had a huge performance regression on AIM7
(and potentially other BKL-heavy benchmarks) because the generic
semaphores had been rewritten to be simple to understand and fair.  The
latter, in particular, turns a semaphore-based BKL implementation into a
mess of scheduling.

The attempt to fix the performance regression failed miserably (see the
previous commit 00b41ec2 'Revert
"semaphore: fix"'), and so for now the simple and sane approach is to
instead just go back to the old spinlock-based BKL implementation that
never had any issues like this.

This patch also has the advantage of being reported to fix the
regression completely according to Yanmin Zhang, unlike the semaphore
hack which still left a couple percentage point regression.

As a spinlock, the BKL obviously has the potential to be a latency
issue, but it's not really any different from any other spinlock in that
respect.  We do want to get rid of the BKL asap, but that has been the
plan for several years.

These days, the biggest users are in the tty layer (open/release in
particular) and Alan holds out some hope:

  "tty release is probably a few months away from getting cured - I'm
   afraid it will almost certainly be the very last user of the BKL in
   tty to get fixed as it depends on everything else being sanely locked."

so while we're not there yet, we do have a plan of action.
Tested-by: NYanmin Zhang <yanmin_zhang@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Alexander Viro <viro@ftp.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8e3e076c

06 5月, 2008 10 次提交

sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK · 3e51f33f

由 Peter Zijlstra 提交于 5月 03, 2008

this replaces the rq->clock stuff (and possibly cpu_clock()).

 - architectures that have an 'imperfect' hardware clock can set
   CONFIG_HAVE_UNSTABLE_SCHED_CLOCK

 - the 'jiffie' window might be superfulous when we update tick_gtod
   before the __update_sched_clock() call in sched_clock_tick()

 - cpu_clock() might be implemented as:

     sched_clock_cpu(smp_processor_id())

   if the accuracy proves good enough - how far can TSC drift in a
   single jiffie when considering the filtering and idle hooks?

[ mingo@elte.hu: various fixes and cleanups ]
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3e51f33f

sched: fix cpu clock · dfbf4a1b

由 Ingo Molnar 提交于 4月 23, 2008

David Miller pointed it out that nothing in cpu_clock() sets
prev_cpu_time. This caused __sync_cpu_clock() to be called
all the time - against the intention of this code.

The result was that in practice we hit a global spinlock every
time cpu_clock() is called - which - even though cpu_clock()
is used for tracing and debugging, is suboptimal.

While at it, also:

- move the irq disabling to the outest layer,
  this should make cpu_clock() warp-free when called with irqs
  enabled.

- use long long instead of cycles_t - for platforms where cycles_t
  is 32-bit.
Reported-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

dfbf4a1b

sched: fair-group: fix a Div0 error of the fair group scheduler · cb4ad1ff

由 Miao Xie 提交于 4月 28, 2008

When I echoed 0 into the "cpu.shares" file, a Div0 error occured.

We found it is caused by the following calling.

   sched_group_set_shares(tg, shares)
       set_se_shares(tg->se[i], shares/nr_cpu_ids)
           __set_se_shares(se, shares)
               div64_64((1ULL<<32), shares)

When the echoed value was less than the number of processores, the result of the
sentence "shares/nr_cpu_ids" was 0, and then the system called div64() to divide
the result, the Div0 error occured.

It is unnecessary that the shares value is divided by nr_cpu_ids, I think.
Because in the function  __update_group_shares_cpu() and init_tg_cfs_entry(),
the shares value isn't divided by nr_cpu_ids when setting shares of the sched
entity.

This patch fixes this bug. And echoing ULONG_MAX value into cpu.shares also
causes Div0 error, so we set a macro MAX_SHARES to limit the max value of
shares.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cb4ad1ff

sched: fix missing locking in sched_domains code · 712555ee

由 Heiko Carstens 提交于 4月 28, 2008

Concurrent calls to detach_destroy_domains and arch_init_sched_domains
were prevented by the old scheduler subsystem cpu hotplug mutex. When
this got converted to get_online_cpus() the locking got broken.
Unlike before now several processes can concurrently enter the critical
sections that were protected by the old lock.

So use the already present doms_cur_mutex to protect these sections again.

Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

712555ee

I
sched: make clock sync tunable by architecture code · 690229a0
由 Ingo Molnar 提交于 4月 23, 2008
```
make time_sync_thresh tunable to architecture code.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
690229a0

sched: fix sched_info_switch not being called according to documentation · 673a90a1

由 David Simner 提交于 4月 29, 2008

http://bugzilla.kernel.org/show_bug.cgi?id=10545

sched_stats.h says that __sched_info_switch is "called when prev !=
next" in the comment.  sched.c should therefore do that.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

673a90a1

sched: fix hrtick_start_fair and CPU-Hotplug · b328ca18

由 Peter Zijlstra 提交于 4月 29, 2008

Gautham R Shenoy reported:

 > While running the usual CPU-Hotplug stress tests on linux-2.6.25,
 > I noticed the following in the console logs.
 >
 > This is a wee bit difficult to reproduce. In the past 10 runs I hit this
 > only once.
 >
 > ------------[ cut here ]------------
 >
 > WARNING: at kernel/sched.c:962 hrtick+0x2e/0x65()
 >
 > Just wondering if we are doing a good job at handling the cancellation
 > of any per-cpu scheduler timers during CPU-Hotplug.

This looks like its indeed not cancelled at all and migrates the it to
another cpu. Fix it via a proper hotplug notifier mechanism.
Reported-by: NGautham R Shenoy <ego@in.ibm.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b328ca18

sched: add statics, don't return void expressions · 983ed7a6

由 Harvey Harrison 提交于 4月 24, 2008

Noticed by sparse:
kernel/sched.c:760:20: warning: symbol 'sched_feat_names' was not declared. Should it be static?
kernel/sched.c:767:5: warning: symbol 'sched_feat_open' was not declared. Should it be static?
kernel/sched_fair.c:845:3: warning: returning void-valued expression
kernel/sched.c:4386:3: warning: returning void-valued expression
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

983ed7a6

sched: add debug checks to idle functions · d478c2cf

由 Andrew Morton 提交于 4月 26, 2008

Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: "Justin Mattock" <justinmattock@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d478c2cf

sched: optimize calc_delta_mine() · e05510d0

由 Peter Zijlstra 提交于 5月 05, 2008

Joel noticed that the !lw->inv_weight contition isn't unlikely anymore so
remove the unlikely annotation. Also, remove the two div64_u64() inv_weight
calculations, which makes them rely on the calc_delta_mine() path as well.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
CC: Joel Schopp <jschopp@austin.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e05510d0

01 5月, 2008 1 次提交

rename div64_64 to div64_u64 · 6f6d6a1a

由 Roman Zippel 提交于 5月 01, 2008

Rename div64_64 to div64_u64 to make it consistent with the other divide
functions, so it clearly includes the type of the divide.  Move its definition
to math64.h as currently no architecture overrides the generic implementation.
 They can still override it of course, but the duplicated declarations are
avoided.
Signed-off-by: NRoman Zippel <zippel@linux-m68k.org>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6f6d6a1a

29 4月, 2008 2 次提交

CGroups _s64 files: use read_s64/write_s64 in CFS cgroup for rt_runtime file · 06ecb27c

由 Paul Menage 提交于 4月 29, 2008

This removes some filesystem boilerplate from the CFS cgroup subsystem.
Signed-off-by: NPaul Menage <menage@google.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

06ecb27c

CGroup API files: rename read/write_uint methods to read_write_u64 · f4c753b7

由 Paul Menage 提交于 4月 29, 2008

Several people have justifiably complained that the "_uint" suffix is
inappropriate for functions that handle u64 values, so this patch just renames
all these functions and their users to have the suffic _u64.

[peterz@infradead.org: build fix]
Signed-off-by: NPaul Menage <menage@google.com>
Cc: "Li Zefan" <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "YAMAMOTO Takashi" <yamamoto@valinux.co.jp>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f4c753b7

25 4月, 2008 4 次提交

sched: use alloc_bootmem() instead of alloc_bootmem_low() · 5a9d3225

由 David Miller 提交于 4月 24, 2008

There is no guarantee that there is physical ram below 4GB, and in
fact many boxes don't have exactly that.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5a9d3225

sched: fix share (re)distribution · 3f5087a2

由 Peter Zijlstra 提交于 4月 25, 2008

fix __aggregate_redistribute_shares() related lockup reported by
David S. Miller.

The problem this code tries to solve is 'accurately' calculating the 'fair'
share of the group weight for each cpu. The current code falls back to a global
group rebalance in case the sched_domain's span it looks at has no shares, but
does have tasks.

The reason it gets stuck here, is because its inherently racy - if someone
steals the last task after we compute the agg->rq_weight, but before we
rebalance, we'll never get out of the loop.

We could of course go fix that, but while looking at this issue I found that
this 'fallback' wasn't nearly as rare as I'd hoped it to be. In fact its quite
common - and given it walks the whole machine, thats very bad.

The new approach is simple (why didn't I think of it before?), we set the
aggregate shares to the full task group weight, and each larger sched domain
that encounters an aggregate shares larger than the weight, clips it (it
already re-distributes anyway).

This nicely converges to the desired global picture where the sum of all
shares equals the task group weight.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3f5087a2

[PATCH] Build fix for CONFIG_NUMA=y && CONFIG_SMP=n · 03970f06

由 Mike Travis 提交于 4月 22, 2008

Regression caused by 434d53b0Signed-off-by: NMike Travis <travis@sgi.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

03970f06

[IA64] fix bootmem regression on Altix · 472613b9

由 Russ Anderson 提交于 4月 24, 2008

A recent change prevents SGI Altix from booting.
This patch fixes the problem.

The regresson was introduced in commit 434d53b0Signed-off-by: NRuss Anderson <rja@sgi.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>

472613b9

23 4月, 2008 1 次提交

kernel-doc: fix sched.c missing parameter · 73486722

由 Randy Dunlap 提交于 4月 22, 2008

Add missing kernel-doc in kernel/sched.c:

Warning(linux-2.6.25-git3//kernel/sched.c:7044): No description found for parameter 'span'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

73486722

20 4月, 2008 11 次提交

I
sched: features fix · c24b7c52
由 Ingo Molnar 提交于 4月 18, 2008
```
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
c24b7c52

sched: /debug/sched_features · f00b45c1

由 Peter Zijlstra 提交于 4月 19, 2008

provide a text based interface to the scheduler features; this saves the
'user' from setting bits using decimal arithmetic.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f00b45c1

I
sched: add SCHED_FEAT_DEADLINE · 06379aba
由 Ingo Molnar 提交于 4月 19, 2008
```
unused at the moment.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
06379aba

sched: fair: weight calculations · 8f1bc385

由 Peter Zijlstra 提交于 4月 19, 2008

In order to level the hierarchy, we need to calculate load based on the
root view. That is, each task's load is in the same unit.

             A
            / \
           B   1
          / \
         2   3

To compute 1's load we do:

	   weight(1)
	--------------
	 rq_weight(A)

To compute 2's load we do:

	  weight(2)      weight(B)
	------------ * -----------
	rq_weight(B)   rw_weight(A)

This yields load fractions in comparable units.

The consequence is that it changes virtual time. We used to have:

                time_{i}
  vtime_{i} = ------------
               weight_{i}

  vtime = \Sum vtime_{i} = time / rq_weight.

But with the new way of load calculation we get that vtime equals time.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8f1bc385

sched: fair-group: de-couple load-balancing from the rb-trees · 4a55bd5e

由 Peter Zijlstra 提交于 4月 19, 2008

De-couple load-balancing from the rb-trees, so that I can change their
organization.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4a55bd5e

sched: fair-group: SMP-nice for group scheduling · 18d95a28

由 Peter Zijlstra 提交于 4月 19, 2008

Implement SMP nice support for the full group hierarchy.

On each load-balance action, compile a sched_domain wide view of the full
task_group tree. We compute the domain wide view when walking down the
hierarchy, and readjust the weights when walking back up.

After collecting and readjusting the domain wide view, we try to balance the
tasks within the task_groups. The current approach is a naively balance each
task group until we've moved the targeted amount of load.

Inspired by Srivatsa Vaddsgiri's previous code and Abhishek Chandra's H-SMP
paper.

XXX: there will be some numerical issues due to the limited nature of
     SCHED_LOAD_SCALE wrt to representing a task_groups influence on the
     total weight. When the tree is deep enough, or the task weight small
     enough, we'll run out of bits.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
CC: Abhishek Chandra <chandra@cs.umn.edu>
CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

18d95a28

sched, cpuset: customize sched domains, core · 1d3504fc

由 Hidetoshi Seto 提交于 4月 15, 2008

[rebased for sched-devel/latest]

 - Add a new cpuset file, having levels:
     sched_relax_domain_level

 - Modify partition_sched_domains() and build_sched_domains()
   to take attributes parameter passed from cpuset.

 - Fill newidle_idx for node domains which currently unused but
   might be required if sched_relax_domain_level become higher.

 - We can change the default level by boot option 'relax_domain_level='.
Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1d3504fc

sched: rt: multi level group constraints · b40b2e8e

由 Peter Zijlstra 提交于 4月 19, 2008

multi level rt constraints
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b40b2e8e

sched: task_group hierarchy · f473aa5e

由 Peter Zijlstra 提交于 4月 19, 2008

Add the full parent<->child relation thing into task_groups as well.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f473aa5e

sched: fix the task_group hierarchy for UID grouping · eff766a6

由 Peter Zijlstra 提交于 4月 19, 2008

UID grouping doesn't actually have a task_group representing the root of
the task_group tree. Add one.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

eff766a6

sched: allow the group scheduler to have multiple levels · ec7dc8ac

由 Dhaval Giani 提交于 4月 19, 2008

This patch makes the group scheduler multi hierarchy aware.

[a.p.zijlstra@chello.nl: rt-parts and assorted fixes]
Signed-off-by: NDhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ec7dc8ac

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多