提交 · 2d452c9b10caeec455eb5e56a0ef4ed485178213 · openeuler / raspberrypi-kernel

29 6月, 2008 1 次提交

sched: sched_clock_cpu() based cpu_clock(), lockdep fix · 2d452c9b

由 Ingo Molnar 提交于 6月 29, 2008

Vegard Nossum reported:

> WARNING: at kernel/lockdep.c:2738 check_flags+0x142/0x160()

which happens due to:

 unsigned long long cpu_clock(int cpu)
 {
         unsigned long long clock;
         unsigned long flags;

         raw_local_irq_save(flags);

as lower level functions can take locks, we must not do that, use
proper lockdep-annotated irq save/restore.
Reported-by: NVegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2d452c9b

27 6月, 2008 32 次提交

sched: export cpu_clock · 4c9fe8ad

由 Ingo Molnar 提交于 6月 27, 2008

the rcutorture module relies on cpu_clock.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4c9fe8ad

sched: make sched_{rt,fair}.c ifdefs more readable · 55e12e5e

由 Dhaval Giani 提交于 6月 24, 2008

Signed-off-by: NDhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

55e12e5e

sched: bias effective_load() error towards failing wake_affine(). · f5bfb7d9

由 Peter Zijlstra 提交于 6月 27, 2008

Measurement shows that the difference between cgroup:/ and cgroup:/foo
wake_affine() results is that the latter succeeds significantly more.

Therefore bias the calculations towards failing the test.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f5bfb7d9

sched: incremental effective_load() · f1d239f7

由 Peter Zijlstra 提交于 6月 27, 2008

Increase the accuracy of the effective_load values.

Not only consider the current increment (as per the attempted wakeup), but
also consider the delta between when we last adjusted the shares and the
current situation.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f1d239f7

sched: correct wakeup weight calculations · 83378269

由 Peter Zijlstra 提交于 6月 27, 2008

rw_i = {2, 4, 1, 0}
s_i = {2/7, 4/7, 1/7, 0}

wakeup on cpu0, weight=1

rw'_i = {3, 4, 1, 0}
s'_i = {3/8, 4/8, 1/8, 0}

s_0 = S * rw_0 / \Sum rw_j ->
  \Sum rw_j = S*rw_0/s_0 = 1*2*7/2 = 7 (correct)

s'_0 = S * (rw_0 + 1) / (\Sum rw_j + 1) =
       1 * (2+1) / (7+1) = 3/8 (correct

so we find that adding 1 to cpu0 gains 5/56 in weight
if say the other cpu were, cpu1, we'd also have to calculate its 4/56 loss
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

83378269

sched: fix mult overflow · 243e0e7b

由 Srivatsa Vaddagiri 提交于 6月 27, 2008

It was observed these mults can overflow.
Signed-off-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

243e0e7b

sched: update shares on wakeup · 2398f2c6

由 Peter Zijlstra 提交于 6月 27, 2008

We found that the affine wakeup code needs rather accurate load figures
to be effective. The trouble is that updating the load figures is fairly
expensive with group scheduling. Therefore ratelimit the updating.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2398f2c6

sched: fix shares boost logic · cd80917e

由 Peter Zijlstra 提交于 6月 27, 2008

In case the domain is empty, pretend there is a single task on each cpu, so
that together with the boost logic we end up giving 1/n shares to each
cpu.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cd80917e

sched: disable source/target_load bias · 93b75217

由 Peter Zijlstra 提交于 6月 27, 2008

The bias given by source/target_load functions can be very large, disable
it by default to get faster convergence.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

93b75217

sched: optimize effective_load() · cb5ef42a

由 Peter Zijlstra 提交于 6月 27, 2008

s_i = S * rw_i / \Sum_j rw_j

 -> \Sum_j rw_j = S * rw_i / s_i

 -> s'_i = S * (rw_i + w) / (\Sum_j rw_j + w)

delta s = s' - s = S * (rw + w) / ((S * rw / s) + w)
        = s * (S * (rw + w) / (S * rw + s * w) - 1)

 a = S*(rw+w), b = S*rw + s*w

delta s = s * (a-b) / b

IOW, trade one divide for two multiplies
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cb5ef42a

sched: remove prio preference from balance decisions · 051c6764

由 Peter Zijlstra 提交于 6月 27, 2008

Priority looses much of its meaning in a hierarchical context. So don't
use it in balance decisions.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

051c6764

sched: fix task_h_load() · 4be9daaa

由 Peter Zijlstra 提交于 6月 27, 2008

Currently task_h_load() computes the load of a task and uses that to either
subtract it from the total, or add to it.

However, removing or adding a task need not have any effect on the total load
at all. Imagine adding a task to a group that is local to one cpu - in that
case the total load of that cpu is unaffected.

So properly compute addition/removal:

 s_i = S * rw_i / \Sum_j rw_j
 s'_i = S * (rw_i + wl) / (\Sum_j rw_j + wg)

then s'_i - s_i gives the change in load.

Where s_i is the shares for cpu i, S the group weight, rw_i the runqueue weight
for that cpu, wl the weight we add (subtract) and wg the weight contribution to
the runqueue.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4be9daaa

sched: fix load scaling in group balancing · 42a3ac7d

由 Peter Zijlstra 提交于 6月 27, 2008

doing the load balance will change cfs_rq->load.weight (that's the whole point)
but since that's part of the scale factor, we'll scale back with a different
amount.

Weight getting smaller would result in an inflated moved_load which causes
it to stop balancing too soon.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

42a3ac7d

sched: hierarchical load vs find_busiest_group · 408ed066

由 Peter Zijlstra 提交于 6月 27, 2008

find_busiest_group() has some assumptions about task weight being in the
NICE_0_LOAD range. Hierarchical task groups break this assumption - fix this
by replacing it with the average task weight, which will adapt the situation.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

408ed066

sched: hierarchical load vs affine wakeups · bb3469ac

由 Peter Zijlstra 提交于 6月 27, 2008

With hierarchical grouping we can't just compare task weight to rq weight - we
need to scale the weight appropriately.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bb3469ac

sched: persistent average load per task · a8a51d5e

由 Peter Zijlstra 提交于 6月 27, 2008

Remove the fall-back to SCHED_LOAD_SCALE by remembering the previous value of
cpu_avg_load_per_task() - this is useful because of the hierarchical group
model in which task weight can be much smaller.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a8a51d5e

sched: fix sched_balance_self() smp group balancing · 039a1c41

由 Peter Zijlstra 提交于 6月 27, 2008

Finding the least idle cpu is more accurate when done with updated shares.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

039a1c41

sched: fix newidle smp group balancing · 3e5459b4

由 Peter Zijlstra 提交于 6月 27, 2008

Re-compute the shares on newidle - so we can make a decision based on
recent data.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3e5459b4

sched: simplify the group load balancer · c8cba857

由 Peter Zijlstra 提交于 6月 27, 2008

While thinking about the previous patch - I realized that using per domain
aggregate load values in load_balance_fair() is wrong. We should use the
load value for that CPU.

By not needing per domain hierarchical load values we don't need to store
per domain aggregate shares, which greatly simplifies all the math.

It basically falls apart in two separate computations:
 - per domain update of the shares
 - per CPU update of the hierarchical load

Also get rid of the move_group_shares() stuff - just re-compute the shares
again after a successful load balance.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c8cba857

sched: no need to aggregate task_weight · a25b5aca

由 Peter Zijlstra 提交于 6月 27, 2008

We only need to know the task_weight of the busiest rq - nothing to do
if there are no tasks there.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a25b5aca

sched: dont micro manage share losses · d3f40dba

由 Peter Zijlstra 提交于 6月 27, 2008

We used to try and contain the loss of 'shares' by playing arithmetic
games. Replace that by noticing that at the top sched_domain we'll
always have the full weight in shares to distribute.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d3f40dba

sched: kill task_group balancing · 53fecd8a

由 Srivatsa Vaddagiri 提交于 6月 27, 2008

The idea was to balance groups until we've reached the global goal, however
Vatsa rightly pointed out that we might never reach that goal this way -
hence take out this logic.

[ the initial rationale for this 'feature' was to promote max concurrency
  within a group - it does not however affect fairness ]
Reported-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

53fecd8a

sched: update aggregate when holding the RQs · 4d8d595d

由 Peter Zijlstra 提交于 6月 27, 2008

It was observed that in __update_group_shares_cpu()

  rq_weight > aggregate()->rq_weight

This is caused by forks/wakeups in between the initial aggregate pass and
locking of the RQs for load balance. To avoid this situation partially re-do
the aggregation once we have the RQs locked (which avoids new tasks from
appearing).
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4d8d595d

sched: fix sched_domain aggregation · b6a86c74

由 Peter Zijlstra 提交于 6月 27, 2008

Keeping the aggregate on the first cpu of the sched domain has two problems:
 - it could collide between different sched domains on different cpus
 - it could slow things down because of the remote accesses
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b6a86c74

sched: add full schedstats to /proc/sched_debug · 32df2ee8

由 Peter Zijlstra 提交于 6月 27, 2008

show all the schedstats in /debug/sched_debug as well.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

32df2ee8

sched: fix wakeup granularity and buddy granularity · 103638d9

由 Peter Zijlstra 提交于 6月 27, 2008

Uncouple buddy selection from wakeup granularity.

The initial idea was that buddies could run ahead as far as a normal task
can - do this by measuring a pair 'slice' just as we do for a normal task.

This means we can drop the wakeup_granularity back to 5ms.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

103638d9

sched: sched_clock_cpu() based cpu_clock() · 76a2a6ee

由 Peter Zijlstra 提交于 6月 27, 2008

with sched_clock_cpu() being reasonably in sync between cpus (max 1 jiffy
difference) use this to provide cpu_clock().
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

76a2a6ee

sched: revert revert of: fair-group: SMP-nice for group scheduling · c09595f6

由 Peter Zijlstra 提交于 6月 27, 2008

Try again..

Initial commit: 18d95a28
Revert: 6363ca57Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c09595f6

sched: fix calc_delta_asym, #2 · ced8aa16

由 Peter Zijlstra 提交于 6月 27, 2008

Ok, so why are we in this mess, it was:

  1/w

but now we mixed that rw in the mix like:

 rw/w

rw being \Sum w suggests: fiddling w, we should also fiddle rw, humm?
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ced8aa16

sched: fix calc_delta_asym() · c9c294a6

由 Peter Zijlstra 提交于 6月 27, 2008

calc_delta_asym() is supposed to do the same as calc_delta_fair() except
linearly shrink the result for negative nice processes - this causes them
to have a smaller preemption threshold so that they are more easily preempted.

The problem is that for task groups se->load.weight is the per cpu share of
the actual task group weight; take that into account.

Also provide a debug switch to disable the asymmetry (which I still don't
like - but it does greatly benefit some workloads)

This would explain the interactivity issues reported against group scheduling.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c9c294a6

sched: revert the revert of: weight calculations · a7be37ac

由 Peter Zijlstra 提交于 6月 27, 2008

Try again..

initial commit: 8f1bc385
revert: f9305d4aSigned-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a7be37ac

sched: clean up some unused variables · bf647b62

由 Peter Zijlstra 提交于 6月 27, 2008

In file included from /mnt/build/linux-2.6/kernel/sched.c:1496:
/mnt/build/linux-2.6/kernel/sched_rt.c: In function '__enable_runtime':
/mnt/build/linux-2.6/kernel/sched_rt.c:339: warning: unused variable 'rd'
/mnt/build/linux-2.6/kernel/sched_rt.c: In function 'requeue_rt_entity':
/mnt/build/linux-2.6/kernel/sched_rt.c:692: warning: unused variable 'queue'
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bf647b62

25 6月, 2008 7 次提交

I
Merge branch 'linus' into sched/devel · f57aec5a
由 Ingo Molnar 提交于 6月 25, 2008
```
Conflicts:

	kernel/sched_rt.c
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
f57aec5a
L

Linux 2.6.26-rc8 · 543cf4cb
由 Linus Torvalds 提交于 6月 24, 2008

543cf4cb

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 · bd8c540f

由 Linus Torvalds 提交于 6月 24, 2008

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Eliminate NULL test after alloc_bootmem in iosapic_alloc_rte()
  [IA64] Handle count==0 in sn2_ptc_proc_write()
  [IA64] Fix boot failure on ia64/sn2

bd8c540f

Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes · 035cfc61

由 Linus Torvalds 提交于 6月 24, 2008

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
  [GFS2] fix gfs2 block allocation (cleaned up)
  [GFS2] BUG: unable to handle kernel paging request at ffff81002690e000

035cfc61

Merge branch 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm · 919c0d14

由 Linus Torvalds 提交于 6月 24, 2008

* 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: Remove now unused structs from kvm_para.h
  x86: KVM guest: Use the paravirt clocksource structs and functions
  KVM: Make kvm host use the paravirt clocksource structs
  x86: Make xen use the paravirt clocksource structs and functions
  x86: Add structs and functions for paravirt clocksource
  KVM: VMX: Fix host msr corruption with preemption enabled
  KVM: ioapic: fix lost interrupt when changing a device's irq
  KVM: MMU: Fix oops on guest userspace access to guest pagetable
  KVM: MMU: large page update_pte issue with non-PAE 32-bit guests (resend)
  KVM: MMU: Fix rmap_write_protect() hugepage iteration bug
  KVM: close timer injection race window in __vcpu_run
  KVM: Fix race between timer migration and vcpu migration

919c0d14

Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog · de08341a

由 Linus Torvalds 提交于 6月 24, 2008

* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  Revert "[WATCHDOG] hpwdt: Add CFLAGS to get driver working"

de08341a

Merge branch 'x86-fixes-for-linus' of... · 9bf8a943

由 Linus Torvalds 提交于 6月 24, 2008

Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  xen: remove support for non-PAE 32-bit

9bf8a943