提交 · 6cecd084d0fd27bb1e498e2829fd45846d806856 · openeuler / raspberrypi-kernel

09 12月, 2009 1 次提交

由 Peter Zijlstra 提交于 11月 30, 2009

WAKEUP_RUNNING was an experiment, not sure why that ever ended up being
merged...
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6cecd084

17 9月, 2009 1 次提交

sched: Add new wakeup preemption mode: WAKEUP_RUNNING · ad4b78bb

由 Peter Zijlstra 提交于 9月 16, 2009

Create a new wakeup preemption mode, preempt towards tasks that run
shorter on avg. It sets next buddy to be sure we actually run the task
we preempted for.

Test results:

 root@twins:~# while :; do :; done &
 [1] 6537
 root@twins:~# while :; do :; done &
 [2] 6538
 root@twins:~# while :; do :; done &
 [3] 6539
 root@twins:~# while :; do :; done &
 [4] 6540

 root@twins:/home/peter# ./latt -c4 sleep 4
 Entries: 48 (clients=4)

 Averages:
 ------------------------------
        Max          4750 usec
        Avg           497 usec
        Stdev         737 usec

 root@twins:/home/peter# echo WAKEUP_RUNNING > /debug/sched_features

 root@twins:/home/peter# ./latt -c4 sleep 4
 Entries: 48 (clients=4)

 Averages:
 ------------------------------
        Max            14 usec
        Avg             5 usec
        Stdev           3 usec

Disabled by default - needs more testing.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
LKML-Reference: <new-submission>

ad4b78bb

16 9月, 2009 3 次提交

sched: Optimize cgroup vs wakeup a bit · 3b640894

由 Peter Zijlstra 提交于 9月 16, 2009

We don't need to call update_shares() for each domain we iterate,
just got the largets one.

However, we should call it before wake_affine() as well, so that
that can use up-to-date values too.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3b640894

sched: Implement a gentler fair-sleepers feature · 51e0304c

由 Ingo Molnar 提交于 9月 16, 2009

Add back FAIR_SLEEPERS and GENTLE_FAIR_SLEEPERS.

FAIR_SLEEPERS is the old logic: credit sleepers with their sleep time.

GENTLE_FAIR_SLEEPERS dampens this a bit: 50% of their sleep time gets
credited.

The hope here is to still give the benefits of fair-sleepers logic
(quick wakeups, etc.) while not allow them to have 100% of their
sleep time as if they were running.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

51e0304c

sched: Add a few SYNC hint knobs to play with · e69b0f1b

由 Peter Zijlstra 提交于 9月 15, 2009

Currently we use overlap to weaken the SYNC hint, but allow it to
set the hint as well.

 echo NO_SYNC_WAKEUP > /debug/sched_features
 echo SYNC_MORE > /debug/sched_features

preserves pipe-test behaviour without using the WF_SYNC hint.

Worth playing with on more workloads...
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e69b0f1b

15 9月, 2009 5 次提交

sched: Feature to disable APERF/MPERF cpu_power · 8e6598af

由 Peter Zijlstra 提交于 9月 03, 2009

I suspect a feed-back loop between cpuidle and the aperf/mperf
cpu_power bits, where when we have idle C-states lower the ratio,
which leads to lower cpu_power and then less load, which generates
more idle time, etc..

Put in a knob to disable it.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8e6598af

sched: Improve latencies and throughput · 0ec9fab3

由 Mike Galbraith 提交于 9月 15, 2009

Make the idle balancer more agressive, to improve a
x264 encoding workload provided by Jason Garrett-Glaser:

 NEXT_BUDDY NO_LB_BIAS
 encoded 600 frames, 252.82 fps, 22096.60 kb/s
 encoded 600 frames, 250.69 fps, 22096.60 kb/s
 encoded 600 frames, 245.76 fps, 22096.60 kb/s

 NO_NEXT_BUDDY LB_BIAS
 encoded 600 frames, 344.44 fps, 22096.60 kb/s
 encoded 600 frames, 346.66 fps, 22096.60 kb/s
 encoded 600 frames, 352.59 fps, 22096.60 kb/s

 NO_NEXT_BUDDY NO_LB_BIAS
 encoded 600 frames, 425.75 fps, 22096.60 kb/s
 encoded 600 frames, 425.45 fps, 22096.60 kb/s
 encoded 600 frames, 422.49 fps, 22096.60 kb/s

Peter pointed out that this is better done via newidle_idx,
not via LB_BIAS, newidle balancing should look for where
there is load _now_, not where there was load 2 ticks ago.

Worst-case latencies are improved as well as no buddies
means less vruntime spread. (as per prior lkml discussions)

This change improves kbuild-peak parallelism as well.
Reported-by: NJason Garrett-Glaser <darkshikari@gmail.com>
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1253011667.9128.16.camel@marge.simson.net>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0ec9fab3

sched: Add come comments to the sched features · e26af0e8

由 Peter Zijlstra 提交于 9月 11, 2009

Add text...
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e26af0e8

sched: Complete buddy switches · 3cb63d52

由 Mike Galbraith 提交于 9月 11, 2009

Add a NEXT_BUDDY feature flag to aid in debugging.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3cb63d52

sched: Split WAKEUP_OVERLAP · e6b1b2c9

由 Peter Zijlstra 提交于 9月 11, 2009

It consists of two conditions, split them out in separate toggles
so we can test them independently.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e6b1b2c9

11 9月, 2009 1 次提交

sched: Disable NEW_FAIR_SLEEPERS for now · 3f2aa307

由 Ingo Molnar 提交于 9月 10, 2009

Nikos Chantziaras and Jens Axboe reported that turning off
NEW_FAIR_SLEEPERS improves desktop interactivity visibly.

Nikos described his experiences the following way:

  " With this setting, I can do "nice -n 19 make -j20" and
    still have a very smooth desktop and watch a movie at
    the same time.  Various other annoyances (like the
    "logout/shutdown/restart" dialog of KDE not appearing
    at all until the background fade-out effect has finished)
    are also gone.  So this seems to be the single most
    important setting that vastly improves desktop behavior,
    at least here. "

Jens described it the following way, referring to a 10-seconds
xmodmap scheduling delay he was trying to debug:

  " Then I tried switching NO_NEW_FAIR_SLEEPERS on, and then
    I get:

    Performance counter stats for 'xmodmap .xmodmap-carl':

         9.009137  task-clock-msecs         #      0.447 CPUs
               18  context-switches         #      0.002 M/sec
                1  CPU-migrations           #      0.000 M/sec
              315  page-faults              #      0.035 M/sec

    0.020167093  seconds time elapsed

    Woot! "

So disable it for now. In perf trace output i can see weird
delta timestamps:

  cc1-9943  [001]  2802.059479616: sched_stat_wait: task: as:9944 wait: 2801938766276 [ns]

That nsec field is not supposed to be that large. More digging
is needed - but lets turn it off while the real bug is found.
Reported-by: NNikos Chantziaras <realnc@arcor.de>
Tested-by: NNikos Chantziaras <realnc@arcor.de>
Reported-by: NJens Axboe <jens.axboe@oracle.com>
Tested-by: NJens Axboe <jens.axboe@oracle.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <4AA93D34.8040500@arcor.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3f2aa307

15 1月, 2009 2 次提交

sched: prefer wakers · e52fb7c0

由 Peter Zijlstra 提交于 1月 14, 2009

Prefer tasks that wake other tasks to preempt quickly. This improves
performance because more work is available sooner.

The workload that prompted this patch was a kernel build over NFS4 (for some
curious and not understood reason we had to revert commit:
18de9735 to make any progress at all)

Without this patch a make -j8 bzImage (of x86-64 defconfig) would take
3m30-ish, with this patch we're down to 2m50-ish.

psql-sysbench/mysql-sysbench show a slight improvement in peak performance as
well, tbench and vmark seemed to not care.

It is possible to improve upon the build time (to 2m20-ish) but that seriously
destroys other benchmarks (just shows that there's more room for tinkering).

Much thanks to Mike who put in a lot of effort to benchmark things and proved
a worthy opponent with a competing patch.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e52fb7c0

mutex: implement adaptive spinning · 0d66bf6d

由 Peter Zijlstra 提交于 1月 12, 2009

Change mutex contention behaviour such that it will sometimes busy wait on
acquisition - moving its behaviour closer to that of spinlocks.

This concept got ported to mainline from the -rt tree, where it was originally
implemented for rtmutexes by Steven Rostedt, based on work by Gregory Haskins.

Testing with Ingo's test-mutex application (http://lkml.org/lkml/2006/1/8/50)
gave a 345% boost for VFS scalability on my testbox:

 # ./test-mutex-shm V 16 10 | grep "^avg ops"
 avg ops/sec:               296604

 # ./test-mutex-shm V 16 10 | grep "^avg ops"
 avg ops/sec:               85870

The key criteria for the busy wait is that the lock owner has to be running on
a (different) cpu. The idea is that as long as the owner is running, there is a
fair chance it'll release the lock soon, and thus we'll be better off spinning
instead of blocking/scheduling.

Since regular mutexes (as opposed to rtmutexes) do not atomically track the
owner, we add the owner in a non-atomic fashion and deal with the races in
the slowpath.

Furthermore, to ease the testing of the performance impact of this new code,
there is means to disable this behaviour runtime (without having to reboot
the system), when scheduler debugging is enabled (CONFIG_SCHED_DEBUG=y),
by issuing the following command:

 # echo NO_OWNER_SPIN > /debug/sched_features

This command re-enables spinning again (this is also the default):

 # echo OWNER_SPIN > /debug/sched_features
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0d66bf6d

05 11月, 2008 1 次提交

sched: backward looking buddy · 4793241b

由 Peter Zijlstra 提交于 11月 04, 2008

Impact: improve/change/fix wakeup-buddy scheduling

Currently we only have a forward looking buddy, that is, we prefer to
schedule to the task we last woke up, under the presumption that its
going to consume the data we just produced, and therefore will have
cache hot benefits.

This allows co-waking producer/consumer task pairs to run ahead of the
pack for a little while, keeping their cache warm. Without this, we
would interleave all pairs, utterly trashing the cache.

This patch introduces a backward looking buddy, that is, suppose that
in the above scenario, the consumer preempts the producer before it
can go to sleep, we will therefore miss the wakeup from consumer to
producer (its already running, after all), breaking the cycle and
reverting to the cache-trashing interleaved schedule pattern.

The backward buddy will try to schedule back to the task that woke us
up in case the forward buddy is not available, under the assumption
that the last task will be the one with the most cache hot task around
barring current.

This will basically allow a task to continue after it got preempted.

In order to avoid starvation, we allow either buddy to get wakeup_gran
ahead of the pack.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4793241b

20 10月, 2008 1 次提交

sched: disable the hrtick for now · 0c4b83da

由 Ingo Molnar 提交于 10月 20, 2008

David Miller reported that hrtick update overhead has tripled the
wakeup overhead on Sparc64.

That is too much - disable the HRTICK feature for now by default,
until a faster implementation is found.
Reported-by: NDavid Miller <davem@davemloft.net>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0c4b83da

22 9月, 2008 2 次提交

sched: turn off WAKEUP_OVERLAP · f681bbd6

由 Ingo Molnar 提交于 9月 22, 2008

WAKEUP_OVERLAP is not a winner on a 16way box, running psql+sysbench:

       .27-rc7-NO_WAKEUP_OVERLAP  .27-rc7-WAKEUP_OVERLAP
-------------------------------------------------
    1:             694              811    +14.39%
    2:            1454             1427    -1.86%
    4:            3017             3070    +1.70%
    8:            5694             5808    +1.96%
   16:           10592            10612    +0.19%
   32:            9693             9647    -0.48%
   64:            8507             8262    -2.97%
  128:            8402             7087    -18.55%
  256:            8419             5124    -64.30%
  512:            7990             3671    -117.62%
-------------------------------------------------
  SUM:           64466            55524    -16.11%

... so turn it off by default.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f681bbd6

sched: wakeup preempt when small overlap · 15afe09b

由 Peter Zijlstra 提交于 9月 20, 2008

Lin Ming reported a 10% OLTP regression against 2.6.27-rc4.

The difference seems to come from different preemption agressiveness,
which affects the cache footprint of the workload and its effective
cache trashing.

Aggresively preempt a task if its avg overlap is very small, this should
avoid the task going to sleep and find it still running when we schedule
back to it - saving a wakeup.
Reported-by: NLin Ming <ming.m.lin@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

15afe09b

21 8月, 2008 1 次提交

sched: enable LB_BIAS by default · efc2dead

由 Peter Zijlstra 提交于 8月 20, 2008

Yanmin reported a significant regression on his 16-core machine due to:

  commit 93b75217
  Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
  Date:   Fri Jun 27 13:41:33 2008 +0200

Flip back to the old behaviour.
Reported-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

efc2dead

27 6月, 2008 5 次提交

sched: bias effective_load() error towards failing wake_affine(). · f5bfb7d9

由 Peter Zijlstra 提交于 6月 27, 2008

Measurement shows that the difference between cgroup:/ and cgroup:/foo
wake_affine() results is that the latter succeeds significantly more.

Therefore bias the calculations towards failing the test.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f5bfb7d9

sched: update shares on wakeup · 2398f2c6

由 Peter Zijlstra 提交于 6月 27, 2008

We found that the affine wakeup code needs rather accurate load figures
to be effective. The trouble is that updating the load figures is fairly
expensive with group scheduling. Therefore ratelimit the updating.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2398f2c6

sched: disable source/target_load bias · 93b75217

由 Peter Zijlstra 提交于 6月 27, 2008

The bias given by source/target_load functions can be very large, disable
it by default to get faster convergence.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

93b75217

sched: fix calc_delta_asym() · c9c294a6

由 Peter Zijlstra 提交于 6月 27, 2008

calc_delta_asym() is supposed to do the same as calc_delta_fair() except
linearly shrink the result for negative nice processes - this causes them
to have a smaller preemption threshold so that they are more easily preempted.

The problem is that for task groups se->load.weight is the per cpu share of
the actual task group weight; take that into account.

Also provide a debug switch to disable the asymmetry (which I still don't
like - but it does greatly benefit some workloads)

This would explain the interactivity issues reported against group scheduling.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c9c294a6

sched: revert the revert of: weight calculations · a7be37ac

由 Peter Zijlstra 提交于 6月 27, 2008

Try again..

initial commit: 8f1bc385
revert: f9305d4aSigned-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a7be37ac

10 6月, 2008 1 次提交

sched: trivial sched_features cleanup · 6492c7f8

由 Mike Galbraith 提交于 6月 08, 2008

Remove unused debug/tuning features.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6492c7f8

20 4月, 2008 1 次提交

sched: /debug/sched_features · f00b45c1

由 Peter Zijlstra 提交于 4月 19, 2008

provide a text based interface to the scheduler features; this saves the
'user' from setting bits using decimal arithmetic.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f00b45c1