1. 10 1月, 2018 5 次提交
    • J
      sched/fair: Consider RT/IRQ pressure in capacity_spare_wake() · f453ae22
      Joel Fernandes 提交于
      capacity_spare_wake() in the slow path influences choice of idlest groups,
      as we search for groups with maximum spare capacity. In scenarios where
      RT pressure is high, a sub optimal group can be chosen and hurt
      performance of the task being woken up.
      
      Fix this by using capacity_of() instead of capacity_orig_of() in capacity_spare_wake().
      
      Tests results from improvements with this change are below. More tests
      were also done by myself and Matt Fleming to ensure no degradation in
      different benchmarks.
      
      1) Rohit ran barrier.c test (details below) with following improvements:
      ------------------------------------------------------------------------
      This was Rohit's original use case for a patch he posted at [1] however
      from his recent tests he showed my patch can replace his slow path
      changes [1] and there's no need to selectively scan/skip CPUs in
      find_idlest_group_cpu in the slow path to get the improvement he sees.
      
      barrier.c (open_mp code) as a micro-benchmark. It does a number of
      iterations and barrier sync at the end of each for loop.
      
      Here barrier,c is running in along with ping on CPU 0 and 1 as:
      'ping -l 10000 -q -s 10 -f hostX'
      
      barrier.c can be found at:
      http://www.spinics.net/lists/kernel/msg2506955.html
      
      Following are the results for the iterations per second with this
      micro-benchmark (higher is better), on a 44 core, 2 socket 88 Threads
      Intel x86 machine:
      +--------+------------------+---------------------------+
      |Threads | Without patch    | With patch                |
      |        |                  |                           |
      +--------+--------+---------+-----------------+---------+
      |        | Mean   | Std Dev | Mean            | Std Dev |
      +--------+--------+---------+-----------------+---------+
      |1       | 539.36 | 60.16   | 572.54 (+6.15%) | 40.95   |
      |2       | 481.01 | 19.32   | 530.64 (+10.32%)| 56.16   |
      |4       | 474.78 | 22.28   | 479.46 (+0.99%) | 18.89   |
      |8       | 450.06 | 24.91   | 447.82 (-0.50%) | 12.36   |
      |16      | 436.99 | 22.57   | 441.88 (+1.12%) | 7.39    |
      |32      | 388.28 | 55.59   | 429.4  (+10.59%)| 31.14   |
      |64      | 314.62 | 6.33    | 311.81 (-0.89%) | 11.99   |
      +--------+--------+---------+-----------------+---------+
      
      2) ping+hackbench test on bare-metal sever (by Rohit)
      -----------------------------------------------------
      Here hackbench is running in threaded mode along
      with, running ping on CPU 0 and 1 as:
      'ping -l 10000 -q -s 10 -f hostX'
      
      This test is running on 2 socket, 20 core and 40 threads Intel x86
      machine:
      Number of loops is 10000 and runtime is in seconds (Lower is better).
      
      +--------------+-----------------+--------------------------+
      |Task Groups   | Without patch   |  With patch              |
      |              +-------+---------+----------------+---------+
      |(Groups of 40)| Mean  | Std Dev |  Mean          | Std Dev |
      +--------------+-------+---------+----------------+---------+
      |1             | 0.851 | 0.007   |  0.828 (+2.77%)| 0.032   |
      |2             | 1.083 | 0.203   |  1.087 (-0.37%)| 0.246   |
      |4             | 1.601 | 0.051   |  1.611 (-0.62%)| 0.055   |
      |8             | 2.837 | 0.060   |  2.827 (+0.35%)| 0.031   |
      |16            | 5.139 | 0.133   |  5.107 (+0.63%)| 0.085   |
      |25            | 7.569 | 0.142   |  7.503 (+0.88%)| 0.143   |
      +--------------+-------+---------+----------------+---------+
      
      [1] https://patchwork.kernel.org/patch/9991635/
      
      Matt Fleming also ran several different hackbench tests and cyclic test
      to santiy-check that the patch doesn't harm other usecases.
      Tested-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Tested-by: NRohit Jain <rohit.k.jain@oracle.com>
      Signed-off-by: NJoel Fernandes <joelaf@google.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NVincent Guittot <vincent.guittot@linaro.org>
      Reviewed-by: NDietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Atish Patra <atish.patra@oracle.com>
      Cc: Brendan Jackman <brendan.jackman@arm.com>
      Cc: Chris Redpath <Chris.Redpath@arm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Juri Lelli <juri.lelli@arm.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Morten Ramussen <morten.rasmussen@arm.com>
      Cc: Patrick Bellasi <patrick.bellasi@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Saravana Kannan <skannan@quicinc.com>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: Steve Muckle <smuckle@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vikram Mulukutla <markivx@codeaurora.org>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Link: http://lkml.kernel.org/r/20171214212158.188190-1-joelaf@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f453ae22
    • P
      sched/fair: Use 'unsigned long' for utilization, consistently · f01415fd
      Patrick Bellasi 提交于
      Utilization and capacity are tracked as 'unsigned long', however some
      functions using them return an 'int' which is ultimately assigned back to
      'unsigned long' variables.
      
      Since there is not scope on using a different and signed type,
      consolidate the signature of functions returning utilization to always
      use the native type.
      
      This change improves code consistency, and it also benefits
      code paths where utilizations should be clamped by avoiding
      further type conversions or ugly type casts.
      Signed-off-by: NPatrick Bellasi <patrick.bellasi@arm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NChris Redpath <chris.redpath@arm.com>
      Reviewed-by: NBrendan Jackman <brendan.jackman@arm.com>
      Reviewed-by: NDietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Joel Fernandes <joelaf@google.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Todd Kjos <tkjos@android.com>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Link: http://lkml.kernel.org/r/20171205171018.9203-2-patrick.bellasi@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f01415fd
    • R
      sched/core: Rework and clarify prepare_lock_switch() · 31cb1bc0
      rodrigosiqueira 提交于
      The prepare_lock_switch() function has an unused parameter, and also the
      function name was not descriptive. To improve readability and remove
      the extra parameter, do the following changes:
      
      * Move prepare_lock_switch() from kernel/sched/sched.h to
        kernel/sched/core.c, rename it to prepare_task(), and remove the
        unused parameter.
      
      * Split the smp_store_release() out from finish_lock_switch() to a
        function named finish_task.
      
      * Comments ajdustments.
      Signed-off-by: NRodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20171215140603.gxe5i2y6fg5ojfpp@smtp.gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      31cb1bc0
    • I
      cb1f34dd
    • M
      membarrier: Disable preemption when calling smp_call_function_many() · 54167607
      Mathieu Desnoyers 提交于
      smp_call_function_many() requires disabling preemption around the call.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: <stable@vger.kernel.org> # v4.14+
      Cc: Andrea Parri <parri.andrea@gmail.com>
      Cc: Andrew Hunter <ahh@google.com>
      Cc: Avi Kivity <avi@scylladb.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Dave Watson <davejwatson@fb.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maged Michael <maged.michael@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20171215192310.25293-1-mathieu.desnoyers@efficios.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      54167607
  2. 09 1月, 2018 1 次提交
  3. 06 1月, 2018 10 次提交
  4. 05 1月, 2018 24 次提交