1. 19 9月, 2014 3 次提交
  2. 09 9月, 2014 1 次提交
  3. 07 9月, 2014 1 次提交
    • X
      sched/deadline: Fix a precision problem in the microseconds range · 177ef2a6
      xiaofeng.yan 提交于
      An overrun could happen in function start_hrtick_dl()
      when a task with SCHED_DEADLINE runs in the microseconds
      range.
      
      For example, if a task with SCHED_DEADLINE has the following parameters:
      
        Task  runtime  deadline  period
         P1   200us     500us    500us
      
      The deadline and period from task P1 are less than 1ms.
      
      In order to achieve microsecond precision, we need to enable HRTICK feature
      by the next command:
      
        PC#echo "HRTICK" > /sys/kernel/debug/sched_features
        PC#trace-cmd record -e sched_switch &
        PC#./schedtool -E -t 200000:500000:500000 -e ./test
      
      The binary test is in an endless while(1) loop here.
      Some pieces of trace.dat are as follows:
      
        <idle>-0   157.603157: sched_switch: :R ==> 2481:4294967295: test
        test-2481  157.603203: sched_switch:  2481:R ==> 0:120: swapper/2
        <idle>-0   157.605657: sched_switch:  :R ==> 2481:4294967295: test
        test-2481  157.608183: sched_switch:  2481:R ==> 2483:120: trace-cmd
        trace-cmd-2483 157.609656: sched_switch:2483:R==>2481:4294967295: test
      
      We can get the runtime of P1 from the information above:
      
        runtime = 157.608183 - 157.605657
        runtime = 0.002526(2.526ms)
      
      The correct runtime should be less than or equal to 200us at some point.
      
      The problem is caused by a conditional judgment "delta > 10000"
      in function start_hrtick_dl().
      
      Because no hrtimer start up to control the rest of runtime
      when the reset of runtime is less than 10us.
      
      So the process will continue to run until tick-period is coming.
      
      Move the code with the limit of the least time slice
      from hrtick_start_fair() to hrtick_start() because the
      EDF schedule class also needs this function in start_hrtick_dl().
      
      To fix this problem, we call hrtimer_start() unconditionally in
      start_hrtick_dl(), and make sure the scheduling slice won't be smaller
      than 10us in hrtimer_start().
      Signed-off-by: NXiaofeng Yan <xiaofeng.yan@huawei.com>
      Reviewed-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NJuri Lelli <juri.lelli@arm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1409022941-5880-1-git-send-email-xiaofeng.yan@huawei.com
      [ Massaged the changelog and the code. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      177ef2a6
  4. 05 9月, 2014 1 次提交
  5. 20 8月, 2014 4 次提交
  6. 12 8月, 2014 5 次提交
  7. 28 7月, 2014 1 次提交
  8. 16 7月, 2014 2 次提交
  9. 05 7月, 2014 10 次提交
    • K
      sched/fair: Disable runtime_enabled on dying rq · 0e59bdae
      Kirill Tkhai 提交于
      We kill rq->rd on the CPU_DOWN_PREPARE stage:
      
      	cpuset_cpu_inactive -> cpuset_update_active_cpus -> partition_sched_domains ->
      	-> cpu_attach_domain -> rq_attach_root -> set_rq_offline
      
      This unthrottles all throttled cfs_rqs.
      
      But the cpu is still able to call schedule() till
      
      	take_cpu_down->__cpu_disable()
      
      is called from stop_machine.
      
      This case the tasks from just unthrottled cfs_rqs are pickable
      in a standard scheduler way, and they are picked by dying cpu.
      The cfs_rqs becomes throttled again, and migrate_tasks()
      in migration_call skips their tasks (one more unthrottle
      in migrate_tasks()->CPU_DYING does not happen, because rq->rd
      is already NULL).
      
      Patch sets runtime_enabled to zero. This guarantees, the runtime
      is not accounted, and the cfs_rqs won't exceed given
      cfs_rq->runtime_remaining = 1, and tasks will be pickable
      in migrate_tasks(). runtime_enabled is recalculated again
      when rq becomes online again.
      
      Ben Segall also noticed, we always enable runtime in
      tg_set_cfs_bandwidth(). Actually, we should do that for online
      cpus only. To prevent races with unthrottle_offline_cfs_rqs()
      we take get_online_cpus() lock.
      Reviewed-by: NBen Segall <bsegall@google.com>
      Reviewed-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: NKirill Tkhai <ktkhai@parallels.com>
      CC: Konstantin Khorenko <khorenko@parallels.com>
      CC: Paul Turner <pjt@google.com>
      CC: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1403684382.3462.42.camel@tkhaiSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0e59bdae
    • R
      sched/numa: Change scan period code to match intent · a22b4b01
      Rik van Riel 提交于
      Reading through the scan period code and comment, it appears the
      intent was to slow down NUMA scanning when a majority of accesses
      are on the local node, specifically a local:remote ratio of 3:1.
      
      However, the code actually tests local / (local + remote), and
      the actual cut-off point was around 30% local accesses, well before
      a task has actually converged on a node.
      
      Changing the threshold to 7 means scanning slows down when a task
      has around 70% of its accesses local, which appears to match the
      intent of the code more closely.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Cc: mgorman@suse.de
      Cc: chegu_vinod@hp.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1403538095-31256-8-git-send-email-riel@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a22b4b01
    • R
      sched/numa: Rework best node setting in task_numa_migrate() · db015dae
      Rik van Riel 提交于
      Fix up the best node setting in task_numa_migrate() to deal with a task
      in a pseudo-interleaved NUMA group, which is already running in the
      best location.
      
      Set the task's preferred nid to the current nid, so task migration is
      not retried at a high rate.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Cc: mgorman@suse.de
      Cc: chegu_vinod@hp.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1403538095-31256-7-git-send-email-riel@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      db015dae
    • R
      sched/numa: Examine a task move when examining a task swap · 0132c3e1
      Rik van Riel 提交于
      Running "perf bench numa mem -0 -m -P 1000 -p 8 -t 20" on a 4
      node system results in 160 runnable threads on a system with 80
      CPU threads.
      
      Once a process has nearly converged, with 39 threads on one node
      and 1 thread on another node, the remaining thread will be unable
      to migrate to its preferred node through a task swap.
      
      However, a simple task move would make the workload converge,
      witout causing an imbalance.
      
      Test for this unlikely occurrence, and attempt a task move to
      the preferred nid when it happens.
      
       # Running main, "perf bench numa mem -p 8 -t 20 -0 -m -P 1000"
      
       ###
       # 160 tasks will execute (on 4 nodes, 80 CPUs):
       #         -1x     0MB global  shared mem operations
       #         -1x  1000MB process shared mem operations
       #         -1x     0MB thread  local  mem operations
       ###
      
       ###
       #
       #    0.0%  [0.2 mins]  0/0   1/1  36/2   0/0  [36/3 ] l:  0-0   (  0) {0-2}
       #    0.0%  [0.3 mins] 43/3  37/2  39/2  41/3  [ 6/10] l:  0-1   (  1) {1-2}
       #    0.0%  [0.4 mins] 42/3  38/2  40/2  40/2  [ 4/9 ] l:  1-2   (  1) [50.0%] {1-2}
       #    0.0%  [0.6 mins] 41/3  39/2  40/2  40/2  [ 2/9 ] l:  2-4   (  2) [50.0%] {1-2}
       #    0.0%  [0.7 mins] 40/2  40/2  40/2  40/2  [ 0/8 ] l:  3-5   (  2) [40.0%] (  41.8s converged)
      
      Without this patch, this same perf bench numa mem run had to
      rely on the scheduler load balancer to first balance out the
      load (moving a random task), before a task swap could complete
      the NUMA convergence.
      
      The load balancer does not normally take action unless the load
      
      difference exceeds 25%. Convergence times of over half an hour
      have been observed without this patch.
      
      With this patch, the NUMA balancing code will simply migrate the
      task, if that does not cause an imbalance.
      
      Also skip examining a CPU in detail if the improvement on that CPU
      is no more than the best we already have.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Cc: chegu_vinod@hp.com
      Cc: mgorman@suse.de
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-ggthh0rnh0yua6o5o3p6cr1o@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0132c3e1
    • R
      sched/numa: Simplify task_numa_compare() · 1c5d3eb3
      Rik van Riel 提交于
      When a task is part of a numa_group, the comparison should always use
      the group weight, in order to make workloads converge.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Cc: chegu_vinod@hp.com
      Cc: mgorman@suse.de
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1403538378-31571-4-git-send-email-riel@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1c5d3eb3
    • R
      sched/numa: Use effective_load() to balance NUMA loads · 6dc1a672
      Rik van Riel 提交于
      When CONFIG_FAIR_GROUP_SCHED is enabled, the load that a task places
      on a CPU is determined by the group the task is in. The active groups
      on the source and destination CPU can be different, resulting in a
      different load contribution by the same task at its source and at its
      destination. As a result, the load needs to be calculated separately
      for each CPU, instead of estimated once with task_h_load().
      
      Getting this calculation right allows some workloads to converge,
      where previously the last thread could get stuck on another node,
      without being able to migrate to its final destination.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Cc: mgorman@suse.de
      Cc: chegu_vinod@hp.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1403538378-31571-3-git-send-email-riel@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6dc1a672
    • R
      sched/numa: Move power adjustment into load_too_imbalanced() · 28a21745
      Rik van Riel 提交于
      Currently the NUMA code scales the load on each node with the
      amount of CPU power available on that node, but it does not
      apply any adjustment to the load of the task that is being
      moved over.
      
      On systems with SMT/HT, this results in a task being weighed
      much more heavily than a CPU core, and a task move that would
      even out the load between nodes being disallowed.
      
      The correct thing is to apply the power correction to the
      numbers after we have first applied the move of the tasks'
      loads to them.
      
      This also allows us to do the power correction with a multiplication,
      rather than a division.
      
      Also drop two function arguments for load_too_unbalanced, since it
      takes various factors from env already.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Cc: chegu_vinod@hp.com
      Cc: mgorman@suse.de
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1403538378-31571-2-git-send-email-riel@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      28a21745
    • R
      sched/numa: Use group's max nid as task's preferred nid · f0b8a4af
      Rik van Riel 提交于
      From task_numa_placement, always try to consolidate the tasks
      in a group on the group's top nid.
      
      In case this task is part of a group that is interleaved over
      multiple nodes, task_numa_migrate will set the task's preferred
      nid to the best node it could find for the task, so this patch
      will cause at most one run through task_numa_migrate.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Cc: mgorman@suse.de
      Cc: chegu_vinod@hp.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1403538095-31256-2-git-send-email-riel@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f0b8a4af
    • T
      sched/fair: Implement fast idling of CPUs when the system is partially loaded · 4486edd1
      Tim Chen 提交于
      When a system is lightly loaded (i.e. no more than 1 job per cpu),
      attempt to pull job to a cpu before putting it to idle is unnecessary and
      can be skipped.  This patch adds an indicator so the scheduler can know
      when there's no more than 1 active job is on any CPU in the system to
      skip needless job pulls.
      
      On a 4 socket machine with a request/response kind of workload from
      clients, we saw about 0.13 msec delay when we go through a full load
      balance to try pull job from all the other cpus.  While 0.1 msec was
      spent on processing the request and generating a response, the 0.13 msec
      load balance overhead was actually more than the actual work being done.
      This overhead can be skipped much of the time for lightly loaded systems.
      
      With this patch, we tested with a netperf request/response workload that
      has the server busy with half the cpus in a 4 socket system.  We found
      the patch eliminated 75% of the load balance attempts before idling a cpu.
      
      The overhead of setting/clearing the indicator is low as we already gather
      the necessary info while we call add_nr_running() and update_sd_lb_stats.()
      We switch to full load balance load immediately if any cpu got more than
      one job on its run queue in add_nr_running.  We'll clear the indicator
      to avoid load balance when we detect no cpu's have more than one job
      when we scan the work queues in update_sg_lb_stats().  We are aggressive
      in turning on the load balance and opportunistic in skipping the load
      balance.
      Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Acked-by: NJason Low <jason.low2@hp.com>
      Cc: "Paul E.McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: Alex Shi <alex.shi@linaro.org>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1403551009.2970.613.camel@schen9-DESKSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4486edd1
    • B
      sched: Fix potential near-infinite distribute_cfs_runtime() loop · c06f04c7
      Ben Segall 提交于
      distribute_cfs_runtime() intentionally only hands out enough runtime to
      bring each cfs_rq to 1 ns of runtime, expecting the cfs_rqs to then take
      the runtime they need only once they actually get to run. However, if
      they get to run sufficiently quickly, the period timer is still in
      distribute_cfs_runtime() and no runtime is available, causing them to
      throttle. Then distribute has to handle them again, and this can go on
      until distribute has handed out all of the runtime 1ns at a time, which
      takes far too long.
      
      Instead allow access to the same runtime that distribute is handing out,
      accepting that corner cases with very low quota may be able to spend the
      entire cfs_b->runtime during distribute_cfs_runtime, meaning that the
      runtime directly handed out by distribute_cfs_runtime was over quota. In
      addition, if a cfs_rq does manage to throttle like this, make sure the
      existing distribute_cfs_runtime no longer loops over it again.
      Signed-off-by: NBen Segall <bsegall@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140620222120.13814.21652.stgit@sword-of-the-dawn.mtv.corp.google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c06f04c7
  10. 19 6月, 2014 3 次提交
  11. 09 6月, 2014 1 次提交
  12. 05 6月, 2014 8 次提交
    • N
      sched: Rename capacity related flags · 5d4dfddd
      Nicolas Pitre 提交于
      It is better not to think about compute capacity as being equivalent
      to "CPU power".  The upcoming "power aware" scheduler work may create
      confusion with the notion of energy consumption if "power" is used too
      liberally.
      
      Let's rename the following feature flags since they do relate to capacity:
      
      	SD_SHARE_CPUPOWER  -> SD_SHARE_CPUCAPACITY
      	ARCH_POWER         -> ARCH_CAPACITY
      	NONTASK_POWER      -> NONTASK_CAPACITY
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: linaro-kernel@lists.linaro.org
      Cc: Andy Fleming <afleming@freescale.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: devicetree@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/n/tip-e93lpnxb87owfievqatey6b5@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5d4dfddd
    • N
      sched: Final power vs. capacity cleanups · ca8ce3d0
      Nicolas Pitre 提交于
      It is better not to think about compute capacity as being equivalent
      to "CPU power".  The upcoming "power aware" scheduler work may create
      confusion with the notion of energy consumption if "power" is used too
      liberally.
      
      This contains the architecture visible changes.  Incidentally, only ARM
      takes advantage of the available pow^H^H^Hcapacity scaling hooks and
      therefore those changes outside kernel/sched/ are confined to one ARM
      specific file.  The default arch_scale_smt_power() hook is not overridden
      by anyone.
      
      Replacements are as follows:
      
      	arch_scale_freq_power  --> arch_scale_freq_capacity
      	arch_scale_smt_power   --> arch_scale_smt_capacity
      	SCHED_POWER_SCALE      --> SCHED_CAPACITY_SCALE
      	SCHED_POWER_SHIFT      --> SCHED_CAPACITY_SHIFT
      
      The local usage of "power" in arch/arm/kernel/topology.c is also changed
      to "capacity" as appropriate.
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: linaro-kernel@lists.linaro.org
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Brown <broonie@linaro.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: devicetree@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/n/tip-48zba9qbznvglwelgq2cfygh@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ca8ce3d0
    • N
      sched: Remove remaining dubious usage of "power" · ced549fa
      Nicolas Pitre 提交于
      It is better not to think about compute capacity as being equivalent
      to "CPU power".  The upcoming "power aware" scheduler work may create
      confusion with the notion of energy consumption if "power" is used too
      liberally.
      
      This is the remaining "power" -> "capacity" rename for local symbols.
      Those symbols visible to the rest of the kernel are not included yet.
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: linaro-kernel@lists.linaro.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/n/tip-yyyhohzhkwnaotr3lx8zd5aa@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ced549fa
    • N
      sched: Let 'struct sched_group_power' care about CPU capacity · 63b2ca30
      Nicolas Pitre 提交于
      It is better not to think about compute capacity as being equivalent
      to "CPU power".  The upcoming "power aware" scheduler work may create
      confusion with the notion of energy consumption if "power" is used too
      liberally.
      
      Since struct sched_group_power is really about compute capacity of sched
      groups, let's rename it to struct sched_group_capacity. Similarly sgp
      becomes sgc. Related variables and functions dealing with groups are also
      adjusted accordingly.
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: linaro-kernel@lists.linaro.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/n/tip-5yeix833vvgf2uyj5o36hpu9@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      63b2ca30
    • N
      sched/fair: Disambiguate existing/remaining "capacity" usage · 0fedc6c8
      Nicolas Pitre 提交于
      We have "power" (which should actually become "capacity") and "capacity"
      which is a scaled down "capacity factor" in terms of unitary tasks.
      Let's use "capacity_factor" to make room for proper usage of "capacity"
      later.
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: linaro-kernel@lists.linaro.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/n/tip-gk1co8sqdev3763opqm6ovml@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0fedc6c8
    • N
      sched/fair: Change "has_capacity" to "has_free_capacity" · 1b6a7495
      Nicolas Pitre 提交于
      The capacity of a CPU/group should be some intrinsic value that doesn't
      change with task placement.  It is like a container which capacity is
      stable regardless of the amount of liquid in it (its "utilization")...
      unless the container itself is crushed that is, but that's another story.
      
      Therefore let's rename "has_capacity" to "has_free_capacity" in order to
      better convey the wanted meaning.
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: linaro-kernel@lists.linaro.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/n/tip-djzkk027jm0e8x8jxy70opzh@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1b6a7495
    • N
      sched/fair: Remove "power" from 'struct numa_stats' · 5ef20ca1
      Nicolas Pitre 提交于
      It is better not to think about compute capacity as being equivalent
      to "CPU power".  The upcoming "power aware" scheduler work may create
      confusion with the notion of energy consumption if "power" is used too
      liberally.
      
      To make things explicit and not create more confusion with the existing
      "capacity" member, let's rename things as follows:
      
      	power    -> compute_capacity
      	capacity -> task_capacity
      
      Note: none of those fields are actually used outside update_numa_stats().
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: linaro-kernel@lists.linaro.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/n/tip-2e2ndymj5gyshyjq8am79f20@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5ef20ca1
    • M
      sched/fair: Use time_after() in record_wakee() · 2538d960
      Manuel Schölling 提交于
      To be future-proof and for better readability the time comparisons are modified
      to use time_after() instead of plain, error-prone math.
      Signed-off-by: NManuel Schölling <manuel.schoelling@gmx.de>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1400780723-24626-1-git-send-email-manuel.schoelling@gmx.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2538d960