1. 28 10月, 2014 9 次提交
  2. 03 10月, 2014 1 次提交
  3. 24 9月, 2014 7 次提交
  4. 21 9月, 2014 1 次提交
  5. 19 9月, 2014 4 次提交
  6. 09 9月, 2014 1 次提交
  7. 07 9月, 2014 1 次提交
    • X
      sched/deadline: Fix a precision problem in the microseconds range · 177ef2a6
      xiaofeng.yan 提交于
      An overrun could happen in function start_hrtick_dl()
      when a task with SCHED_DEADLINE runs in the microseconds
      range.
      
      For example, if a task with SCHED_DEADLINE has the following parameters:
      
        Task  runtime  deadline  period
         P1   200us     500us    500us
      
      The deadline and period from task P1 are less than 1ms.
      
      In order to achieve microsecond precision, we need to enable HRTICK feature
      by the next command:
      
        PC#echo "HRTICK" > /sys/kernel/debug/sched_features
        PC#trace-cmd record -e sched_switch &
        PC#./schedtool -E -t 200000:500000:500000 -e ./test
      
      The binary test is in an endless while(1) loop here.
      Some pieces of trace.dat are as follows:
      
        <idle>-0   157.603157: sched_switch: :R ==> 2481:4294967295: test
        test-2481  157.603203: sched_switch:  2481:R ==> 0:120: swapper/2
        <idle>-0   157.605657: sched_switch:  :R ==> 2481:4294967295: test
        test-2481  157.608183: sched_switch:  2481:R ==> 2483:120: trace-cmd
        trace-cmd-2483 157.609656: sched_switch:2483:R==>2481:4294967295: test
      
      We can get the runtime of P1 from the information above:
      
        runtime = 157.608183 - 157.605657
        runtime = 0.002526(2.526ms)
      
      The correct runtime should be less than or equal to 200us at some point.
      
      The problem is caused by a conditional judgment "delta > 10000"
      in function start_hrtick_dl().
      
      Because no hrtimer start up to control the rest of runtime
      when the reset of runtime is less than 10us.
      
      So the process will continue to run until tick-period is coming.
      
      Move the code with the limit of the least time slice
      from hrtick_start_fair() to hrtick_start() because the
      EDF schedule class also needs this function in start_hrtick_dl().
      
      To fix this problem, we call hrtimer_start() unconditionally in
      start_hrtick_dl(), and make sure the scheduling slice won't be smaller
      than 10us in hrtimer_start().
      Signed-off-by: NXiaofeng Yan <xiaofeng.yan@huawei.com>
      Reviewed-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NJuri Lelli <juri.lelli@arm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1409022941-5880-1-git-send-email-xiaofeng.yan@huawei.com
      [ Massaged the changelog and the code. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      177ef2a6
  8. 25 8月, 2014 1 次提交
  9. 20 8月, 2014 4 次提交
    • K
      sched: Remove double_rq_lock() from __migrate_task() · a1e01829
      Kirill Tkhai 提交于
      Avoid double_rq_lock() and use TASK_ON_RQ_MIGRATING for
      __migrate_task(). The advantage is (obviously) not holding two
      rq->lock's at the same time and thereby increasing parallelism.
      
      The important point to note is that because we acquire dst->lock
      immediately after releasing src->lock the potential wait time of
      task_rq_lock() callers on TASK_ON_RQ_MIGRATING is not longer
      than it would have been in the double rq lock scenario.
      Signed-off-by: NKirill Tkhai <ktkhai@parallels.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Kirill Tkhai <tkhai@yandex.ru>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1408528070.23412.89.camel@tkhaiSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a1e01829
    • K
      sched: Teach scheduler to understand TASK_ON_RQ_MIGRATING state · cca26e80
      Kirill Tkhai 提交于
      This is a new p->on_rq state which will be used to indicate that a task
      is in a process of migrating between two RQs. It allows to get
      rid of double_rq_lock(), which we used to use to change a rq of
      a queued task before.
      
      Let's consider an example. To move a task between src_rq and
      dst_rq we will do the following:
      
      	raw_spin_lock(&src_rq->lock);
      	/* p is a task which is queued on src_rq */
      	p = ...;
      
      	dequeue_task(src_rq, p, 0);
      	p->on_rq = TASK_ON_RQ_MIGRATING;
      	set_task_cpu(p, dst_cpu);
      	raw_spin_unlock(&src_rq->lock);
      
          	/*
          	 * Both RQs are unlocked here.
          	 * Task p is dequeued from src_rq
          	 * but its on_rq value is not zero.
          	 */
      
      	raw_spin_lock(&dst_rq->lock);
      	p->on_rq = TASK_ON_RQ_QUEUED;
      	enqueue_task(dst_rq, p, 0);
      	raw_spin_unlock(&dst_rq->lock);
      
      While p->on_rq is TASK_ON_RQ_MIGRATING, task is considered as
      "migrating", and other parallel scheduler actions with it are
      not available to parallel callers. The parallel caller is
      spining till migration is completed.
      
      The unavailable actions are changing of cpu affinity, changing
      of priority etc, in other words all the functionality which used
      to require task_rq(p)->lock before (and related to the task).
      
      To implement TASK_ON_RQ_MIGRATING support we primarily are using
      the following fact. Most of scheduler users (from which we are
      protecting a migrating task) use task_rq_lock() and
      __task_rq_lock() to get the lock of task_rq(p). These primitives
      know that task's cpu may change, and they are spining while the
      lock of the right RQ is not held. We add one more condition into
      them, so they will be also spinning until the migration is
      finished.
      Signed-off-by: NKirill Tkhai <ktkhai@parallels.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Kirill Tkhai <tkhai@yandex.ru>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1408528062.23412.88.camel@tkhaiSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cca26e80
    • K
      sched: Add wrapper for checking task_struct::on_rq · da0c1e65
      Kirill Tkhai 提交于
      Implement task_on_rq_queued() and use it everywhere instead of
      on_rq check. No functional changes.
      
      The only exception is we do not use the wrapper in
      check_for_tasks(), because it requires to export
      task_on_rq_queued() in global header files. Next patch in series
      would return it back, so we do not twist it from here to there.
      Signed-off-by: NKirill Tkhai <ktkhai@parallels.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Kirill Tkhai <tkhai@yandex.ru>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1408528052.23412.87.camel@tkhaiSigned-off-by: NIngo Molnar <mingo@kernel.org>
      da0c1e65
    • O
      sched: s/do_each_thread/for_each_process_thread/ in core.c · 5d07f420
      Oleg Nesterov 提交于
      Change kernel/sched/core.c to use for_each_process_thread().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Frank Mayhar <fmayhar@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Sanjay Rao <srao@redhat.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20140813191953.GA19315@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5d07f420
  10. 13 8月, 2014 1 次提交
    • P
      locking: Remove deprecated smp_mb__() barriers · 2e39465a
      Peter Zijlstra 提交于
      Its been a while and there are no in-tree users left, so remove the
      deprecated barriers.
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Chen, Gong <gong.chen@linux.intel.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: John Sullivan <jsrhbz@kanargh.force9.co.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      2e39465a
  11. 12 8月, 2014 1 次提交
  12. 07 8月, 2014 1 次提交
  13. 28 7月, 2014 3 次提交
  14. 16 7月, 2014 3 次提交
  15. 15 7月, 2014 1 次提交
    • T
      cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes · 5577964e
      Tejun Heo 提交于
      Currently, cgroup_subsys->base_cftypes is used for both the unified
      default hierarchy and legacy ones and subsystems can mark each file
      with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE if it has to appear
      only on one of them.  This is quite hairy and error-prone.  Also, we
      may end up exposing interface files to the default hierarchy without
      thinking it through.
      
      cgroup_subsys will grow two separate cftype arrays and apply each only
      on the hierarchies of the matching type.  This will allow organizing
      cftypes in a lot clearer way and encourage subsystems to scrutinize
      the interface which is being exposed in the new default hierarchy.
      
      In preparation, this patch renames cgroup_subsys->base_cftypes to
      cgroup_subsys->legacy_cftypes.  This patch is pure rename.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Aristeu Rozanski <aris@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      5577964e
  16. 05 7月, 2014 1 次提交
    • K
      sched/fair: Disable runtime_enabled on dying rq · 0e59bdae
      Kirill Tkhai 提交于
      We kill rq->rd on the CPU_DOWN_PREPARE stage:
      
      	cpuset_cpu_inactive -> cpuset_update_active_cpus -> partition_sched_domains ->
      	-> cpu_attach_domain -> rq_attach_root -> set_rq_offline
      
      This unthrottles all throttled cfs_rqs.
      
      But the cpu is still able to call schedule() till
      
      	take_cpu_down->__cpu_disable()
      
      is called from stop_machine.
      
      This case the tasks from just unthrottled cfs_rqs are pickable
      in a standard scheduler way, and they are picked by dying cpu.
      The cfs_rqs becomes throttled again, and migrate_tasks()
      in migration_call skips their tasks (one more unthrottle
      in migrate_tasks()->CPU_DYING does not happen, because rq->rd
      is already NULL).
      
      Patch sets runtime_enabled to zero. This guarantees, the runtime
      is not accounted, and the cfs_rqs won't exceed given
      cfs_rq->runtime_remaining = 1, and tasks will be pickable
      in migrate_tasks(). runtime_enabled is recalculated again
      when rq becomes online again.
      
      Ben Segall also noticed, we always enable runtime in
      tg_set_cfs_bandwidth(). Actually, we should do that for online
      cpus only. To prevent races with unthrottle_offline_cfs_rqs()
      we take get_online_cpus() lock.
      Reviewed-by: NBen Segall <bsegall@google.com>
      Reviewed-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: NKirill Tkhai <ktkhai@parallels.com>
      CC: Konstantin Khorenko <khorenko@parallels.com>
      CC: Paul Turner <pjt@google.com>
      CC: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1403684382.3462.42.camel@tkhaiSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0e59bdae