1. 27 6月, 2016 9 次提交
    • P
      sched/fair: Rework throttle_count sync · 55e16d30
      Peter Zijlstra 提交于
      Since we already take rq->lock when creating a cgroup, use it to also
      sync the throttle_count and avoid the extra state and enqueue path
      branch.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: bsegall@google.com
      Cc: linux-kernel@vger.kernel.org
      [ Fixed build warning. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      55e16d30
    • P
      sched/fair: Reorder cgroup creation code · 8663e24d
      Peter Zijlstra 提交于
      A future patch needs rq->lock held _after_ we link the task_group into
      the hierarchy. In order to avoid taking every rq->lock twice, reorder
      things a little and create online_fair_sched_group() to be called
      after we link the task_group.
      
      All this code is still ran from css_alloc() so css_online() isn't in
      fact used for this.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: bsegall@google.com
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      8663e24d
    • P
      sched/fair: Apply more PELT fixes · 3d30544f
      Peter Zijlstra 提交于
      One additional 'rule' for using update_cfs_rq_load_avg() is that one
      should call update_tg_load_avg() if it returns true.
      
      Add a bunch of comments to hopefully clarify some of the rules:
      
       o  You need to update cfs_rq _before_ any entity attach/detach,
          this is important, because while for mathmatical consisency this
          isn't strictly needed, it is required for the physical
          interpretation of the model, you attach/detach _now_.
      
       o  When you modify the cfs_rq avg, you have to then call
          update_tg_load_avg() in order to propagate changes upwards.
      
       o  (Fair) entities are always attached, switched_{to,from}_fair()
          deal with !fair. This directly follows from the definition of the
          cfs_rq averages, namely that they are a direct sum of all
          (runnable or blocked) entities on that rq.
      
      It is the second rule that this patch enforces, but it adds comments
      pertaining to all of them.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      3d30544f
    • P
      sched/fair: Fix PELT integrity for new tasks · 7dc603c9
      Peter Zijlstra 提交于
      Vincent and Yuyang found another few scenarios in which entity
      tracking goes wobbly.
      
      The scenarios are basically due to the fact that new tasks are not
      immediately attached and thereby differ from the normal situation -- a
      task is always attached to a cfs_rq load average (such that it
      includes its blocked contribution) and are explicitly
      detached/attached on migration to another cfs_rq.
      
      Scenario 1: switch to fair class
      
        p->sched_class = fair_class;
        if (queued)
          enqueue_task(p);
            ...
              enqueue_entity()
      	  enqueue_entity_load_avg()
      	    migrated = !sa->last_update_time (true)
      	    if (migrated)
      	      attach_entity_load_avg()
        check_class_changed()
          switched_from() (!fair)
          switched_to()   (fair)
            switched_to_fair()
              attach_entity_load_avg()
      
      If @p is a new task that hasn't been fair before, it will have
      !last_update_time and, per the above, end up in
      attach_entity_load_avg() _twice_.
      
      Scenario 2: change between cgroups
      
        sched_move_group(p)
          if (queued)
            dequeue_task()
          task_move_group_fair()
            detach_task_cfs_rq()
              detach_entity_load_avg()
            set_task_rq()
            attach_task_cfs_rq()
              attach_entity_load_avg()
          if (queued)
            enqueue_task();
              ...
                enqueue_entity()
      	    enqueue_entity_load_avg()
      	      migrated = !sa->last_update_time (true)
      	      if (migrated)
      	        attach_entity_load_avg()
      
      Similar as with scenario 1, if @p is a new task, it will have
      !load_update_time and we'll end up in attach_entity_load_avg()
      _twice_.
      
      Furthermore, notice how we do a detach_entity_load_avg() on something
      that wasn't attached to begin with.
      
      As stated above; the problem is that the new task isn't yet attached
      to the load tracking and thereby violates the invariant assumption.
      
      This patch remedies this by ensuring a new task is indeed properly
      attached to the load tracking on creation, through
      post_init_entity_util_avg().
      
      Of course, this isn't entirely as straightforward as one might think,
      since the task is hashed before we call wake_up_new_task() and thus
      can be poked at. We avoid this by adding TASK_NEW and teaching
      cpu_cgroup_can_attach() to refuse such tasks.
      Reported-by: NYuyang Du <yuyang.du@intel.com>
      Reported-by: NVincent Guittot <vincent.guittot@linaro.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      7dc603c9
    • V
      sched/cgroup: Fix cpu_cgroup_fork() handling · ea86cb4b
      Vincent Guittot 提交于
      A new fair task is detached and attached from/to task_group with:
      
        cgroup_post_fork()
          ss->fork(child) := cpu_cgroup_fork()
            sched_move_task()
              task_move_group_fair()
      
      Which is wrong, because at this point in fork() the task isn't fully
      initialized and it cannot 'move' to another group, because its not
      attached to any group as yet.
      
      In fact, cpu_cgroup_fork() needs a small part of sched_move_task() so we
      can just call this small part directly instead sched_move_task(). And
      the task doesn't really migrate because it is not yet attached so we
      need the following sequence:
      
        do_fork()
          sched_fork()
            __set_task_cpu()
      
          cgroup_post_fork()
            set_task_rq() # set task group and runqueue
      
          wake_up_new_task()
            select_task_rq() can select a new cpu
            __set_task_cpu
            post_init_entity_util_avg
              attach_task_cfs_rq()
            activate_task
              enqueue_task
      
      This patch makes that happen.
      Signed-off-by: NVincent Guittot <vincent.guittot@linaro.org>
      [ Added TASK_SET_GROUP to set depth properly. ]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ea86cb4b
    • P
      sched/fair: Fix PELT integrity for new groups · 01011473
      Peter Zijlstra 提交于
      Vincent reported that when a new task is moved into a new cgroup it
      gets attached twice to the load tracking:
      
        sched_move_task()
          task_move_group_fair()
            detach_task_cfs_rq()
            set_task_rq()
            attach_task_cfs_rq()
              attach_entity_load_avg()
                se->avg.last_load_update = cfs_rq->avg.last_load_update // == 0
      
        enqueue_entity()
          enqueue_entity_load_avg()
            update_cfs_rq_load_avg()
              now = clock()
              __update_load_avg(&cfs_rq->avg)
                cfs_rq->avg.last_load_update = now
                // ages load/util for: now - 0, load/util -> 0
            if (migrated)
              attach_entity_load_avg()
                se->avg.last_load_update = cfs_rq->avg.last_load_update; // now != 0
      
      The problem is that we don't update cfs_rq load_avg before all
      entity attach/detach operations. Only enqueue_task() and migrate_task()
      do this.
      
      By fixing this, the above will not happen, because the
      sched_move_task() attach will have updated cfs_rq's last_load_update
      time before attach, and in turn the attach will have set the entity's
      last_load_update stamp.
      
      Note that there is a further problem with sched_move_task() calling
      detach on a task that hasn't yet been attached; this will be taken
      care of in a subsequent patch.
      Reported-by: NVincent Guittot <vincent.guittot@linaro.org>
      Tested-by: NVincent Guittot <vincent.guittot@linaro.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yuyang Du <yuyang.du@intel.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      01011473
    • P
      sched/fair: Fix and optimize the fork() path · e210bffd
      Peter Zijlstra 提交于
      The task_fork_fair() callback already calls __set_task_cpu() and takes
      rq->lock.
      
      If we move the sched_class::task_fork callback in sched_fork() under
      the existing p->pi_lock, right after its set_task_cpu() call, we can
      avoid doing two such calls and omit the IRQ disabling on the rq->lock.
      
      Change to __set_task_cpu() to skip the migration bits, this is a new
      task, not a migration. Similarly, make wake_up_new_task() use
      __set_task_cpu() for the same reason, the task hasn't actually
      migrated as it hasn't ever ran.
      
      This cures the problem of calling migrate_task_rq_fair(), which does
      remove_entity_from_load_avg() on tasks that have never been added to
      the load avg to begin with.
      
      This bug would result in transiently messed up load_avg values, averaged
      out after a few dozen milliseconds. This is probably the reason why
      this bug was not found for such a long time.
      Reported-by: NVincent Guittot <vincent.guittot@linaro.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e210bffd
    • P
      sched/fair: Fix calc_cfs_shares() fixed point arithmetics width confusion · ea1dc6fc
      Peter Zijlstra 提交于
      Commit:
      
        fde7d22e ("sched/fair: Fix overly small weight for interactive group entities")
      
      did something non-obvious but also did it buggy yet latent.
      
      The problem was exposed for real by a later commit in the v4.7 merge window:
      
        2159197d ("sched/core: Enable increased load resolution on 64-bit kernels")
      
      ... after which tg->load_avg and cfs_rq->load.weight had different
      units (10 bit fixed point and 20 bit fixed point resp.).
      
      Add a comment to explain the use of cfs_rq->load.weight over the
      'natural' cfs_rq->avg.load_avg and add scale_load_down() to correct
      for the difference in unit.
      
      Since this is (now, as per a previous commit) the only user of
      calc_tg_weight(), collapse it.
      
      The effects of this bug should be randomly inconsistent SMP-balancing
      of cgroups workloads.
      Reported-by: NJirka Hladky <jhladky@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 2159197d ("sched/core: Enable increased load resolution on 64-bit kernels")
      Fixes: fde7d22e ("sched/fair: Fix overly small weight for interactive group entities")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ea1dc6fc
    • P
      sched/fair: Fix effective_load() to consistently use smoothed load · 7dd49125
      Peter Zijlstra 提交于
      Starting with the following commit:
      
        fde7d22e ("sched/fair: Fix overly small weight for interactive group entities")
      
      calc_tg_weight() doesn't compute the right value as expected by effective_load().
      
      The difference is in the 'correction' term. In order to ensure \Sum
      rw_j >= rw_i we cannot use tg->load_avg directly, since that might be
      lagging a correction on the current cfs_rq->avg.load_avg value.
      Therefore we use tg->load_avg - cfs_rq->tg_load_avg_contrib +
      cfs_rq->avg.load_avg.
      
      Now, per the referenced commit, calc_tg_weight() doesn't use
      cfs_rq->avg.load_avg, as is later used in @w, but uses
      cfs_rq->load.weight instead.
      
      So stop using calc_tg_weight() and do it explicitly.
      
      The effects of this bug are wake_affine() making randomly
      poor choices in cgroup-intense workloads.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@vger.kernel.org> # v4.3+
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: fde7d22e ("sched/fair: Fix overly small weight for interactive group entities")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      7dd49125
  2. 24 6月, 2016 2 次提交
  3. 20 6月, 2016 1 次提交
    • P
      sched/fair: Fix cfs_rq avg tracking underflow · 89741892
      Peter Zijlstra 提交于
      As per commit:
      
        b7fa30c9 ("sched/fair: Fix post_init_entity_util_avg() serialization")
      
      > the code generated from update_cfs_rq_load_avg():
      >
      > 	if (atomic_long_read(&cfs_rq->removed_load_avg)) {
      > 		s64 r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
      > 		sa->load_avg = max_t(long, sa->load_avg - r, 0);
      > 		sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0);
      > 		removed_load = 1;
      > 	}
      >
      > turns into:
      >
      > ffffffff81087064:       49 8b 85 98 00 00 00    mov    0x98(%r13),%rax
      > ffffffff8108706b:       48 85 c0                test   %rax,%rax
      > ffffffff8108706e:       74 40                   je     ffffffff810870b0 <update_blocked_averages+0xc0>
      > ffffffff81087070:       4c 89 f8                mov    %r15,%rax
      > ffffffff81087073:       49 87 85 98 00 00 00    xchg   %rax,0x98(%r13)
      > ffffffff8108707a:       49 29 45 70             sub    %rax,0x70(%r13)
      > ffffffff8108707e:       4c 89 f9                mov    %r15,%rcx
      > ffffffff81087081:       bb 01 00 00 00          mov    $0x1,%ebx
      > ffffffff81087086:       49 83 7d 70 00          cmpq   $0x0,0x70(%r13)
      > ffffffff8108708b:       49 0f 49 4d 70          cmovns 0x70(%r13),%rcx
      >
      > Which you'll note ends up with sa->load_avg -= r in memory at
      > ffffffff8108707a.
      
      So I _should_ have looked at other unserialized users of ->load_avg,
      but alas. Luckily nikbor reported a similar /0 from task_h_load() which
      instantly triggered recollection of this here problem.
      
      Aside from the intermediate value hitting memory and causing problems,
      there's another problem: the underflow detection relies on the signed
      bit. This reduces the effective width of the variables, IOW its
      effectively the same as having these variables be of signed type.
      
      This patch changes to a different means of unsigned underflow
      detection to not rely on the signed bit. This allows the variables to
      use the 'full' unsigned range. And it does so with explicit LOAD -
      STORE to ensure any intermediate value will never be visible in
      memory, allowing these unserialized loads.
      
      Note: GCC generates crap code for this, might warrant a look later.
      
      Note2: I say 'full' above, if we end up at U*_MAX we'll still explode;
             maybe we should do clamping on add too.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yuyang Du <yuyang.du@intel.com>
      Cc: bsegall@google.com
      Cc: kernel@kyup.com
      Cc: morten.rasmussen@arm.com
      Cc: pjt@google.com
      Cc: steve.muckle@linaro.org
      Fixes: 9d89c257 ("sched/fair: Rewrite runnable load and utilization average tracking")
      Link: http://lkml.kernel.org/r/20160617091948.GJ30927@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      89741892
  4. 14 6月, 2016 2 次提交
    • J
      sched/debug: Fix deadlock when enabling sched events · eda8dca5
      Josh Poimboeuf 提交于
      I see a hang when enabling sched events:
      
        echo 1 > /sys/kernel/debug/tracing/events/sched/enable
      
      The printk buffer shows:
      
        BUG: spinlock recursion on CPU#1, swapper/1/0
         lock: 0xffff88007d5d8c00, .magic: dead4ead, .owner: swapper/1/0, .owner_cpu: 1
        CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.7.0-rc2+ #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
        ...
        Call Trace:
         <IRQ>  [<ffffffff8143d663>] dump_stack+0x85/0xc2
         [<ffffffff81115948>] spin_dump+0x78/0xc0
         [<ffffffff81115aea>] do_raw_spin_lock+0x11a/0x150
         [<ffffffff81891471>] _raw_spin_lock+0x61/0x80
         [<ffffffff810e5466>] ? try_to_wake_up+0x256/0x4e0
         [<ffffffff810e5466>] try_to_wake_up+0x256/0x4e0
         [<ffffffff81891a0a>] ? _raw_spin_unlock_irqrestore+0x4a/0x80
         [<ffffffff810e5705>] wake_up_process+0x15/0x20
         [<ffffffff810cebb4>] insert_work+0x84/0xc0
         [<ffffffff810ced7f>] __queue_work+0x18f/0x660
         [<ffffffff810cf9a6>] queue_work_on+0x46/0x90
         [<ffffffffa00cd95b>] drm_fb_helper_dirty.isra.11+0xcb/0xe0 [drm_kms_helper]
         [<ffffffffa00cdac0>] drm_fb_helper_sys_imageblit+0x30/0x40 [drm_kms_helper]
         [<ffffffff814babcd>] soft_cursor+0x1ad/0x230
         [<ffffffff814ba379>] bit_cursor+0x649/0x680
         [<ffffffff814b9d30>] ? update_attr.isra.2+0x90/0x90
         [<ffffffff814b5e6a>] fbcon_cursor+0x14a/0x1c0
         [<ffffffff81555ef8>] hide_cursor+0x28/0x90
         [<ffffffff81558b6f>] vt_console_print+0x3bf/0x3f0
         [<ffffffff81122c63>] call_console_drivers.constprop.24+0x183/0x200
         [<ffffffff811241f4>] console_unlock+0x3d4/0x610
         [<ffffffff811247f5>] vprintk_emit+0x3c5/0x610
         [<ffffffff81124bc9>] vprintk_default+0x29/0x40
         [<ffffffff811e965b>] printk+0x57/0x73
         [<ffffffff810f7a9e>] enqueue_entity+0xc2e/0xc70
         [<ffffffff810f7b39>] enqueue_task_fair+0x59/0xab0
         [<ffffffff8106dcd9>] ? kvm_sched_clock_read+0x9/0x20
         [<ffffffff8103fb39>] ? sched_clock+0x9/0x10
         [<ffffffff810e3fcc>] activate_task+0x5c/0xa0
         [<ffffffff810e4514>] ttwu_do_activate+0x54/0xb0
         [<ffffffff810e5cea>] sched_ttwu_pending+0x7a/0xb0
         [<ffffffff810e5e51>] scheduler_ipi+0x61/0x170
         [<ffffffff81059e7f>] smp_trace_reschedule_interrupt+0x4f/0x2a0
         [<ffffffff81893ba6>] trace_reschedule_interrupt+0x96/0xa0
         <EOI>  [<ffffffff8106e0d6>] ? native_safe_halt+0x6/0x10
         [<ffffffff8110fb1d>] ? trace_hardirqs_on+0xd/0x10
         [<ffffffff81040ac0>] default_idle+0x20/0x1a0
         [<ffffffff8104147f>] arch_cpu_idle+0xf/0x20
         [<ffffffff81102f8f>] default_idle_call+0x2f/0x50
         [<ffffffff8110332e>] cpu_startup_entry+0x37e/0x450
         [<ffffffff8105af70>] start_secondary+0x160/0x1a0
      
      Note the hang only occurs when echoing the above from a physical serial
      console, not from an ssh session.
      
      The bug is caused by a deadlock where the task is trying to grab the rq
      lock twice because printk()'s aren't safe in sched code.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: cb251765 ("sched/debug: Make schedstats a runtime tunable that is disabled by default")
      Link: http://lkml.kernel.org/r/20160613073209.gdvdybiruljbkn3p@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>
      eda8dca5
    • P
      sched/fair: Fix post_init_entity_util_avg() serialization · b7fa30c9
      Peter Zijlstra 提交于
      Chris Wilson reported a divide by 0 at:
      
       post_init_entity_util_avg():
      
       >    725	if (cfs_rq->avg.util_avg != 0) {
       >    726		sa->util_avg  = cfs_rq->avg.util_avg * se->load.weight;
       > -> 727		sa->util_avg /= (cfs_rq->avg.load_avg + 1);
       >    728
       >    729		if (sa->util_avg > cap)
       >    730			sa->util_avg = cap;
       >    731	} else {
      
      Which given the lack of serialization, and the code generated from
      update_cfs_rq_load_avg() is entirely possible:
      
      	if (atomic_long_read(&cfs_rq->removed_load_avg)) {
      		s64 r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
      		sa->load_avg = max_t(long, sa->load_avg - r, 0);
      		sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0);
      		removed_load = 1;
      	}
      
      turns into:
      
        ffffffff81087064:       49 8b 85 98 00 00 00    mov    0x98(%r13),%rax
        ffffffff8108706b:       48 85 c0                test   %rax,%rax
        ffffffff8108706e:       74 40                   je     ffffffff810870b0
        ffffffff81087070:       4c 89 f8                mov    %r15,%rax
        ffffffff81087073:       49 87 85 98 00 00 00    xchg   %rax,0x98(%r13)
        ffffffff8108707a:       49 29 45 70             sub    %rax,0x70(%r13)
        ffffffff8108707e:       4c 89 f9                mov    %r15,%rcx
        ffffffff81087081:       bb 01 00 00 00          mov    $0x1,%ebx
        ffffffff81087086:       49 83 7d 70 00          cmpq   $0x0,0x70(%r13)
        ffffffff8108708b:       49 0f 49 4d 70          cmovns 0x70(%r13),%rcx
      
      Which you'll note ends up with 'sa->load_avg - r' in memory at
      ffffffff8108707a.
      
      By calling post_init_entity_util_avg() under rq->lock we're sure to be
      fully serialized against PELT updates and cannot observe intermediate
      state like this.
      Reported-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yuyang Du <yuyang.du@intel.com>
      Cc: bsegall@google.com
      Cc: morten.rasmussen@arm.com
      Cc: pjt@google.com
      Cc: steve.muckle@linaro.org
      Fixes: 2b8c41da ("sched/fair: Initiate a new task's util avg to a bounded value")
      Link: http://lkml.kernel.org/r/20160609130750.GQ30909@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b7fa30c9
  5. 03 6月, 2016 2 次提交
  6. 12 5月, 2016 6 次提交
    • M
      sched/fair: Correct unit of load_above_capacity · cfa10334
      Morten Rasmussen 提交于
      In calculate_imbalance() load_above_capacity currently has the unit
      [capacity] while it is used as being [load/capacity]. Not only is it
      wrong it also makes it unlikely that load_above_capacity is ever used
      as the subsequent code picks the smaller of load_above_capacity and
      the avg_load
      
      This patch ensures that load_above_capacity has the right unit
      [load/capacity].
      Signed-off-by: NMorten Rasmussen <morten.rasmussen@arm.com>
      [ Changed changelog to note it was in capacity unit; +rebase. ]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/r/1461958364-675-4-git-send-email-dietmar.eggemann@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cfa10334
    • P
      sched/fair: Clean up scale confusion · 1be0eb2a
      Peter Zijlstra 提交于
      Wanpeng noted that the scale_load_down() in calculate_imbalance() was
      weird. I agree, it should be SCHED_CAPACITY_SCALE, since we're going
      to compare against busiest->group_capacity, which is in [capacity]
      units.
      Reported-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yuyang Du <yuyang.du@intel.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      1be0eb2a
    • P
      sched/fair: Fix fairness issue on migration · 2f950354
      Peter Zijlstra 提交于
      Pavan reported that in the presence of very light tasks (or cgroups)
      the placement of migrated tasks can cause severe fairness issues.
      
      The problem is that enqueue_entity() places the task before it updates
      time, thereby it can place the task far in the past (remember that
      light tasks will shoot virtual time forward at a high speed, so in
      relation to the pre-existing light task, we can land far in the past).
      
      This is done because update_curr() needs the current task, and we
      might be placing the current task.
      
      The obvious solution is to differentiate between the current and any
      other task; placing the current before we update time, and placing any
      other task after, such that !curr tasks end up at the current moment
      in time, and not in the past.
      
      This commit re-introduces the previously reverted commit:
      
        3a47d512 ("sched/fair: Fix fairness issue on migration")
      
      ... which is now safe to do, after we've also fixed another
      underlying bug first, in:
      
        sched/fair: Prepare to fix fairness problems on migration
      
      and cleaned up other details in the migration code:
      
        sched/core: Kill sched_class::task_waking
      Reported-by: NPavan Kondeti <pkondeti@codeaurora.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      2f950354
    • P
      sched/core: Kill sched_class::task_waking to clean up the migration logic · 59efa0ba
      Peter Zijlstra 提交于
      With sched_class::task_waking being called only when we do
      set_task_cpu(), we can make sched_class::migrate_task_rq() do the work
      and eliminate sched_class::task_waking entirely.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Hunter <ahh@google.com>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Pavan Kondeti <pkondeti@codeaurora.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: byungchul.park@lge.com
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      59efa0ba
    • P
      sched/fair: Prepare to fix fairness problems on migration · b5179ac7
      Peter Zijlstra 提交于
      Mike reported that our recent attempt to fix migration problems:
      
        3a47d512 ("sched/fair: Fix fairness issue on migration")
      
      broke interactivity and the signal starve test. We reverted that
      commit and now let's try it again more carefully, with some other
      underlying problems fixed first.
      
      One problem is that I assumed ENQUEUE_WAKING was only set when we do a
      cross-cpu wakeup (migration), which isn't true. This means we now
      destroy the vruntime history of tasks and wakeup-preemption suffers.
      
      Cure this by making my assumption true, only call
      sched_class::task_waking() when we do a cross-cpu wakeup. This avoids
      the indirect call in the case we do a local wakeup.
      Reported-by: NMike Galbraith <mgalbraith@suse.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Hunter <ahh@google.com>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Pavan Kondeti <pkondeti@codeaurora.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: byungchul.park@lge.com
      Cc: linux-kernel@vger.kernel.org
      Fixes: 3a47d512 ("sched/fair: Fix fairness issue on migration")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b5179ac7
    • P
      sched/fair: Move record_wakee() · c58d25f3
      Peter Zijlstra 提交于
      Since I want to make ->task_woken() conditional on the task getting
      migrated, we cannot use it to call record_wakee().
      
      Move it to select_task_rq_fair(), which gets called in almost all the
      same conditions. The only exception is if the woken task (@p) is
      CPU-bound (as per the nr_cpus_allowed test in select_task_rq()).
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Hunter <ahh@google.com>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Pavan Kondeti <pkondeti@codeaurora.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: byungchul.park@lge.com
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c58d25f3
  7. 11 5月, 2016 1 次提交
  8. 07 5月, 2016 1 次提交
  9. 06 5月, 2016 1 次提交
  10. 05 5月, 2016 7 次提交
  11. 23 4月, 2016 7 次提交
    • F
      sched/fair: Optimize !CONFIG_NO_HZ_COMMON CPU load updates · 9fd81dd5
      Frederic Weisbecker 提交于
      Some code in CPU load update only concern NO_HZ configs but it is
      built on all configurations. When NO_HZ isn't built, that code is harmless
      but just happens to take some useless ressources in CPU and memory:
      
      1) one useless field in struct rq
      2) jiffies record on every tick that is never used (cpu_load_update_periodic)
      3) decay_load_missed is called two times on every tick to eventually
         return immediately with no action taken. And that function is dead
         code.
      
      For pure optimization purposes, lets conditionally build the NO_HZ
      related code.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1461080211-16271-1-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9fd81dd5
    • F
      sched/fair: Correctly handle nohz ticks CPU load accounting · 1f41906a
      Frederic Weisbecker 提交于
      Ticks can happen while the CPU is in dynticks-idle or dynticks-singletask
      mode. In fact "nohz" or "dynticks" only mean that we exit the periodic
      mode and we try to minimize the ticks as much as possible. The nohz
      subsystem uses a confusing terminology with the internal state
      "ts->tick_stopped" which is also available through its public interface
      with tick_nohz_tick_stopped(). This is a misnomer as the tick is instead
      reduced with the best effort rather than stopped. In the best case the
      tick can indeed be actually stopped but there is no guarantee about that.
      If a timer needs to fire one second later, a tick will fire while the
      CPU is in nohz mode and this is a very common scenario.
      
      Now this confusion happens to be a problem with CPU load updates:
      cpu_load_update_active() doesn't handle nohz ticks correctly because it
      assumes that ticks are completely stopped in nohz mode and that
      cpu_load_update_active() can't be called in dynticks mode. When that
      happens, the whole previous tickless load is ignored and the function
      just records the load for the current tick, ignoring potentially long
      idle periods behind.
      
      In order to solve this, we could account the current load for the
      previous nohz time but there is a risk that we account the load of a
      task that got freshly enqueued for the whole nohz period.
      
      So instead, lets record the dynticks load on nohz frame entry so we know
      what to record in case of nohz ticks, then use this record to account
      the tickless load on nohz ticks and nohz frame end.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1460555812-25375-3-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1f41906a
    • F
      sched/fair: Gather CPU load functions under a more conventional namespace · cee1afce
      Frederic Weisbecker 提交于
      The CPU load update related functions have a weak naming convention
      currently, starting with update_cpu_load_*() which isn't ideal as
      "update" is a very generic concept.
      
      Since two of these functions are public already (and a third is to come)
      that's enough to introduce a more conventional naming scheme. So let's
      do the following rename instead:
      
      	update_cpu_load_*() -> cpu_load_update_*()
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1460555812-25375-2-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cee1afce
    • S
      sched/fair: Call cpufreq hook in additional paths · a2c6c91f
      Steve Muckle 提交于
      The cpufreq hook should be called any time the root CFS rq utilization
      changes. This can occur when a task is switched to or from the fair
      class, or a task moves between groups or CPUs, but these paths
      currently do not call the cpufreq hook.
      
      Fix this by adding the hook to attach_entity_load_avg() and
      detach_entity_load_avg().
      Suggested-by: NVincent Guittot <vincent.guittot@linaro.org>
      Signed-off-by: NSteve Muckle <smuckle@linaro.org>
      [ Added the .update_freq argument to update_cfs_rq_load_avg() to avoid a double cpufreq call. ]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <Juri.Lelli@arm.com>
      Cc: Michael Turquette <mturquette@baylibre.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: Patrick Bellasi <patrick.bellasi@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rafael@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1458858367-2831-1-git-send-email-smuckle@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a2c6c91f
    • S
      sched/fair: Do not call cpufreq hook unless util changed · 41e0d37f
      Steve Muckle 提交于
      There's no reason to call the cpufreq hook if the root cfs_rq
      utilization has not been modified.
      Signed-off-by: NSteve Muckle <smuckle@linaro.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <Juri.Lelli@arm.com>
      Cc: Michael Turquette <mturquette@baylibre.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: Patrick Bellasi <patrick.bellasi@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rafael@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Link: http://lkml.kernel.org/r/1458606068-7476-2-git-send-email-smuckle@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      41e0d37f
    • S
      sched/fair: Move cpufreq hook to update_cfs_rq_load_avg() · 21e96f88
      Steve Muckle 提交于
      The cpufreq hook should be called whenever the root cfs_rq
      utilization changes so update_cfs_rq_load_avg() is a better
      place for it. The current location is not invoked in the
      enqueue_entity() or update_blocked_averages() paths.
      Suggested-by: NVincent Guittot <vincent.guittot@linaro.org>
      Signed-off-by: NSteve Muckle <smuckle@linaro.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <Juri.Lelli@arm.com>
      Cc: Michael Turquette <mturquette@baylibre.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: Patrick Bellasi <patrick.bellasi@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rafael@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1458606068-7476-1-git-send-email-smuckle@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      21e96f88
    • S
      sched/fair: Fix asym packing to select correct CPU · 1f621e02
      Srikar Dronamraju 提交于
      When asymmetric packing is set in the sched_domain and target CPU is
      busy, update_sd_pick_busiest() may not select the busiest runqueue.
      When target CPU is busy, find_busiest_group() will ignore checks for
      asym packing and may continue to load balance using the currently
      selected not-the-busiest runqueue as source runqueue.
      Selecting the busiest runqueue as source when the target CPU is busy,
      should result in achieving much better load balance.
      
      Also when target CPU is not busy and asymmetric packing is set in sd,
      select higher CPU as source CPU for load balancing.
      
      While doing this change, move the check to see if target CPU is busy
      into check_asym_packing().
      
      The extent of performance benefit from this change decreases with the
      increasing load. However there is benefit in undercommit as well as
      overcommit conditions.
      
      1. Record per second ebizzy (32 threads) on a 64 CPU power 7 box. (5 iterations)
      4.6.0-rc2
      	Testcase:         Min         Max         Avg      StdDev
      	  ebizzy:  5223767.00 10368236.00  7946971.00  1753094.76
      
      4.6.0-rc2+asym-changes
      	Testcase:         Min         Max         Avg      StdDev     %Change
      	  ebizzy:  8617191.00 13872356.00 11383980.00  1783400.89     +24.78%
      
      2. Record per second ebizzy (64 threads) on a 64 CPU power 7 box. (5 iterations)
      4.6.0-rc2
      	Testcase:         Min         Max         Avg      StdDev
      	  ebizzy:  6497666.00 18399783.00 10818093.20  4051452.08
      
      4.6.0-rc2+asym-changes
      	Testcase:         Min         Max         Avg      StdDev     %Change
      	  ebizzy:  7567365.00 19456937.00 11674063.60  4295407.48      +4.40%
      
      3. Record per second ebizzy (128 threads) on a 64 CPU power 7 box. (5 iterations)
      4.6.0-rc2
      	Testcase:         Min         Max         Avg      StdDev
      	  ebizzy: 37073983.00 40341911.00 38776241.80  1259766.82
      
      4.6.0-rc2+asym-changes
      	Testcase:         Min         Max         Avg      StdDev     %Change
      	  ebizzy: 38030399.00 41333378.00 39827404.40  1255001.86      +2.54%
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1459948660-16073-1-git-send-email-srikar@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1f621e02
  12. 31 3月, 2016 1 次提交
    • Y
      sched/fair: Initiate a new task's util avg to a bounded value · 2b8c41da
      Yuyang Du 提交于
      A new task's util_avg is set to full utilization of a CPU (100% time
      running). This accelerates a new task's utilization ramp-up, useful to
      boost its execution in early time. However, it may result in
      (insanely) high utilization for a transient time period when a flood
      of tasks are spawned. Importantly, it violates the "fundamentally
      bounded" CPU utilization, and its side effect is negative if we don't
      take any measure to bound it.
      
      This patch proposes an algorithm to address this issue. It has
      two methods to approach a sensible initial util_avg:
      
      (1) An expected (or average) util_avg based on its cfs_rq's util_avg:
      
        util_avg = cfs_rq->util_avg / (cfs_rq->load_avg + 1) * se.load.weight
      
      (2) A trajectory of how successive new tasks' util develops, which
      gives 1/2 of the left utilization budget to a new task such that
      the additional util is noticeably large (when overall util is low) or
      unnoticeably small (when overall util is high enough). In the meantime,
      the aggregate utilization is well bounded:
      
        util_avg_cap = (1024 - cfs_rq->avg.util_avg) / 2^n
      
      where n denotes the nth task.
      
      If util_avg is larger than util_avg_cap, then the effective util is
      clamped to the util_avg_cap.
      Reported-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: NYuyang Du <yuyang.du@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: bsegall@google.com
      Cc: morten.rasmussen@arm.com
      Cc: pjt@google.com
      Cc: steve.muckle@linaro.org
      Link: http://lkml.kernel.org/r/1459283456-21682-1-git-send-email-yuyang.du@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2b8c41da