1. 04 4月, 2017 3 次提交
    • P
      sched,tracing: Update trace_sched_pi_setprio() · b91473ff
      Peter Zijlstra 提交于
      Pass the PI donor task, instead of a numerical priority.
      
      Numerical priorities are not sufficient to describe state ever since
      SCHED_DEADLINE.
      
      Annotate all sched tracepoints that are currently broken; fixing them
      will bork userspace. *hate*.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: juri.lelli@arm.com
      Cc: bigeasy@linutronix.de
      Cc: xlpang@redhat.com
      Cc: mathieu.desnoyers@efficios.com
      Cc: jdesfossez@efficios.com
      Cc: bristot@redhat.com
      Link: http://lkml.kernel.org/r/20170323150216.353599881@infradead.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      b91473ff
    • P
      sched/rtmutex: Refactor rt_mutex_setprio() · acd58620
      Peter Zijlstra 提交于
      With the introduction of SCHED_DEADLINE the whole notion that priority
      is a single number is gone, therefore the @prio argument to
      rt_mutex_setprio() doesn't make sense anymore.
      
      So rework the code to pass a pi_task instead.
      
      Note this also fixes a problem with pi_top_task caching; previously we
      would not set the pointer (call rt_mutex_update_top_task) if the
      priority didn't change, this could lead to a stale pointer.
      
      As for the XXX, I think its fine to use pi_task->prio, because if it
      differs from waiter->prio, a PI chain update is immenent.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: juri.lelli@arm.com
      Cc: bigeasy@linutronix.de
      Cc: xlpang@redhat.com
      Cc: rostedt@goodmis.org
      Cc: mathieu.desnoyers@efficios.com
      Cc: jdesfossez@efficios.com
      Cc: bristot@redhat.com
      Link: http://lkml.kernel.org/r/20170323150216.303827095@infradead.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      acd58620
    • X
      sched/rtmutex/deadline: Fix a PI crash for deadline tasks · e96a7705
      Xunlei Pang 提交于
      A crash happened while I was playing with deadline PI rtmutex.
      
          BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
          IP: [<ffffffff810eeb8f>] rt_mutex_get_top_task+0x1f/0x30
          PGD 232a75067 PUD 230947067 PMD 0
          Oops: 0000 [#1] SMP
          CPU: 1 PID: 10994 Comm: a.out Not tainted
      
          Call Trace:
          [<ffffffff810b658c>] enqueue_task+0x2c/0x80
          [<ffffffff810ba763>] activate_task+0x23/0x30
          [<ffffffff810d0ab5>] pull_dl_task+0x1d5/0x260
          [<ffffffff810d0be6>] pre_schedule_dl+0x16/0x20
          [<ffffffff8164e783>] __schedule+0xd3/0x900
          [<ffffffff8164efd9>] schedule+0x29/0x70
          [<ffffffff8165035b>] __rt_mutex_slowlock+0x4b/0xc0
          [<ffffffff81650501>] rt_mutex_slowlock+0xd1/0x190
          [<ffffffff810eeb33>] rt_mutex_timed_lock+0x53/0x60
          [<ffffffff810ecbfc>] futex_lock_pi.isra.18+0x28c/0x390
          [<ffffffff810ed8b0>] do_futex+0x190/0x5b0
          [<ffffffff810edd50>] SyS_futex+0x80/0x180
      
      This is because rt_mutex_enqueue_pi() and rt_mutex_dequeue_pi()
      are only protected by pi_lock when operating pi waiters, while
      rt_mutex_get_top_task(), will access them with rq lock held but
      not holding pi_lock.
      
      In order to tackle it, we introduce new "pi_top_task" pointer
      cached in task_struct, and add new rt_mutex_update_top_task()
      to update its value, it can be called by rt_mutex_setprio()
      which held both owner's pi_lock and rq lock. Thus "pi_top_task"
      can be safely accessed by enqueue_task_dl() under rq lock.
      
      Originally-From: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NXunlei Pang <xlpang@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: juri.lelli@arm.com
      Cc: bigeasy@linutronix.de
      Cc: mathieu.desnoyers@efficios.com
      Cc: jdesfossez@efficios.com
      Cc: bristot@redhat.com
      Link: http://lkml.kernel.org/r/20170323150216.157682758@infradead.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      e96a7705
  2. 23 3月, 2017 1 次提交
  3. 16 3月, 2017 8 次提交
  4. 02 3月, 2017 10 次提交
  5. 28 2月, 2017 1 次提交
  6. 24 2月, 2017 3 次提交
  7. 22 2月, 2017 1 次提交
  8. 10 2月, 2017 1 次提交
  9. 07 2月, 2017 4 次提交
  10. 01 2月, 2017 1 次提交
  11. 30 1月, 2017 5 次提交
  12. 14 1月, 2017 2 次提交
    • T
      sched/core: Separate out io_schedule_prepare() and io_schedule_finish() · 10ab5643
      Tejun Heo 提交于
      Now that IO schedule accounting is done inside __schedule(),
      io_schedule() can be split into three steps - prep, schedule, and
      finish - where the schedule part doesn't need any special annotation.
      This allows marking a sleep as iowait by simply wrapping an existing
      blocking function with io_schedule_prepare() and io_schedule_finish().
      
      Because task_struct->in_iowait is single bit, the caller of
      io_schedule_prepare() needs to record and the pass its state to
      io_schedule_finish() to be safe regarding nesting.  While this isn't
      the prettiest, these functions are mostly gonna be used by core
      functions and we don't want to use more space for ->in_iowait.
      
      While at it, as it's simple to do now, reimplement io_schedule()
      without unnecessarily going through io_schedule_timeout().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: adilger.kernel@dilger.ca
      Cc: jack@suse.com
      Cc: kernel-team@fb.com
      Cc: mingbo@fb.com
      Cc: tytso@mit.edu
      Link: http://lkml.kernel.org/r/1477673892-28940-3-git-send-email-tj@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      10ab5643
    • T
      sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler · e33a9bba
      Tejun Heo 提交于
      For an interface to support blocking for IOs, it must call
      io_schedule() instead of schedule().  This makes it tedious to add IO
      blocking to existing interfaces as the switching between schedule()
      and io_schedule() is often buried deep.
      
      As we already have a way to mark the task as IO scheduling, this can
      be made easier by separating out io_schedule() into multiple steps so
      that IO schedule preparation can be performed before invoking a
      blocking interface and the actual accounting happens inside the
      scheduler.
      
      io_schedule_timeout() does the following three things prior to calling
      schedule_timeout().
      
       1. Mark the task as scheduling for IO.
       2. Flush out plugged IOs.
       3. Account the IO scheduling.
      
      done close to the actual scheduling.  This patch moves #3 into the
      scheduler so that later patches can separate out preparation and
      finish steps from io_schedule().
      Patch-originally-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: adilger.kernel@dilger.ca
      Cc: akpm@linux-foundation.org
      Cc: axboe@kernel.dk
      Cc: jack@suse.com
      Cc: kernel-team@fb.com
      Cc: mingbo@fb.com
      Cc: tytso@mit.edu
      Link: http://lkml.kernel.org/r/20161207204841.GA22296@htj.duckdns.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e33a9bba