1. 06 1月, 2016 1 次提交
  2. 23 10月, 2015 1 次提交
  3. 20 10月, 2015 1 次提交
    • L
      sched/deadline: Fix migration of SCHED_DEADLINE tasks · 5aa50507
      Luca Abeni 提交于
      Commit:
      
        9d514262 ("sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target")
      
      broke select_task_rq_dl() and find_lock_later_rq(), because it introduced
      a comparison between the local task's deadline and dl.earliest_dl.curr of
      the remote queue.
      
      However, if the remote runqueue does not contain any SCHED_DEADLINE
      task its earliest_dl.curr is 0 (always smaller than the deadline of
      the local task) and the remote runqueue is not selected for pushing.
      
      As a result, if an application creates multiple SCHED_DEADLINE
      threads, they will never be pushed to runqueues that do not already
      contain SCHED_DEADLINE tasks.
      
      This patch fixes the issue by checking if dl.dl_nr_running == 0.
      Signed-off-by: NLuca Abeni <luca.abeni@unitn.it>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Juri Lelli <juri.lelli@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wanpeng Li <wanpeng.li@linux.intel.com>
      Fixes: 9d514262 ("sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target")
      Link: http://lkml.kernel.org/r/1444982781-15608-1-git-send-email-luca.abeni@unitn.itSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5aa50507
  4. 12 8月, 2015 4 次提交
  5. 03 8月, 2015 1 次提交
  6. 19 6月, 2015 9 次提交
  7. 08 5月, 2015 1 次提交
  8. 22 4月, 2015 1 次提交
  9. 02 4月, 2015 3 次提交
  10. 27 3月, 2015 1 次提交
  11. 10 3月, 2015 1 次提交
  12. 18 2月, 2015 3 次提交
  13. 04 2月, 2015 3 次提交
  14. 31 1月, 2015 1 次提交
  15. 09 1月, 2015 2 次提交
    • L
      sched/deadline: Avoid double-accounting in case of missed deadlines · 269ad801
      Luca Abeni 提交于
      The dl_runtime_exceeded() function is supposed to ckeck if
      a SCHED_DEADLINE task must be throttled, by checking if its
      current runtime is <= 0. However, it also checks if the
      scheduling deadline has been missed (the current time is
      larger than the current scheduling deadline), further
      decreasing the runtime if this happens.
      This "double accounting" is wrong:
      
      - In case of partitioned scheduling (or single CPU), this
        happens if task_tick_dl() has been called later than expected
        (due to small HZ values). In this case, the current runtime is
        also negative, and replenish_dl_entity() can take care of the
        deadline miss by recharging the current runtime to a value smaller
        than dl_runtime
      
      - In case of global scheduling on multiple CPUs, scheduling
        deadlines can be missed even if the task did not consume more
        runtime than expected, hence penalizing the task is wrong
      
      This patch fix this problem by throttling a SCHED_DEADLINE task
      only when its runtime becomes negative, and not modifying the runtime
      Signed-off-by: NLuca Abeni <luca.abeni@unitn.it>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NJuri Lelli <juri.lelli@gmail.com>
      Cc: <stable@vger.kernel.org>
      Cc: Dario Faggioli <raistlin@linux.it>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1418813432-20797-3-git-send-email-luca.abeni@unitn.itSigned-off-by: NIngo Molnar <mingo@kernel.org>
      269ad801
    • L
      sched/deadline: Fix migration of SCHED_DEADLINE tasks · 6a503c3b
      Luca Abeni 提交于
      According to global EDF, tasks should be migrated between runqueues
      without checking if their scheduling deadlines and runtimes are valid.
      However, SCHED_DEADLINE currently performs such a check:
      a migration happens doing:
      
      	deactivate_task(rq, next_task, 0);
      	set_task_cpu(next_task, later_rq->cpu);
      	activate_task(later_rq, next_task, 0);
      
      which ends up calling dequeue_task_dl(), setting the new CPU, and then
      calling enqueue_task_dl().
      
      enqueue_task_dl() then calls enqueue_dl_entity(), which calls
      update_dl_entity(), which can modify scheduling deadline and runtime,
      breaking global EDF scheduling.
      
      As a result, some of the properties of global EDF are not respected:
      for example, a taskset {(30, 80), (40, 80), (120, 170)} scheduled on
      two cores can have unbounded response times for the third task even
      if 30/80+40/80+120/170 = 1.5809 < 2
      
      This can be fixed by invoking update_dl_entity() only in case of
      wakeup, or if this is a new SCHED_DEADLINE task.
      Signed-off-by: NLuca Abeni <luca.abeni@unitn.it>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NJuri Lelli <juri.lelli@gmail.com>
      Cc: <stable@vger.kernel.org>
      Cc: Dario Faggioli <raistlin@linux.it>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1418813432-20797-2-git-send-email-luca.abeni@unitn.itSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6a503c3b
  16. 16 11月, 2014 4 次提交
    • W
      sched/deadline: Introduce start_hrtick_dl() for !CONFIG_SCHED_HRTICK · 36ce9881
      Wanpeng Li 提交于
      Introduce start_hrtick_dl for !CONFIG_SCHED_HRTICK to align with
      the fair class.
      Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Juri Lelli <juri.lelli@arm.com>
      Cc: Kirill Tkhai <ktkhai@parallels.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1415670747-58726-1-git-send-email-wanpeng.li@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      36ce9881
    • W
      sched/deadline: Fix rq->dl.pushable_tasks bug in push_dl_task() · c51b8ab5
      Wanpeng Li 提交于
      Do not call dequeue_pushable_dl_task() when failing to push an eligible
      task, as it remains pushable, merely not at this particular moment.
      
      Actually the patch is the same behavior as commit 311e800e ("sched,
      rt: Fix rq->rt.pushable_tasks bug in push_rt_task()" in -rt side.
      Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Juri Lelli <juri.lelli@arm.com>
      Cc: Kirill Tkhai <ktkhai@parallels.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1415258564-8573-1-git-send-email-wanpeng.li@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c51b8ab5
    • W
      sched: Move p->nr_cpus_allowed check to select_task_rq() · 6c1d9410
      Wanpeng Li 提交于
      Move the p->nr_cpus_allowed check into kernel/sched/core.c: select_task_rq().
      This change will make fair.c, rt.c, and deadline.c all start with the
      same logic.
      Suggested-and-Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: "pang.xunlei" <pang.xunlei@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1415150077-59053-1-git-send-email-wanpeng.li@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6c1d9410
    • S
      sched/cputime: Fix clock_nanosleep()/clock_gettime() inconsistency · 6e998916
      Stanislaw Gruszka 提交于
      Commit d670ec13 "posix-cpu-timers: Cure SMP wobbles" fixes one glibc
      test case in cost of breaking another one. After that commit, calling
      clock_nanosleep(TIMER_ABSTIME, X) and then clock_gettime(&Y) can result
      of Y time being smaller than X time.
      
      Reproducer/tester can be found further below, it can be compiled and ran by:
      
      	gcc -o tst-cpuclock2 tst-cpuclock2.c -pthread
      	while ./tst-cpuclock2 ; do : ; done
      
      This reproducer, when running on a buggy kernel, will complain
      about "clock_gettime difference too small".
      
      Issue happens because on start in thread_group_cputimer() we initialize
      sum_exec_runtime of cputimer with threads runtime not yet accounted and
      then add the threads runtime to running cputimer again on scheduler
      tick, making it's sum_exec_runtime bigger than actual threads runtime.
      
      KOSAKI Motohiro posted a fix for this problem, but that patch was never
      applied: https://lkml.org/lkml/2013/5/26/191 .
      
      This patch takes different approach to cure the problem. It calls
      update_curr() when cputimer starts, that assure we will have updated
      stats of running threads and on the next schedule tick we will account
      only the runtime that elapsed from cputimer start. That also assure we
      have consistent state between cpu times of individual threads and cpu
      time of the process consisted by those threads.
      
      Full reproducer (tst-cpuclock2.c):
      
      	#define _GNU_SOURCE
      	#include <unistd.h>
      	#include <sys/syscall.h>
      	#include <stdio.h>
      	#include <time.h>
      	#include <pthread.h>
      	#include <stdint.h>
      	#include <inttypes.h>
      
      	/* Parameters for the Linux kernel ABI for CPU clocks.  */
      	#define CPUCLOCK_SCHED          2
      	#define MAKE_PROCESS_CPUCLOCK(pid, clock) \
      		((~(clockid_t) (pid) << 3) | (clockid_t) (clock))
      
      	static pthread_barrier_t barrier;
      
      	/* Help advance the clock.  */
      	static void *chew_cpu(void *arg)
      	{
      		pthread_barrier_wait(&barrier);
      		while (1) ;
      
      		return NULL;
      	}
      
      	/* Don't use the glibc wrapper.  */
      	static int do_nanosleep(int flags, const struct timespec *req)
      	{
      		clockid_t clock_id = MAKE_PROCESS_CPUCLOCK(0, CPUCLOCK_SCHED);
      
      		return syscall(SYS_clock_nanosleep, clock_id, flags, req, NULL);
      	}
      
      	static int64_t tsdiff(const struct timespec *before, const struct timespec *after)
      	{
      		int64_t before_i = before->tv_sec * 1000000000ULL + before->tv_nsec;
      		int64_t after_i = after->tv_sec * 1000000000ULL + after->tv_nsec;
      
      		return after_i - before_i;
      	}
      
      	int main(void)
      	{
      		int result = 0;
      		pthread_t th;
      
      		pthread_barrier_init(&barrier, NULL, 2);
      
      		if (pthread_create(&th, NULL, chew_cpu, NULL) != 0) {
      			perror("pthread_create");
      			return 1;
      		}
      
      		pthread_barrier_wait(&barrier);
      
      		/* The test.  */
      		struct timespec before, after, sleeptimeabs;
      		int64_t sleepdiff, diffabs;
      		const struct timespec sleeptime = {.tv_sec = 0,.tv_nsec = 100000000 };
      
      		/* The relative nanosleep.  Not sure why this is needed, but its presence
      		   seems to make it easier to reproduce the problem.  */
      		if (do_nanosleep(0, &sleeptime) != 0) {
      			perror("clock_nanosleep");
      			return 1;
      		}
      
      		/* Get the current time.  */
      		if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &before) < 0) {
      			perror("clock_gettime[2]");
      			return 1;
      		}
      
      		/* Compute the absolute sleep time based on the current time.  */
      		uint64_t nsec = before.tv_nsec + sleeptime.tv_nsec;
      		sleeptimeabs.tv_sec = before.tv_sec + nsec / 1000000000;
      		sleeptimeabs.tv_nsec = nsec % 1000000000;
      
      		/* Sleep for the computed time.  */
      		if (do_nanosleep(TIMER_ABSTIME, &sleeptimeabs) != 0) {
      			perror("absolute clock_nanosleep");
      			return 1;
      		}
      
      		/* Get the time after the sleep.  */
      		if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &after) < 0) {
      			perror("clock_gettime[3]");
      			return 1;
      		}
      
      		/* The time after sleep should always be equal to or after the absolute sleep
      		   time passed to clock_nanosleep.  */
      		sleepdiff = tsdiff(&sleeptimeabs, &after);
      		if (sleepdiff < 0) {
      			printf("absolute clock_nanosleep woke too early: %" PRId64 "\n", sleepdiff);
      			result = 1;
      
      			printf("Before %llu.%09llu\n", before.tv_sec, before.tv_nsec);
      			printf("After  %llu.%09llu\n", after.tv_sec, after.tv_nsec);
      			printf("Sleep  %llu.%09llu\n", sleeptimeabs.tv_sec, sleeptimeabs.tv_nsec);
      		}
      
      		/* The difference between the timestamps taken before and after the
      		   clock_nanosleep call should be equal to or more than the duration of the
      		   sleep.  */
      		diffabs = tsdiff(&before, &after);
      		if (diffabs < sleeptime.tv_nsec) {
      			printf("clock_gettime difference too small: %" PRId64 "\n", diffabs);
      			result = 1;
      		}
      
      		pthread_cancel(th);
      
      		return result;
      	}
      Signed-off-by: NStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20141112155843.GA24803@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6e998916
  17. 04 11月, 2014 3 次提交