1. 16 1月, 2014 1 次提交
  2. 13 1月, 2014 8 次提交
    • P
      sched/deadline: Remove the sysctl_sched_dl knobs · 1724813d
      Peter Zijlstra 提交于
      Remove the deadline specific sysctls for now. The problem with them is
      that the interaction with the exisiting rt knobs is nearly impossible
      to get right.
      
      The current (as per before this patch) situation is that the rt and dl
      bandwidth is completely separate and we enforce rt+dl < 100%. This is
      undesirable because this means that the rt default of 95% leaves us
      hardly any room, even though dl tasks are saver than rt tasks.
      
      Another proposed solution was (a discarted patch) to have the dl
      bandwidth be a fraction of the rt bandwidth. This is highly
      confusing imo.
      
      Furthermore neither proposal is consistent with the situation we
      actually want; which is rt tasks ran from a dl server. In which case
      the rt bandwidth is a direct subset of dl.
      
      So whichever way we go, the introduction of dl controls at this point
      is painful. Therefore remove them and instead share the rt budget.
      
      This means that for now the rt knobs are used for dl admission control
      and the dl runtime is accounted against the rt runtime. I realise that
      this isn't entirely desirable either; but whatever we do we appear to
      need to change the interface later, so better have a small interface
      for now.
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-zpyqbqds1r0vyxtxza1e7rdc@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1724813d
    • J
      sched/deadline: speed up SCHED_DEADLINE pushes with a push-heap · 6bfd6d72
      Juri Lelli 提交于
      Data from tests confirmed that the original active load balancing
      logic didn't scale neither in the number of CPU nor in the number of
      tasks (as sched_rt does).
      
      Here we provide a global data structure to keep track of deadlines
      of the running tasks in the system. The structure is composed by
      a bitmask showing the free CPUs and a max-heap, needed when the system
      is heavily loaded.
      
      The implementation and concurrent access scheme are kept simple by
      design. However, our measurements show that we can compete with sched_rt
      on large multi-CPUs machines [1].
      
      Only the push path is addressed, the extension to use this structure
      also for pull decisions is straightforward. However, we are currently
      evaluating different (in order to decrease/avoid contention) data
      structures to solve possibly both problems. We are also going to re-run
      tests considering recent changes inside cpupri [2].
      
       [1] http://retis.sssup.it/~jlelli/papers/Ospert11Lelli.pdf
       [2] http://www.spinics.net/lists/linux-rt-users/msg06778.htmlSigned-off-by: NJuri Lelli <juri.lelli@gmail.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-14-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6bfd6d72
    • D
      sched/deadline: Add bandwidth management for SCHED_DEADLINE tasks · 332ac17e
      Dario Faggioli 提交于
      In order of deadline scheduling to be effective and useful, it is
      important that some method of having the allocation of the available
      CPU bandwidth to tasks and task groups under control.
      This is usually called "admission control" and if it is not performed
      at all, no guarantee can be given on the actual scheduling of the
      -deadline tasks.
      
      Since when RT-throttling has been introduced each task group have a
      bandwidth associated to itself, calculated as a certain amount of
      runtime over a period. Moreover, to make it possible to manipulate
      such bandwidth, readable/writable controls have been added to both
      procfs (for system wide settings) and cgroupfs (for per-group
      settings).
      
      Therefore, the same interface is being used for controlling the
      bandwidth distrubution to -deadline tasks and task groups, i.e.,
      new controls but with similar names, equivalent meaning and with
      the same usage paradigm are added.
      
      However, more discussion is needed in order to figure out how
      we want to manage SCHED_DEADLINE bandwidth at the task group level.
      Therefore, this patch adds a less sophisticated, but actually
      very sensible, mechanism to ensure that a certain utilization
      cap is not overcome per each root_domain (the single rq for !SMP
      configurations).
      
      Another main difference between deadline bandwidth management and
      RT-throttling is that -deadline tasks have bandwidth on their own
      (while -rt ones doesn't!), and thus we don't need an higher level
      throttling mechanism to enforce the desired bandwidth.
      
      This patch, therefore:
      
       - adds system wide deadline bandwidth management by means of:
          * /proc/sys/kernel/sched_dl_runtime_us,
          * /proc/sys/kernel/sched_dl_period_us,
         that determine (i.e., runtime / period) the total bandwidth
         available on each CPU of each root_domain for -deadline tasks;
      
       - couples the RT and deadline bandwidth management, i.e., enforces
         that the sum of how much bandwidth is being devoted to -rt
         -deadline tasks to stay below 100%.
      
      This means that, for a root_domain comprising M CPUs, -deadline tasks
      can be created until the sum of their bandwidths stay below:
      
          M * (sched_dl_runtime_us / sched_dl_period_us)
      
      It is also possible to disable this bandwidth management logic, and
      be thus free of oversubscribing the system up to any arbitrary level.
      Signed-off-by: NDario Faggioli <raistlin@linux.it>
      Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-12-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      332ac17e
    • D
      sched/deadline: Add SCHED_DEADLINE inheritance logic · 2d3d891d
      Dario Faggioli 提交于
      Some method to deal with rt-mutexes and make sched_dl interact with
      the current PI-coded is needed, raising all but trivial issues, that
      needs (according to us) to be solved with some restructuring of
      the pi-code (i.e., going toward a proxy execution-ish implementation).
      
      This is under development, in the meanwhile, as a temporary solution,
      what this commits does is:
      
       - ensure a pi-lock owner with waiters is never throttled down. Instead,
         when it runs out of runtime, it immediately gets replenished and it's
         deadline is postponed;
      
       - the scheduling parameters (relative deadline and default runtime)
         used for that replenishments --during the whole period it holds the
         pi-lock-- are the ones of the waiting task with earliest deadline.
      
      Acting this way, we provide some kind of boosting to the lock-owner,
      still by using the existing (actually, slightly modified by the previous
      commit) pi-architecture.
      
      We would stress the fact that this is only a surely needed, all but
      clean solution to the problem. In the end it's only a way to re-start
      discussion within the community. So, as always, comments, ideas, rants,
      etc.. are welcome! :-)
      Signed-off-by: NDario Faggioli <raistlin@linux.it>
      Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
      [ Added !RT_MUTEXES build fix. ]
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-11-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2d3d891d
    • H
      sched/deadline: Add period support for SCHED_DEADLINE tasks · 755378a4
      Harald Gustafsson 提交于
      Make it possible to specify a period (different or equal than
      deadline) for -deadline tasks. Relative deadlines (D_i) are used on
      task arrivals to generate new scheduling (absolute) deadlines as "d =
      t + D_i", and periods (P_i) to postpone the scheduling deadlines as "d
      = d + P_i" when the budget is zero.
      
      This is in general useful to model (and schedule) tasks that have slow
      activation rates (long periods), but have to be scheduled soon once
      activated (short deadlines).
      Signed-off-by: NHarald Gustafsson <harald.gustafsson@ericsson.com>
      Signed-off-by: NDario Faggioli <raistlin@linux.it>
      Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-7-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      755378a4
    • D
      sched/deadline: Add SCHED_DEADLINE avg_update accounting · 239be4a9
      Dario Faggioli 提交于
      Make the core scheduler and load balancer aware of the load
      produced by -deadline tasks, by updating the moving average
      like for sched_rt.
      Signed-off-by: NDario Faggioli <raistlin@linux.it>
      Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-6-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      239be4a9
    • J
      sched/deadline: Add SCHED_DEADLINE SMP-related data structures & logic · 1baca4ce
      Juri Lelli 提交于
      Introduces data structures relevant for implementing dynamic
      migration of -deadline tasks and the logic for checking if
      runqueues are overloaded with -deadline tasks and for choosing
      where a task should migrate, when it is the case.
      
      Adds also dynamic migrations to SCHED_DEADLINE, so that tasks can
      be moved among CPUs when necessary. It is also possible to bind a
      task to a (set of) CPU(s), thus restricting its capability of
      migrating, or forbidding migrations at all.
      
      The very same approach used in sched_rt is utilised:
       - -deadline tasks are kept into CPU-specific runqueues,
       - -deadline tasks are migrated among runqueues to achieve the
         following:
          * on an M-CPU system the M earliest deadline ready tasks
            are always running;
          * affinity/cpusets settings of all the -deadline tasks is
            always respected.
      
      Therefore, this very special form of "load balancing" is done with
      an active method, i.e., the scheduler pushes or pulls tasks between
      runqueues when they are woken up and/or (de)scheduled.
      IOW, every time a preemption occurs, the descheduled task might be sent
      to some other CPU (depending on its deadline) to continue executing
      (push). On the other hand, every time a CPU becomes idle, it might pull
      the second earliest deadline ready task from some other CPU.
      
      To enforce this, a pull operation is always attempted before taking any
      scheduling decision (pre_schedule()), as well as a push one after each
      scheduling decision (post_schedule()). In addition, when a task arrives
      or wakes up, the best CPU where to resume it is selected taking into
      account its affinity mask, the system topology, but also its deadline.
      E.g., from the scheduling point of view, the best CPU where to wake
      up (and also where to push) a task is the one which is running the task
      with the latest deadline among the M executing ones.
      
      In order to facilitate these decisions, per-runqueue "caching" of the
      deadlines of the currently running and of the first ready task is used.
      Queued but not running tasks are also parked in another rb-tree to
      speed-up pushes.
      Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
      Signed-off-by: NDario Faggioli <raistlin@linux.it>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-5-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1baca4ce
    • D
      sched/deadline: Add SCHED_DEADLINE structures & implementation · aab03e05
      Dario Faggioli 提交于
      Introduces the data structures, constants and symbols needed for
      SCHED_DEADLINE implementation.
      
      Core data structure of SCHED_DEADLINE are defined, along with their
      initializers. Hooks for checking if a task belong to the new policy
      are also added where they are needed.
      
      Adds a scheduling class, in sched/dl.c and a new policy called
      SCHED_DEADLINE. It is an implementation of the Earliest Deadline
      First (EDF) scheduling algorithm, augmented with a mechanism (called
      Constant Bandwidth Server, CBS) that makes it possible to isolate
      the behaviour of tasks between each other.
      
      The typical -deadline task will be made up of a computation phase
      (instance) which is activated on a periodic or sporadic fashion. The
      expected (maximum) duration of such computation is called the task's
      runtime; the time interval by which each instance need to be completed
      is called the task's relative deadline. The task's absolute deadline
      is dynamically calculated as the time instant a task (better, an
      instance) activates plus the relative deadline.
      
      The EDF algorithms selects the task with the smallest absolute
      deadline as the one to be executed first, while the CBS ensures each
      task to run for at most its runtime every (relative) deadline
      length time interval, avoiding any interference between different
      tasks (bandwidth isolation).
      Thanks to this feature, also tasks that do not strictly comply with
      the computational model sketched above can effectively use the new
      policy.
      
      To summarize, this patch:
       - introduces the data structures, constants and symbols needed;
       - implements the core logic of the scheduling algorithm in the new
         scheduling class file;
       - provides all the glue code between the new scheduling class and
         the core scheduler and refines the interactions between sched/dl
         and the other existing scheduling classes.
      Signed-off-by: NDario Faggioli <raistlin@linux.it>
      Signed-off-by: NMichael Trimarchi <michael@amarulasolutions.com>
      Signed-off-by: NFabio Checconi <fchecconi@gmail.com>
      Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-4-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      aab03e05