1. 20 6月, 2017 1 次提交
    • I
      sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into <linux/wait_bit.h> · 5dd43ce2
      Ingo Molnar 提交于
      The wait_bit*() types and APIs are mixed into wait.h, but they
      are a pretty orthogonal extension of wait-queues.
      
      Furthermore, only about 50 kernel files use these APIs, while
      over 1000 use the regular wait-queue functionality.
      
      So clean up the main wait.h by moving the wait-bit functionality
      out of it, into a separate .h and .c file:
      
        include/linux/wait_bit.h  for types and APIs
        kernel/sched/wait_bit.c   for the implementation
      
      Update all header dependencies.
      
      This reduces the size of wait.h rather significantly, by about 30%.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5dd43ce2
  2. 08 6月, 2017 1 次提交
  3. 08 2月, 2017 1 次提交
    • I
      sched/autogroup: Rename auto_group.[ch] to autogroup.[ch] · 1051408f
      Ingo Molnar 提交于
      The names are all 'autogroup', not 'auto_group' - so rename
      the kernel/sched/auto_group.[ch] to match the existing
      nomenclature.
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      1051408f
  4. 07 2月, 2017 1 次提交
  5. 02 4月, 2016 1 次提交
    • R
      cpufreq: schedutil: New governor based on scheduler utilization data · 9bdcb44e
      Rafael J. Wysocki 提交于
      Add a new cpufreq scaling governor, called "schedutil", that uses
      scheduler-provided CPU utilization information as input for making
      its decisions.
      
      Doing that is possible after commit 34e2c555 (cpufreq: Add
      mechanism for registering utilization update callbacks) that
      introduced cpufreq_update_util() called by the scheduler on
      utilization changes (from CFS) and RT/DL task status updates.
      In particular, CPU frequency scaling decisions may be based on
      the the utilization data passed to cpufreq_update_util() by CFS.
      
      The new governor is relatively simple.
      
      The frequency selection formula used by it depends on whether or not
      the utilization is frequency-invariant.  In the frequency-invariant
      case the new CPU frequency is given by
      
      	next_freq = 1.25 * max_freq * util / max
      
      where util and max are the last two arguments of cpufreq_update_util().
      In turn, if util is not frequency-invariant, the maximum frequency in
      the above formula is replaced with the current frequency of the CPU:
      
      	next_freq = 1.25 * curr_freq * util / max
      
      The coefficient 1.25 corresponds to the frequency tipping point at
      (util / max) = 0.8.
      
      All of the computations are carried out in the utilization update
      handlers provided by the new governor.  One of those handlers is
      used for cpufreq policies shared between multiple CPUs and the other
      one is for policies with one CPU only (and therefore it doesn't need
      to use any extra synchronization means).
      
      The governor supports fast frequency switching if that is supported
      by the cpufreq driver in use and possible for the given policy.
      In the fast switching case, all operations of the governor take
      place in its utilization update handlers.  If fast switching cannot
      be used, the frequency switch operations are carried out with the
      help of a work item which only calls __cpufreq_driver_target()
      (under a mutex) to trigger a frequency update (to a value already
      computed beforehand in one of the utilization update handlers).
      
      Currently, the governor treats all of the RT and DL tasks as
      "unknown utilization" and sets the frequency to the allowed
      maximum when updated from the RT or DL sched classes.  That
      heavy-handed approach should be replaced with something more
      subtle and specifically targeted at RT and DL tasks.
      
      The governor shares some tunables management code with the
      "ondemand" and "conservative" governors and uses some common
      definitions from cpufreq_governor.h, but apart from that it
      is stand-alone.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      9bdcb44e
  6. 23 3月, 2016 1 次提交
    • D
      kernel: add kcov code coverage · 5c9a8750
      Dmitry Vyukov 提交于
      kcov provides code coverage collection for coverage-guided fuzzing
      (randomized testing).  Coverage-guided fuzzing is a testing technique
      that uses coverage feedback to determine new interesting inputs to a
      system.  A notable user-space example is AFL
      (http://lcamtuf.coredump.cx/afl/).  However, this technique is not
      widely used for kernel testing due to missing compiler and kernel
      support.
      
      kcov does not aim to collect as much coverage as possible.  It aims to
      collect more or less stable coverage that is function of syscall inputs.
      To achieve this goal it does not collect coverage in soft/hard
      interrupts and instrumentation of some inherently non-deterministic or
      non-interesting parts of kernel is disbled (e.g.  scheduler, locking).
      
      Currently there is a single coverage collection mode (tracing), but the
      API anticipates additional collection modes.  Initially I also
      implemented a second mode which exposes coverage in a fixed-size hash
      table of counters (what Quentin used in his original patch).  I've
      dropped the second mode for simplicity.
      
      This patch adds the necessary support on kernel side.  The complimentary
      compiler support was added in gcc revision 231296.
      
      We've used this support to build syzkaller system call fuzzer, which has
      found 90 kernel bugs in just 2 months:
      
        https://github.com/google/syzkaller/wiki/Found-Bugs
      
      We've also found 30+ bugs in our internal systems with syzkaller.
      Another (yet unexplored) direction where kcov coverage would greatly
      help is more traditional "blob mutation".  For example, mounting a
      random blob as a filesystem, or receiving a random blob over wire.
      
      Why not gcov.  Typical fuzzing loop looks as follows: (1) reset
      coverage, (2) execute a bit of code, (3) collect coverage, repeat.  A
      typical coverage can be just a dozen of basic blocks (e.g.  an invalid
      input).  In such context gcov becomes prohibitively expensive as
      reset/collect coverage steps depend on total number of basic
      blocks/edges in program (in case of kernel it is about 2M).  Cost of
      kcov depends only on number of executed basic blocks/edges.  On top of
      that, kernel requires per-thread coverage because there are always
      background threads and unrelated processes that also produce coverage.
      With inlined gcov instrumentation per-thread coverage is not possible.
      
      kcov exposes kernel PCs and control flow to user-space which is
      insecure.  But debugfs should not be mapped as user accessible.
      
      Based on a patch by Quentin Casasnovas.
      
      [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
      [akpm@linux-foundation.org: unbreak allmodconfig]
      [akpm@linux-foundation.org: follow x86 Makefile layout standards]
      Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: Vegard Nossum <vegard.nossum@oracle.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Tavis Ormandy <taviso@google.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: David Drysdale <drysdale@google.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c9a8750
  7. 11 3月, 2016 1 次提交
  8. 25 2月, 2016 1 次提交
    • P
      wait.[ch]: Introduce the simple waitqueue (swait) implementation · 13b35686
      Peter Zijlstra (Intel) 提交于
      The existing wait queue support has support for custom wake up call
      backs, wake flags, wake key (passed to call back) and exclusive
      flags that allow wakers to be tagged as exclusive, for limiting
      the number of wakers.
      
      In a lot of cases, none of these features are used, and hence we
      can benefit from a slimmed down version that lowers memory overhead
      and reduces runtime overhead.
      
      The concept originated from -rt, where waitqueues are a constant
      source of trouble, as we can't convert the head lock to a raw
      spinlock due to fancy and long lasting callbacks.
      
      With the removal of custom callbacks, we can use a raw lock for
      queue list manipulations, hence allowing the simple wait support
      to be used in -rt.
      
      [Patch is from PeterZ which is based on Thomas version. Commit message is
       written by Paul G.
       Daniel:  - Fixed some compile issues
       	  - Added non-lazy implementation of swake_up_locked as suggested
      	     by Boqun Feng.]
      Originally-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: linux-rt-users@vger.kernel.org
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1455871601-27484-2-git-send-email-wagi@monom.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      13b35686
  9. 08 5月, 2015 1 次提交
    • P
      sched: Move the loadavg code to a more obvious location · 3289bdb4
      Peter Zijlstra 提交于
      I could not find the loadavg code.. turns out it was hidden in a file
      called proc.c. It further got mingled up with the cruft per rq load
      indexes (which we really want to get rid of).
      
      Move the per rq load indexes into the fair.c load-balance code (that's
      the only thing that uses them) and rename proc.c to loadavg.c so we
      can find it again.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      [ Did minor cleanups to the code. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      3289bdb4
  10. 29 1月, 2015 1 次提交
  11. 11 2月, 2014 1 次提交
  12. 13 1月, 2014 2 次提交
    • J
      sched/deadline: speed up SCHED_DEADLINE pushes with a push-heap · 6bfd6d72
      Juri Lelli 提交于
      Data from tests confirmed that the original active load balancing
      logic didn't scale neither in the number of CPU nor in the number of
      tasks (as sched_rt does).
      
      Here we provide a global data structure to keep track of deadlines
      of the running tasks in the system. The structure is composed by
      a bitmask showing the free CPUs and a max-heap, needed when the system
      is heavily loaded.
      
      The implementation and concurrent access scheme are kept simple by
      design. However, our measurements show that we can compete with sched_rt
      on large multi-CPUs machines [1].
      
      Only the push path is addressed, the extension to use this structure
      also for pull decisions is straightforward. However, we are currently
      evaluating different (in order to decrease/avoid contention) data
      structures to solve possibly both problems. We are also going to re-run
      tests considering recent changes inside cpupri [2].
      
       [1] http://retis.sssup.it/~jlelli/papers/Ospert11Lelli.pdf
       [2] http://www.spinics.net/lists/linux-rt-users/msg06778.htmlSigned-off-by: NJuri Lelli <juri.lelli@gmail.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-14-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6bfd6d72
    • D
      sched/deadline: Add SCHED_DEADLINE structures & implementation · aab03e05
      Dario Faggioli 提交于
      Introduces the data structures, constants and symbols needed for
      SCHED_DEADLINE implementation.
      
      Core data structure of SCHED_DEADLINE are defined, along with their
      initializers. Hooks for checking if a task belong to the new policy
      are also added where they are needed.
      
      Adds a scheduling class, in sched/dl.c and a new policy called
      SCHED_DEADLINE. It is an implementation of the Earliest Deadline
      First (EDF) scheduling algorithm, augmented with a mechanism (called
      Constant Bandwidth Server, CBS) that makes it possible to isolate
      the behaviour of tasks between each other.
      
      The typical -deadline task will be made up of a computation phase
      (instance) which is activated on a periodic or sporadic fashion. The
      expected (maximum) duration of such computation is called the task's
      runtime; the time interval by which each instance need to be completed
      is called the task's relative deadline. The task's absolute deadline
      is dynamically calculated as the time instant a task (better, an
      instance) activates plus the relative deadline.
      
      The EDF algorithms selects the task with the smallest absolute
      deadline as the one to be executed first, while the CBS ensures each
      task to run for at most its runtime every (relative) deadline
      length time interval, avoiding any interference between different
      tasks (bandwidth isolation).
      Thanks to this feature, also tasks that do not strictly comply with
      the computational model sketched above can effectively use the new
      policy.
      
      To summarize, this patch:
       - introduces the data structures, constants and symbols needed;
       - implements the core logic of the scheduling algorithm in the new
         scheduling class file;
       - provides all the glue code between the new scheduling class and
         the core scheduler and refines the interactions between sched/dl
         and the other existing scheduling classes.
      Signed-off-by: NDario Faggioli <raistlin@linux.it>
      Signed-off-by: NMichael Trimarchi <michael@amarulasolutions.com>
      Signed-off-by: NFabio Checconi <fchecconi@gmail.com>
      Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-4-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      aab03e05
  13. 06 11月, 2013 2 次提交
  14. 07 5月, 2013 1 次提交
  15. 10 4月, 2013 1 次提交
  16. 20 8月, 2012 1 次提交
    • F
      sched: Move cputime code to its own file · 73fbec60
      Frederic Weisbecker 提交于
      Extract cputime code from the giant sched/core.c and
      put it in its own file. This make it easier to deal with
      this particular area and de-bloat a bit more core.c
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      73fbec60
  17. 05 5月, 2012 1 次提交
    • T
      init_task: Create generic init_task instance · a4a2eb49
      Thomas Gleixner 提交于
      All archs define init_task in the same way (except ia64, but there is
      no particular reason why ia64 cannot use the common version). Create a
      generic instance so all archs can be converted over.
      
      The config switch is temporary and will be removed when all archs are
      converted over.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chen Liqin <liqin.chen@sunplusct.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David Howells <dhowells@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20120503085034.092585287@linutronix.de
      a4a2eb49
  18. 17 11月, 2011 1 次提交