1. 23 3月, 2016 1 次提交
    • D
      kernel: add kcov code coverage · 5c9a8750
      Dmitry Vyukov 提交于
      kcov provides code coverage collection for coverage-guided fuzzing
      (randomized testing).  Coverage-guided fuzzing is a testing technique
      that uses coverage feedback to determine new interesting inputs to a
      system.  A notable user-space example is AFL
      (http://lcamtuf.coredump.cx/afl/).  However, this technique is not
      widely used for kernel testing due to missing compiler and kernel
      support.
      
      kcov does not aim to collect as much coverage as possible.  It aims to
      collect more or less stable coverage that is function of syscall inputs.
      To achieve this goal it does not collect coverage in soft/hard
      interrupts and instrumentation of some inherently non-deterministic or
      non-interesting parts of kernel is disbled (e.g.  scheduler, locking).
      
      Currently there is a single coverage collection mode (tracing), but the
      API anticipates additional collection modes.  Initially I also
      implemented a second mode which exposes coverage in a fixed-size hash
      table of counters (what Quentin used in his original patch).  I've
      dropped the second mode for simplicity.
      
      This patch adds the necessary support on kernel side.  The complimentary
      compiler support was added in gcc revision 231296.
      
      We've used this support to build syzkaller system call fuzzer, which has
      found 90 kernel bugs in just 2 months:
      
        https://github.com/google/syzkaller/wiki/Found-Bugs
      
      We've also found 30+ bugs in our internal systems with syzkaller.
      Another (yet unexplored) direction where kcov coverage would greatly
      help is more traditional "blob mutation".  For example, mounting a
      random blob as a filesystem, or receiving a random blob over wire.
      
      Why not gcov.  Typical fuzzing loop looks as follows: (1) reset
      coverage, (2) execute a bit of code, (3) collect coverage, repeat.  A
      typical coverage can be just a dozen of basic blocks (e.g.  an invalid
      input).  In such context gcov becomes prohibitively expensive as
      reset/collect coverage steps depend on total number of basic
      blocks/edges in program (in case of kernel it is about 2M).  Cost of
      kcov depends only on number of executed basic blocks/edges.  On top of
      that, kernel requires per-thread coverage because there are always
      background threads and unrelated processes that also produce coverage.
      With inlined gcov instrumentation per-thread coverage is not possible.
      
      kcov exposes kernel PCs and control flow to user-space which is
      insecure.  But debugfs should not be mapped as user accessible.
      
      Based on a patch by Quentin Casasnovas.
      
      [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
      [akpm@linux-foundation.org: unbreak allmodconfig]
      [akpm@linux-foundation.org: follow x86 Makefile layout standards]
      Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: Vegard Nossum <vegard.nossum@oracle.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Tavis Ormandy <taviso@google.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: David Drysdale <drysdale@google.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c9a8750
  2. 17 2月, 2016 16 次提交
  3. 09 2月, 2016 5 次提交
  4. 30 1月, 2016 1 次提交
  5. 06 10月, 2015 1 次提交
    • K
      perf/x86: Add Intel cstate PMUs support · 7ce1346a
      Kan Liang 提交于
      This patch adds new PMUs to support cstate related free running
      (read-only) counters. These counters may be used simultaneously by other
      tools, such as turbostat. However, it still make sense to implement them
      in perf. Because we can conveniently collect them together with other
      events, and allow to use them from tools without special MSR access
      code.
      
      These counters include CORE_C*_RESIDENCY and PKG_C*_RESIDENCY.
      According to counters' scope and category, two PMUs are registered with
      the perf_event core subsystem.
      
       - 'cstate_core': The counter is available for each physical core. The
                        counters include CORE_C*_RESIDENCY.
      
       - 'cstate_pkg':  The counter is available for each physical package. The
                        counters include PKG_C*_RESIDENCY.
      
      The events are exposed in sysfs for use by perf stat and other tools.
      The files are:
      
        /sys/devices/cstate_core/events/c*-residency
        /sys/devices/cstate_pkg/events/c*-residency
      
      These events only support system-wide mode counting.
      The /sys/devices/cstate_*/cpumask file can be used by tools to figure
      out which CPUs to monitor by default.
      
      The PMU type (attr->type) is dynamically allocated and is available from
      /sys/devices/core_misc/type and /sys/device/cstate_*/type.
      
      Sampling is not supported.
      
      Here is an example.
      
       - To caculate the fraction of time when the core is running in C6 state
         CORE_C6_time% = CORE_C6_RESIDENCY / TSC
      
       # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 sleep 5
      
         11838820015,,cstate_core/c6-residency/,5175919658,100.00
         11877130740,,msr/tsc/,5175922010,100.00
      
       For sleep, 99.7% of time we ran in C6 state.
      
       # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 busyloop
      
         1253316,,cstate_core/c6-residency/,4360969154,100.00
         10012635248,,msr/tsc/,4360972366,100.00
      
       For busyloop, 0.01% of time we ran in C6 state.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1443443404-8581-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7ce1346a
  6. 04 8月, 2015 1 次提交
  7. 02 4月, 2015 2 次提交
    • A
      perf/x86/intel/bts: Add BTS PMU driver · 8062382c
      Alexander Shishkin 提交于
      Add support for Branch Trace Store (BTS) via kernel perf event infrastructure.
      The difference with the existing implementation of BTS support is that this
      one is a separate PMU that exports events' trace buffers to userspace by means
      of AUX area of the perf buffer, which is zero-copy mapped into userspace.
      
      The immediate benefit is that the buffer size can be much bigger, resulting in
      fewer interrupts and no kernel side copying is involved and little to no trace
      data loss. Also, kernel code can be traced with this driver.
      
      The old way of collecting BTS traces still works.
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kaixu Xia <kaixu.xia@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@infradead.org
      Cc: adrian.hunter@intel.com
      Cc: kan.liang@intel.com
      Cc: markus.t.metzger@intel.com
      Cc: mathieu.poirier@linaro.org
      Link: http://lkml.kernel.org/r/1422614435-114702-1-git-send-email-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8062382c
    • A
      perf/x86/intel/pt: Add Intel PT PMU driver · 52ca9ced
      Alexander Shishkin 提交于
      Add support for Intel Processor Trace (PT) to kernel's perf events.
      PT is an extension of Intel Architecture that collects information about
      software execuction such as control flow, execution modes and timings and
      formats it into highly compressed binary packets. Even being compressed,
      these packets are generated at hundreds of megabytes per second per core,
      which makes it impractical to decode them on the fly in the kernel.
      
      This driver exports trace data by through AUX space in the perf ring
      buffer, which is zero-copy mapped into userspace for faster data retrieval.
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kaixu Xia <kaixu.xia@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@infradead.org
      Cc: adrian.hunter@intel.com
      Cc: kan.liang@intel.com
      Cc: markus.t.metzger@intel.com
      Cc: mathieu.poirier@linaro.org
      Link: http://lkml.kernel.org/r/1422614392-114498-1-git-send-email-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      52ca9ced
  8. 23 3月, 2015 1 次提交
  9. 23 12月, 2014 1 次提交
  10. 28 10月, 2014 1 次提交
  11. 18 8月, 2014 2 次提交
    • J
      x86: Support compiling out human-friendly processor feature names · 9def39be
      Josh Triplett 提交于
      The table mapping CPUID bits to human-readable strings takes up a
      non-trivial amount of space, and only exists to support /proc/cpuinfo
      and a couple of kernel messages.  Since programs depend on the format of
      /proc/cpuinfo, force inclusion of the table when building with /proc
      support; otherwise, support omitting that table to save space, in which
      case the kernel messages will print features numerically instead.
      
      In addition to saving 1408 bytes out of vmlinux, this also saves 1373
      bytes out of the uncompressed setup code, which contributes directly to
      the size of bzImage.
      Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
      9def39be
    • J
      x86: Drop support for /proc files when !CONFIG_PROC_FS · 39f838e0
      Josh Triplett 提交于
      arch/x86/kernel/cpu/proc.c only exists to support files in /proc; omit that
      file when compiling without CONFIG_PROC_FS.
      
      Saves 645 additional bytes on 32-bit x86 when !CONFIG_PROC_FS:
      
      add/remove: 0/5 grow/shrink: 0/0 up/down: 0/-645 (-645)
      function                                     old     new   delta
      c_stop                                         1       -      -1
      c_next                                        11       -     -11
      cpuinfo_op                                    16       -     -16
      c_start                                       24       -     -24
      show_cpuinfo                                 593       -    -593
      Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
      39f838e0
  12. 13 8月, 2014 3 次提交
  13. 14 1月, 2014 1 次提交
  14. 27 11月, 2013 1 次提交
    • S
      perf/x86: Add Intel RAPL PMU support · 4788e5b4
      Stephane Eranian 提交于
      This patch adds a new uncore PMU to expose the Intel
      RAPL energy consumption counters. Up to 3 counters,
      each counting a particular RAPL event are exposed.
      
      The RAPL counters are available on Intel SandyBridge,
      IvyBridge, Haswell. The server skus add a 3rd counter.
      
      The following events are available and exposed in sysfs:
      
        - power/energy-cores: power consumption of all cores on socket
        - power/energy-pkg: power consumption of all cores + LLc cache
        - power/energy-dram: power consumption of DRAM (servers only)
      
      For each event both the unit (Joules) and scale (2^-32 J)
      is exposed in sysfs for use by perf stat and other tools.
      The files are:
      
      	/sys/devices/power/events/energy-*.unit
      	/sys/devices/power/events/energy-*.scale
      
      The RAPL PMU is uncore by nature and is implemented such
      that it only works in system-wide mode. Measuring only
      one CPU per socket is sufficient. The /sys/devices/power/cpumask
      file can be used by tools to figure out which CPUs to monitor
      by default. For instance, on a 2-socket system, 2 CPUs
      (one on each socket) will be shown.
      
      All the counters measure in the same unit (exposed via sysfs).
      The perf_events API exposes all RAPL counters as 64-bit integers
      counting in unit of 1/2^32 Joules (about 0.23 nJ). User level tools
      must convert the counts by multiplying them by 2^-32 to obtain
      Joules. The reason for this is that the kernel avoids
      doing floating point math whenever possible because it is
      expensive (user floating-point state must be saved). The method
      used avoids kernel floating-point usage. There is no loss of
      precision. Thanks to PeterZ for suggesting this approach.
      
      To convert the raw count in Watt:
         W = C * 2.3 / (1e10 * time)
      or ldexp(C, -32).
      
      RAPL PMU is a new standalone PMU which registers with the
      perf_event core subsystem. The PMU type (attr->type) is
      dynamically allocated and is available from /sys/device/power/type.
      
      Sampling is not supported by the RAPL PMU. There is no
      privilege level filtering either.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NMaria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: acme@redhat.com
      Cc: jolsa@redhat.com
      Cc: zheng.z.yan@intel.com
      Cc: bp@alien8.de
      Link: http://lkml.kernel.org/r/1384275531-10892-4-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4788e5b4
  15. 19 6月, 2013 1 次提交
  16. 30 4月, 2013 1 次提交
  17. 21 4月, 2013 1 次提交