1. 23 3月, 2016 7 次提交
    • D
      kernel: add kcov code coverage · 5c9a8750
      Dmitry Vyukov 提交于
      kcov provides code coverage collection for coverage-guided fuzzing
      (randomized testing).  Coverage-guided fuzzing is a testing technique
      that uses coverage feedback to determine new interesting inputs to a
      system.  A notable user-space example is AFL
      (http://lcamtuf.coredump.cx/afl/).  However, this technique is not
      widely used for kernel testing due to missing compiler and kernel
      support.
      
      kcov does not aim to collect as much coverage as possible.  It aims to
      collect more or less stable coverage that is function of syscall inputs.
      To achieve this goal it does not collect coverage in soft/hard
      interrupts and instrumentation of some inherently non-deterministic or
      non-interesting parts of kernel is disbled (e.g.  scheduler, locking).
      
      Currently there is a single coverage collection mode (tracing), but the
      API anticipates additional collection modes.  Initially I also
      implemented a second mode which exposes coverage in a fixed-size hash
      table of counters (what Quentin used in his original patch).  I've
      dropped the second mode for simplicity.
      
      This patch adds the necessary support on kernel side.  The complimentary
      compiler support was added in gcc revision 231296.
      
      We've used this support to build syzkaller system call fuzzer, which has
      found 90 kernel bugs in just 2 months:
      
        https://github.com/google/syzkaller/wiki/Found-Bugs
      
      We've also found 30+ bugs in our internal systems with syzkaller.
      Another (yet unexplored) direction where kcov coverage would greatly
      help is more traditional "blob mutation".  For example, mounting a
      random blob as a filesystem, or receiving a random blob over wire.
      
      Why not gcov.  Typical fuzzing loop looks as follows: (1) reset
      coverage, (2) execute a bit of code, (3) collect coverage, repeat.  A
      typical coverage can be just a dozen of basic blocks (e.g.  an invalid
      input).  In such context gcov becomes prohibitively expensive as
      reset/collect coverage steps depend on total number of basic
      blocks/edges in program (in case of kernel it is about 2M).  Cost of
      kcov depends only on number of executed basic blocks/edges.  On top of
      that, kernel requires per-thread coverage because there are always
      background threads and unrelated processes that also produce coverage.
      With inlined gcov instrumentation per-thread coverage is not possible.
      
      kcov exposes kernel PCs and control flow to user-space which is
      insecure.  But debugfs should not be mapped as user accessible.
      
      Based on a patch by Quentin Casasnovas.
      
      [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
      [akpm@linux-foundation.org: unbreak allmodconfig]
      [akpm@linux-foundation.org: follow x86 Makefile layout standards]
      Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: Vegard Nossum <vegard.nossum@oracle.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Tavis Ormandy <taviso@google.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: David Drysdale <drysdale@google.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c9a8750
    • A
      rapidio: add global inbound port write interfaces · 9a0b0627
      Alexandre Bounine 提交于
      Add new Port Write handler registration interfaces that attach PW
      handlers to local mport device objects.  This is different from old
      interface that attaches PW callback to individual RapidIO device.  The
      new interfaces are intended for use for common event handling (e.g.
      hot-plug notifications) while the old interface is available for
      individual device drivers.
      
      This patch is based on patch proposed by Andre van Herk but preserves
      existing per-device interface and adds lock protection for list
      handling.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a0b0627
    • A
      powerpc/fsl_rio: changes to mport registration · dd64f4fe
      Alexandre Bounine 提交于
      Change mport object initialization/registration sequence to match
      reworked version of rio_register_mport() in the core code.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dd64f4fe
    • J
      fs/coredump: prevent fsuid=0 dumps into user-controlled directories · 378c6520
      Jann Horn 提交于
      This commit fixes the following security hole affecting systems where
      all of the following conditions are fulfilled:
      
       - The fs.suid_dumpable sysctl is set to 2.
       - The kernel.core_pattern sysctl's value starts with "/". (Systems
         where kernel.core_pattern starts with "|/" are not affected.)
       - Unprivileged user namespace creation is permitted. (This is
         true on Linux >=3.8, but some distributions disallow it by
         default using a distro patch.)
      
      Under these conditions, if a program executes under secure exec rules,
      causing it to run with the SUID_DUMP_ROOT flag, then unshares its user
      namespace, changes its root directory and crashes, the coredump will be
      written using fsuid=0 and a path derived from kernel.core_pattern - but
      this path is interpreted relative to the root directory of the process,
      allowing the attacker to control where a coredump will be written with
      root privileges.
      
      To fix the security issue, always interpret core_pattern for dumps that
      are written under SUID_DUMP_ROOT relative to the root directory of init.
      Signed-off-by: NJann Horn <jann@thejh.net>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      378c6520
    • A
      x86/compat: remove is_compat_task() · f970165b
      Andy Lutomirski 提交于
      x86's is_compat_task always checked the current syscall type, not the
      task type.  It has no non-arch users any more, so just remove it to
      avoid confusion.
      
      On x86, nothing should really be checking the task ABI.  There are
      legitimate users for the syscall ABI and for the mm ABI.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f970165b
    • A
      sparc/syscall: fix syscall_get_arch · 203f7907
      Andy Lutomirski 提交于
      Sparc's syscall_get_arch was buggy: it returned the task arch, not the
      syscall arch.  This could confuse seccomp and audit.
      
      I don't think this is as bad for seccomp as it looks: sparc's 32-bit and
      64-bit syscalls are numbered the same.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      203f7907
    • A
      sparc/compat: provide an accurate in_compat_syscall implementation · 069923d8
      Andy Lutomirski 提交于
      On sparc64 compat-enabled kernels, any task can make 32-bit and 64-bit
      syscalls.  is_compat_task returns true in 32-bit tasks, which does not
      necessarily imply that the current syscall is 32-bit.
      
      Provide an in_compat_syscall implementation that checks whether the
      current syscall is compat.
      
      As far as I know, sparc is the only architecture on which is_compat_task
      checks the compat status of the task and on which the compat status of a
      syscall can differ from the compat status of the task.  On x86,
      is_compat_task checks the syscall type, not the task type.
      
      [akpm@linux-foundation.org: add comment, per Sam]
      [akpm@linux-foundation.org: update comment, per Andy]
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      069923d8
  2. 21 3月, 2016 1 次提交
    • A
      x86/kallsyms: fix GOLD link failure with new relative kallsyms table format · 142b9e6c
      Ard Biesheuvel 提交于
      Commit 2213e9a6 ("kallsyms: add support for relative offsets in
      kallsyms address table") changed the default kallsyms symbol table
      format to use relative references rather than absolute addresses.
      
      This reduces the size of the kallsyms symbol table by 50% on 64-bit
      architectures, and further reduces the size of the relocation tables
      used by relocatable kernels.  Since the memory footprint of the static
      kernel image is always much smaller than 4 GB, these relative references
      are assumed to be representable in 32 bits, even when the native word
      size is 64 bits.
      
      On 64-bit architectures, this obviously only works if the distance
      between each relative reference and the chosen anchor point is
      representable in 32 bits, and so the table generation code in
      scripts/kallsyms.c scans the table for the lowest value that is covered
      by the kernel text, and selects it as the anchor point.
      
      However, when using the GOLD linker rather than the default BFD linker
      to build the x86_64 kernel, the symbol phys_offset_64, which is the
      result of arithmetic defined in the linker script, is emitted as a 'T'
      rather than an 'A' type symbol, resulting in scripts/kallsyms.c to
      mistake it for a suitable anchor point, even though it is far away from
      the actual kernel image in the virtual address space.  This results in
      out-of-range warnings from scripts/kallsyms.c and a broken build.
      
      So let's align with the BFD linker, and emit the phys_offset_[32|64]
      symbols as absolute symbols explicitly.  Note that the out of range
      issue does not exist on 32-bit x86, but this patch changes both symbols
      for symmetry.
      Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      142b9e6c
  3. 19 3月, 2016 9 次提交
  4. 18 3月, 2016 23 次提交