1. 23 3月, 2016 40 次提交
    • L
      Merge branch 'akpm' (patches from Andrew) · a24e3d41
      Linus Torvalds 提交于
      Merge third patch-bomb from Andrew Morton:
      
       - more ocfs2 changes
      
       - a few hotfixes
      
       - Andy's compat cleanups
      
       - misc fixes to fatfs, ptrace, coredump, cpumask, creds, eventfd,
         panic, ipmi, kgdb, profile, kfifo, ubsan, etc.
      
       - many rapidio updates: fixes, new drivers.
      
       - kcov: kernel code coverage feature.  Like gcov, but not
         "prohibitively expensive".
      
       - extable code consolidation for various archs
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (81 commits)
        ia64/extable: use generic search and sort routines
        x86/extable: use generic search and sort routines
        s390/extable: use generic search and sort routines
        alpha/extable: use generic search and sort routines
        kernel/...: convert pr_warning to pr_warn
        drivers: dma-coherent: use memset_io for DMA_MEMORY_IO mappings
        drivers: dma-coherent: use MEMREMAP_WC for DMA_MEMORY_MAP
        memremap: add MEMREMAP_WC flag
        memremap: don't modify flags
        kernel/signal.c: add compile-time check for __ARCH_SI_PREAMBLE_SIZE
        mm/mprotect.c: don't imply PROT_EXEC on non-exec fs
        ipc/sem: make semctl setting sempid consistent
        ubsan: fix tree-wide -Wmaybe-uninitialized false positives
        kfifo: fix sparse complaints
        scripts/gdb: account for changes in module data structure
        scripts/gdb: add cmdline reader command
        scripts/gdb: add version command
        kernel: add kcov code coverage
        profile: hide unused functions when !CONFIG_PROC_FS
        hpwdt: use nmi_panic() when kernel panics in NMI handler
        ...
      a24e3d41
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · b91d9c67
      Linus Torvalds 提交于
      Pull more KVM updates from Paolo Bonzini:
       "Second round of KVM changes for 4.6:
      
         - build fixes for PPC KVM
         - miscellaneous bugfixes for ARM KVM
         - cleanup of memory barrier and removal of redundant barriers
         - x86 fixes: page tracking oops, support for old buggy KVM nested on 4.5
         - support for protection keys in guests
         - lockdep fix
         - another conversion to simple wait queues and raw spinlocks,
           backported from PREEMPT_RT"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (27 commits)
        KVM: page_track: fix access to NULL slot
        KVM: PPC: do not compile in vfio.o unconditionally
        kvm, rt: change async pagefault code locking for PREEMPT_RT
        KVM/PPC: update the comment of memory barrier in the kvmppc_prepare_to_enter()
        KVM/x86: update the comment of memory barrier in the vcpu_enter_guest()
        KVM: Replace smp_mb() with smp_load_acquire() in the kvm_flush_remote_tlbs()
        KVM/x86: Call smp_wmb() before increasing tlbs_dirty
        KVM: Replace smp_mb() with smp_mb_after_atomic() in the kvm_make_all_cpus_request()
        KVM/x86: Replace smp_mb() with smp_store_mb/release() in the walk_shadow_page_lockless_begin/end()
        KVM: Remove redundant smp_mb() in the kvm_mmu_commit_zap_page()
        KVM, pkeys: expose CPUID/CR4 to guest
        KVM, pkeys: add pkeys support for permission_fault
        KVM, pkeys: introduce pkru_mask to cache conditions
        KVM, pkeys: save/restore PKRU when guest/host switches
        x86: pkey: introduce write_pkru() for KVM
        KVM, pkeys: add pkeys support for xsave state
        KVM, pkeys: disable pkeys for guests in non-paging mode
        KVM: x86: remove magic number with enum cpuid_leafs
        KVM: MMU: return page fault error code from permission_fault
        KVM: fix spin_lock_init order on x86
        ...
      b91d9c67
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · b8ba4526
      Linus Torvalds 提交于
      Pull more rdma updates from Doug Ledford:
       "Round two of 4.6 merge window patches.
      
        This is a monster pull request.  I held off on the hfi1 driver updates
        (the hfi1 driver is intimately tied to the qib driver and the new
        rdmavt software library that was created to help both of them) in my
        first pull request.  The hfi1/qib/rdmavt update is probably 90% of
        this pull request.  The hfi1 driver is being left in staging so that
        it can be fixed up in regards to the API that Al and yourself didn't
        like.  Intel has agreed to do the work, but in the meantime, this
        clears out 300+ patches in the backlog queue and brings my tree and
        their tree closer to sync.
      
        This also includes about 10 patches to the core and a few to mlx5 to
        create an infrastructure for configuring SRIOV ports on IB devices.
        That series includes one patch to the net core that we sent to netdev@
        and Dave Miller with each of the three revisions to the series.  We
        didn't get any response to the patch, so we took that as implicit
        approval.
      
        Finally, this series includes Intel's new iWARP driver for their x722
        cards.  It's not nearly the beast as the hfi1 driver.  It also has a
        linux-next merge issue, but that has been resolved and it now passes
        just fine.
      
        Summary:
      
         - A few minor core fixups needed for the next patch series
      
         - The IB SRIOV series.  This has bounced around for several versions.
           Of note is the fact that the first patch in this series effects the
           net core.  It was directed to netdev and DaveM for each iteration
           of the series (three versions total).  Dave did not object, but did
           not respond either.  I've taken this as permission to move forward
           with the series.
      
         - The new Intel X722 iWARP driver
      
         - A huge set of updates to the Intel hfi1 driver.  Of particular
           interest here is that we have left the driver in staging since it
           still has an API that people object to.  Intel is working on a fix,
           but getting these patches in now helps keep me sane as the upstream
           and Intel's trees were over 300 patches apart"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (362 commits)
        IB/ipoib: Allow mcast packets from other VFs
        IB/mlx5: Implement callbacks for manipulating VFs
        net/mlx5_core: Implement modify HCA vport command
        net/mlx5_core: Add VF param when querying vport counter
        IB/ipoib: Add ndo operations for configuring VFs
        IB/core: Add interfaces to control VF attributes
        IB/core: Support accessing SA in virtualized environment
        IB/core: Add subnet prefix to port info
        IB/mlx5: Fix decision on using MAD_IFC
        net/core: Add support for configuring VF GUIDs
        IB/{core, ulp} Support above 32 possible device capability flags
        IB/core: Replace setting the zero values in ib_uverbs_ex_query_device
        net/mlx5_core: Introduce offload arithmetic hardware capabilities
        net/mlx5_core: Refactor device capability function
        net/mlx5_core: Fix caching ATOMIC endian mode capability
        ib_srpt: fix a WARN_ON() message
        i40iw: Replace the obsolete crypto hash interface with shash
        IB/hfi1: Add SDMA cache eviction algorithm
        IB/hfi1: Switch to using the pin query function
        IB/hfi1: Specify mm when releasing pages
        ...
      b8ba4526
    • A
      ia64/extable: use generic search and sort routines · 8fe9752e
      Ard Biesheuvel 提交于
      Replace the arch specific versions of search_extable() and
      sort_extable() with calls to the generic ones, which now support
      relative exception tables as well.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8fe9752e
    • A
      x86/extable: use generic search and sort routines · 29934b0f
      Ard Biesheuvel 提交于
      Replace the arch specific versions of search_extable() and
      sort_extable() with calls to the generic ones, which now support
      relative exception tables as well.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      29934b0f
    • A
      s390/extable: use generic search and sort routines · c352e8b6
      Ard Biesheuvel 提交于
      Replace the arch specific versions of search_extable() and
      sort_extable() with calls to the generic ones, which now support
      relative exception tables as well.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c352e8b6
    • A
      alpha/extable: use generic search and sort routines · e77986b5
      Ard Biesheuvel 提交于
      Replace the arch specific versions of search_extable() and
      sort_extable() with calls to the generic ones, which now support
      relative exception tables as well.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NRichard Henderson <rth@twiddle.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e77986b5
    • J
      kernel/...: convert pr_warning to pr_warn · a395d6a7
      Joe Perches 提交于
      Use the more common logging method with the eventual goal of removing
      pr_warning altogether.
      
      Miscellanea:
      
       - Realign arguments
       - Coalesce formats
       - Add missing space between a few coalesced formats
      Signed-off-by: NJoe Perches <joe@perches.com>
      Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	[kernel/power/suspend.c]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a395d6a7
    • B
      drivers: dma-coherent: use memset_io for DMA_MEMORY_IO mappings · 20d7a35b
      Brian Starkey 提交于
      Use memset_io() for DMA_MEMORY_IO mappings which are mapped as I/O
      memory, and regular memset() for DMA_MEMORY_MAP mappings.
      
      This fixes the below alignment fault on arm64 for DMA_MEMORY_IO
      mappings, where memset() uses the DC ZVA instruction which is invalid on
      device memory.
      
         Unhandled fault: alignment fault (0x96000061) at 0xffffff8000380000
         Internal error: : 96000061 [#1] PREEMPT SMP
         Modules linked in: hdlcd(+) clk_scpi
         CPU: 4 PID: 1355 Comm: systemd-udevd Not tainted 4.4.0-rc1+ #5
         Hardware name: ARM Juno development board (r0) (DT)
         task: ffffffc9763eee00 ti: ffffffc9758c4000 task.ti: ffffffc9758c4000
         PC is at __efistub_memset+0x1ac/0x200
         LR is at dma_alloc_from_coherent+0xb0/0x120
         pc : [<ffffffc00030ff2c>] lr : [<ffffffc00042a918>] pstate: 400001c5
         sp : ffffffc9758c79a0
         x29: ffffffc9758c79a0 x28: ffffffc000635cd0
         x27: 0000000000000124 x26: ffffffc000119ef4
         x25: 0000000000010000 x24: 0000000000000140
         x23: ffffffc07e9ac3a8 x22: ffffffc9758c7a58
         x21: ffffffc9758c7a68 x20: 0000000000000004
         x19: ffffffc07e9ac380 x18: 0000000000000001
         x17: 0000007fae1bbba8 x16: ffffffc0001b2d1c
         x15: ffffffffffffffff x14: 0ffffffffffffffe
         x13: 0000000000000010 x12: ffffff800837ffff
         x11: ffffff800837ffff x10: 0000000040000000
         x9 : 0000000000000000 x8 : ffffff8000380000
         x7 : 0000000000000000 x6 : 000000000000003f
         x5 : 0000000000000040 x4 : 0000000000000000
         x3 : 0000000000000004 x2 : 000000000000ffc0
         x1 : 0000000000000000 x0 : ffffff8000380000
      Signed-off-by: NBrian Starkey <brian.starkey@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      20d7a35b
    • B
      drivers: dma-coherent: use MEMREMAP_WC for DMA_MEMORY_MAP · 6b03ae0d
      Brian Starkey 提交于
      When the DMA_MEMORY_MAP flag is used, memory which can be accessed
      directly should be returned, so use memremap(..., MEMREMAP_WC) to
      provide a writecombine mapping.
      Signed-off-by: NBrian Starkey <brian.starkey@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6b03ae0d
    • B
      memremap: add MEMREMAP_WC flag · c907e0eb
      Brian Starkey 提交于
      Add a flag to memremap() for writecombine mappings.  Mappings satisfied
      by this flag will not be cached, however writes may be delayed or
      combined into more efficient bursts.  This is most suitable for buffers
      written sequentially by the CPU for use by other DMA devices.
      Signed-off-by: NBrian Starkey <brian.starkey@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c907e0eb
    • B
      memremap: don't modify flags · cf61e2a1
      Brian Starkey 提交于
      These patches implement a MEMREMAP_WC flag for memremap(), which can be
      used to obtain writecombine mappings.  This is then used for setting up
      dma_coherent_mem regions which use the DMA_MEMORY_MAP flag.
      
      The motivation is to fix an alignment fault on arm64, and the suggestion
      to implement MEMREMAP_WC for this case was made at [1].  That particular
      issue is handled in patch 4, which makes sure that the appropriate
      memset function is used when zeroing allocations mapped as IO memory.
      
      This patch (of 4):
      
      Don't modify the flags input argument to memremap(). MEMREMAP_WB is
      already a special case so we can check for it directly instead of
      clearing flag bits in each mapper.
      Signed-off-by: NBrian Starkey <brian.starkey@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cf61e2a1
    • H
      kernel/signal.c: add compile-time check for __ARCH_SI_PREAMBLE_SIZE · 41b27154
      Helge Deller 提交于
      The value of __ARCH_SI_PREAMBLE_SIZE defines the size (including
      padding) of the part of the struct siginfo that is before the union, and
      it is then used to calculate the needed padding (SI_PAD_SIZE) to make
      the size of struct siginfo equal to 128 (SI_MAX_SIZE) bytes.
      
      Depending on the target architecture and word width it equals to either
      3 or 4 times sizeof int.
      
      Since the very beginning we had __ARCH_SI_PREAMBLE_SIZE wrong on the
      parisc architecture for the 64bit kernel build.  It's even more
      frustrating, because it can easily be checked at compile time if the
      value was defined correctly.
      
      This patch adds such a check for the correctness of
      __ARCH_SI_PREAMBLE_SIZE in the hope that it will prevent existing and
      future architectures from running into the same problem.
      
      I refrained from replacing __ARCH_SI_PREAMBLE_SIZE by offsetof() in
      copy_siginfo() in include/asm-generic/siginfo.h, because a) it doesn't
      make any difference and b) it's used in the Documentation/kmemcheck.txt
      example.
      
      I ran this patch through the 0-DAY kernel test infrastructure and only
      the parisc architecture triggered as expected.  That means that this
      patch should be OK for all major architectures.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      41b27154
    • P
      mm/mprotect.c: don't imply PROT_EXEC on non-exec fs · f138556d
      Piotr Kwapulinski 提交于
      The mprotect(PROT_READ) fails when called by the READ_IMPLIES_EXEC
      binary on a memory mapped file located on non-exec fs.  The mprotect
      does not check whether fs is _executable_ or not.  The PROT_EXEC flag is
      set automatically even if a memory mapped file is located on non-exec
      fs.  Fix it by checking whether a memory mapped file is located on a
      non-exec fs.  If so the PROT_EXEC is not implied by the PROT_READ.  The
      implementation uses the VM_MAYEXEC flag set properly in mmap.  Now it is
      consistent with mmap.
      
      I did the isolated tests (PT_GNU_STACK X/NX, multiple VMAs, X/NX fs).  I
      also patched the official 3.19.0-47-generic Ubuntu 14.04 kernel and it
      seems to work.
      Signed-off-by: NPiotr Kwapulinski <kwapulinski.piotr@gmail.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f138556d
    • D
      ipc/sem: make semctl setting sempid consistent · a5f4db87
      Davidlohr Bueso 提交于
      As indicated by bug#112271, Linux sets the sempid value upon semctl, and
      not only for semop calls.  However, within semctl we only do this for
      SETVAL, leaving SETALL without updating the field, and therefore rather
      inconsistent behavior when compared to other Unices.
      
      There is really no documentation regarding this and therefore users
      should not make assumptions.  With this patch, along with updating
      semctl.2 manpages, this scenario should become less ambiguous As such,
      set sempid on SETALL cmd.
      
      Also update some in-code documentation, specifying where the sempid is
      set.
      
      Passes ltp and custom testcase where a child (fork) does SETALL to the
      set.
      Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Reported-by: NPhilip Semanchuk <linux_kernel.20.ick@spamgourmet.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Herton R. Krzesinski <herton@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a5f4db87
    • A
      ubsan: fix tree-wide -Wmaybe-uninitialized false positives · dde5cf39
      Andrey Ryabinin 提交于
      -fsanitize=* options makes GCC less smart than usual and increase number
      of 'maybe-uninitialized' false-positives. So this patch does two things:
      
       * Add -Wno-maybe-uninitialized to CFLAGS_UBSAN which will disable all
         such warnings for instrumented files.
      
       * Remove CONFIG_UBSAN_SANITIZE_ALL from all[yes|mod]config builds. So
         the all[yes|mod]config build goes without -fsanitize=* and still with
         -Wmaybe-uninitialized.
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dde5cf39
    • S
      kfifo: fix sparse complaints · 21b2f443
      Stefani Seibold 提交于
      This patch fix complaints by the sparse tool when using kfifo_put() with
      non scalar types like structures (i.e.
      drivers/iio/industrialio-event.c).
      
      Casting a pointer to the value and read this pointer instead of directly
      casting the value will fix this.
      
      The generated code is equal.
      Signed-off-by: NStefani Seibold <stefani@seibold.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      21b2f443
    • J
      scripts/gdb: account for changes in module data structure · ad4db3b2
      Jan Kiszka 提交于
      Commit 7523e4dc ("module: use a structure to encapsulate layout.")
      factored out the module_layout structure.  Adjust the symbol loader and
      the lsmod command to this.
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Reviewed-by: NKieran Bingham <kieran.bingham@linaro.org>
      Tested-by: Kieran Bingham <kieran.bingham@linaro.org> (qemu-{ARM,x86})
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: <stable@vger.kernel.org>	[4.4+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ad4db3b2
    • K
      scripts/gdb: add cmdline reader command · 72bf92ec
      Kieran Bingham 提交于
      lx-cmdline Report the Linux Commandline used in the current kernel
      
      [jan.kiszka@siemens.com: remove blank line from help output and fix pep8 warning]
      Signed-off-by: NKieran Bingham <kieran.bingham@linaro.org>
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72bf92ec
    • K
      scripts/gdb: add version command · 2d061d99
      Kieran Bingham 提交于
      lx-version Report the Linux Version of the current kernel.
      
      Add a command to identify the version specified by the banner in the
      debugged kernel.
      
      This lets the user identify the kernel of the running kernel, and will
      let later scripts compare the banner of the attached kernel against the
      banner in the vmlinux symbols files to verify that the files are
      correct.
      
      [jan.kiszka@siemens.com: remove blank line from help output and fix pep8 warning]
      Signed-off-by: NKieran Bingham <kieran.bingham@linaro.org>
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2d061d99
    • D
      kernel: add kcov code coverage · 5c9a8750
      Dmitry Vyukov 提交于
      kcov provides code coverage collection for coverage-guided fuzzing
      (randomized testing).  Coverage-guided fuzzing is a testing technique
      that uses coverage feedback to determine new interesting inputs to a
      system.  A notable user-space example is AFL
      (http://lcamtuf.coredump.cx/afl/).  However, this technique is not
      widely used for kernel testing due to missing compiler and kernel
      support.
      
      kcov does not aim to collect as much coverage as possible.  It aims to
      collect more or less stable coverage that is function of syscall inputs.
      To achieve this goal it does not collect coverage in soft/hard
      interrupts and instrumentation of some inherently non-deterministic or
      non-interesting parts of kernel is disbled (e.g.  scheduler, locking).
      
      Currently there is a single coverage collection mode (tracing), but the
      API anticipates additional collection modes.  Initially I also
      implemented a second mode which exposes coverage in a fixed-size hash
      table of counters (what Quentin used in his original patch).  I've
      dropped the second mode for simplicity.
      
      This patch adds the necessary support on kernel side.  The complimentary
      compiler support was added in gcc revision 231296.
      
      We've used this support to build syzkaller system call fuzzer, which has
      found 90 kernel bugs in just 2 months:
      
        https://github.com/google/syzkaller/wiki/Found-Bugs
      
      We've also found 30+ bugs in our internal systems with syzkaller.
      Another (yet unexplored) direction where kcov coverage would greatly
      help is more traditional "blob mutation".  For example, mounting a
      random blob as a filesystem, or receiving a random blob over wire.
      
      Why not gcov.  Typical fuzzing loop looks as follows: (1) reset
      coverage, (2) execute a bit of code, (3) collect coverage, repeat.  A
      typical coverage can be just a dozen of basic blocks (e.g.  an invalid
      input).  In such context gcov becomes prohibitively expensive as
      reset/collect coverage steps depend on total number of basic
      blocks/edges in program (in case of kernel it is about 2M).  Cost of
      kcov depends only on number of executed basic blocks/edges.  On top of
      that, kernel requires per-thread coverage because there are always
      background threads and unrelated processes that also produce coverage.
      With inlined gcov instrumentation per-thread coverage is not possible.
      
      kcov exposes kernel PCs and control flow to user-space which is
      insecure.  But debugfs should not be mapped as user accessible.
      
      Based on a patch by Quentin Casasnovas.
      
      [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
      [akpm@linux-foundation.org: unbreak allmodconfig]
      [akpm@linux-foundation.org: follow x86 Makefile layout standards]
      Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: Vegard Nossum <vegard.nossum@oracle.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Tavis Ormandy <taviso@google.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: David Drysdale <drysdale@google.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c9a8750
    • A
      profile: hide unused functions when !CONFIG_PROC_FS · ade356b9
      Arnd Bergmann 提交于
      A couple of functions and variables in the profile implementation are
      used only on SMP systems by the procfs code, but are unused if either
      procfs is disabled or in uniprocessor kernels.  gcc prints a harmless
      warning about the unused symbols:
      
        kernel/profile.c:243:13: error: 'profile_flip_buffers' defined but not used [-Werror=unused-function]
         static void profile_flip_buffers(void)
                     ^
        kernel/profile.c:266:13: error: 'profile_discard_flip_buffers' defined but not used [-Werror=unused-function]
         static void profile_discard_flip_buffers(void)
                     ^
        kernel/profile.c:330:12: error: 'profile_cpu_callback' defined but not used [-Werror=unused-function]
         static int profile_cpu_callback(struct notifier_block *info,
                    ^
      
      This adds further #ifdef to the file, to annotate exactly in which cases
      they are used.  I have done several thousand ARM randconfig kernels with
      this patch applied and no longer get any warnings in this file.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Robin Holt <robinmholt@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ade356b9
    • H
      hpwdt: use nmi_panic() when kernel panics in NMI handler · abc514c5
      Hidehiro Kawai 提交于
      Commit 1717f209 ("panic, x86: Fix re-entrance problem due to panic
      on NMI") introduced nmi_panic() which prevents concurrent and recursive
      execution of panic().  It also saves registers for the crash dump on x86
      by later commit 58c5661f ("panic, x86: Allow CPUs to save registers
      even if looping in NMI context").
      
      hpwdt driver can call panic() from NMI handler, so replace it with
      nmi_panic().  Also, do some cleanups.
      Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Acked-by: NGuenter Roeck <linux@roeck-us.net>
      Cc: Thomas Mingarelli <thomas.mingarelli@hpe.com>
      Cc: Wim Van Sebroeck <wim@iguana.be>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      abc514c5
    • H
      ipmi/watchdog: use nmi_panic() when kernel panics in NMI handler · 73cbf4a1
      Hidehiro Kawai 提交于
      Commit 1717f209 ("panic, x86: Fix re-entrance problem due to panic
      on NMI") introduced nmi_panic() which prevents concurrent and recursive
      execution of panic().  It also saves registers for the crash dump on x86
      by later commit 58c5661f ("panic, x86: Allow CPUs to save registers
      even if looping in NMI context").
      
      ipmi_watchdog driver can call panic() from NMI handler, so replace it
      with nmi_panic().
      Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Acked-by: NCorey Minyard <cminyard@mvista.com>
      Acked-by: NGuenter Roeck <linux@roeck-us.net>
      Reviewed-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      73cbf4a1
    • H
      panic: change nmi_panic from macro to function · ebc41f20
      Hidehiro Kawai 提交于
      Commit 1717f209 ("panic, x86: Fix re-entrance problem due to panic
      on NMI") and commit 58c5661f ("panic, x86: Allow CPUs to save
      registers even if looping in NMI context") introduced nmi_panic() which
      prevents concurrent/recursive execution of panic().  It also saves
      registers for the crash dump on x86.
      
      However, there are some cases where NMI handlers still use panic().
      This patch set partially replaces them with nmi_panic() in those cases.
      
      Even this patchset is applied, some NMI or similar handlers (e.g.  MCE
      handler) continue to use panic().  This is because I can't test them
      well and actual problems won't happen.  For example, the possibility
      that normal panic and panic on MCE happen simultaneously is very low.
      
      This patch (of 3):
      
      Convert nmi_panic() to a proper function and export it instead of
      exporting internal implementation details to modules, for obvious
      reasons.
      Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NMichal Nazarewicz <mina86@mina86.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
      Cc: Javi Merino <javi.merino@arm.com>
      Cc: Gobinda Charan Maji <gobinda.cemk07@gmail.com>
      Cc: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ebc41f20
    • P
      eventfd: document lockless access in eventfd_poll · a484c3dd
      Paolo Bonzini 提交于
      Since commit e22553e2 ("eventfd: don't take the spinlock in
      eventfd_poll", 2015-02-17), eventfd is reading ctx->count outside
      ctx->wqh.lock.
      
      However, things aren't as simple as the read barrier in eventfd_poll
      would suggest.  In fact, the read barrier, besides lacking a comment, is
      not paired in any obvious manner with another read barrier, and it is
      pointless because it is sitting between a write (deep in poll_wait) and
      the read of ctx->count.  The read barrier is acting just as a compiler
      barrier, for which we can use READ_ONCE instead.  This is what the code
      change in this patch does.
      
      The documentation change is just as important, however.  The question,
      posed by Andrea Arcangeli, is then why the thing is safe on
      architectures where spin_unlock does not imply a store-load memory
      barrier.  The answer is that it's safe because writes of ctx->count use
      the same lock as poll_wait, and hence an acquire barrier implicit in
      poll_wait provides the necessary synchronization between eventfd_poll
      and callers of wake_up_locked_poll.  This is sort of mentioned in the
      commit message with respect to eventfd_ctx_read ("eventfd_read is
      similar, it will do a single decrement with the lock held") but it
      applies to all other callers too.  It's tricky enough that it should be
      documented in the code.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: Chris Mason <clm@fb.com>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a484c3dd
    • A
      cred/userns: define current_user_ns() as a function · 0335695d
      Arnd Bergmann 提交于
      The current_user_ns() macro currently returns &init_user_ns when user
      namespaces are disabled, and that causes several warnings when building
      with gcc-6.0 in code that compares the result of the macro to
      &init_user_ns itself:
      
        fs/xfs/xfs_ioctl.c: In function 'xfs_ioctl_setattr_check_projid':
        fs/xfs/xfs_ioctl.c:1249:22: error: self-comparison always evaluates to true [-Werror=tautological-compare]
          if (current_user_ns() == &init_user_ns)
      
      This is a legitimate warning in principle, but here it isn't really
      helpful, so I'm reprasing the definition in a way that shuts up the
      warning.  Apparently gcc only warns when comparing identical literals,
      but it can figure out that the result of an inline function can be
      identical to a constant expression in order to optimize a condition yet
      not warn about the fact that the condition is known at compile time.
      This is exactly what we want here, and it looks reasonable because we
      generally prefer inline functions over macros anyway.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0335695d
    • A
      rapidio: add mport char device driver · e8de3701
      Alexandre Bounine 提交于
      Add mport character device driver to provide user space interface to
      basic RapidIO subsystem operations.
      
      See included Documentation/rapidio/mport_cdev.txt for more details.
      
      [akpm@linux-foundation.org: fix printk warning on i386]
      [dan.carpenter@oracle.com: mport_cdev: fix some error codes]
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Tested-by: NBarry Wood <barry.wood@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Cc: Barry Wood <barry.wood@idt.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8de3701
    • A
      rapidio/tsi721_dma: fix hardware error handling · 458bdf6e
      Alexandre Bounine 提交于
      Add DMA channel re-initialization after an error to avoid termination of
      all pending transfer requests.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Reported-by: NBarry Wood <barry.wood@idt.com>
      Tested-by: NBarry Wood <barry.wood@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Cc: Barry Wood <barry.wood@idt.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      458bdf6e
    • A
      rapidio/tsi721_dma: fix synchronization issues · e680b672
      Alexandre Bounine 提交于
      Fix synchronization issues found during testing using multiple DMA
      transfer requests to the same channel:
      
       - lost MSI-X interrupt notifications
       - non-synchronized attempts to start DMA channel HW resulting in error
         message from the driver
       - cookie tracking/update race conditions resulting in incorrect DMA
         transfer status report
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Reported-by: NBarry Wood <barry.wood@idt.com>
      Tested-by: NBarry Wood <barry.wood@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Cc: Barry Wood <barry.wood@idt.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e680b672
    • A
      rapidio/tsi721_dma: update error reporting from prep_sg callback · 83472457
      Alexandre Bounine 提交于
      Switch to returning error-valued pointer instead of simple NULL pointer.
      This allows to properly identify situation when request queue is full
      and therefore gives to upper layer an option to retry operation later.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83472457
    • A
      rapidio/tsi721: add filtered debug output · 72d8a0d2
      Alexandre Bounine 提交于
      Replace "all-or-nothing" debug output with controlled debug output using
      functional block masks.  This allows run time control of debug messages
      through 'dbg_level' module parameter.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72d8a0d2
    • A
      rapidio/tsi721: add outbound windows mapping support · 1679e8da
      Alexandre Bounine 提交于
      Add device-specific callback functions to support outbound windows
      mapping and release.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1679e8da
    • A
      rapidio: add outbound window support · 93bdaca5
      Alexandre Bounine 提交于
      Add RapidIO controller (mport) outbound window configuration operations.
      
      This patch is a part of the original patch submitted by Li Yang:
      
         https://lists.ozlabs.org/pipermail/linuxppc-dev/2009-April/071210.html
      
      For some reason the original part was not applied to mainline code
      tree.  The inbound window mapping part has been applied later during
      tsi721 mport driver submission.  Now goes the second part with
      corresponding HW support.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Li Yang <leoli@freescale.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93bdaca5
    • A
      rapidio/tsi721: fix locking in OB_MSG processing · 2ece1caf
      Alexandre Bounine 提交于
      - Add spinlock protection into outbound message queuing routine.
      
      - Change outbound message interrupt handler to avoid deadlock when
        calling registered callback routine.
      
      - Allow infinite retries for outbound messages to avoid retry threshold
        error signaling in systems with nodes that have slow message receive
        queue processing.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2ece1caf
    • A
      rapidio: add global inbound port write interfaces · 9a0b0627
      Alexandre Bounine 提交于
      Add new Port Write handler registration interfaces that attach PW
      handlers to local mport device objects.  This is different from old
      interface that attaches PW callback to individual RapidIO device.  The
      new interfaces are intended for use for common event handling (e.g.
      hot-plug notifications) while the old interface is available for
      individual device drivers.
      
      This patch is based on patch proposed by Andre van Herk but preserves
      existing per-device interface and adds lock protection for list
      handling.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a0b0627
    • A
      rapidio: move rio_pw_enable into core code · b6cb95e8
      Alexandre Bounine 提交于
      Make rio_pw_enable() routine available to other RapidIO drivers.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b6cb95e8
    • A
      rapidio: move rio_local_set_device_id function to the common core · 5024622f
      Alexandre Bounine 提交于
      Make function rio_local_set_device_id() common for all components of
      RapidIO subsystem.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5024622f
    • A
      rapidio: add lock protection for doorbell list · a7b4c636
      Alexandre Bounine 提交于
      Add lock protection around doorbell list handling to prevent list
      corruption on SMP platforms.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7b4c636
    • A
      rapidio/rionet: add mport removal handling · b7dfca8b
      Alexandre Bounine 提交于
      Add handling of a local mport device removal.
      
      RIONET driver registers itself as class interface that supports only
      removal notification, 'add_device' callback is not provided because
      RIONET network device can be initialized only after enumeration is
      completed and the existing method (using remote peer addition) satisfies
      this condition.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b7dfca8b