1. 16 4月, 2018 5 次提交
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 174e7194
      Linus Torvalds 提交于
      Pull more perf updates from Thomas Gleixner:
       "A rather large set of perf updates:
      
        Kernel:
      
         - Fix various initialization issues
      
         - Prevent creating [ku]probes for not CAP_SYS_ADMIN users
      
        Tooling:
      
         - Show only failing syscalls with 'perf trace --failure' (Arnaldo
           Carvalho de Melo)
      
                  e.g: See what 'openat' syscalls are failing:
      
              # perf trace --failure -e openat
               762.323 ( 0.007 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video2) = -1 ENOENT No such file or directory
               <SNIP N /dev/videoN open attempts... sigh, where is that improvised camera lid?!? >
               790.228 ( 0.008 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video63) = -1 ENOENT No such file or directory
              ^C#
      
         - Show information about the event (freq, nr_samples, total
           period/nr_events) in the annotate --tui and --stdio2 'perf
           annotate' output, similar to the first line in the 'perf report
           --tui', but just for the samples for a the annotated symbol
           (Arnaldo Carvalho de Melo)
      
         - Introduce 'perf version --build-options' to show what features were
           linked, aliased as well as a shorter 'perf -vv' (Jin Yao)
      
         - Add a "dso_size" sort order (Kim Phillips)
      
         - Remove redundant ')' in the tracepoint output in 'perf trace'
           (Changbin Du)
      
         - Synchronize x86's cpufeatures.h, no effect on toolss (Arnaldo
           Carvalho de Melo)
      
         - Show group details on the title line in the annotate browser and
           'perf annotate --stdio2' output, so that the per-event columns can
           have headers (Arnaldo Carvalho de Melo)
      
         - Fixup vertical line separating metrics from instructions and
           cleaning unused lines at the bottom, both in the annotate TUI
           browser (Arnaldo Carvalho de Melo)
      
         - Remove duplicated 'samples' in lost samples warning in
           'perf report' (Arnaldo Carvalho de Melo)
      
         - Synchronize i915_drm.h, silencing the perf build process,
           automagically adding support for the new DRM_I915_QUERY ioctl
           (Arnaldo Carvalho de Melo)
      
         - Make auxtrace_queues__add_buffer() allocate struct buffer, from a
           patchkit already applied (Adrian Hunter)
      
         - Fix the --stdio2/TUI annotate output to include group details, be
           it for a recorded '{a,b,f}' explicit event group or when forcing
           group display using 'perf report --group' for a set of events not
           recorded as a group (Arnaldo Carvalho de Melo)
      
         - Fix display artifacts in the ui browser (base class for the
           annotate and main report/top TUI browser) related to the extra
           title lines work (Arnaldo Carvalho de Melo)
      
         - perf auxtrace refactorings, leftovers from a previously partially
           processed patchset (Adrian Hunter)
      
         - Fix the builtin clang build (Sandipan Das, Arnaldo Carvalho de
           Melo)
      
         - Synchronize i915_drm.h, silencing a perf build warning and in the
           process automagically adding support for a new ioctl command
           (Arnaldo Carvalho de Melo)
      
         - Fix a strncpy issue in uprobe tracing"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
        perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open()
        tracing/uprobe_event: Fix strncpy corner case
        perf/core: Fix perf_uprobe_init()
        perf/core: Fix perf_kprobe_init()
        perf/core: Fix use-after-free in uprobe_perf_close()
        perf tests clang: Fix function name for clang IR test
        perf clang: Add support for recent clang versions
        perf tools: Fix perf builds with clang support
        perf tools: No need to include namespaces.h in util.h
        perf hists browser: Remove leftover from row returned from refresh
        perf hists browser: Show extra_title_lines in the 'D' debug hotkey
        perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering
        tools headers uapi: Synchronize i915_drm.h
        perf report: Remove duplicated 'samples' in lost samples warning
        perf ui browser: Fixup cleaning unused lines at the bottom
        perf annotate browser: Fixup vertical line separating metrics from instructions
        perf annotate: Show group details on the title line
        perf auxtrace: Make auxtrace_queues__add_buffer() allocate struct buffer
        perf/x86/intel: Move regs->flags EXACT bit init
        perf trace: Remove redundant ')'
        ...
      174e7194
    • L
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 19ca90de
      Linus Torvalds 提交于
      Pull x86 EFI bootup fixlet from Thomas Gleixner:
       "A single fix for an early boot warning caused by invoking
        this_cpu_has() before SMP initialization"
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm: Fix bogus warning during EFI bootup, use boot_cpu_has() instead of this_cpu_has() in build_cr3_noflush()
      19ca90de
    • L
      Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 68d54d3f
      Linus Torvalds 提交于
      Pull irq affinity fixes from Thomas Gleixner:
      
        - Fix error path handling in the affinity spreading code
      
        - Make affinity spreading smarter to avoid issues on systems which
          claim to have hotpluggable CPUs while in fact they can't hotplug
          anything.
      
          So instead of trying to spread the vectors (and thereby the
          associated device queues) to all possibe CPUs, spread them on all
          present CPUs first. If there are left over vectors after that first
          step they are spread among the possible, but not present CPUs which
          keeps the code backwards compatible for virtual decives and NVME
          which allocate a queue per possible CPU, but makes the spreading
          smarter for devices which have less queues than possible or present
          CPUs.
      
      * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq/affinity: Spread irq vectors among present CPUs as far as possible
        genirq/affinity: Allow irq spreading from a given starting point
        genirq/affinity: Move actual irq vector spreading into a helper function
        genirq/affinity: Rename *node_to_possible_cpumask as *node_to_cpumask
        genirq/affinity: Don't return with empty affinity masks on error
      68d54d3f
    • L
      Merge tag 'for-linus' of git://github.com/openrisc/linux · 9dceab89
      Linus Torvalds 提交于
      Pull OpenRISC fixlet from Stafford Horne:
       "Just one small thing here, it came in a while back but I didnt have
        anything in my 4.16 queue, still its the only thing for 4.17 so
        sending it alone.
      
        Small cleanup: remove unused __ARCH_HAVE_MMU define"
      
      * tag 'for-linus' of git://github.com/openrisc/linux:
        openrisc: remove unused __ARCH_HAVE_MMU define
      9dceab89
    • L
      Merge tag 'powerpc-4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · b1cb4f93
      Linus Torvalds 提交于
      Pull powerpc fixes from Michael Ellerman:
      
       - Fix crashes when loading modules built with a different
         CONFIG_RELOCATABLE value by adding CONFIG_RELOCATABLE to vermagic.
      
       - Fix busy loops in the OPAL NVRAM driver if we get certain error
         conditions from firmware.
      
       - Remove tlbie trace points from KVM code that's called in real mode,
         because it causes crashes.
      
       - Fix checkstops caused by invalid tlbiel on Power9 Radix.
      
       - Ensure the set of CPU features we "know" are always enabled is
         actually the minimal set when we build with support for firmware
         supplied CPU features.
      
      Thanks to: Aneesh Kumar K.V, Anshuman Khandual, Nicholas Piggin.
      
      * tag 'powerpc-4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/64s: Fix CPU_FTRS_ALWAYS vs DT CPU features
        powerpc/mm/radix: Fix checkstops caused by invalid tlbiel
        KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode
        powerpc/8xx: Fix build with hugetlbfs enabled
        powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops
        powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops
        powerpc/fscr: Enable interrupts earlier before calling get_user()
        powerpc/64s: Fix section mismatch warnings from setup_rfi_flush()
        powerpc/modules: Fix crashes by adding CONFIG_RELOCATABLE to vermagic
      b1cb4f93
  2. 14 4月, 2018 35 次提交
    • L
      Merge branch 'akpm' (patches from Andrew) · 18b7fd1c
      Linus Torvalds 提交于
      Merge yet more updates from Andrew Morton:
      
       - various hotfixes
      
       - kexec_file updates and feature work
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (27 commits)
        kernel/kexec_file.c: move purgatories sha256 to common code
        kernel/kexec_file.c: allow archs to set purgatory load address
        kernel/kexec_file.c: remove mis-use of sh_offset field during purgatory load
        kernel/kexec_file.c: remove unneeded variables in kexec_purgatory_setup_sechdrs
        kernel/kexec_file.c: remove unneeded for-loop in kexec_purgatory_setup_sechdrs
        kernel/kexec_file.c: split up __kexec_load_puragory
        kernel/kexec_file.c: use read-only sections in arch_kexec_apply_relocations*
        kernel/kexec_file.c: search symbols in read-only kexec_purgatory
        kernel/kexec_file.c: make purgatory_info->ehdr const
        kernel/kexec_file.c: remove checks in kexec_purgatory_load
        include/linux/kexec.h: silence compile warnings
        kexec_file, x86: move re-factored code to generic side
        x86: kexec_file: clean up prepare_elf64_headers()
        x86: kexec_file: lift CRASH_MAX_RANGES limit on crash_mem buffer
        x86: kexec_file: remove X86_64 dependency from prepare_elf64_headers()
        x86: kexec_file: purge system-ram walking from prepare_elf64_headers()
        kexec_file,x86,powerpc: factor out kexec_file_ops functions
        kexec_file: make use of purgatory optional
        proc: revalidate misc dentries
        mm, slab: reschedule cache_reap() on the same CPU
        ...
      18b7fd1c
    • P
      kernel/kexec_file.c: move purgatories sha256 to common code · df6f2801
      Philipp Rudo 提交于
      The code to verify the new kernels sha digest is applicable for all
      architectures.  Move it to common code.
      
      One problem is the string.c implementation on x86.  Currently sha256
      includes x86/boot/string.h which defines memcpy and memset to be gcc
      builtins.  By moving the sha256 implementation to common code and
      changing the include to linux/string.h both functions are no longer
      defined.  Thus definitions have to be provided in x86/purgatory/string.c
      
      Link: http://lkml.kernel.org/r/20180321112751.22196-12-prudo@linux.vnet.ibm.comSigned-off-by: NPhilipp Rudo <prudo@linux.vnet.ibm.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      df6f2801
    • P
      kernel/kexec_file.c: allow archs to set purgatory load address · 3be3f61d
      Philipp Rudo 提交于
      For s390 new kernels are loaded to fixed addresses in memory before they
      are booted.  With the current code this is a problem as it assumes the
      kernel will be loaded to an 'arbitrary' address.  In particular,
      kexec_locate_mem_hole searches for a large enough memory region and sets
      the load address (kexec_bufer->mem) to it.
      
      Luckily there is a simple workaround for this problem.  By returning 1
      in arch_kexec_walk_mem, kexec_locate_mem_hole is turned off.  This
      allows the architecture to set kbuf->mem by hand.  While the trick works
      fine for the kernel it does not for the purgatory as here the
      architectures don't have access to its kexec_buffer.
      
      Give architectures access to the purgatories kexec_buffer by changing
      kexec_load_purgatory to take a pointer to it.  With this change
      architectures have access to the buffer and can edit it as they need.
      
      A nice side effect of this change is that we can get rid of the
      purgatory_info->purgatory_load_address field.  As now the information
      stored there can directly be accessed from kbuf->mem.
      
      Link: http://lkml.kernel.org/r/20180321112751.22196-11-prudo@linux.vnet.ibm.comSigned-off-by: NPhilipp Rudo <prudo@linux.vnet.ibm.com>
      Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3be3f61d
    • P
      kernel/kexec_file.c: remove mis-use of sh_offset field during purgatory load · 8da0b724
      Philipp Rudo 提交于
      The current code uses the sh_offset field in purgatory_info->sechdrs to
      store a pointer to the current load address of the section.  Depending
      whether the section will be loaded or not this is either a pointer into
      purgatory_info->purgatory_buf or kexec_purgatory.  This is not only a
      violation of the ELF standard but also makes the code very hard to
      understand as you cannot tell if the memory you are using is read-only
      or not.
      
      Remove this misuse and store the offset of the section in
      pugaroty_info->purgatory_buf in sh_offset.
      
      Link: http://lkml.kernel.org/r/20180321112751.22196-10-prudo@linux.vnet.ibm.comSigned-off-by: NPhilipp Rudo <prudo@linux.vnet.ibm.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8da0b724
    • P
      kernel/kexec_file.c: remove unneeded variables in kexec_purgatory_setup_sechdrs · 620f697c
      Philipp Rudo 提交于
      The main loop currently uses quite a lot of variables to update the
      section headers.  Some of them are unnecessary.  So clean them up a
      little.
      
      Link: http://lkml.kernel.org/r/20180321112751.22196-9-prudo@linux.vnet.ibm.comSigned-off-by: NPhilipp Rudo <prudo@linux.vnet.ibm.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      620f697c
    • P
      kernel/kexec_file.c: remove unneeded for-loop in kexec_purgatory_setup_sechdrs · f1b1cca3
      Philipp Rudo 提交于
      To update the entry point there is an extra loop over all section
      headers although this can be done in the main loop.  So move it there
      and eliminate the extra loop and variable to store the 'entry section
      index'.
      
      Also, in the main loop, move the usual case, i.e.  non-bss section, out
      of the extra if-block.
      
      Link: http://lkml.kernel.org/r/20180321112751.22196-8-prudo@linux.vnet.ibm.comSigned-off-by: NPhilipp Rudo <prudo@linux.vnet.ibm.com>
      Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f1b1cca3
    • P
      kernel/kexec_file.c: split up __kexec_load_puragory · 93045705
      Philipp Rudo 提交于
      When inspecting __kexec_load_purgatory you find that it has two tasks
      
      	1) setting up the kexec_buffer for the new kernel and,
      	2) setting up pi->sechdrs for the final load address.
      
      The two tasks are independent of each other.  To improve readability
      split up __kexec_load_purgatory into two functions, one for each task,
      and call them directly from kexec_load_purgatory.
      
      Link: http://lkml.kernel.org/r/20180321112751.22196-7-prudo@linux.vnet.ibm.comSigned-off-by: NPhilipp Rudo <prudo@linux.vnet.ibm.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93045705
    • P
      kernel/kexec_file.c: use read-only sections in arch_kexec_apply_relocations* · 8aec395b
      Philipp Rudo 提交于
      When the relocations are applied to the purgatory only the section the
      relocations are applied to is writable.  The other sections, i.e.  the
      symtab and .rel/.rela, are in read-only kexec_purgatory.  Highlight this
      by marking the corresponding variables as 'const'.
      
      While at it also change the signatures of arch_kexec_apply_relocations* to
      take section pointers instead of just the index of the relocation section.
      This removes the second lookup and sanity check of the sections in arch
      code.
      
      Link: http://lkml.kernel.org/r/20180321112751.22196-6-prudo@linux.vnet.ibm.comSigned-off-by: NPhilipp Rudo <prudo@linux.vnet.ibm.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8aec395b
    • P
      kernel/kexec_file.c: search symbols in read-only kexec_purgatory · 961d921a
      Philipp Rudo 提交于
      The stripped purgatory does not contain a symtab.  So when looking for
      symbols this is done in read-only kexec_purgatory.  Highlight this by
      marking the corresponding variables as 'const'.
      
      Link: http://lkml.kernel.org/r/20180321112751.22196-5-prudo@linux.vnet.ibm.comSigned-off-by: NPhilipp Rudo <prudo@linux.vnet.ibm.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      961d921a
    • P
      kernel/kexec_file.c: make purgatory_info->ehdr const · 65c225d3
      Philipp Rudo 提交于
      The kexec_purgatory buffer is read-only.  Thus all pointers into
      kexec_purgatory are read-only, too.  Point this out by explicitly
      marking purgatory_info->ehdr as 'const' and update the comments in
      purgatory_info.
      
      Link: http://lkml.kernel.org/r/20180321112751.22196-4-prudo@linux.vnet.ibm.comSigned-off-by: NPhilipp Rudo <prudo@linux.vnet.ibm.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      65c225d3
    • P
      kernel/kexec_file.c: remove checks in kexec_purgatory_load · d2b8178c
      Philipp Rudo 提交于
      Before the purgatory is loaded several checks are done whether the ELF
      file in kexec_purgatory is valid or not.  These checks are incomplete.
      For example they don't check for the total size of the sections defined
      in the section header table or if the entry point actually points into
      the purgatory.
      
      On the other hand the purgatory, although an ELF file on its own, is
      part of the kernel.  Thus not trusting the purgatory means not trusting
      the kernel build itself.
      
      So remove all validity checks on the purgatory and just trust the kernel
      build.
      
      Link: http://lkml.kernel.org/r/20180321112751.22196-3-prudo@linux.vnet.ibm.comSigned-off-by: NPhilipp Rudo <prudo@linux.vnet.ibm.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d2b8178c
    • P
      include/linux/kexec.h: silence compile warnings · ee6ebeda
      Philipp Rudo 提交于
      Patch series "kexec_file: Clean up purgatory load", v2.
      
      Following the discussion with Dave and AKASHI, here are the common code
      patches extracted from my recent patch set (Add kexec_file_load support
      to s390) [1].  The patches were extracted to allow upstream integration
      together with AKASHI's common code patches before the arch code gets
      adjusted to the new base.
      
      The reason for this series is to prepare common code for adding
      kexec_file_load to s390 as well as cleaning up the mis-use of the
      sh_offset field during purgatory load.  In detail this series contains:
      
      Patch #1&2: Minor cleanups/fixes.
      
      Patch #3-9: Clean up the purgatory load/relocation code.  Especially
      remove the mis-use of the purgatory_info->sechdrs->sh_offset field,
      currently holding a pointer into either kexec_purgatory (ro) or
      purgatory_buf (rw) depending on the section.  With these patches the
      section address will be calculated verbosely and sh_offset will contain
      the offset of the section in the stripped purgatory binary
      (purgatory_buf).
      
      Patch #10: Allows architectures to set the purgatory load address.  This
      patch is important for s390 as the kernel and purgatory have to be
      loaded to fixed addresses.  In current code this is impossible as the
      purgatory load is opaque to the architecture.
      
      Patch #11: Moves x86 purgatories sha implementation to common lib/
      directory to allow reuse in other architectures.
      
      This patch (of 11)
      
      When building the kernel with CONFIG_KEXEC_FILE enabled gcc prints a
      compile warning multiple times.
      
        In file included from <path>/linux/init/initramfs.c:526:0:
        <path>/include/linux/kexec.h:120:9: warning: `struct kimage' declared inside parameter list [enabled by default]
                 unsigned long cmdline_len);
                 ^
      
      This is because the typedefs for kexec_file_load uses struct kimage
      before it is declared.  Fix this by simply forward declaring struct
      kimage.
      
      Link: http://lkml.kernel.org/r/20180321112751.22196-2-prudo@linux.vnet.ibm.comSigned-off-by: NPhilipp Rudo <prudo@linux.vnet.ibm.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee6ebeda
    • A
      kexec_file, x86: move re-factored code to generic side · babac4a8
      AKASHI Takahiro 提交于
      In the previous patches, commonly-used routines, exclude_mem_range() and
      prepare_elf64_headers(), were carved out.  Now place them in kexec
      common code.  A prefix "crash_" is given to each of their names to avoid
      possible name collisions.
      
      Link: http://lkml.kernel.org/r/20180306102303.9063-8-takahiro.akashi@linaro.orgSigned-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Acked-by: NDave Young <dyoung@redhat.com>
      Tested-by: NDave Young <dyoung@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      babac4a8
    • A
      x86: kexec_file: clean up prepare_elf64_headers() · eb7dae94
      AKASHI Takahiro 提交于
      Removing bufp variable in prepare_elf64_headers() makes the code simpler
      and more understandable.
      
      Link: http://lkml.kernel.org/r/20180306102303.9063-7-takahiro.akashi@linaro.orgSigned-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Acked-by: NDave Young <dyoung@redhat.com>
      Tested-by: NDave Young <dyoung@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eb7dae94
    • A
      x86: kexec_file: lift CRASH_MAX_RANGES limit on crash_mem buffer · 8d5f894a
      AKASHI Takahiro 提交于
      While CRASH_MAX_RANGES (== 16) seems to be good enough, fixed-number
      array is not a good idea in general.
      
      In this patch, size of crash_mem buffer is calculated as before and the
      buffer is now dynamically allocated.  This change also allows removing
      crash_elf_data structure.
      
      Link: http://lkml.kernel.org/r/20180306102303.9063-6-takahiro.akashi@linaro.orgSigned-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Acked-by: NDave Young <dyoung@redhat.com>
      Tested-by: NDave Young <dyoung@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d5f894a
    • A
      x86: kexec_file: remove X86_64 dependency from prepare_elf64_headers() · c72c7e67
      AKASHI Takahiro 提交于
      The code guarded by CONFIG_X86_64 is necessary on some architectures
      which have a dedicated kernel mapping outside of linear memory mapping.
      (arm64 is among those.)
      
      In this patch, an additional argument, kernel_map, is added to enable/
      disable the code removing #ifdef.
      
      Link: http://lkml.kernel.org/r/20180306102303.9063-5-takahiro.akashi@linaro.orgSigned-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Acked-by: NDave Young <dyoung@redhat.com>
      Tested-by: NDave Young <dyoung@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c72c7e67
    • A
      x86: kexec_file: purge system-ram walking from prepare_elf64_headers() · cbe66016
      AKASHI Takahiro 提交于
      While prepare_elf64_headers() in x86 looks pretty generic for other
      architectures' use, it contains some code which tries to list crash
      memory regions by walking through system resources, which is not always
      architecture agnostic.  To make this function more generic, the related
      code should be purged.
      
      In this patch, prepare_elf64_headers() simply scans crash_mem buffer
      passed and add all the listed regions to elf header as a PT_LOAD
      segment.  So walk_system_ram_res(prepare_elf64_headers_callback) have
      been moved forward before prepare_elf64_headers() where the callback,
      prepare_elf64_headers_callback(), is now responsible for filling up
      crash_mem buffer.
      
      Meanwhile exclude_elf_header_ranges() used to be called every time in
      this callback it is rather redundant and now called only once in
      prepare_elf_headers() as well.
      
      Link: http://lkml.kernel.org/r/20180306102303.9063-4-takahiro.akashi@linaro.orgSigned-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Acked-by: NDave Young <dyoung@redhat.com>
      Tested-by: NDave Young <dyoung@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cbe66016
    • A
      kexec_file,x86,powerpc: factor out kexec_file_ops functions · 9ec4ecef
      AKASHI Takahiro 提交于
      As arch_kexec_kernel_image_{probe,load}(),
      arch_kimage_file_post_load_cleanup() and arch_kexec_kernel_verify_sig()
      are almost duplicated among architectures, they can be commonalized with
      an architecture-defined kexec_file_ops array.  So let's factor them out.
      
      Link: http://lkml.kernel.org/r/20180306102303.9063-3-takahiro.akashi@linaro.orgSigned-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Acked-by: NDave Young <dyoung@redhat.com>
      Tested-by: NDave Young <dyoung@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9ec4ecef
    • A
      kexec_file: make use of purgatory optional · b799a09f
      AKASHI Takahiro 提交于
      Patch series "kexec_file, x86, powerpc: refactoring for other
      architecutres", v2.
      
      This is a preparatory patchset for adding kexec_file support on arm64.
      
      It was originally included in a arm64 patch set[1], but Philipp is also
      working on their kexec_file support on s390[2] and some changes are now
      conflicting.
      
      So these common parts were extracted and put into a separate patch set
      for better integration.  What's more, my original patch#4 was split into
      a few small chunks for easier review after Dave's comment.
      
      As such, the resulting code is basically identical with my original, and
      the only *visible* differences are:
      
       - renaming of _kexec_kernel_image_probe() and  _kimage_file_post_load_cleanup()
      
       - change one of types of arguments at prepare_elf64_headers()
      
      Those, unfortunately, require a couple of trivial changes on the rest
      (#1, #6 to #13) of my arm64 kexec_file patch set[1].
      
      Patch #1 allows making a use of purgatory optional, particularly useful
      for arm64.
      
      Patch #2 commonalizes arch_kexec_kernel_{image_probe, image_load,
      verify_sig}() and arch_kimage_file_post_load_cleanup() across
      architectures.
      
      Patches #3-#7 are also intended to generalize parse_elf64_headers(),
      along with exclude_mem_range(), to be made best re-use of.
      
      [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/561182.html
      [2] http://lkml.iu.edu//hypermail/linux/kernel/1802.1/02596.html
      
      This patch (of 7):
      
      On arm64, crash dump kernel's usable memory is protected by *unmapping*
      it from kernel virtual space unlike other architectures where the region
      is just made read-only.  It is highly unlikely that the region is
      accidentally corrupted and this observation rationalizes that digest
      check code can also be dropped from purgatory.  The resulting code is so
      simple as it doesn't require a bit ugly re-linking/relocation stuff,
      i.e.  arch_kexec_apply_relocations_add().
      
      Please see:
      
         http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html
      
      All that the purgatory does is to shuffle arguments and jump into a new
      kernel, while we still need to have some space for a hash value
      (purgatory_sha256_digest) which is never checked against.
      
      As such, it doesn't make sense to have trampline code between old kernel
      and new kernel on arm64.
      
      This patch introduces a new configuration, ARCH_HAS_KEXEC_PURGATORY, and
      allows related code to be compiled in only if necessary.
      
      [takahiro.akashi@linaro.org: fix trivial screwup]
        Link: http://lkml.kernel.org/r/20180309093346.GF25863@linaro.org
      Link: http://lkml.kernel.org/r/20180306102303.9063-2-takahiro.akashi@linaro.orgSigned-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Acked-by: NDave Young <dyoung@redhat.com>
      Tested-by: NDave Young <dyoung@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b799a09f
    • A
      proc: revalidate misc dentries · 1da4d377
      Alexey Dobriyan 提交于
      If module removes proc directory while another process pins it by
      chdir'ing to it, then subsequent recreation of proc entry and all
      entries down the tree will not be visible to any process until pinning
      process unchdir from directory and unpins everything.
      
      Steps to reproduce:
      
      	proc_mkdir("aaa", NULL);
      	proc_create("aaa/bbb", ...);
      
      		chdir("/proc/aaa");
      
      	remove_proc_entry("aaa/bbb", NULL);
      	remove_proc_entry("aaa", NULL);
      
      	proc_mkdir("aaa", NULL);
      	# inaccessible because "aaa" dentry still points
      	# to the original "aaa".
      	proc_create("aaa/bbb", ...);
      
      Fix is to implement ->d_revalidate and ->d_delete.
      
      Link: http://lkml.kernel.org/r/20180312201938.GA4871@avx2Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1da4d377
    • V
      mm, slab: reschedule cache_reap() on the same CPU · a9f2a846
      Vlastimil Babka 提交于
      cache_reap() is initially scheduled in start_cpu_timer() via
      schedule_delayed_work_on(). But then the next iterations are scheduled
      via schedule_delayed_work(), i.e. using WORK_CPU_UNBOUND.
      
      Thus since commit ef557180 ("workqueue: schedule WORK_CPU_UNBOUND
      work on wq_unbound_cpumask CPUs") there is no guarantee the future
      iterations will run on the originally intended cpu, although it's still
      preferred.  I was able to demonstrate this with
      /sys/module/workqueue/parameters/debug_force_rr_cpu.  IIUC, it may also
      happen due to migrating timers in nohz context.  As a result, some cpu's
      would be calling cache_reap() more frequently and others never.
      
      This patch uses schedule_delayed_work_on() with the current cpu when
      scheduling the next iteration.
      
      Link: http://lkml.kernel.org/r/20180411070007.32225-1-vbabka@suse.cz
      Fixes: ef557180 ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs")
      Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Stephen Boyd <sboyd@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9f2a846
    • P
      kexec: export PG_swapbacked to VMCOREINFO · 1cbf29da
      Petr Tesarik 提交于
      Since commit 6326fec1 ("mm: Use owner_priv bit for PageSwapCache,
      valid when PageSwapBacked"), PG_swapcache is an alias for
      PG_owner_priv_1, which may be also used for other purposes.
      
      To know whether the bit indeed has the PG_swapcache meaning, it is
      necessary to check PG_swapbacked, hence this bit must be exported.
      
      Link: http://lkml.kernel.org/r/20180410161345.142e142d@ezekiel.suse.czSigned-off-by: NPetr Tesarik <ptesarik@suse.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Xunlei Pang <xlpang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: "Marc-Andr Lureau" <marcandre.lureau@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1cbf29da
    • E
      ipc/shm: fix use-after-free of shm file via remap_file_pages() · 3f05317d
      Eric Biggers 提交于
      syzbot reported a use-after-free of shm_file_data(file)->file->f_op in
      shm_get_unmapped_area(), called via sys_remap_file_pages().
      
      Unfortunately it couldn't generate a reproducer, but I found a bug which
      I think caused it.  When remap_file_pages() is passed a full System V
      shared memory segment, the memory is first unmapped, then a new map is
      created using the ->vm_file.  Between these steps, the shm ID can be
      removed and reused for a new shm segment.  But, shm_mmap() only checks
      whether the ID is currently valid before calling the underlying file's
      ->mmap(); it doesn't check whether it was reused.  Thus it can use the
      wrong underlying file, one that was already freed.
      
      Fix this by making the "outer" shm file (the one that gets put in
      ->vm_file) hold a reference to the real shm file, and by making
      __shm_open() require that the file associated with the shm ID matches
      the one associated with the "outer" file.
      
      Taking the reference to the real shm file is needed to fully solve the
      problem, since otherwise sfd->file could point to a freed file, which
      then could be reallocated for the reused shm ID, causing the wrong shm
      segment to be mapped (and without the required permission checks).
      
      Commit 1ac0b6de ("ipc/shm: handle removed segments gracefully in
      shm_mmap()") almost fixed this bug, but it didn't go far enough because
      it didn't consider the case where the shm ID is reused.
      
      The following program usually reproduces this bug:
      
      	#include <stdlib.h>
      	#include <sys/shm.h>
      	#include <sys/syscall.h>
      	#include <unistd.h>
      
      	int main()
      	{
      		int is_parent = (fork() != 0);
      		srand(getpid());
      		for (;;) {
      			int id = shmget(0xF00F, 4096, IPC_CREAT|0700);
      			if (is_parent) {
      				void *addr = shmat(id, NULL, 0);
      				usleep(rand() % 50);
      				while (!syscall(__NR_remap_file_pages, addr, 4096, 0, 0, 0));
      			} else {
      				usleep(rand() % 50);
      				shmctl(id, IPC_RMID, NULL);
      			}
      		}
      	}
      
      It causes the following NULL pointer dereference due to a 'struct file'
      being used while it's being freed.  (I couldn't actually get a KASAN
      use-after-free splat like in the syzbot report.  But I think it's
      possible with this bug; it would just take a more extraordinary race...)
      
      	BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
      	PGD 0 P4D 0
      	Oops: 0000 [#1] SMP NOPTI
      	CPU: 9 PID: 258 Comm: syz_ipc Not tainted 4.16.0-05140-gf8cf2f16 #189
      	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
      	RIP: 0010:d_inode include/linux/dcache.h:519 [inline]
      	RIP: 0010:touch_atime+0x25/0xd0 fs/inode.c:1724
      	[...]
      	Call Trace:
      	 file_accessed include/linux/fs.h:2063 [inline]
      	 shmem_mmap+0x25/0x40 mm/shmem.c:2149
      	 call_mmap include/linux/fs.h:1789 [inline]
      	 shm_mmap+0x34/0x80 ipc/shm.c:465
      	 call_mmap include/linux/fs.h:1789 [inline]
      	 mmap_region+0x309/0x5b0 mm/mmap.c:1712
      	 do_mmap+0x294/0x4a0 mm/mmap.c:1483
      	 do_mmap_pgoff include/linux/mm.h:2235 [inline]
      	 SYSC_remap_file_pages mm/mmap.c:2853 [inline]
      	 SyS_remap_file_pages+0x232/0x310 mm/mmap.c:2769
      	 do_syscall_64+0x64/0x1a0 arch/x86/entry/common.c:287
      	 entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      [ebiggers@google.com: add comment]
        Link: http://lkml.kernel.org/r/20180410192850.235835-1-ebiggers3@gmail.com
      Link: http://lkml.kernel.org/r/20180409043039.28915-1-ebiggers3@gmail.com
      Reported-by: syzbot+d11f321e7f1923157eac80aa990b446596f46439@syzkaller.appspotmail.com
      Fixes: c8d78c18 ("mm: replace remap_file_pages() syscall with emulation")
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NDavidlohr Bueso <dbueso@suse.de>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3f05317d
    • A
      mm/filemap.c: provide dummy filemap_page_mkwrite() for NOMMU · 45397228
      Arnd Bergmann 提交于
      Building orangefs on MMU-less machines now results in a link error
      because of the newly introduced use of the filemap_page_mkwrite()
      function:
      
        ERROR: "filemap_page_mkwrite" [fs/orangefs/orangefs.ko] undefined!
      
      This adds a dummy version for it, similar to the existing
      generic_file_mmap and generic_file_readonly_mmap stubs in the same file,
      to avoid the link error without adding #ifdefs in each file system that
      uses these.
      
      Link: http://lkml.kernel.org/r/20180409105555.2439976-1-arnd@arndb.de
      Fixes: a5135eea ("orangefs: implement vm_ops->fault")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Martin Brandenburg <martin@omnibond.com>
      Cc: Mike Marshall <hubcap@omnibond.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      45397228
    • M
      mm/gup.c: document return value · d0811078
      Michael S. Tsirkin 提交于
      __get_user_pages_fast handles errors differently from
      get_user_pages_fast: the former always returns the number of pages
      pinned, the later might return a negative error code.
      
      Link: http://lkml.kernel.org/r/1522962072-182137-6-git-send-email-mst@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thorsten Leemhuis <regressions@leemhuis.info>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0811078
    • M
      get_user_pages_fast(): return -EFAULT on access_ok failure · c61611f7
      Michael S. Tsirkin 提交于
      get_user_pages_fast is supposed to be a faster drop-in equivalent of
      get_user_pages.  As such, callers expect it to return a negative return
      code when passed an invalid address, and never expect it to return 0
      when passed a positive number of pages, since its documentation says:
      
       * Returns number of pages pinned. This may be fewer than the number
       * requested. If nr_pages is 0 or negative, returns 0. If no pages
       * were pinned, returns -errno.
      
      When get_user_pages_fast fall back on get_user_pages this is exactly
      what happens.  Unfortunately the implementation is inconsistent: it
      returns 0 if passed a kernel address, confusing callers: for example,
      the following is pretty common but does not appear to do the right thing
      with a kernel address:
      
              ret = get_user_pages_fast(addr, 1, writeable, &page);
              if (ret < 0)
                      return ret;
      
      Change get_user_pages_fast to return -EFAULT when supplied a kernel
      address to make it match expectations.
      
      All callers have been audited for consistency with the documented
      semantics.
      
      Link: http://lkml.kernel.org/r/1522962072-182137-4-git-send-email-mst@redhat.com
      Fixes: 5b65c467 ("mm, x86/mm: Fix performance regression in get_user_pages_fast()")
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Reported-by: syzbot+6304bf97ef436580fede@syzkaller.appspotmail.com
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thorsten Leemhuis <regressions@leemhuis.info>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c61611f7
    • M
      mm/gup_benchmark: handle gup failures · 09e35a4a
      Michael S. Tsirkin 提交于
      Patch series "mm/get_user_pages_fast fixes, cleanups", v2.
      
      Turns out get_user_pages_fast and __get_user_pages_fast return different
      values on error when given a single page: __get_user_pages_fast returns
      0.  get_user_pages_fast returns either 0 or an error.
      
      Callers of get_user_pages_fast expect an error so fix it up to return an
      error consistently.
      
      Stress the difference between get_user_pages_fast and
      __get_user_pages_fast to make sure callers aren't confused.
      
      This patch (of 3):
      
      __gup_benchmark_ioctl does not handle the case where get_user_pages_fast
      fails:
      
       - a negative return code will cause a buffer overrun
      
       - returning with partial success will cause use of uninitialized
         memory.
      
      [akpm@linux-foundation.org: simplification]
      Link: http://lkml.kernel.org/r/1522962072-182137-3-git-send-email-mst@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thorsten Leemhuis <regressions@leemhuis.info>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09e35a4a
    • T
      resource: fix integer overflow at reallocation · 60bb83b8
      Takashi Iwai 提交于
      We've got a bug report indicating a kernel panic at booting on an x86-32
      system, and it turned out to be the invalid PCI resource assigned after
      reallocation.  __find_resource() first aligns the resource start address
      and resets the end address with start+size-1 accordingly, then checks
      whether it's contained.  Here the end address may overflow the integer,
      although resource_contains() still returns true because the function
      validates only start and end address.  So this ends up with returning an
      invalid resource (start > end).
      
      There was already an attempt to cover such a problem in the commit
      47ea91b4 ("Resource: fix wrong resource window calculation"), but
      this case is an overseen one.
      
      This patch adds the validity check of the newly calculated resource for
      avoiding the integer overflow problem.
      
      Bugzilla: http://bugzilla.opensuse.org/show_bug.cgi?id=1086739
      Link: http://lkml.kernel.org/r/s5hpo37d5l8.wl-tiwai@suse.de
      Fixes: 23c570a6 ("resource: ability to resize an allocated resource")
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Reported-by: NMichael Henders <hendersm@shaw.ca>
      Tested-by: NMichael Henders <hendersm@shaw.ca>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Ram Pai <linuxram@us.ibm.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      60bb83b8
    • L
      Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · 48023102
      Linus Torvalds 提交于
      Pull overlayfs updates from Miklos Szeredi:
       "In addition to bug fixes and cleanups there are two new features from
        Amir:
      
         - Consistent inode number support for the case when layers are not
           all on the same filesystem (feature is dubbed "xino").
      
         - Optimize overlayfs file handle decoding. This one touches the
           exportfs interface to allow detecting the disconnected directory
           case"
      
      * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
        ovl: update documentation w.r.t "xino" feature
        ovl: add support for "xino" mount and config options
        ovl: consistent d_ino for non-samefs with xino
        ovl: consistent i_ino for non-samefs with xino
        ovl: constant st_ino for non-samefs with xino
        ovl: allocate anon bdev per unique lower fs
        ovl: factor out ovl_map_dev_ino() helper
        ovl: cleanup ovl_update_time()
        ovl: add WARN_ON() for non-dir redirect cases
        ovl: cleanup setting OVL_INDEX
        ovl: set d->is_dir and d->opaque for last path element
        ovl: Do not check for redirect if this is last layer
        ovl: lookup in inode cache first when decoding lower file handle
        ovl: do not try to reconnect a disconnected origin dentry
        ovl: disambiguate ovl_encode_fh()
        ovl: set lower layer st_dev only if setting lower st_ino
        ovl: fix lookup with middle layer opaque dir and absolute path redirects
        ovl: Set d->last properly during lookup
        ovl: set i_ino to the value of st_ino for NFS export
      48023102
    • L
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux · ba2b137d
      Linus Torvalds 提交于
      Pull thermal management update from Zhang Rui:
      
       - Fix race condition in imx_thermal_probe() (Mikhail Lappo)
      
       - Add cooling device's statistics in sysfs (Viresh Kumar)
      
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
        thermal: Add cooling device's statistics in sysfs
        thermal: imx: Fix race condition in imx_thermal_probe()
      ba2b137d
    • L
      Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging · 71893f11
      Linus Torvalds 提交于
      Pull dmi updates from Jean Delvare.
      
      * 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
        firmware: dmi_scan: Use lowercase letters for UUID
        firmware: dmi_scan: Add DMI_OEM_STRING support to dmi_matches
        firmware: dmi_scan: Fix UUID length safety check
      71893f11
    • L
      Merge tag 'chrome-platform-for-linus-4.17' of... · f6811370
      Linus Torvalds 提交于
      Merge tag 'chrome-platform-for-linus-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform
      
      Pull chrome platform updates from Benson Leung:
      
       - a series from Dmitry to remove platform data from chromeos_laptop.c,
         which was the only user of platform data for the atmel_mxt_ts driver.
      
       - a series to clean up sysfs and debugfs for cros_ec
      
       - other misc cleanups
      
      * tag 'chrome-platform-for-linus-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform: (22 commits)
        platform/chrome: mfd/cros_ec_dev: Add sysfs entry to set keyboard wake lid angle
        platform/chrome: cros_ec_debugfs: Add PD port info to debugfs
        platform/chrome: cros_ec_debugfs: Use octal permissions '0444'
        platform/chrome: cros_ec_sysfs: use permission-specific DEVICE_ATTR variants
        platform/chrome: cros_ec_sysfs: introduce to_cros_ec_dev define.
        platform/chrome: cros_ec_sysfs: Modify error handling
        platform/chrome: cros_ec_lpc: Add support for Google devices using custom coreboot firmware
        platform/chrome: cros_ec_lpc: wake up from s2idle on Chrome EC
        Input: atmel_mxt_ts - remove platform data support
        platform/chrome: chromeos_laptop - discard data for unneeded boards
        platform/chrome: chromeos_laptop - use device properties for Pixel
        platform/chrome: chromeos_laptop - rely on I2C to set up interrupt trigger
        platform/chrome: chromeos_laptop - use I2C notifier to create devices
        platform/chrome: chromeos_laptop - parse DMI IRQ data once
        platform/chrome: chromeos_laptop - rework i2c peripherals initialization
        platform/chrome: chromeos_laptop - factor out getting IRQ from DMI
        platform/chrome: chromeos_laptop - introduce pr_fmt()
        platform/chrome: chromeos_laptop - stop setting suspend mode for Atmel devices
        platform/chrome: chromeos_laptop - add SPDX identifier
        Input: atmel_mxt_ts - switch ChromeOS ACPI devices to generic props
        ...
      f6811370
    • L
      Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · ca4e7c51
      Linus Torvalds 提交于
      Pull clk updates from Stephen Boyd:
       "The large diff this time around is from the addition of a new clk
        driver for the TI Davinci family of SoCs. So far those clks have been
        supported with a custom implementation of the clk API in the arch port
        instead of in the CCF. With this driver merged we're one step closer
        to having a single clk API implementation.
      
        The other large diff is from the Amlogic clk driver that underwent
        some major surgery to use regmap. Beyond that, the biggest hitter is
        Samsung which needed some reworks to properly handle clk provider
        power domains and a bunch of PLL rate updates.
      
        The core framework was fairly quiet this round, just getting some
        cleanups and small fixes for some of the more esoteric features. And
        the usual set of driver non-critical fixes, cleanups, and minor
        additions are here as well.
      
        Core:
         - Rejig clk_ops::init() to be a little earlier for phase/accuracy ops
         - debugfs ops macroized to shave some lines of boilerplate code
         - Always calculate the phase instead of caching it in clk_get_phase()
         - More __must_check on bulk clk APIs
      
        New Drivers:
         - TI's Davinci family of SoCs
         - Intel's Stratix10 SoC
         - stm32mp157 SoC
         - Allwinner H6 CCU
         - Silicon Labs SI544 clock generator chip
         - Renesas R-Car M3-N and V3H SoCs
         - i.MX6SLL SoCs
      
        Removed Drivers:
         - ST-Ericsson AB8540/9540
      
        Updates:
         - Mediatek MT2701 and MT7622 audsys support and MT2712 updates
         - STM32F469 DSI and STM32F769 sdmmc2 support
         - GPIO clks can sleep now
         - Spreadtrum SC9860 RTC clks
         - Nvidia Tegra MBIST workarounds and various minor fixes
         - Rockchip phase handling fixes and a memory leak plugged
         - Renesas drivers switch to readl/writel from clk_readl/clk_writel
         - Renesas gained CPU (Z/Z2) and watchdog support
         - Rockchip rk3328 display clks and rk3399 1.6GHz PLL support
         - Qualcomm PM8921 PMIC XO buffers
         - Amlogic migrates to regmap APIs
         - TI Keystone clk latching support
         - Allwinner H3 and H5 video clk fixes
         - Broadcom BCM2835 PLLs needed another bit to enable
         - i.MX6SX CKO mux fix and i.MX7D Video PLL divider fix
         - i.MX6UL/ULL epdc_podf support
         - Hi3798CV200 COMBPHY0 and USB2_OTG_UTMI and phase support for eMMC"
      
      * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (233 commits)
        clk: davinci: add a reset lookup table for psc0
        clk: imx: add clock driver for imx6sll
        dt-bindings: imx: update clock doc for imx6sll
        clk: imx: add new gate/gate2 wrapper funtion
        clk: imx: Add CLK_IS_CRITICAL flag for busy divider and busy mux
        clk: cs2000: set pm_ops in hibernate-compatible way
        clk: bcm2835: De-assert/assert PLL reset signal when appropriate
        clk: imx7d: Move clks_init_on before any clock operations
        clk: imx7d: Correct ahb clk parent select
        clk: imx7d: Correct dram pll type
        clk: imx7d: Add USB clock information
        clk: socfpga: stratix10: add clock driver for Stratix10 platform
        dt-bindings: documentation: add clock bindings information for Stratix10
        clk: ti: fix flag space conflict with clkctrl clocks
        clk: uniphier: add additional ethernet clock lines for Pro4
        clk: uniphier: add SATA clock control support
        clk: uniphier: add PCIe clock control support
        clk: Add driver for the si544 clock generator chip
        clk: davinci: Remove redundant dev_err calls
        clk: uniphier: add ethernet clock control support for PXs3
        ...
      ca4e7c51
    • L
      Merge tag 'pwm/for-4.17-rc1' of... · daf3ef6e
      Linus Torvalds 提交于
      Merge tag 'pwm/for-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
      
      Pull pwm updates from Thierry Reding:
       "This set of changes adds support for more generations of the RCar
        controller as well as runtime PM support. The JZ4740 driver gains
        support for device tree and can now be used on all Ingenic SoCs.
      
        Rounding things off is a random assortment of fixes and cleanups all
        across the board"
      
      * tag 'pwm/for-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (29 commits)
        pwm: rcar: Add suspend/resume support
        pwm: rcar: Use PM Runtime to control module clock
        dt-bindings: pwm: rcar: Add bindings for R-Car M3N support
        pwm: rcar: Fix a condition to prevent mismatch value setting to duty
        pwm: sysfs: Use put_device() instead of kfree()
        dt-bindings: pwm: sunxi: Add new compatible strings
        pwm: sun4i: Simplify controller mapping
        pwm: sun4i: Drop unused .has_rdy member
        pwm: sun4i: Properly check current state
        pwm: Remove depends on AVR32
        pwm: stm32: LPTimer: Use 3 cells ->of_xlate()
        dt-bindings: pwm-stm32-lp: Add #pwm-cells
        pwm: stm32: Protect common prescaler for all channels
        pwm: stm32: Remove unused struct device
        pwm: mediatek: Improve precision in rate calculation
        pwm: mediatek: Remove redundant MODULE_ALIAS entries
        pwm: mediatek: Fix up PWM4 and PWM5 malfunction on MT7623
        pwm: jz4740: Enable for all Ingenic SoCs
        pwm: jz4740: Add support for devicetree
        pwm: jz4740: Implement ->set_polarity()
        ...
      daf3ef6e
    • L
      Merge tag 'linux-watchdog-4.17-rc1' of git://www.linux-watchdog.org/linux-watchdog · 41531f58
      Linus Torvalds 提交于
      Pull watchdog updates from Wim Van Sebroeck:
      
       - Add Nuvoton NPCM watchdog driver
      
       - renesas_wdt: Add R-Car Gen2 support
      
       - renesas_wdt: add suspend/resume and restart handler support
      
       - hpwdt: convert to watchdog core and improve NMI
      
       - improve timeout setting/handling in various drivers
      
       - coh901327: make license text and module licence match
      
       - fix error handling in asm9260_wdt, sprd_wdt and davinci_wdt
      
       - aspeed imrovements
      
       - dw improvements (for control register & suspend/resume)
      
       - add SPDX identifiers for watchdog subsystem
      
      * tag 'linux-watchdog-4.17-rc1' of git://www.linux-watchdog.org/linux-watchdog: (35 commits)
        watchdog: davinci_wdt: fix error handling in davinci_wdt_probe()
        watchdog: add SPDX identifiers for watchdog subsystem
        watchdog: aspeed: Allow configuring for alternate boot
        watchdog: Add Nuvoton NPCM watchdog driver
        dt-bindings: watchdog: Add Nuvoton NPCM description
        watchdog: dw: save/restore control and timeout across suspend/resume
        watchdog: dw: RMW the control register
        watchdog: sprd_wdt: Fix error handling in sprd_wdt_enable()
        watchdog: aspeed: Fix translation of reset mode to ctrl register
        watchdog: renesas_wdt: Add restart handler
        watchdog: renesas_wdt: Add R-Car Gen2 support
        watchdog: renesas_wdt: Add suspend/resume support
        watchdog: f71808e_wdt: Fix WD_EN register read
        watchdog: hpwdt: Update driver version.
        watchdog: hpwdt: Add dynamic debug
        watchdog: hpwdt: Programable Pretimeout NMI
        watchdog: hpwdt: remove allow_kdump module parameter.
        watchdog: hpwdt: condition early return of NMI handler on iLO5
        watchdog: hpwdt: Modify to use watchdog core.
        watchdog: hpwdt: Update nmi_panic message.
        ...
      41531f58