1. 16 2月, 2018 3 次提交
  2. 15 2月, 2018 18 次提交
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e525de3a
      Linus Torvalds 提交于
      Pull x86 fixes from Ingo Molnar:
       "Misc fixes all across the map:
      
         - /proc/kcore vsyscall related fixes
         - LTO fix
         - build warning fix
         - CPU hotplug fix
         - Kconfig NR_CPUS cleanups
         - cpu_has() cleanups/robustification
         - .gitignore fix
         - memory-failure unmapping fix
         - UV platform fix"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages
        x86/error_inject: Make just_return_func() globally visible
        x86/platform/UV: Fix GAM Range Table entries less than 1GB
        x86/build: Add arch/x86/tools/insn_decoder_test to .gitignore
        x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU
        x86/mm/kcore: Add vsyscall page to /proc/kcore conditionally
        vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page
        x86/Kconfig: Further simplify the NR_CPUS config
        x86/Kconfig: Simplify NR_CPUS config
        x86/MCE: Fix build warning introduced by "x86: do not use print_symbol()"
        x86/cpufeature: Update _static_cpu_has() to use all named variables
        x86/cpufeature: Reindent _static_cpu_has()
      e525de3a
    • L
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d4667ca1
      Linus Torvalds 提交于
      Pull x86 PTI and Spectre related fixes and updates from Ingo Molnar:
       "Here's the latest set of Spectre and PTI related fixes and updates:
      
        Spectre:
         - Add entry code register clearing to reduce the Spectre attack
           surface
         - Update the Spectre microcode blacklist
         - Inline the KVM Spectre helpers to get close to v4.14 performance
           again.
         - Fix indirect_branch_prediction_barrier()
         - Fix/improve Spectre related kernel messages
         - Fix array_index_nospec_mask() asm constraint
         - KVM: fix two MSR handling bugs
      
        PTI:
         - Fix a paranoid entry PTI CR3 handling bug
         - Fix comments
      
        objtool:
         - Fix paranoid_entry() frame pointer warning
         - Annotate WARN()-related UD2 as reachable
         - Various fixes
         - Add Add Peter Zijlstra as objtool co-maintainer
      
        Misc:
         - Various x86 entry code self-test fixes
         - Improve/simplify entry code stack frame generation and handling
           after recent heavy-handed PTI and Spectre changes. (There's two
           more WIP improvements expected here.)
         - Type fix for cache entries
      
        There's also some low risk non-fix changes I've included in this
        branch to reduce backporting conflicts:
      
         - rename a confusing x86_cpu field name
         - de-obfuscate the naming of single-TLB flushing primitives"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
        x86/entry/64: Fix CR3 restore in paranoid_exit()
        x86/cpu: Change type of x86_cache_size variable to unsigned int
        x86/spectre: Fix an error message
        x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping
        selftests/x86/mpx: Fix incorrect bounds with old _sigfault
        x86/mm: Rename flush_tlb_single() and flush_tlb_one() to __flush_tlb_one_[user|kernel]()
        x86/speculation: Add <asm/msr-index.h> dependency
        nospec: Move array_index_nospec() parameter checking into separate macro
        x86/speculation: Fix up array_index_nospec_mask() asm constraint
        x86/debug: Use UD2 for WARN()
        x86/debug, objtool: Annotate WARN()-related UD2 as reachable
        objtool: Fix segfault in ignore_unreachable_insn()
        selftests/x86: Disable tests requiring 32-bit support on pure 64-bit systems
        selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c
        selftests/x86: Do not rely on "int $0x80" in test_mremap_vdso.c
        selftests/x86: Fix build bug caused by the 5lvl test which has been moved to the VM directory
        selftests/x86/pkeys: Remove unused functions
        selftests/x86: Clean up and document sscanf() usage
        selftests/x86: Fix vDSO selftest segfault for vsyscall=none
        x86/entry/64: Remove the unused 'icebp' macro
        ...
      d4667ca1
    • I
      x86/entry/64: Fix CR3 restore in paranoid_exit() · e4865757
      Ingo Molnar 提交于
      Josh Poimboeuf noticed the following bug:
      
       "The paranoid exit code only restores the saved CR3 when it switches back
        to the user GS.  However, even in the kernel GS case, it's possible that
        it needs to restore a user CR3, if for example, the paranoid exception
        occurred in the syscall exit path between SWITCH_TO_USER_CR3_STACK and
        SWAPGS."
      
      Josh also confirmed via targeted testing that it's possible to hit this bug.
      
      Fix the bug by also restoring CR3 in the paranoid_exit_no_swapgs branch.
      
      The reason we haven't seen this bug reported by users yet is probably because
      "paranoid" entry points are limited to the following cases:
      
       idtentry double_fault       do_double_fault  has_error_code=1  paranoid=2
       idtentry debug              do_debug         has_error_code=0  paranoid=1 shift_ist=DEBUG_STACK
       idtentry int3               do_int3          has_error_code=0  paranoid=1 shift_ist=DEBUG_STACK
       idtentry machine_check      do_mce           has_error_code=0  paranoid=1
      
      Amongst those entry points only machine_check is one that will interrupt an
      IRQS-off critical section asynchronously - and machine check events are rare.
      
      The other main asynchronous entries are NMI entries, which can be very high-freq
      with perf profiling, but they are special: they don't use the 'idtentry' macro but
      are open coded and restore user CR3 unconditionally so don't have this bug.
      Reported-and-tested-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180214073910.boevmg65upbk3vqb@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e4865757
    • G
      x86/cpu: Change type of x86_cache_size variable to unsigned int · 24dbc600
      Gustavo A. R. Silva 提交于
      Currently, x86_cache_size is of type int, which makes no sense as we
      will never have a valid cache size equal or less than 0. So instead of
      initializing this variable to -1, it can perfectly be initialized to 0
      and use it as an unsigned variable instead.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGustavo A. R. Silva <garsilva@embeddedor.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Addresses-Coverity-ID: 1464429
      Link: http://lkml.kernel.org/r/20180213192208.GA26414@embeddedor.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      24dbc600
    • D
      x86/spectre: Fix an error message · 9de29eac
      Dan Carpenter 提交于
      If i == ARRAY_SIZE(mitigation_options) then we accidentally print
      garbage from one space beyond the end of the mitigation_options[] array.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: KarimAllah Ahmed <karahmed@amazon.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kernel-janitors@vger.kernel.org
      Fixes: 9005c683 ("x86/spectre: Simplify spectre_v2 command line parsing")
      Link: http://lkml.kernel.org/r/20180214071416.GA26677@mwandaSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9de29eac
    • J
      x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping · b399151c
      Jia Zhang 提交于
      x86_mask is a confusing name which is hard to associate with the
      processor's stepping.
      
      Additionally, correct an indent issue in lib/cpu.c.
      Signed-off-by: NJia Zhang <qianyue.zj@alibaba-inc.com>
      [ Updated it to more recent kernels. ]
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: bp@alien8.de
      Cc: tony.luck@intel.com
      Link: http://lkml.kernel.org/r/1514771530-70829-1-git-send-email-qianyue.zj@alibaba-inc.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b399151c
    • R
      selftests/x86/mpx: Fix incorrect bounds with old _sigfault · 961888b1
      Rui Wang 提交于
      For distributions with old userspace header files, the _sigfault
      structure is different. mpx-mini-test fails with the following
      error:
      
        [root@Purley]# mpx-mini-test_64 tabletest
        XSAVE is supported by HW & OS
        XSAVE processor supported state mask: 0x2ff
        XSAVE OS supported state mask: 0x2ff
         BNDREGS: size: 64 user: 1 supervisor: 0 aligned: 0
          BNDCSR: size: 64 user: 1 supervisor: 0 aligned: 0
        starting mpx bounds table test
        ERROR: siginfo bounds do not match shadow bounds for register 0
      
      Fix it by using the correct offset of _lower/_upper in _sigfault.
      RHEL needs this patch to work.
      Signed-off-by: NRui Wang <rui.y.wang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dave.hansen@linux.intel.com
      Fixes: e754aedc ("x86/mpx, selftests: Add MPX self test")
      Link: http://lkml.kernel.org/r/1513586050-1641-1-git-send-email-rui.y.wang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      961888b1
    • A
      x86/mm: Rename flush_tlb_single() and flush_tlb_one() to __flush_tlb_one_[user|kernel]() · 1299ef1d
      Andy Lutomirski 提交于
      flush_tlb_single() and flush_tlb_one() sound almost identical, but
      they really mean "flush one user translation" and "flush one kernel
      translation".  Rename them to flush_tlb_one_user() and
      flush_tlb_one_kernel() to make the semantics more obvious.
      
      [ I was looking at some PTI-related code, and the flush-one-address code
        is unnecessarily hard to understand because the names of the helpers are
        uninformative.  This came up during PTI review, but no one got around to
        doing it. ]
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Linux-MM <linux-mm@kvack.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/3303b02e3c3d049dc5235d5651e0ae6d29a34354.1517414378.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1299ef1d
    • P
      x86/speculation: Add <asm/msr-index.h> dependency · ea00f301
      Peter Zijlstra 提交于
      Joe Konno reported a compile failure resulting from using an MSR
      without inclusion of <asm/msr-index.h>, and while the current code builds
      fine (by accident) this needs fixing for future patches.
      Reported-by: NJoe Konno <joe.konno@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arjan@linux.intel.com
      Cc: bp@alien8.de
      Cc: dan.j.williams@intel.com
      Cc: dave.hansen@linux.intel.com
      Cc: dwmw2@infradead.org
      Cc: dwmw@amazon.co.uk
      Cc: gregkh@linuxfoundation.org
      Cc: hpa@zytor.com
      Cc: jpoimboe@redhat.com
      Cc: linux-tip-commits@vger.kernel.org
      Cc: luto@kernel.org
      Fixes: 20ffa1ca ("x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support")
      Link: http://lkml.kernel.org/r/20180213132819.GJ25201@hirez.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ea00f301
    • W
      nospec: Move array_index_nospec() parameter checking into separate macro · 8fa80c50
      Will Deacon 提交于
      For architectures providing their own implementation of
      array_index_mask_nospec() in asm/barrier.h, attempting to use WARN_ONCE() to
      complain about out-of-range parameters using WARN_ON() results in a mess
      of mutually-dependent include files.
      
      Rather than unpick the dependencies, simply have the core code in nospec.h
      perform the checking for us.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1517840166-15399-1-git-send-email-will.deacon@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8fa80c50
    • D
      x86/speculation: Fix up array_index_nospec_mask() asm constraint · be3233fb
      Dan Williams 提交于
      Allow the compiler to handle @size as an immediate value or memory
      directly rather than allocating a register.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/151797010204.1289.1510000292250184993.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      be3233fb
    • P
      x86/debug: Use UD2 for WARN() · 3b3a371c
      Peter Zijlstra 提交于
      Since the Intel SDM added an ModR/M byte to UD0 and binutils followed
      that specification, we now cannot disassemble our kernel anymore.
      
      This now means Intel and AMD disagree on the encoding of UD0. And instead
      of playing games with additional bytes that are valid ModR/M and single
      byte instructions (0xd6 for instance), simply use UD2 for both WARN() and
      BUG().
      Requested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180208194406.GD25181@hirez.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3b3a371c
    • J
      x86/debug, objtool: Annotate WARN()-related UD2 as reachable · 2b5db668
      Josh Poimboeuf 提交于
      By default, objtool assumes that a UD2 is a dead end.  This is mainly
      because GCC 7+ sometimes inserts a UD2 when it detects a divide-by-zero
      condition.
      
      Now that WARN() is moving back to UD2, annotate the code after it as
      reachable so objtool can follow the code flow.
      Reported-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      Link: http://lkml.kernel.org/r/0e483379275a42626ba8898117f918e1bf661e40.1518130694.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2b5db668
    • J
      objtool: Fix segfault in ignore_unreachable_insn() · fe24e271
      Josh Poimboeuf 提交于
      Peter Zijlstra's patch for converting WARN() to use UD2 triggered a
      bunch of false "unreachable instruction" warnings, which then triggered
      a seg fault in ignore_unreachable_insn().
      
      The seg fault happened when it tried to dereference a NULL 'insn->func'
      pointer.  Thanks to static_cpu_has(), some functions can jump to a
      non-function area in the .altinstr_aux section.  That breaks
      ignore_unreachable_insn()'s assumption that it's always inside the
      original function.
      
      Make sure ignore_unreachable_insn() only follows jumps within the
      current function.
      Reported-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      Link: http://lkml.kernel.org/r/bace77a60d5af9b45eddb8f8fb9c776c8de657ef.1518130694.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fe24e271
    • D
      selftests/x86: Disable tests requiring 32-bit support on pure 64-bit systems · 9279ddf2
      Dominik Brodowski 提交于
      The ldt_gdt and ptrace_syscall selftests, even in their 64-bit variant, use
      hard-coded 32-bit syscall numbers and call "int $0x80".
      
      This will fail on 64-bit systems with CONFIG_IA32_EMULATION=y disabled.
      
      Therefore, do not build these tests if we cannot build 32-bit binaries
      (which should be a good approximation for CONFIG_IA32_EMULATION=y being enabled).
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kselftest@vger.kernel.org
      Cc: shuah@kernel.org
      Link: http://lkml.kernel.org/r/20180211111013.16888-6-linux@dominikbrodowski.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9279ddf2
    • D
      selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c · 4105c697
      Dominik Brodowski 提交于
      On 64-bit builds, we should not rely on "int $0x80" working (it only does if
      CONFIG_IA32_EMULATION=y is enabled). To keep the "Set TF and check int80"
      test running on 64-bit installs with CONFIG_IA32_EMULATION=y enabled, build
      this test only if we can also build 32-bit binaries (which should be a
      good approximation for that).
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kselftest@vger.kernel.org
      Cc: shuah@kernel.org
      Link: http://lkml.kernel.org/r/20180211111013.16888-5-linux@dominikbrodowski.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4105c697
    • L
      Merge tag 'gfs2-4.16.rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 · 6556677a
      Linus Torvalds 提交于
      Pull gfs2 fix from Bob Peterson:
       "Fix regressions in the gfs2 iomap for block_map implementation we
        recently discovered in commit 3974320c"
      
      * tag 'gfs2-4.16.rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
        gfs2: Fixes to "Implement iomap for block_map"
      6556677a
    • L
      Merge tag 'powerpc-4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 694a20da
      Linus Torvalds 提交于
      Pull powerpc fixes from Michael Ellerman:
       "A larger batch of fixes than we'd like. Roughly 1/3 fixes for new
        code, 1/3 fixes for stable and 1/3 minor things.
      
        There's four commits fixing bugs when using 16GB huge pages on hash,
        caused by some of the preparatory changes for pkeys.
      
        Two fixes for bugs in the enhanced IRQ soft masking for local_t, one
        of which broke KVM in some circumstances.
      
        Four fixes for Power9. The most bizarre being a bug where futexes
        stopped working because a NULL pointer dereference didn't trap during
        early boot (it aliased the kernel mapping). A fix for memory hotplug
        when using the Radix MMU, and a fix for live migration of guests using
        the Radix MMU.
      
        Two fixes for hotplug on pseries machines. One where we weren't
        correctly updating NUMA info when CPUs are added and removed. And the
        other fixes crashes/hangs seen when doing memory hot remove during
        boot, which is apparently a thing people do.
      
        Finally a handful of build fixes for obscure configs and other minor
        fixes.
      
        Thanks to: Alexey Kardashevskiy, Aneesh Kumar K.V, Balbir Singh, Colin
        Ian King, Daniel Henrique Barboza, Florian Weimer, Guenter Roeck,
        Harish, Laurent Vivier, Madhavan Srinivasan, Mauricio Faria de
        Oliveira, Nathan Fontenot, Nicholas Piggin, Sam Bobroff"
      
      * tag 'powerpc-4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        selftests/powerpc: Fix to use ucontext_t instead of struct ucontext
        powerpc/kdump: Fix powernv build break when KEXEC_CORE=n
        powerpc/pseries: Fix build break for SPLPAR=n and CPU hotplug
        powerpc/mm/hash64: Zero PGD pages on allocation
        powerpc/mm/hash64: Store the slot information at the right offset for hugetlb
        powerpc/mm/hash64: Allocate larger PMD table if hugetlb config is enabled
        powerpc/mm: Fix crashes with 16G huge pages
        powerpc/mm: Flush radix process translations when setting MMU type
        powerpc/vas: Don't set uses_vas for kernel windows
        powerpc/pseries: Enable RAS hotplug events later
        powerpc/mm/radix: Split linear mapping on hot-unplug
        powerpc/64s/radix: Boot-time NULL pointer protection using a guard-PID
        ocxl: fix signed comparison with less than zero
        powerpc/64s: Fix may_hard_irq_enable() for PMI soft masking
        powerpc/64s: Fix MASKABLE_RELON_EXCEPTION_HV_OOL macro
        powerpc/numa: Invalidate numa_cpu_lookup_table on cpu remove
      694a20da
  3. 14 2月, 2018 2 次提交
    • A
      gfs2: Fixes to "Implement iomap for block_map" · 49edd5bf
      Andreas Gruenbacher 提交于
      It turns out that commit 3974320c "Implement iomap for block_map"
      introduced a few bugs that trigger occasional failures with xfstest
      generic/476:
      
      In gfs2_iomap_begin, we jump to do_alloc when we determine that we are
      beyond the end of the allocated metadata (height > ip->i_height).
      There, we can end up calling hole_size with a metapath that doesn't
      match the current metadata tree, which doesn't make sense.  After
      untangling the code at do_alloc, fix this by checking if the block we
      are looking for is within the range of allocated metadata.
      
      In addition, add a BUG() in case gfs2_iomap_begin is accidentally called
      for reading stuffed files: this is handled separately.  Make sure we
      don't truncate iomap->length for reads beyond the end of the file; in
      that case, the entire range counts as a hole.
      
      Finally, revert to taking a bitmap write lock when doing allocations.
      It's unclear why that change didn't lead to any failures during testing.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      49edd5bf
    • L
      Merge tag 'mips_4.16_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips · 61f14c01
      Linus Torvalds 提交于
      Pull MIPS fix from James Hogan:
       "A single change (and associated DT binding update) to allow the
        address of the MIPS Cluster Power Controller (CPC) to be chosen by DT,
        which allows SMP to work on generic MIPS kernels where the bootloader
        hasn't configured the CPC address (i.e. the new Ranchu platform)"
      
      * tag 'mips_4.16_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
        MIPS: CPC: Map registers using DT in mips_cpc_default_phys_base()
        dt-bindings: Document mti,mips-cpc binding
      61f14c01
  4. 13 2月, 2018 17 次提交
    • T
      x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages · fd0e786d
      Tony Luck 提交于
      In the following commit:
      
        ce0fa3e5 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
      
      ... we added code to memory_failure() to unmap the page from the
      kernel 1:1 virtual address space to avoid speculative access to the
      page logging additional errors.
      
      But memory_failure() may not always succeed in taking the page offline,
      especially if the page belongs to the kernel.  This can happen if
      there are too many corrected errors on a page and either mcelog(8)
      or drivers/ras/cec.c asks to take a page offline.
      
      Since we remove the 1:1 mapping early in memory_failure(), we can
      end up with the page unmapped, but still in use. On the next access
      the kernel crashes :-(
      
      There are also various debug paths that call memory_failure() to simulate
      occurrence of an error. Since there is no actual error in memory, we
      don't need to map out the page for those cases.
      
      Revert most of the previous attempt and keep the solution local to
      arch/x86/kernel/cpu/mcheck/mce.c. Unmap the page only when:
      
      	1) there is a real error
      	2) memory_failure() succeeds.
      
      All of this only applies to 64-bit systems. 32-bit kernel doesn't map
      all of memory into kernel space. It isn't worth adding the code to unmap
      the piece that is mapped because nobody would run a 32-bit kernel on a
      machine that has recoverable machine checks.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave <dave.hansen@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert (Persistent Memory) <elliott@hpe.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Cc: stable@vger.kernel.org #v4.14
      Fixes: ce0fa3e5 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      fd0e786d
    • A
      x86/error_inject: Make just_return_func() globally visible · 01684e72
      Arnd Bergmann 提交于
      With link time optimizations enabled, I get a link failure:
      
        ./ccLbOEHX.ltrans19.ltrans.o: In function `override_function_with_return':
        <artificial>:(.text+0x7f3): undefined reference to `just_return_func'
      
      Marking the symbol .globl makes it work as expected.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Josef Bacik <jbacik@fb.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicolas Pitre <nico@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: 540adea3 ("error-injection: Separate error-injection from kprobe")
      Link: http://lkml.kernel.org/r/20180202145634.200291-3-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      01684e72
    • M
      x86/platform/UV: Fix GAM Range Table entries less than 1GB · c25d99d2
      mike.travis@hpe.com 提交于
      The latest UV platforms include the new ApachePass NVDIMMs into the
      UV address space.  This has introduced address ranges in the Global
      Address Map Table that are less than the previous lowest range, which
      was 2GB.  Fix the address calculation so it accommodates address ranges
      from bytes to exabytes.
      Signed-off-by: NMike Travis <mike.travis@hpe.com>
      Reviewed-by: NAndrew Banman <andrew.banman@hpe.com>
      Reviewed-by: NDimitri Sivanich <dimitri.sivanich@hpe.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <russ.anderson@hpe.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180205221503.190219903@stormcage.americas.sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c25d99d2
    • P
      x86/build: Add arch/x86/tools/insn_decoder_test to .gitignore · 74eb816b
      Progyan Bhattacharya 提交于
      The file was generated by make command and should not be in the source tree.
      Signed-off-by: NProgyan Bhattacharya <progyanb@acm.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      74eb816b
    • M
      x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU · 295cc7eb
      Masayoshi Mizuma 提交于
      When a physical CPU is hot-removed, the following warning messages
      are shown while the uncore device is removed in uncore_pci_remove():
      
        WARNING: CPU: 120 PID: 5 at arch/x86/events/intel/uncore.c:988
        uncore_pci_remove+0xf1/0x110
        ...
        CPU: 120 PID: 5 Comm: kworker/u1024:0 Not tainted 4.15.0-rc8 #1
        Workqueue: kacpi_hotplug acpi_hotplug_work_fn
        ...
        Call Trace:
        pci_device_remove+0x36/0xb0
        device_release_driver_internal+0x145/0x210
        pci_stop_bus_device+0x76/0xa0
        pci_stop_root_bus+0x44/0x60
        acpi_pci_root_remove+0x1f/0x80
        acpi_bus_trim+0x54/0x90
        acpi_bus_trim+0x2e/0x90
        acpi_device_hotplug+0x2bc/0x4b0
        acpi_hotplug_work_fn+0x1a/0x30
        process_one_work+0x141/0x340
        worker_thread+0x47/0x3e0
        kthread+0xf5/0x130
      
      When uncore_pci_remove() runs, it tries to get the package ID to
      clear the value of uncore_extra_pci_dev[].dev[] by using
      topology_phys_to_logical_pkg(). The warning messesages are
      shown because topology_phys_to_logical_pkg() returns -1.
      
        arch/x86/events/intel/uncore.c:
        static void uncore_pci_remove(struct pci_dev *pdev)
        {
        ...
                phys_id = uncore_pcibus_to_physid(pdev->bus);
        ...
                        pkg = topology_phys_to_logical_pkg(phys_id); // returns -1
                        for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) {
                                if (uncore_extra_pci_dev[pkg].dev[i] == pdev) {
                                        uncore_extra_pci_dev[pkg].dev[i] = NULL;
                                        break;
                                }
                        }
                        WARN_ON_ONCE(i >= UNCORE_EXTRA_PCI_DEV_MAX); // <=========== HERE!!
      
      topology_phys_to_logical_pkg() tries to find
      cpuinfo_x86->phys_proc_id that matches the phys_pkg argument.
      
        arch/x86/kernel/smpboot.c:
        int topology_phys_to_logical_pkg(unsigned int phys_pkg)
        {
                int cpu;
      
                for_each_possible_cpu(cpu) {
                        struct cpuinfo_x86 *c = &cpu_data(cpu);
      
                        if (c->initialized && c->phys_proc_id == phys_pkg)
                                return c->logical_proc_id;
                }
                return -1;
        }
      
      However, the phys_proc_id was already set to 0 by remove_siblinginfo()
      when the CPU was offlined.
      
      So, topology_phys_to_logical_pkg() cannot find the correct
      logical_proc_id and always returns -1.
      
      As the result, uncore_pci_remove() calls WARN_ON_ONCE() and the warning
      messages are shown.
      
      What is worse is that the bogus 'pkg' index results in two bugs:
      
       - We dereference uncore_extra_pci_dev[] with a negative index
       - We fail to clean up a stale pointer in uncore_extra_pci_dev[][]
      
      To fix these bugs, remove the clearing of ->phys_proc_id from remove_siblinginfo().
      
      This should not cause any problems, because ->phys_proc_id is not
      used after it is hot-removed and it is re-set while hot-adding.
      Signed-off-by: NMasayoshi Mizuma <m.mizuma@jp.fujitsu.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: yasu.isimatu@gmail.com
      Cc: <stable@vger.kernel.org>
      Fixes: 30bb9811 ("x86/topology: Avoid wasting 128k for package id array")
      Link: http://lkml.kernel.org/r/ed738d54-0f01-b38b-b794-c31dc118c207@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      295cc7eb
    • H
      selftests/powerpc: Fix to use ucontext_t instead of struct ucontext · ecdf06e1
      Harish 提交于
      With glibc 2.26 'struct ucontext' is removed to improve POSIX
      compliance, which breaks powerpc/alignment_handler selftest. Fix the
      test by using ucontext_t. Tested on ppc, works with older glibc
      versions as well.
      
      Fixes the following:
        alignment_handler.c: In function ‘sighandler’:
        alignment_handler.c:68:5: error: dereferencing pointer to incomplete type ‘struct ucontext’
          ucp->uc_mcontext.gp_regs[PT_NIP] += 4;
      Signed-off-by: NHarish <harish@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ecdf06e1
    • G
      powerpc/kdump: Fix powernv build break when KEXEC_CORE=n · 91096175
      Guenter Roeck 提交于
      If KEXEC_CORE is not enabled, powernv builds fail as follows.
      
        arch/powerpc/platforms/powernv/smp.c: In function 'pnv_smp_cpu_kill_self':
        arch/powerpc/platforms/powernv/smp.c:236:4: error:
        	implicit declaration of function 'crash_ipi_callback'
      
      Add dummy function calls, similar to kdump_in_progress(), to solve the
      problem.
      
      Fixes: 4145f358 ("powernv/kdump: Fix cases where the kdump kernel can get HMI's")
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      91096175
    • G
      powerpc/pseries: Fix build break for SPLPAR=n and CPU hotplug · 82343484
      Guenter Roeck 提交于
      Commit e67e02a5 ("powerpc/pseries: Fix cpu hotplug crash with
      memoryless nodes") adds an unconditional call to
      find_and_online_cpu_nid(), which is only declared if CONFIG_PPC_SPLPAR
      is enabled. This results in the following build error if this is not
      the case.
      
        arch/powerpc/platforms/pseries/hotplug-cpu.o: In function `dlpar_online_cpu':
        arch/powerpc/platforms/pseries/hotplug-cpu.c:369:
        			undefined reference to `.find_and_online_cpu_nid'
      
      Follow the guideline provided by similar functions and provide a dummy
      function if CONFIG_PPC_SPLPAR is not enabled. This also moves the
      external function declaration into an include file where it should be.
      
      Fixes: e67e02a5 ("powerpc/pseries: Fix cpu hotplug crash with memoryless nodes")
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      [mpe: Change subject to emphasise the build fix]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      82343484
    • A
      powerpc/mm/hash64: Zero PGD pages on allocation · fc5c2f4a
      Aneesh Kumar K.V 提交于
      On powerpc we allocate page table pages from slab caches of different
      sizes. Currently we have a constructor that zeroes out the objects when
      we allocate them for the first time.
      
      We expect the objects to be zeroed out when we free the the object
      back to slab cache. This happens in the unmap path. For hugetlb pages
      we call huge_pte_get_and_clear() to do that.
      
      With the current configuration of page table size, both PUD and PGD
      level tables are allocated from the same slab cache. At the PUD level,
      we use the second half of the table to store the slot information. But
      we never clear that when unmapping.
      
      When such a freed object is then allocated for a PGD page, the second
      half of the page table page will not be zeroed as expected. This
      results in a kernel crash.
      
      Fix it by always clearing PGD pages when they're allocated.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      [mpe: Change log wording and formatting, add whitespace]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fc5c2f4a
    • A
      powerpc/mm/hash64: Store the slot information at the right offset for hugetlb · ff31e105
      Aneesh Kumar K.V 提交于
      The hugetlb pte entries are at the PMD and PUD level, so we can't use
      PTRS_PER_PTE to find the second half of the page table. Use the right
      offset for PUD/PMD to get to the second half of the table.
      
      Fixes: bf9a95f9 ("powerpc: Free up four 64K PTE bits in 64K backed HPTE pages")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Reviewed-by: NRam Pai <linuxram@us.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ff31e105
    • A
      powerpc/mm/hash64: Allocate larger PMD table if hugetlb config is enabled · 4a7aa4fe
      Aneesh Kumar K.V 提交于
      We use the second half of the page table to store slot information, so we must
      allocate it always if hugetlb is possible.
      
      Fixes: bf9a95f9 ("powerpc: Free up four 64K PTE bits in 64K backed HPTE pages")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Reviewed-by: NRam Pai <linuxram@us.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4a7aa4fe
    • A
      powerpc/mm: Fix crashes with 16G huge pages · fae22116
      Aneesh Kumar K.V 提交于
      To support memory keys, we moved the hash pte slot information to the
      second half of the page table. This was ok with PTE entries at level
      4 (PTE page) and level 3 (PMD). We already allocate larger page table
      pages at those levels to accomodate extra details. For level 4 we
      already have the extra space which was used to track 4k hash page
      table entry details and at level 3 the extra space was allocated to
      track the THP details.
      
      With hugetlbfs PTE, we used this extra space at the PMD level to store
      the slot details. But we also support hugetlbfs PTE at PUD level for
      16GB pages and PUD level page didn't allocate extra space. This
      resulted in memory corruption.
      
      Fix this by allocating extra space at PUD level when HUGETLB is
      enabled.
      
      Fixes: bf9a95f9 ("powerpc: Free up four 64K PTE bits in 64K backed HPTE pages")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Reviewed-by: NRam Pai <linuxram@us.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fae22116
    • A
      powerpc/mm: Flush radix process translations when setting MMU type · 62e984dd
      Alexey Kardashevskiy 提交于
      Radix guests do normally invalidate process-scoped translations when a
      new pid is allocated but migrated guests do not invalidate these so
      migrated guests crash sometime, especially easy to reproduce with
      migration happening within first 10 seconds after the guest boot start
      on the same machine.
      
      This adds the "Invalidate process-scoped translations" flush to fix
      radix guests migration.
      
      Fixes: 2ee13be3 ("KVM: PPC: Book3S HV: Update kvmppc_set_arch_compat() for ISA v3.00")
      Cc: stable@vger.kernel.org # v4.10+
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Tested-by: NLaurent Vivier <lvivier@redhat.com>
      Tested-by: NDaniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      62e984dd
    • N
      powerpc/vas: Don't set uses_vas for kernel windows · b00b6289
      Nicholas Piggin 提交于
      cp_abort is only required for user windows, because kernel context
      must not be preempted between a copy/paste pair.
      
      Without this patch, the init task gets used_vas set when it runs the
      nx842_powernv_init initcall, which opens windows for kernel usage.
      
      used_vas is then never cleared anywhere, so it gets propagated into
      all other tasks. It's a property of the address space, so it should
      really be cleared when a new mm is created (or in dup_mmap if the
      mmaps are marked as VM_DONTCOPY). For now we seem to have no such
      driver, so leave that for another patch.
      
      Fixes: 6c8e6bb2 ("powerpc/vas: Add support for user receive window")
      Cc: stable@vger.kernel.org # v4.15+
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b00b6289
    • S
      powerpc/pseries: Enable RAS hotplug events later · c9dccf1d
      Sam Bobroff 提交于
      Currently if the kernel receives a memory hot-unplug event early
      enough, it may get stuck in an infinite loop in
      dissolve_free_huge_pages(). This appears as a stall just after:
      
        pseries-hotplug-mem: Attempting to hot-remove XX LMB(s) at YYYYYYYY
      
      It appears to be caused by "minimum_order" being uninitialized, due to
      init_ras_IRQ() executing before hugetlb_init().
      
      To correct this, extract the part of init_ras_IRQ() that enables
      hotplug event processing and place it in the machine_late_initcall
      phase, which is guaranteed to be after hugetlb_init() is called.
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      [mpe: Reorder the functions to make the diff readable]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c9dccf1d
    • jia zhang's avatar
      x86/mm/kcore: Add vsyscall page to /proc/kcore conditionally · cd026ca2
      jia zhang 提交于
      The vsyscall page should be visible only if vsyscall=emulate/native when dumping /proc/kcore.
      Signed-off-by: jia zhang's avatarJia Zhang <zhang.jia@linux.alibaba.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: jolsa@redhat.com
      Link: http://lkml.kernel.org/r/1518446694-21124-3-git-send-email-zhang.jia@linux.alibaba.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cd026ca2
    • jia zhang's avatar
      vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page · 595dd46e
      jia zhang 提交于
      Commit:
      
        df04abfd ("fs/proc/kcore.c: Add bounce buffer for ktext data")
      
      ... introduced a bounce buffer to work around CONFIG_HARDENED_USERCOPY=y.
      However, accessing the vsyscall user page will cause an SMAP fault.
      
      Replace memcpy() with copy_from_user() to fix this bug works, but adding
      a common way to handle this sort of user page may be useful for future.
      
      Currently, only vsyscall page requires KCORE_USER.
      Signed-off-by: jia zhang's avatarJia Zhang <zhang.jia@linux.alibaba.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: jolsa@redhat.com
      Link: http://lkml.kernel.org/r/1518446694-21124-2-git-send-email-zhang.jia@linux.alibaba.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      595dd46e