1. 02 5月, 2017 1 次提交
  2. 01 5月, 2017 1 次提交
  3. 30 4月, 2017 1 次提交
  4. 28 4月, 2017 1 次提交
    • B
      x86/KASLR: Fix kexec kernel boot crash when KASLR randomization fails · da63b6b2
      Baoquan He 提交于
      Dave found that a kdump kernel with KASLR enabled will reset to the BIOS
      immediately if physical randomization failed to find a new position for
      the kernel. A kernel with the 'nokaslr' option works in this case.
      
      The reason is that KASLR will install a new page table for the identity
      mapping, while it missed building it for the original kernel location
      if KASLR physical randomization fails.
      
      This only happens in the kexec/kdump kernel, because the identity mapping
      has been built for kexec/kdump in the 1st kernel for the whole memory by
      calling init_pgtable(). Here if physical randomizaiton fails, it won't build
      the identity mapping for the original area of the kernel but change to a
      new page table '_pgtable'. Then the kernel will triple fault immediately
      caused by no identity mappings.
      
      The normal kernel won't see this bug, because it comes here via startup_32()
      and CR3 will be set to _pgtable already. In startup_32() the identity
      mapping is built for the 0~4G area. In KASLR we just append to the existing
      area instead of entirely overwriting it for on-demand identity mapping
      building. So the identity mapping for the original area of kernel is still
      there.
      
      To fix it we just switch to the new identity mapping page table when physical
      KASLR succeeds. Otherwise we keep the old page table unchanged just like
      "nokaslr" does.
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Garnier <thgarnie@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/1493278940-5885-1-git-send-email-bhe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      da63b6b2
  5. 27 4月, 2017 3 次提交
  6. 26 4月, 2017 2 次提交
    • J
      x86/unwind: Dump all stacks in unwind_dump() · 262fa734
      Josh Poimboeuf 提交于
      Currently unwind_dump() dumps only the most recently accessed stack.
      But it has a few issues.
      
      In some cases, 'first_sp' can get out of sync with 'stack_info', causing
      unwind_dump() to start from the wrong address, flood the printk buffer,
      and eventually read a bad address.
      
      In other cases, dumping only the most recently accessed stack doesn't
      give enough data to diagnose the error.
      
      Fix both issues by dumping *all* stacks involved in the trace, not just
      the last one.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 8b5e99f0 ("x86/unwind: Dump stack data on warnings")
      Link: http://lkml.kernel.org/r/016d6a9810d7d1bfc87ef8c0e6ee041c6744c909.1493171120.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      262fa734
    • J
      x86/unwind: Silence more entry-code related warnings · b0d50c7b
      Josh Poimboeuf 提交于
      Borislav Petkov reported the following unwinder warning:
      
        WARNING: kernel stack regs at ffffc9000024fea8 in udevadm:92 has bad 'bp' value 00007fffc4614d30
        unwind stack type:0 next_sp:          (null) mask:0x6 graph_idx:0
        ffffc9000024fea8: 000055a6100e9b38 (0x55a6100e9b38)
        ffffc9000024feb0: 000055a6100e9b35 (0x55a6100e9b35)
        ffffc9000024feb8: 000055a6100e9f68 (0x55a6100e9f68)
        ffffc9000024fec0: 000055a6100e9f50 (0x55a6100e9f50)
        ffffc9000024fec8: 00007fffc4614d30 (0x7fffc4614d30)
        ffffc9000024fed0: 000055a6100eaf50 (0x55a6100eaf50)
        ffffc9000024fed8: 0000000000000000 ...
        ffffc9000024fee0: 0000000000000100 (0x100)
        ffffc9000024fee8: ffff8801187df488 (0xffff8801187df488)
        ffffc9000024fef0: 00007ffffffff000 (0x7ffffffff000)
        ffffc9000024fef8: 0000000000000000 ...
        ffffc9000024ff10: ffffc9000024fe98 (0xffffc9000024fe98)
        ffffc9000024ff18: 00007fffc4614d00 (0x7fffc4614d00)
        ffffc9000024ff20: ffffffffffffff10 (0xffffffffffffff10)
        ffffc9000024ff28: ffffffff811c6c1f (SyS_newlstat+0xf/0x10)
        ffffc9000024ff30: 0000000000000010 (0x10)
        ffffc9000024ff38: 0000000000000296 (0x296)
        ffffc9000024ff40: ffffc9000024ff50 (0xffffc9000024ff50)
        ffffc9000024ff48: 0000000000000018 (0x18)
        ffffc9000024ff50: ffffffff816b2e6a (entry_SYSCALL_64_fastpath+0x18/0xa8)
        ...
      
      It unwinded from an interrupt which came in right after entry code
      called into a C syscall handler, before it had a chance to set up the
      frame pointer, so regs->bp still had its user space value.
      
      Add a check to silence warnings in such a case, where an interrupt
      has occurred and regs->sp is almost at the end of the stack.
      Reported-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: c32c47c6 ("x86/unwind: Warn on bad frame pointer")
      Link: http://lkml.kernel.org/r/c695f0d0d4c2cfe6542b90e2d0520e11eb901eb5.1493171120.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b0d50c7b
  7. 25 4月, 2017 1 次提交
  8. 24 4月, 2017 2 次提交
  9. 21 4月, 2017 2 次提交
    • S
      x86/ftrace: Fix ebp in ftrace_regs_caller that screws up unwinder · dc912c30
      Steven Rostedt (VMware) 提交于
      Fengguang Wu's zero day bot triggered a stack unwinder dump. This can
      be easily triggered when CONFIG_FRAME_POINTERS is enabled and -mfentry
      is in use on x86_32.
      
       ># cd /sys/kernel/debug/tracing
       ># echo 'p:schedule schedule' > kprobe_events
       ># echo stacktrace > events/kprobes/schedule/trigger
      
      This is because the code that implemented fentry in the ftrace_regs_caller
      tried to use the least amount of #ifdefs, and modified ebp when
      CC_USE_FENTRY was defined to point to the parent ip as it does when
      CC_USE_FENTRY is not defined. But when CONFIG_FRAME_POINTERS is set, it
      corrupts the ebp register for this frame while doing the tracing.
      
      NOTE, it does not corrupt ebp in any other way. It is just a bad frame
      pointer when calling into the tracing infrastructure. The original ebp is
      restored before returning from the fentry call. But if a stack trace is
      performed inside the tracing, the unwinder will notice the bad ebp.
      
      Instead of toying with ebp with CC_USING_FENTRY, just slap the parent ip
      into the second parameter (%edx), and have an #else that does it the
      original way.
      
      The unwinder will unfortunately miss the function being traced, as the
      stack frame is not set up yet for it, as it is for x86_64. But fixing that
      is a bit more complex and did not work before anyway.
      
      This has been tested with and without FRAME_POINTERS being set while using
      -mfentry, as well as using an older compiler that uses mcount.
      Analyzed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Fixes: 644e0e8d ("x86/ftrace: Add -mfentry support to x86_32 with DYNAMIC_FTRACE set")
      Reported-by: Nkernel test robot <fengguang.wu@intel.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Link: https://lists.01.org/pipermail/lkp/2017-April/006165.html
      Link: http://lkml.kernel.org/r/20170420172236.7af7f6e5@gandalf.local.homeSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      dc912c30
    • V
      ARCv2: entry: save Accumulator register pair (r58:59) if present · 3d5e8012
      Vineet Gupta 提交于
      Accumulator is present in configs with FPU and/or DSP MPY (mpy > 6)
      
      Instead of doing this in pt_regs (and thus every kernel entry/exit),
      this could have been done in context switch (and for user task only) as
      currently kernel doesn't clobber these registers for its own accord.
      However we will soon start using 64-bit multiply instructions for kernel
      which can clobber these. Also gcc folks also plan to start using these
      as GPRs, hence better to always save/restore them
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      3d5e8012
  10. 20 4月, 2017 7 次提交
  11. 19 4月, 2017 9 次提交
  12. 18 4月, 2017 5 次提交
    • M
      powerpc/64: Fix HMI exception on LE with CONFIG_RELOCATABLE=y · be5c5e84
      Michael Ellerman 提交于
      Prior to commit 2337d207 ("powerpc/64: CONFIG_RELOCATABLE support for hmi
      interrupts"), the branch from hmi_exception_early() to hmi_exception_realmode()
      was just a bl hmi_exception_realmode, which the linker would turn into a bl to
      the local entry point of hmi_exception_realmode. This was broken when
      CONFIG_RELOCATABLE=y because hmi_exception_realmode() is not in the low part of
      the kernel text that is copied down to 0x0.
      
      But in fixing that, we added a new bug on little endian kernels. Because the
      branch is now a bctrl when CONFIG_RELOCATABLE=y, we branch to the global entry
      point of hmi_exception_realmode(). The global entry point must be called with
      r12 containing the address of hmi_exception_realmode(), because it uses that
      value to calculate the TOC value (r2).
      
      This may manifest as a checkstop, because we take a junk value from r12 which
      came from HSRR1, add a small constant to it and then use that as the TOC
      pointer. The HSRR1 value will have 0x9 as the top nibble, which puts it above
      RAM and somewhere in MMIO space.
      
      Fix it by changing the BRANCH_LINK_TO_FAR() macro to always use r12 to load the
      label we're branching to. This means r12 will be setup correctly on LE, fixing
      this bug, and r12 is also volatile across function calls on BE so it's a good
      choice anyway.
      
      Fixes: 2337d207 ("powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts")
      Reported-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Acked-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      be5c5e84
    • R
      powerpc/kprobe: Fix oops when kprobed on 'stdu' instruction · 9e1ba4f2
      Ravi Bangoria 提交于
      If we set a kprobe on a 'stdu' instruction on powerpc64, we see a kernel
      OOPS:
      
        Bad kernel stack pointer cd93c840 at c000000000009868
        Oops: Bad kernel stack pointer, sig: 6 [#1]
        ...
        GPR00: c000001fcd93cb30 00000000cd93c840 c0000000015c5e00 00000000cd93c840
        ...
        NIP [c000000000009868] resume_kernel+0x2c/0x58
        LR [c000000000006208] program_check_common+0x108/0x180
      
      On a 64-bit system when the user probes on a 'stdu' instruction, the kernel does
      not emulate actual store in emulate_step() because it may corrupt the exception
      frame. So the kernel does the actual store operation in exception return code
      i.e. resume_kernel().
      
      resume_kernel() loads the saved stack pointer from memory using lwz, which only
      loads the low 32-bits of the address, causing the kernel crash.
      
      Fix this by loading the 64-bit value instead.
      
      Fixes: be96f633 ("powerpc: Split out instruction analysis part of emulate_step()")
      Cc: stable@vger.kernel.org # v3.18+
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Reviewed-by: NAnanth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      [mpe: Change log massage, add stable tag]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9e1ba4f2
    • I
      x86: Enable KASLR by default · 6807c846
      Ingo Molnar 提交于
      KASLR is mature (and important) enough to be enabled by default on x86.
      
      Also enable it by default in the defconfigs.
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: dan.j.williams@intel.com
      Cc: dave.jiang@intel.com
      Cc: dyoung@redhat.com
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6807c846
    • J
      x86/unwind: Ensure stack pointer is aligned · e335bb51
      Josh Poimboeuf 提交于
      With frame pointers disabled, on some older versions of GCC (like
      4.8.3), it's possible for the stack pointer to get aligned at a
      half-word boundary:
      
        00000000000004d0 <fib_table_lookup>:
             4d0:       41 57                   push   %r15
             4d2:       41 56                   push   %r14
             4d4:       41 55                   push   %r13
             4d6:       41 54                   push   %r12
             4d8:       55                      push   %rbp
             4d9:       53                      push   %rbx
             4da:       48 83 ec 24             sub    $0x24,%rsp
      
      In such a case, the unwinder ends up reading the entire stack at the
      wrong alignment.  Then the last read goes past the end of the stack,
      hitting the stack guard page:
      
        BUG: stack guard page was hit at ffffc900217c4000 (stack is ffffc900217c0000..ffffc900217c3fff)
        kernel stack overflow (page fault): 0000 [#1] SMP
        ...
      
      Fix it by ensuring the stack pointer is properly aligned before
      unwinding.
      Reported-by: NJirka Hladky <jhladky@redhat.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: 7c7900f8 ("x86/unwind: Add new unwind interface and implementations")
      Link: http://lkml.kernel.org/r/cff33847cc9b02fa548625aa23268ac574460d8d.1492436590.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e335bb51
    • B
      x86/mce: Update notifier priority check · 415601b1
      Borislav Petkov 提交于
      Update the check which enforces the registration of MCE decoder notifier
      callbacks with valid priority only, to include mcelog's priority.
      Reported-by: Nkernel test robot <xiaolong.ye@intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Cc: lkp@01.org
      Link: http://lkml.kernel.org/r/20170418073820.i6kl5tggcntwlisa@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      415601b1
  13. 17 4月, 2017 1 次提交
  14. 16 4月, 2017 1 次提交
  15. 15 4月, 2017 3 次提交
    • M
      parisc: fix bugs in pa_memcpy · 409c1b25
      Mikulas Patocka 提交于
      The patch 554bfece ("parisc: Fix access
      fault handling in pa_memcpy()") reimplements the pa_memcpy function.
      Unfortunatelly, it makes the kernel unbootable. The crash happens in the
      function ide_complete_cmd where memcpy is called with the same source
      and destination address.
      
      This patch fixes a few bugs in pa_memcpy:
      
      * When jumping to .Lcopy_loop_16 for the first time, don't skip the
        instruction "ldi 31,t0" (this bug made the kernel unbootable)
      * Use the COND macro when comparing length, so that the comparison is
        64-bit (a theoretical issue, in case the length is greater than
        0xffffffff)
      * Don't use the COND macro after the "extru" instruction (the PA-RISC
        specification says that the upper 32-bits of extru result are undefined,
        although they are set to zero in practice)
      * Fix exception addresses in .Lcopy16_fault and .Lcopy8_fault
      * Rename .Lcopy_loop_4 to .Lcopy_loop_8 (so that it is consistent with
        .Lcopy8_fault)
      
      Cc: <stable@vger.kernel.org> # v4.9+
      Fixes: 554bfece ("parisc: Fix access fault handling in pa_memcpy()")
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      409c1b25
    • T
      sparc/sysfs: Replace racy task affinity logic · ea875ec9
      Thomas Gleixner 提交于
      The mmustat_enable sysfs file accessor functions must run code on the
      target CPU. This is achieved by temporarily setting the affinity of the
      calling user space thread to the requested CPU and reset it to the original
      affinity afterwards.
      
      That's racy vs. concurrent affinity settings for that thread resulting in
      code executing on the wrong CPU and overwriting the new affinity setting.
      
      Replace it by using work_on_cpu() which guarantees to run the code on the
      requested CPU.
      
      Protection against CPU hotplug is not required as the open sysfs file
      already prevents the removal from the CPU offline callback. Using the
      hotplug protected version would actually be wrong because it would deadlock
      against a CPU hotplug operation of the CPU associated to the sysfs file in
      progress.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: fenghua.yu@intel.com
      Cc: tony.luck@intel.com
      Cc: herbert@gondor.apana.org.au
      Cc: rjw@rjwysocki.net
      Cc: peterz@infradead.org
      Cc: benh@kernel.crashing.org
      Cc: bigeasy@linutronix.de
      Cc: jiangshanlai@gmail.com
      Cc: sparclinux@vger.kernel.org
      Cc: viresh.kumar@linaro.org
      Cc: mpe@ellerman.id.au
      Cc: tj@kernel.org
      Cc: lenb@kernel.org
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704131001270.2408@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      ea875ec9
    • T
      powerpc/smp: Replace open coded task affinity logic · 6d11b87d
      Thomas Gleixner 提交于
      Init task invokes smp_ops->setup_cpu() from smp_cpus_done(). Init task can
      run on any online CPU at this point, but the setup_cpu() callback requires
      to be invoked on the boot CPU. This is achieved by temporarily setting the
      affinity of the calling user space thread to the requested CPU and reset it
      to the original affinity afterwards.
      
      That's racy vs. CPU hotplug and concurrent affinity settings for that
      thread resulting in code executing on the wrong CPU and overwriting the
      new affinity setting.
      
      That's actually not a problem in this context as neither CPU hotplug nor
      affinity settings can happen, but the access to task_struct::cpus_allowed
      is about to restricted.
      
      Replace it with a call to work_on_cpu_safe() which achieves the same result.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Len Brown <lenb@kernel.org>
      Link: http://lkml.kernel.org/r/20170412201042.518053336@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      6d11b87d