1. 05 5月, 2017 3 次提交
    • M
      x86/mm/kaslr: Use the _ASM_MUL macro for multiplication to work around Clang incompatibility · 121843eb
      Matthias Kaehlcke 提交于
      The constraint "rm" allows the compiler to put mix_const into memory.
      When the input operand is a memory location then MUL needs an operand
      size suffix, since Clang can't infer the multiplication width from the
      operand.
      
      Add and use the _ASM_MUL macro which determines the operand size and
      resolves to the NUL instruction with the corresponding suffix.
      
      This fixes the following error when building with clang:
      
        CC      arch/x86/lib/kaslr.o
        /tmp/kaslr-dfe1ad.s: Assembler messages:
        /tmp/kaslr-dfe1ad.s:182: Error: no instruction mnemonic suffix given and no register operands; can't size instruction
      Signed-off-by: NMatthias Kaehlcke <mka@chromium.org>
      Cc: Grant Grundler <grundler@chromium.org>
      Cc: Greg Hackmann <ghackmann@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Davidson <md@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170501224741.133938-1-mka@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      121843eb
    • B
      x86/mm: Fix boot crash caused by incorrect loop count calculation in sync_global_pgds() · fc5f9d5f
      Baoquan He 提交于
      Jeff Moyer reported that on his system with two memory regions 0~64G and
      1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling KASLR
      will make the system hang intermittently during boot. While adding 'nokaslr'
      won't.
      
      The back trace is:
      
       Oops: 0000 [#1] SMP
      
       RIP: memcpy_erms()
       [ .... ]
       Call Trace:
        pmem_rw_page()
        bdev_read_page()
        do_mpage_readpage()
        mpage_readpages()
        blkdev_readpages()
        __do_page_cache_readahead()
        force_page_cache_readahead()
        page_cache_sync_readahead()
        generic_file_read_iter()
        blkdev_read_iter()
        __vfs_read()
        vfs_read()
        SyS_read()
        entry_SYSCALL_64_fastpath()
      
      This crash happens because the for loop count calculation in sync_global_pgds()
      is not correct. When a mapping area crosses PGD entries, we should
      calculate the starting address of region which next PGD covers and assign
      it to next for loop count, but not add PGDIR_SIZE directly. The old
      code works right only if the mapping area is an exact multiple of PGDIR_SIZE,
      otherwize the end region could be skipped so that it can't be synchronized
      to all other processes from kernel PGD init_mm.pgd.
      
      In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than
      PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it
      makes this area be mapped inside one PGD entry. With KASLR enabled,
      this area could cross two PGD entries, then the next PGD entry won't
      be synced to all other processes. That is why we saw empty PGD.
      
      Fix it.
      Reported-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jinbum Park <jinb.park7@gmail.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Garnier <thgarnie@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/1493864747-8506-1-git-send-email-bhe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fc5f9d5f
    • J
      x86/asm: Don't use RBP as a temporary register in csum_partial_copy_generic() · 42fc6c6c
      Josh Poimboeuf 提交于
      Andrey Konovalov reported the following warning while fuzzing the kernel
      with syzkaller:
      
        WARNING: kernel stack regs at ffff8800686869f8 in a.out:4933 has bad 'bp' value c3fc855a10167ec0
      
      The unwinder dump revealed that RBP had a bad value when an interrupt
      occurred in csum_partial_copy_generic().
      
      That function saves RBP on the stack and then overwrites it, using it as
      a scratch register.  That's problematic because it breaks stack traces
      if an interrupt occurs in the middle of the function.
      
      Replace the usage of RBP with another callee-saved register (R15) so
      stack traces are no longer affected.
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Tested-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: David S . Miller <davem@davemloft.net>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: linux-sctp@vger.kernel.org
      Cc: netdev <netdev@vger.kernel.org>
      Cc: syzkaller <syzkaller@googlegroups.com>
      Link: http://lkml.kernel.org/r/4b03a961efda5ec9bfe46b7b9c9ad72d1efad343.1493909486.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      42fc6c6c
  2. 02 5月, 2017 1 次提交
  3. 01 5月, 2017 1 次提交
  4. 30 4月, 2017 1 次提交
  5. 28 4月, 2017 1 次提交
    • B
      x86/KASLR: Fix kexec kernel boot crash when KASLR randomization fails · da63b6b2
      Baoquan He 提交于
      Dave found that a kdump kernel with KASLR enabled will reset to the BIOS
      immediately if physical randomization failed to find a new position for
      the kernel. A kernel with the 'nokaslr' option works in this case.
      
      The reason is that KASLR will install a new page table for the identity
      mapping, while it missed building it for the original kernel location
      if KASLR physical randomization fails.
      
      This only happens in the kexec/kdump kernel, because the identity mapping
      has been built for kexec/kdump in the 1st kernel for the whole memory by
      calling init_pgtable(). Here if physical randomizaiton fails, it won't build
      the identity mapping for the original area of the kernel but change to a
      new page table '_pgtable'. Then the kernel will triple fault immediately
      caused by no identity mappings.
      
      The normal kernel won't see this bug, because it comes here via startup_32()
      and CR3 will be set to _pgtable already. In startup_32() the identity
      mapping is built for the 0~4G area. In KASLR we just append to the existing
      area instead of entirely overwriting it for on-demand identity mapping
      building. So the identity mapping for the original area of kernel is still
      there.
      
      To fix it we just switch to the new identity mapping page table when physical
      KASLR succeeds. Otherwise we keep the old page table unchanged just like
      "nokaslr" does.
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Garnier <thgarnie@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/1493278940-5885-1-git-send-email-bhe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      da63b6b2
  6. 27 4月, 2017 3 次提交
  7. 26 4月, 2017 7 次提交
    • A
      x86/mm: Fix flush_tlb_page() on Xen · dbd68d8e
      Andy Lutomirski 提交于
      flush_tlb_page() passes a bogus range to flush_tlb_others() and
      expects the latter to fix it up.  native_flush_tlb_others() has the
      fixup but Xen's version doesn't.  Move the fixup to
      flush_tlb_others().
      
      AFAICS the only real effect is that, without this fix, Xen would
      flush everything instead of just the one page on remote vCPUs in
      when flush_tlb_page() was called.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: e7b52ffd ("x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range")
      Link: http://lkml.kernel.org/r/10ed0e4dfea64daef10b87fb85df1746999b4dba.1492844372.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      dbd68d8e
    • A
      x86/mm: Make flush_tlb_mm_range() more predictable · ce27374f
      Andy Lutomirski 提交于
      I'm about to rewrite the function almost completely, but first I
      want to get a functional change out of the way.  Currently, if
      flush_tlb_mm_range() does not flush the local TLB at all, it will
      never do individual page flushes on remote CPUs.  This seems to be
      an accident, and preserving it will be awkward.  Let's change it
      first so that any regressions in the rewrite will be easier to
      bisect and so that the rewrite can attempt to change no visible
      behavior at all.
      
      The fix is simple: we can simply avoid short-circuiting the
      calculation of base_pages_to_flush.
      
      As a side effect, this also eliminates a potential corner case: if
      tlb_single_page_flush_ceiling == TLB_FLUSH_ALL, flush_tlb_mm_range()
      could have ended up flushing the entire address space one page at a
      time.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NDave Hansen <dave.hansen@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/4b29b771d9975aad7154c314534fec235618175a.1492844372.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ce27374f
    • A
      x86/mm: Remove flush_tlb() and flush_tlb_current_task() · 29961b59
      Andy Lutomirski 提交于
      I was trying to figure out what how flush_tlb_current_task() would
      possibly work correctly if current->mm != current->active_mm, but I
      realized I could spare myself the effort: it has no callers except
      the unused flush_tlb() macro.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/e52d64c11690f85e9f1d69d7b48cc2269cd2e94b.1492844372.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      29961b59
    • A
      x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly() · 9ccee237
      Andy Lutomirski 提交于
      mark_screen_rdonly() is the last remaining caller of flush_tlb().
      flush_tlb_mm_range() is potentially faster and isn't obsolete.
      
      Compile-tested only because I don't know whether software that uses
      this mechanism even exists.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/791a644076fc3577ba7f7b7cafd643cc089baa7d.1492844372.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9ccee237
    • K
      x86/mm/64: Fix crash in remove_pagetable() · e6ab9c4d
      Kirill A. Shutemov 提交于
      remove_pagetable() does page walk using p*d_page_vaddr() plus cast.
      It's not canonical approach -- we usually use p*d_offset() for that.
      
      It works fine as long as all page table levels are present. We broke the
      invariant by introducing folded p4d page table level.
      
      As result, remove_pagetable() interprets PMD as PUD and it leads to
      crash:
      
      	BUG: unable to handle kernel paging request at ffff880300000000
      	IP: memchr_inv+0x60/0x110
      	PGD 317d067
      	P4D 317d067
      	PUD 3180067
      	PMD 33f102067
      	PTE 8000000300000060
      
      Let's fix this by using p*d_offset() instead of p*d_page_vaddr() for
      page walk.
      Reported-by: NDan Williams <dan.j.williams@intel.com>
      Tested-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Fixes: f2a6a705 ("x86: Convert the rest of the code to support p4d_t")
      Link: http://lkml.kernel.org/r/20170425092557.21852-1-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e6ab9c4d
    • J
      x86/unwind: Dump all stacks in unwind_dump() · 262fa734
      Josh Poimboeuf 提交于
      Currently unwind_dump() dumps only the most recently accessed stack.
      But it has a few issues.
      
      In some cases, 'first_sp' can get out of sync with 'stack_info', causing
      unwind_dump() to start from the wrong address, flood the printk buffer,
      and eventually read a bad address.
      
      In other cases, dumping only the most recently accessed stack doesn't
      give enough data to diagnose the error.
      
      Fix both issues by dumping *all* stacks involved in the trace, not just
      the last one.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 8b5e99f0 ("x86/unwind: Dump stack data on warnings")
      Link: http://lkml.kernel.org/r/016d6a9810d7d1bfc87ef8c0e6ee041c6744c909.1493171120.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      262fa734
    • J
      x86/unwind: Silence more entry-code related warnings · b0d50c7b
      Josh Poimboeuf 提交于
      Borislav Petkov reported the following unwinder warning:
      
        WARNING: kernel stack regs at ffffc9000024fea8 in udevadm:92 has bad 'bp' value 00007fffc4614d30
        unwind stack type:0 next_sp:          (null) mask:0x6 graph_idx:0
        ffffc9000024fea8: 000055a6100e9b38 (0x55a6100e9b38)
        ffffc9000024feb0: 000055a6100e9b35 (0x55a6100e9b35)
        ffffc9000024feb8: 000055a6100e9f68 (0x55a6100e9f68)
        ffffc9000024fec0: 000055a6100e9f50 (0x55a6100e9f50)
        ffffc9000024fec8: 00007fffc4614d30 (0x7fffc4614d30)
        ffffc9000024fed0: 000055a6100eaf50 (0x55a6100eaf50)
        ffffc9000024fed8: 0000000000000000 ...
        ffffc9000024fee0: 0000000000000100 (0x100)
        ffffc9000024fee8: ffff8801187df488 (0xffff8801187df488)
        ffffc9000024fef0: 00007ffffffff000 (0x7ffffffff000)
        ffffc9000024fef8: 0000000000000000 ...
        ffffc9000024ff10: ffffc9000024fe98 (0xffffc9000024fe98)
        ffffc9000024ff18: 00007fffc4614d00 (0x7fffc4614d00)
        ffffc9000024ff20: ffffffffffffff10 (0xffffffffffffff10)
        ffffc9000024ff28: ffffffff811c6c1f (SyS_newlstat+0xf/0x10)
        ffffc9000024ff30: 0000000000000010 (0x10)
        ffffc9000024ff38: 0000000000000296 (0x296)
        ffffc9000024ff40: ffffc9000024ff50 (0xffffc9000024ff50)
        ffffc9000024ff48: 0000000000000018 (0x18)
        ffffc9000024ff50: ffffffff816b2e6a (entry_SYSCALL_64_fastpath+0x18/0xa8)
        ...
      
      It unwinded from an interrupt which came in right after entry code
      called into a C syscall handler, before it had a chance to set up the
      frame pointer, so regs->bp still had its user space value.
      
      Add a check to silence warnings in such a case, where an interrupt
      has occurred and regs->sp is almost at the end of the stack.
      Reported-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: c32c47c6 ("x86/unwind: Warn on bad frame pointer")
      Link: http://lkml.kernel.org/r/c695f0d0d4c2cfe6542b90e2d0520e11eb901eb5.1493171120.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b0d50c7b
  8. 25 4月, 2017 1 次提交
  9. 24 4月, 2017 2 次提交
  10. 23 4月, 2017 1 次提交
    • I
      Revert "x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation" · 6dd29b3d
      Ingo Molnar 提交于
      This reverts commit 2947ba05.
      
      Dan Williams reported dax-pmem kernel warnings with the following signature:
      
         WARNING: CPU: 8 PID: 245 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x1f5/0x200
         percpu ref (dax_pmem_percpu_release [dax_pmem]) <= 0 (0) after switching to atomic
      
      ... and bisected it to this commit, which suggests possible memory corruption
      caused by the x86 fast-GUP conversion.
      
      He also pointed out:
      
       "
        This is similar to the backtrace when we were not properly handling
        pud faults and was fixed with this commit: 220ced16 "mm: fix
        get_user_pages() vs device-dax pud mappings"
      
        I've found some missing _devmap checks in the generic
        get_user_pages_fast() path, but this does not fix the regression
        [...]
       "
      
      So given that there are known bugs, and a pretty robust looking bisection
      points to this commit suggesting that are unknown bugs in the conversion
      as well, revert it for the time being - we'll re-try in v4.13.
      Reported-by: NDan Williams <dan.j.williams@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: aneesh.kumar@linux.vnet.ibm.com
      Cc: dann.frazier@canonical.com
      Cc: dave.hansen@intel.com
      Cc: steve.capper@linaro.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6dd29b3d
  11. 21 4月, 2017 2 次提交
    • S
      x86/ftrace: Fix ebp in ftrace_regs_caller that screws up unwinder · dc912c30
      Steven Rostedt (VMware) 提交于
      Fengguang Wu's zero day bot triggered a stack unwinder dump. This can
      be easily triggered when CONFIG_FRAME_POINTERS is enabled and -mfentry
      is in use on x86_32.
      
       ># cd /sys/kernel/debug/tracing
       ># echo 'p:schedule schedule' > kprobe_events
       ># echo stacktrace > events/kprobes/schedule/trigger
      
      This is because the code that implemented fentry in the ftrace_regs_caller
      tried to use the least amount of #ifdefs, and modified ebp when
      CC_USE_FENTRY was defined to point to the parent ip as it does when
      CC_USE_FENTRY is not defined. But when CONFIG_FRAME_POINTERS is set, it
      corrupts the ebp register for this frame while doing the tracing.
      
      NOTE, it does not corrupt ebp in any other way. It is just a bad frame
      pointer when calling into the tracing infrastructure. The original ebp is
      restored before returning from the fentry call. But if a stack trace is
      performed inside the tracing, the unwinder will notice the bad ebp.
      
      Instead of toying with ebp with CC_USING_FENTRY, just slap the parent ip
      into the second parameter (%edx), and have an #else that does it the
      original way.
      
      The unwinder will unfortunately miss the function being traced, as the
      stack frame is not set up yet for it, as it is for x86_64. But fixing that
      is a bit more complex and did not work before anyway.
      
      This has been tested with and without FRAME_POINTERS being set while using
      -mfentry, as well as using an older compiler that uses mcount.
      Analyzed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Fixes: 644e0e8d ("x86/ftrace: Add -mfentry support to x86_32 with DYNAMIC_FTRACE set")
      Reported-by: Nkernel test robot <fengguang.wu@intel.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Link: https://lists.01.org/pipermail/lkp/2017-April/006165.html
      Link: http://lkml.kernel.org/r/20170420172236.7af7f6e5@gandalf.local.homeSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      dc912c30
    • V
      ARCv2: entry: save Accumulator register pair (r58:59) if present · 3d5e8012
      Vineet Gupta 提交于
      Accumulator is present in configs with FPU and/or DSP MPY (mpy > 6)
      
      Instead of doing this in pt_regs (and thus every kernel entry/exit),
      this could have been done in context switch (and for user task only) as
      currently kernel doesn't clobber these registers for its own accord.
      However we will soon start using 64-bit multiply instructions for kernel
      which can clobber these. Also gcc folks also plan to start using these
      as GPRs, hence better to always save/restore them
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      3d5e8012
  12. 20 4月, 2017 8 次提交
  13. 19 4月, 2017 9 次提交