1. 17 12月, 2017 12 次提交
    • A
      x86/entry/64: Allocate and enable the SYSENTER stack · 1a79797b
      Andy Lutomirski 提交于
      This will simplify future changes that want scratch variables early in
      the SYSENTER handler -- they'll be able to spill registers to the
      stack.  It also lets us get rid of a SWAPGS_UNSAFE_STACK user.
      
      This does not depend on CONFIG_IA32_EMULATION=y because we'll want the
      stack space even without IA32 emulation.
      
      As far as I can tell, the reason that this wasn't done from day 1 is
      that we use IST for #DB and #BP, which is IMO rather nasty and causes
      a lot more problems than it solves.  But, since #DB uses IST, we don't
      actually need a real stack for SYSENTER (because SYSENTER with TF set
      will invoke #DB on the IST stack rather than the SYSENTER stack).
      
      I want to remove IST usage from these vectors some day, and this patch
      is a prerequisite for that as well.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.312726423@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1a79797b
    • A
      x86/irq/64: Print the offending IP in the stack overflow warning · 4f3789e7
      Andy Lutomirski 提交于
      In case something goes wrong with unwind (not unlikely in case of
      overflow), print the offending IP where we detected the overflow.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.231677119@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4f3789e7
    • A
      x86/irq: Remove an old outdated comment about context tracking races · 6669a692
      Andy Lutomirski 提交于
      That race has been fixed and code cleaned up for a while now.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.150551639@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6669a692
    • J
      x86/unwinder: Handle stack overflows more gracefully · b02fcf9b
      Josh Poimboeuf 提交于
      There are at least two unwinder bugs hindering the debugging of
      stack-overflow crashes:
      
      - It doesn't deal gracefully with the case where the stack overflows and
        the stack pointer itself isn't on a valid stack but the
        to-be-dereferenced data *is*.
      
      - The ORC oops dump code doesn't know how to print partial pt_regs, for the
        case where if we get an interrupt/exception in *early* entry code
        before the full pt_regs have been saved.
      
      Fix both issues.
      
      http://lkml.kernel.org/r/20171126024031.uxi4numpbjm5rlbr@trebleSigned-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bpetkov@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.071425003@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b02fcf9b
    • A
      x86/unwinder/orc: Dont bail on stack overflow · d3a09104
      Andy Lutomirski 提交于
      If the stack overflows into a guard page and the ORC unwinder should work
      well: by construction, there can't be any meaningful data in the guard page
      because no writes to the guard page will have succeeded.
      
      But there is a bug that prevents unwinding from working correctly: if the
      starting register state has RSP pointing into a stack guard page, the ORC
      unwinder bails out immediately.
      
      Instead of bailing out immediately check whether the next page up is a
      valid check page and if so analyze that. As a result the ORC unwinder will
      start the unwind.
      
      Tested by intentionally overflowing the task stack.  The result is an
      accurate call trace instead of a trace consisting purely of '?' entries.
      
      There are a few other bugs that are triggered if the unwinder encounters a
      stack overflow after the first step, but they are outside the scope of this
      fix.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150604.991389777@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d3a09104
    • B
      x86/entry/64/paravirt: Use paravirt-safe macro to access eflags · e17f8234
      Boris Ostrovsky 提交于
      Commit 1d3e53e8 ("x86/entry/64: Refactor IRQ stacks and make them
      NMI-safe") added DEBUG_ENTRY_ASSERT_IRQS_OFF macro that acceses eflags
      using 'pushfq' instruction when testing for IF bit. On PV Xen guests
      looking at IF flag directly will always see it set, resulting in 'ud2'.
      
      Introduce SAVE_FLAGS() macro that will use appropriate save_fl pv op when
      running paravirt.
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: xen-devel@lists.xenproject.org
      Link: https://lkml.kernel.org/r/20171204150604.899457242@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e17f8234
    • A
      x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow · 2aeb0736
      Andrey Ryabinin 提交于
      [ Note, this is a Git cherry-pick of the following commit:
      
          d17a1d97: ("x86/mm/kasan: don't use vmemmap_populate() to initialize shadow")
      
        ... for easier x86 PTI code testing and back-porting. ]
      
      The KASAN shadow is currently mapped using vmemmap_populate() since that
      provides a semi-convenient way to map pages into init_top_pgt.  However,
      since that no longer zeroes the mapped pages, it is not suitable for
      KASAN, which requires zeroed shadow memory.
      
      Add kasan_populate_shadow() interface and use it instead of
      vmemmap_populate().  Besides, this allows us to take advantage of
      gigantic pages and use them to populate the shadow, which should save us
      some memory wasted on page tables and reduce TLB pressure.
      
      Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.comSigned-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Bob Picco <bob.picco@oracle.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      2aeb0736
    • W
      locking/barriers: Convert users of lockless_dereference() to READ_ONCE() · 3382290e
      Will Deacon 提交于
      [ Note, this is a Git cherry-pick of the following commit:
      
          506458ef ("locking/barriers: Convert users of lockless_dereference() to READ_ONCE()")
      
        ... for easier x86 PTI code testing and back-porting. ]
      
      READ_ONCE() now has an implicit smp_read_barrier_depends() call, so it
      can be used instead of lockless_dereference() without any change in
      semantics.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1508840570-22169-4-git-send-email-will.deacon@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3382290e
    • D
      bpf: fix build issues on um due to mising bpf_perf_event.h · ab95477e
      Daniel Borkmann 提交于
      [ Note, this is a Git cherry-pick of the following commit:
      
          a23f06f0 ("bpf: fix build issues on um due to mising bpf_perf_event.h")
      
        ... for easier x86 PTI code testing and back-porting. ]
      
      Since c895f6f7 ("bpf: correct broken uapi for
      BPF_PROG_TYPE_PERF_EVENT program type") um (uml) won't build
      on i386 or x86_64:
      
        [...]
          CC      init/main.o
        In file included from ../include/linux/perf_event.h:18:0,
                         from ../include/linux/trace_events.h:10,
                         from ../include/trace/syscall.h:7,
                         from ../include/linux/syscalls.h:82,
                         from ../init/main.c:20:
        ../include/uapi/linux/bpf_perf_event.h:11:32: fatal error:
        asm/bpf_perf_event.h: No such file or directory #include
        <asm/bpf_perf_event.h>
        [...]
      
      Lets add missing bpf_perf_event.h also to um arch. This seems
      to be the only one still missing.
      
      Fixes: c895f6f7 ("bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type")
      Reported-by: NRandy Dunlap <rdunlap@infradead.org>
      Suggested-by: NRichard Weinberger <richard@sigma-star.at>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: NRandy Dunlap <rdunlap@infradead.org>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Richard Weinberger <richard@sigma-star.at>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ab95477e
    • A
      perf/x86: Enable free running PEBS for REGS_USER/INTR · 2fe1bc1f
      Andi Kleen 提交于
      [ Note, this is a Git cherry-pick of the following commit:
      
          a47ba4d7 ("perf/x86: Enable free running PEBS for REGS_USER/INTR")
      
        ... for easier x86 PTI code testing and back-porting. ]
      
      Currently free running PEBS is disabled when user or interrupt
      registers are requested. Most of the registers are actually
      available in the PEBS record and can be supported.
      
      So we just need to check for the supported registers and then
      allow it: it is all except for the segment register.
      
      For user registers this only works when the counter is limited
      to ring 3 only, so this also needs to be checked.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170831214630.21892-1-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2fe1bc1f
    • R
      x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD · f2dbad36
      Rudolf Marek 提交于
      [ Note, this is a Git cherry-pick of the following commit:
      
          2b67799bdf25 ("x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD")
      
        ... for easier x86 PTI code testing and back-porting. ]
      
      The latest AMD AMD64 Architecture Programmer's Manual
      adds a CPUID feature XSaveErPtr (CPUID_Fn80000008_EBX[2]).
      
      If this feature is set, the FXSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES
      / FXRSTOR, XRSTOR, XRSTORS always save/restore error pointers,
      thus making the X86_BUG_FXSAVE_LEAK workaround obsolete on such CPUs.
      Signed-Off-By: NRudolf Marek <r.marek@assembler.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Link: https://lkml.kernel.org/r/bdcebe90-62c5-1f05-083c-eba7f08b2540@assembler.czSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f2dbad36
    • R
      x86/cpufeature: Add User-Mode Instruction Prevention definitions · a8b4db56
      Ricardo Neri 提交于
      [ Note, this is a Git cherry-pick of the following commit: (limited to the cpufeatures.h file)
      
          3522c2a6 ("x86/cpufeature: Add User-Mode Instruction Prevention definitions")
      
        ... for easier x86 PTI code testing and back-porting. ]
      
      User-Mode Instruction Prevention is a security feature present in new
      Intel processors that, when set, prevents the execution of a subset of
      instructions if such instructions are executed in user mode (CPL > 0).
      Attempting to execute such instructions causes a general protection
      exception.
      
      The subset of instructions comprises:
      
       * SGDT - Store Global Descriptor Table
       * SIDT - Store Interrupt Descriptor Table
       * SLDT - Store Local Descriptor Table
       * SMSW - Store Machine Status Word
       * STR  - Store Task Register
      
      This feature is also added to the list of disabled-features to allow
      a cleaner handling of build-time configuration.
      Signed-off-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Chen Yucong <slaoub@gmail.com>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Huang Rui <ray.huang@amd.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: ricardo.neri@intel.com
      Link: http://lkml.kernel.org/r/1509935277-22138-7-git-send-email-ricardo.neri-calderon@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a8b4db56
  2. 11 11月, 2017 1 次提交
  3. 10 11月, 2017 4 次提交
  4. 09 11月, 2017 2 次提交
    • J
      x86/mm: Unbreak modules that rely on external PAGE_KERNEL availability · 87df2617
      Jiri Kosina 提交于
      Commit 7744ccdb ("x86/mm: Add Secure Memory Encryption (SME)
      support") as a side-effect made PAGE_KERNEL all of a sudden unavailable
      to modules which can't make use of EXPORT_SYMBOL_GPL() symbols.
      
      This is because once SME is enabled, sme_me_mask (which is introduced as
      EXPORT_SYMBOL_GPL) makes its way to PAGE_KERNEL through _PAGE_ENC,
      causing imminent build failure for all the modules which make use of all
      the EXPORT-SYMBOL()-exported API (such as vmap(), __vmalloc(),
      remap_pfn_range(), ...).
      
      Exporting (as EXPORT_SYMBOL()) interfaces (and having done so for ages)
      that take pgprot_t argument, while making it impossible to -- all of a
      sudden -- pass PAGE_KERNEL to it, feels rather incosistent.
      
      Restore the original behavior and make it possible to pass PAGE_KERNEL
      to all its EXPORT_SYMBOL() consumers.
      
      [ This is all so not wonderful. We shouldn't need that "sme_me_mask"
        access at all in all those places that really don't care about that
        level of detail, and just want _PAGE_KERNEL or whatever.
      
        We have some similar issues with _PAGE_CACHE_WP and _PAGE_NOCACHE,
        both of which hide a "cachemode2protval()" call, and which also ends
        up using another EXPORT_SYMBOL(), but at least that only triggers for
        the much more rare cases.
      
        Maybe we could move these dynamic page table bits to be generated much
        deeper down in the VM layer, instead of hiding them in the macros that
        everybody uses.
      
        So this all would merit some cleanup. But not today.   - Linus ]
      
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Despised-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      87df2617
    • Y
      x86/idt: Remove X86_TRAP_BP initialization in idt_setup_traps() · d0cd64b0
      Yonghong Song 提交于
      Commit b70543a0("x86/idt: Move regular trap init to tables") moves
      regular trap init for each trap vector into a table based
      initialization. It introduced the initialization for vector X86_TRAP_BP
      which was not in the code which it replaced. This breaks uprobe
      functionality for x86_32; the probed program segfaults instead of handling
      the probe proper.
      
      The reason for this is that TRAP_BP is set up as system interrupt gate
      (DPL3) in the early IDT and then replaced by a regular interrupt gate
      (DPL0) in idt_setup_traps(). The DPL0 restriction causes the int3 trap
      to fail with a #GP resulting in a SIGSEGV of the probed program.
      
      On 64bit this does not cause a problem because the IDT entry is replaced
      with a system interrupt gate (DPL3) with interrupt stack afterwards.
      
      Remove X86_TRAP_BP from the def_idts table which is used in
      idt_setup_traps(). Remove a redundant entry for X86_TRAP_NMI in def_idts
      while at it. Tested on both x86_64 and x86_32.
      
      [ tglx: Amended changelog with a description of the root cause ]
      
      Fixes: b70543a0("x86/idt: Move regular trap init to tables")
      Reported-and-tested-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: a.p.zijlstra@chello.nl
      Cc: ast@fb.com
      Cc: oleg@redhat.com
      Cc: luto@kernel.org
      Cc: kernel-team@fb.com
      Link: https://lkml.kernel.org/r/20171108192845.552709-1-yhs@fb.com
      d0cd64b0
  5. 08 11月, 2017 6 次提交
    • O
      MIPS: AR7: Ensure that serial ports are properly set up · b084116f
      Oswald Buddenhagen 提交于
      Without UPF_FIXED_TYPE, the data from the PORT_AR7 uart_config entry is
      never copied, resulting in a dead port.
      
      Fixes: 154615d5 ("MIPS: AR7: Use correct UART port type")
      Signed-off-by: NOswald Buddenhagen <oswald.buddenhagen@gmx.de>
      [jonas.gorski: add Fixes tag]
      Signed-off-by: NJonas Gorski <jonas.gorski@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
      Cc: Nicolas Schichan <nschichan@freebox.fr>
      Cc: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
      Cc: linux-mips@linux-mips.org
      Cc: linux-serial@vger.kernel.org
      Cc: <stable@vger.kernel.org>
      Patchwork: https://patchwork.linux-mips.org/patch/17543/Signed-off-by: NJames Hogan <jhogan@kernel.org>
      b084116f
    • J
      MIPS: AR7: Defer registration of GPIO · e6b03ab6
      Jonas Gorski 提交于
      When called from prom init code, ar7_gpio_init() will fail as it will
      call gpiochip_add() which relies on a working kmalloc() to alloc
      the gpio_desc array and kmalloc is not useable yet at prom init time.
      
      Move ar7_gpio_init() to ar7_register_devices() (a device_initcall)
      where kmalloc works.
      
      Fixes: 14e85c0e ("gpio: remove gpio_descs global array")
      Signed-off-by: NJonas Gorski <jonas.gorski@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
      Cc: Nicolas Schichan <nschichan@freebox.fr>
      Cc: linux-mips@linux-mips.org
      Cc: linux-serial@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.19+
      Patchwork: https://patchwork.linux-mips.org/patch/17542/Signed-off-by: NJames Hogan <jhogan@kernel.org>
      e6b03ab6
    • B
      x86/oprofile/ppro: Do not use __this_cpu*() in preemptible context · a743bbee
      Borislav Petkov 提交于
      The warning below says it all:
      
        BUG: using __this_cpu_read() in preemptible [00000000] code: swapper/0/1
        caller is __this_cpu_preempt_check
        CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc8 #4
        Call Trace:
         dump_stack
         check_preemption_disabled
         ? do_early_param
         __this_cpu_preempt_check
         arch_perfmon_init
         op_nmi_init
         ? alloc_pci_root_info
         oprofile_arch_init
         oprofile_init
         do_one_initcall
         ...
      
      These accessors should not have been used in the first place: it is PPro so
      no mixed silicon revisions and thus it can simply use boot_cpu_data.
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Tested-by: NFengguang Wu <fengguang.wu@intel.com>
      Fix-creation-mandated-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Robert Richter <rric@kernel.org>
      Cc: x86@kernel.org
      Cc: stable@vger.kernel.org
      a743bbee
    • J
      x86/unwind: Disable KASAN checking in the ORC unwinder · 881125bf
      Josh Poimboeuf 提交于
      Fengguang reported a KASAN warning:
      
        Kprobe smoke test: started
        ==================================================================
        BUG: KASAN: stack-out-of-bounds in deref_stack_reg+0xb5/0x11a
        Read of size 8 at addr ffff8800001c7cd8 by task swapper/1
      
        CPU: 0 PID: 1 Comm: swapper Not tainted 4.14.0-rc8 #26
        Call Trace:
         <#DB>
         ...
         save_trace+0xd9/0x1d3
         mark_lock+0x5f7/0xdc3
         __lock_acquire+0x6b4/0x38ef
         lock_acquire+0x1a1/0x2aa
         _raw_spin_lock_irqsave+0x46/0x55
         kretprobe_table_lock+0x1a/0x42
         pre_handler_kretprobe+0x3f5/0x521
         kprobe_int3_handler+0x19c/0x25f
         do_int3+0x61/0x142
         int3+0x30/0x60
        [...]
      
      The ORC unwinder got confused by some kprobes changes, which isn't
      surprising since the runtime code no longer matches vmlinux and the
      stack was modified for kretprobes.
      
      Until we have a way for generated code to register changes with the
      unwinder, these types of warnings are inevitable.  So just disable KASAN
      checks for stack accesses in the ORC unwinder.
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20171108021934.zbl6unh5hpugybc5@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>
      881125bf
    • P
      KVM: PPC: Book3S HV: Fix exclusion between HPT resizing and other HPT updates · 38c53af8
      Paul Mackerras 提交于
      Commit 5e985969 ("KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing
      implementation", 2016-12-20) added code that tries to exclude any use
      or update of the hashed page table (HPT) while the HPT resizing code
      is iterating through all the entries in the HPT.  It does this by
      taking the kvm->lock mutex, clearing the kvm->arch.hpte_setup_done
      flag and then sending an IPI to all CPUs in the host.  The idea is
      that any VCPU task that tries to enter the guest will see that the
      hpte_setup_done flag is clear and therefore call kvmppc_hv_setup_htab_rma,
      which also takes the kvm->lock mutex and will therefore block until
      we release kvm->lock.
      
      However, any VCPU that is already in the guest, or is handling a
      hypervisor page fault or hypercall, can re-enter the guest without
      rechecking the hpte_setup_done flag.  The IPI will cause a guest exit
      of any VCPUs that are currently in the guest, but does not prevent
      those VCPU tasks from immediately re-entering the guest.
      
      The result is that after resize_hpt_rehash_hpte() has made a HPTE
      absent, a hypervisor page fault can occur and make that HPTE present
      again.  This includes updating the rmap array for the guest real page,
      meaning that we now have a pointer in the rmap array which connects
      with pointers in the old rev array but not the new rev array.  In
      fact, if the HPT is being reduced in size, the pointer in the rmap
      array could point outside the bounds of the new rev array.  If that
      happens, we can get a host crash later on such as this one:
      
      [91652.628516] Unable to handle kernel paging request for data at address 0xd0000000157fb10c
      [91652.628668] Faulting instruction address: 0xc0000000000e2640
      [91652.628736] Oops: Kernel access of bad area, sig: 11 [#1]
      [91652.628789] LE SMP NR_CPUS=1024 NUMA PowerNV
      [91652.628847] Modules linked in: binfmt_misc vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables ses enclosure scsi_transport_sas i2c_opal ipmi_powernv ipmi_devintf i2c_core ipmi_msghandler powernv_op_panel nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc kvm_hv kvm_pr kvm scsi_dh_alua dm_service_time dm_multipath tg3 ptp pps_core [last unloaded: stap_552b612747aec2da355051e464fa72a1_14259]
      [91652.629566] CPU: 136 PID: 41315 Comm: CPU 21/KVM Tainted: G           O    4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le #1
      [91652.629684] task: c0000007a419e400 task.stack: c0000000028d8000
      [91652.629750] NIP:  c0000000000e2640 LR: d00000000c36e498 CTR: c0000000000e25f0
      [91652.629829] REGS: c0000000028db5d0 TRAP: 0300   Tainted: G           O     (4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le)
      [91652.629932] MSR:  900000010280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 44022422  XER: 00000000
      [91652.630034] CFAR: d00000000c373f84 DAR: d0000000157fb10c DSISR: 40000000 SOFTE: 1
      [91652.630034] GPR00: d00000000c36e498 c0000000028db850 c000000001403900 c0000007b7960000
      [91652.630034] GPR04: d0000000117fb100 d000000007ab00d8 000000000033bb10 0000000000000000
      [91652.630034] GPR08: fffffffffffffe7f 801001810073bb10 d00000000e440000 d00000000c373f70
      [91652.630034] GPR12: c0000000000e25f0 c00000000fdb9400 f000000003b24680 0000000000000000
      [91652.630034] GPR16: 00000000000004fb 00007ff7081a0000 00000000000ec91a 000000000033bb10
      [91652.630034] GPR20: 0000000000010000 00000000001b1190 0000000000000001 0000000000010000
      [91652.630034] GPR24: c0000007b7ab8038 d0000000117fb100 0000000ec91a1190 c000001e6a000000
      [91652.630034] GPR28: 00000000033bb100 000000000073bb10 c0000007b7960000 d0000000157fb100
      [91652.630735] NIP [c0000000000e2640] kvmppc_add_revmap_chain+0x50/0x120
      [91652.630806] LR [d00000000c36e498] kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv]
      [91652.630884] Call Trace:
      [91652.630913] [c0000000028db850] [c0000000028db8b0] 0xc0000000028db8b0 (unreliable)
      [91652.630996] [c0000000028db8b0] [d00000000c36e498] kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv]
      [91652.631091] [c0000000028db9e0] [d00000000c36a078] kvmppc_vcpu_run_hv+0xdf8/0x1300 [kvm_hv]
      [91652.631179] [c0000000028dbb30] [d00000000c2248c4] kvmppc_vcpu_run+0x34/0x50 [kvm]
      [91652.631266] [c0000000028dbb50] [d00000000c220d54] kvm_arch_vcpu_ioctl_run+0x114/0x2a0 [kvm]
      [91652.631351] [c0000000028dbbd0] [d00000000c2139d8] kvm_vcpu_ioctl+0x598/0x7a0 [kvm]
      [91652.631433] [c0000000028dbd40] [c0000000003832e0] do_vfs_ioctl+0xd0/0x8c0
      [91652.631501] [c0000000028dbde0] [c000000000383ba4] SyS_ioctl+0xd4/0x130
      [91652.631569] [c0000000028dbe30] [c00000000000b8e0] system_call+0x58/0x6c
      [91652.631635] Instruction dump:
      [91652.631676] fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ffa1 2fa70000 793d0020 e9432110
      [91652.631814] 7bbf26e4 7c7e1b78 7feafa14 409e0094 <807f000c> 786326e4 7c6a1a14 93a40008
      [91652.631959] ---[ end trace ac85ba6db72e5b2e ]---
      
      To fix this, we tighten up the way that the hpte_setup_done flag is
      checked to ensure that it does provide the guarantee that the resizing
      code needs.  In kvmppc_run_core(), we check the hpte_setup_done flag
      after disabling interrupts and refuse to enter the guest if it is
      clear (for a HPT guest).  The code that checks hpte_setup_done and
      calls kvmppc_hv_setup_htab_rma() is moved from kvmppc_vcpu_run_hv()
      to a point inside the main loop in kvmppc_run_vcpu(), ensuring that
      we don't just spin endlessly calling kvmppc_run_core() while
      hpte_setup_done is clear, but instead have a chance to block on the
      kvm->lock mutex.
      
      Finally we also check hpte_setup_done inside the region in
      kvmppc_book3s_hv_page_fault() where the HPTE is locked and we are about
      to update the HPTE, and bail out if it is clear.  If another CPU is
      inside kvm_vm_ioctl_resize_hpt_commit) and has cleared hpte_setup_done,
      then we know that either we are looking at a HPTE
      that resize_hpt_rehash_hpte() has not yet processed, which is OK,
      or else we will see hpte_setup_done clear and refuse to update it,
      because of the full barrier formed by the unlock of the HPTE in
      resize_hpt_rehash_hpte() combined with the locking of the HPTE
      in kvmppc_book3s_hv_page_fault().
      
      Fixes: 5e985969 ("KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing implementation")
      Cc: stable@vger.kernel.org # v4.10+
      Reported-by: NSatheesh Rajendran <satheera@in.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      38c53af8
    • J
      MIPS: BMIPS: Fix missing cbr address · ea4b3afe
      Jaedon Shin 提交于
      Fix NULL pointer access in BMIPS3300 RAC flush.
      
      Fixes: 738a3f79 ("MIPS: BMIPS: Add early CPU initialization code")
      Signed-off-by: NJaedon Shin <jaedon.shin@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Cc: Kevin Cernekee <cernekee@gmail.com>
      Cc: linux-mips@linux-mips.org
      Cc: <stable@vger.kernel.org> # 4.7+
      Patchwork: https://patchwork.linux-mips.org/patch/16423/Signed-off-by: NJames Hogan <jhogan@kernel.org>
      ea4b3afe
  6. 07 11月, 2017 4 次提交
  7. 06 11月, 2017 2 次提交
  8. 05 11月, 2017 1 次提交
  9. 04 11月, 2017 3 次提交
    • A
      Revert "x86/mm: Stop calling leave_mm() in idle code" · 67535736
      Andy Lutomirski 提交于
      This reverts commit 43858b4f.
      
      The reason I removed the leave_mm() calls in question is because the
      heuristic wasn't needed after that patch.  With the original version
      of my PCID series, we never flushed a "lazy cpu" (i.e. a CPU running
      kernel thread) due a flush on the loaded mm.
      
      Unfortunately, that caused architectural issues, so now I've
      reinstated these flushes on non-PCID systems in:
      
          commit b956575b ("x86/mm: Flush more aggressively in lazy TLB mode").
      
      That, in turn, gives us a power management and occasionally
      performance regression as compared to old kernels: a process that
      goes into a deep idle state on a given CPU and gets its mm flushed
      due to activity on a different CPU will wake the idle CPU.
      
      Reinstate the old ugly heuristic: if a CPU goes into ACPI C3 or an
      intel_idle state that is likely to cause a TLB flush gets its mm
      switched to init_mm before going idle.
      
      FWIW, this heuristic is lousy.  Whether we should change CR3 before
      idle isn't a good hint except insofar as the performance hit is a bit
      lower if the TLB is getting flushed by the idle code anyway.  What we
      really want to know is whether we anticipate being idle long enough
      that the mm is likely to be flushed before we wake up.  This is more a
      matter of the expected latency than the idle state that gets chosen.
      This heuristic also completely fails on systems that don't know
      whether the TLB will be flushed (e.g. AMD systems?).  OTOH it may be a
      bit obsolete anyway -- PCID systems don't presently benefit from this
      heuristic at all.
      
      We also shouldn't do this callback from innermost bit of the idle code
      due to the RCU nastiness it causes.  All the information need is
      available before rcu_idle_enter() needs to happen.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 43858b4f "x86/mm: Stop calling leave_mm() in idle code"
      Link: http://lkml.kernel.org/r/c513bbd4e653747213e05bc7062de000bf0202a5.1509793738.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      67535736
    • C
      arch/tile: Implement ->set_state_oneshot_stopped() · 777a45b4
      Chris Metcalf 提交于
      set_state_oneshot_stopped() is called by the clkevt core, when the
      next event is required at an expiry time of 'KTIME_MAX'. This normally
      happens with NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes.
      
      This patch makes the clockevent device to stop on such an event, to
      avoid spurious interrupts, as explained by: commit 8fff52fd
      ("clockevents: Introduce CLOCK_EVT_STATE_ONESHOT_STOPPED state").
      Signed-off-by: NChris Metcalf <cmetcalf@mellanox.com>
      777a45b4
    • P
      Update MIPS email addresses · fb615d61
      Paul Burton 提交于
      MIPS will soon not be a part of Imagination Technologies, and as such
      many @imgtec.com email addresses will no longer be valid. This patch
      updates the addresses for those who:
      
       - Have 10 or more patches in mainline authored using an @imgtec.com
         email address, or any patches dated within the past year.
      
       - Are still with Imagination but leaving as part of the MIPS business
         unit, as determined from an internal email address list.
      
       - Haven't already updated their email address (ie. JamesH) or expressed
         a desire to be excluded (ie. Maciej).
      
       - Acked v2 or earlier of this patch, which leaves Deng-Cheng, Matt &
         myself.
      
      New addresses are of the form firstname.lastname@mips.com, and all
      verified against an internal email address list.  An entry is added to
      .mailmap for each person such that get_maintainer.pl will report the new
      addresses rather than @imgtec.com addresses which will soon be dead.
      
      Instances of the affected addresses throughout the tree are then
      mechanically replaced with the new @mips.com address.
      Signed-off-by: NPaul Burton <paul.burton@mips.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@mips.com>
      Acked-by: NDengcheng Zhu <dengcheng.zhu@mips.com>
      Cc: Matt Redfearn <matt.redfearn@imgtec.com>
      Cc: Matt Redfearn <matt.redfearn@mips.com>
      Acked-by: NMatt Redfearn <matt.redfearn@mips.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: trivial@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fb615d61
  10. 03 11月, 2017 5 次提交
    • R
      x86: CPU: Fix up "cpu MHz" in /proc/cpuinfo · 941f5f0f
      Rafael J. Wysocki 提交于
      Commit 890da9cf (Revert "x86: do not use cpufreq_quick_get() for
      /proc/cpuinfo "cpu MHz"") is not sufficient to restore the previous
      behavior of "cpu MHz" in /proc/cpuinfo on x86 due to some changes
      made after the commit it has reverted.
      
      To address this, make the code in question use arch_freq_get_on_cpu()
      which also is used by cpufreq for reporting the current frequency of
      CPUs and since that function doesn't really depend on cpufreq in any
      way, drop the CONFIG_CPU_FREQ dependency for the object file
      containing it.
      
      Also refactor arch_freq_get_on_cpu() somewhat to avoid IPIs and
      return cached values right away if it is called very often over a
      short time (to prevent user space from triggering IPI storms through
      it).
      
      Fixes: 890da9cf (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"")
      Cc: stable@kernel.org   # 4.13 - together with 890da9cfSigned-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      941f5f0f
    • A
      crypto: x86/sha1-mb - fix panic due to unaligned access · d041b557
      Andrey Ryabinin 提交于
      struct sha1_ctx_mgr allocated in sha1_mb_mod_init() via kzalloc()
      and later passed in sha1_mb_flusher_mgr_flush_avx2() function where
      instructions vmovdqa used to access the struct. vmovdqa requires
      16-bytes aligned argument, but nothing guarantees that struct
      sha1_ctx_mgr will have that alignment. Unaligned vmovdqa will
      generate GP fault.
      
      Fix this by replacing vmovdqa with vmovdqu which doesn't have alignment
      requirements.
      
      Fixes: 2249cbb5 ("crypto: sha-mb - SHA1 multibuffer submit and flush routines for AVX2")
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      d041b557
    • A
      crypto: x86/sha256-mb - fix panic due to unaligned access · 5dfeaac1
      Andrey Ryabinin 提交于
      struct sha256_ctx_mgr allocated in sha256_mb_mod_init() via kzalloc()
      and later passed in sha256_mb_flusher_mgr_flush_avx2() function where
      instructions vmovdqa used to access the struct. vmovdqa requires
      16-bytes aligned argument, but nothing guarantees that struct
      sha256_ctx_mgr will have that alignment. Unaligned vmovdqa will
      generate GP fault.
      
      Fix this by replacing vmovdqa with vmovdqu which doesn't have alignment
      requirements.
      
      Fixes: a377c6b1 ("crypto: sha256-mb - submit/flush routines for AVX2")
      Reported-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: <stable@vger.kernel.org>
      Acked-by: Tim Chen
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      5dfeaac1
    • M
      powerpc/perf: Fix core-imc hotplug callback failure during imc initialization · 7ecb37f6
      Madhavan Srinivasan 提交于
      Call trace observed during boot:
      
        nest_capp0_imc performance monitor hardware support registered
        nest_capp1_imc performance monitor hardware support registered
        core_imc memory allocation for cpu 56 failed
        Unable to handle kernel paging request for data at address 0xffa400010
        Faulting instruction address: 0xc000000000bf3294
        0:mon> e
        cpu 0x0: Vector: 300 (Data Access) at [c000000ff38ff8d0]
            pc: c000000000bf3294: mutex_lock+0x34/0x90
            lr: c000000000bf3288: mutex_lock+0x28/0x90
            sp: c000000ff38ffb50
           msr: 9000000002009033
           dar: ffa400010
         dsisr: 80000
          current = 0xc000000ff383de00
          paca    = 0xc000000007ae0000	 softe: 0	 irq_happened: 0x01
            pid   = 13, comm = cpuhp/0
        Linux version 4.11.0-39.el7a.ppc64le (mockbuild@ppc-058.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Tue Oct 3 07:42:44 EDT 2017
        0:mon> t
        [c000000ff38ffb80] c0000000002ddfac perf_pmu_migrate_context+0xac/0x470
        [c000000ff38ffc40] c00000000011385c ppc_core_imc_cpu_offline+0x1ac/0x1e0
        [c000000ff38ffc90] c000000000125758 cpuhp_invoke_callback+0x198/0x5d0
        [c000000ff38ffd00] c00000000012782c cpuhp_thread_fun+0x8c/0x3d0
        [c000000ff38ffd60] c0000000001678d0 smpboot_thread_fn+0x290/0x2a0
        [c000000ff38ffdc0] c00000000015ee78 kthread+0x168/0x1b0
        [c000000ff38ffe30] c00000000000b368 ret_from_kernel_thread+0x5c/0x74
      
      While registering the cpuhoplug callbacks for core-imc, if we fails
      in the cpuhotplug online path for any random core (either because opal call to
      initialize the core-imc counters fails or because memory allocation fails for
      that core), ppc_core_imc_cpu_offline() will get invoked for other cpus who
      successfully returned from cpuhotplug online path.
      
      But in the ppc_core_imc_cpu_offline() path we are trying to migrate the event
      context, when core-imc counters are not even initialized. Thus creating the
      above stack dump.
      
      Add a check to see if core-imc counters are enabled or not in the cpuhotplug
      offline path before migrating the context to handle this failing scenario.
      
      Fixes: 885dcd70 ("powerpc/perf: Add nest IMC PMU support")
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NAnju T Sudhakar <anju@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7ecb37f6
    • L
      Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"" · 890da9cf
      Linus Torvalds 提交于
      This reverts commit 51204e06.
      
      There wasn't really any good reason for it, and people are complaining
      (rightly) that it broke existing practice.
      
      Cc: Len Brown <len.brown@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      890da9cf