1. 28 1月, 2018 1 次提交
  2. 26 1月, 2018 2 次提交
  3. 25 1月, 2018 1 次提交
    • P
      perf/x86: Fix perf,x86,cpuhp deadlock · efe951d3
      Peter Zijlstra 提交于
      More lockdep gifts, a 5-way lockup race:
      
      	perf_event_create_kernel_counter()
      	  perf_event_alloc()
      	    perf_try_init_event()
      	      x86_pmu_event_init()
      		__x86_pmu_event_init()
      		  x86_reserve_hardware()
       #0		    mutex_lock(&pmc_reserve_mutex);
      		    reserve_ds_buffer()
       #1		      get_online_cpus()
      
      	perf_event_release_kernel()
      	  _free_event()
      	    hw_perf_event_destroy()
      	      x86_release_hardware()
       #0		mutex_lock(&pmc_reserve_mutex)
      		release_ds_buffer()
       #1		  get_online_cpus()
      
       #1	do_cpu_up()
      	  perf_event_init_cpu()
       #2	    mutex_lock(&pmus_lock)
       #3	    mutex_lock(&ctx->mutex)
      
      	sys_perf_event_open()
      	  mutex_lock_double()
       #3	    mutex_lock(ctx->mutex)
       #4	    mutex_lock_nested(ctx->mutex, 1);
      
      	perf_try_init_event()
       #4	  mutex_lock_nested(ctx->mutex, 1)
      	  x86_pmu_event_init()
      	    intel_pmu_hw_config()
      	      x86_add_exclusive()
       #0		mutex_lock(&pmc_reserve_mutex)
      
      Fix it by using ordering constructs instead of locking.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      efe951d3
  4. 24 1月, 2018 5 次提交
    • B
      x86/microcode: Fix again accessing initrd after having been freed · 1d080f09
      Borislav Petkov 提交于
      Commit 24c25032 ("x86/microcode: Do not access the initrd after it has
      been freed") fixed attempts to access initrd from the microcode loader
      after it has been freed. However, a similar KASAN warning was reported
      (stack trace edited):
      
        smpboot: Booting Node 0 Processor 1 APIC 0x11
        ==================================================================
        BUG: KASAN: use-after-free in find_cpio_data+0x9b5/0xa50
        Read of size 1 at addr ffff880035ffd000 by task swapper/1/0
      
        CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.14.8-slack #7
        Hardware name: System manufacturer System Product Name/A88X-PLUS, BIOS 3003 03/10/2016
        Call Trace:
         dump_stack
         print_address_description
         kasan_report
         ? find_cpio_data
         __asan_report_load1_noabort
         find_cpio_data
         find_microcode_in_initrd
         __load_ucode_amd
         load_ucode_amd_ap
            load_ucode_ap
      
      After some investigation, it turned out that a merge was done using the
      wrong side to resolve, leading to picking up the previous state, before
      the 24c25032 fix. Therefore the Fixes tag below contains a merge
      commit.
      
      Revert the mismerge by catching the save_microcode_in_initrd_amd()
      retval and thus letting the function exit with the last return statement
      so that initrd_gone can be set to true.
      
      Fixes: f26483ea ("Merge branch 'x86/urgent' into x86/microcode, to resolve conflicts")
      Reported-by: <higuita@gmx.net>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=198295
      Link: https://lkml.kernel.org/r/20180123104133.918-2-bp@alien8.de
      1d080f09
    • jia zhang's avatar
      x86/microcode/intel: Extend BDW late-loading further with LLC size check · 7e702d17
      jia zhang 提交于
      Commit b94b7373 ("x86/microcode/intel: Extend BDW late-loading with a
      revision check") reduced the impact of erratum BDF90 for Broadwell model
      79.
      
      The impact can be reduced further by checking the size of the last level
      cache portion per core.
      
      Tony: "The erratum says the problem only occurs on the large-cache SKUs.
      So we only need to avoid the update if we are on a big cache SKU that is
      also running old microcode."
      
      For more details, see erratum BDF90 in document #334165 (Intel Xeon
      Processor E7-8800/4800 v4 Product Family Specification Update) from
      September 2017.
      
      Fixes: b94b7373 ("x86/microcode/intel: Extend BDW late-loading with a revision check")
      Signed-off-by: jia zhang's avatarJia Zhang <zhang.jia@linux.alibaba.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/1516321542-31161-1-git-send-email-zhang.jia@linux.alibaba.com
      7e702d17
    • X
      perf/x86/amd/power: Do not load AMD power module on !AMD platforms · 40d4071c
      Xiao Liang 提交于
      The AMD power module can be loaded on non AMD platforms, but unload fails
      with the following Oops:
      
       BUG: unable to handle kernel NULL pointer dereference at           (null)
       IP: __list_del_entry_valid+0x29/0x90
       Call Trace:
        perf_pmu_unregister+0x25/0xf0
        amd_power_pmu_exit+0x1c/0xd23 [power]
        SyS_delete_module+0x1a8/0x2b0
        ? exit_to_usermode_loop+0x8f/0xb0
        entry_SYSCALL_64_fastpath+0x20/0x83
      
      Return -ENODEV instead of 0 from the module init function if the CPU does
      not match.
      
      Fixes: c7ab62bf ("perf/x86/amd/power: Add AMD accumulated power reporting mechanism")
      Signed-off-by: NXiao Liang <xiliang@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180122061252.6394-1-xiliang@redhat.com
      40d4071c
    • S
      ftrace, orc, x86: Handle ftrace dynamically allocated trampolines · 6be7fa3c
      Steven Rostedt (VMware) 提交于
      The function tracer can create a dynamically allocated trampoline that is
      called by the function mcount or fentry hook that is used to call the
      function callback that is registered. The problem is that the orc undwinder
      will bail if it encounters one of these trampolines. This breaks the stack
      trace of function callbacks, which include the stack tracer and setting the
      stack trace for individual functions.
      
      Since these dynamic trampolines are basically copies of the static ftrace
      trampolines defined in ftrace_*.S, we do not need to create new orc entries
      for the dynamic trampolines. Finding the return address on the stack will be
      identical as the functions that were copied to create the dynamic
      trampolines. When encountering a ftrace dynamic trampoline, we can just use
      the orc entry of the ftrace static function that was copied for that
      trampoline.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      6be7fa3c
    • J
      x86/ftrace: Fix ORC unwinding from ftrace handlers · e2ac83d7
      Josh Poimboeuf 提交于
      Steven Rostedt discovered that the ftrace stack tracer is broken when
      it's used with the ORC unwinder.  The problem is that objtool is
      instructed by the Makefile to ignore the ftrace_64.S code, so it doesn't
      generate any ORC data for it.
      
      Fix it by making the asm code objtool-friendly:
      
      - Objtool doesn't like the fact that save_mcount_regs pushes RBP at the
        beginning, but it's never restored (directly, at least).  So just skip
        the original RBP push, which is only needed for frame pointers anyway.
      
      - Annotate some functions as normal callable functions with
        ENTRY/ENDPROC.
      
      - Add an empty unwind hint to return_to_handler().  The return address
        isn't on the stack, so there's nothing ORC can do there.  It will just
        punt in the unlikely case it tries to unwind from that code.
      
      With all that fixed, remove the OBJECT_FILES_NON_STANDARD Makefile
      annotation so objtool can read the file.
      
      Link: http://lkml.kernel.org/r/20180123040746.ih4ep3tk4pbjvg7c@trebleReported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      e2ac83d7
  5. 21 1月, 2018 1 次提交
  6. 19 1月, 2018 5 次提交
  7. 18 1月, 2018 1 次提交
    • T
      x86/mm: Rework wbinvd, hlt operation in stop_this_cpu() · f23d74f6
      Tom Lendacky 提交于
      Some issues have been reported with the for loop in stop_this_cpu() that
      issues the 'wbinvd; hlt' sequence.  Reverting this sequence to halt()
      has been shown to resolve the issue.
      
      However, the wbinvd is needed when running with SME.  The reason for the
      wbinvd is to prevent cache flush races between encrypted and non-encrypted
      entries that have the same physical address.  This can occur when
      kexec'ing from memory encryption active to inactive or vice-versa.  The
      important thing is to not have outside of kernel text memory references
      (such as stack usage), so the usage of the native_*() functions is needed
      since these expand as inline asm sequences.  So instead of reverting the
      change, rework the sequence.
      
      Move the wbinvd instruction outside of the for loop as native_wbinvd()
      and make its execution conditional on X86_FEATURE_SME.  In the for loop,
      change the asm 'wbinvd; hlt' sequence back to a halt sequence but use
      the native_halt() call.
      
      Fixes: bba4ed01 ("x86/mm, kexec: Allow kexec to be used with SME")
      Reported-by: NDave Young <dyoung@redhat.com>
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NDave Young <dyoung@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Yu Chen <yu.c.chen@intel.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: kexec@lists.infradead.org
      Cc: ebiederm@redhat.com
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180117234141.21184.44067.stgit@tlendack-t1.amdoffice.net
      f23d74f6
  8. 17 1月, 2018 5 次提交
  9. 16 1月, 2018 5 次提交
  10. 15 1月, 2018 5 次提交
    • T
      x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros · 28d437d5
      Tom Lendacky 提交于
      The PAUSE instruction is currently used in the retpoline and RSB filling
      macros as a speculation trap.  The use of PAUSE was originally suggested
      because it showed a very, very small difference in the amount of
      cycles/time used to execute the retpoline as compared to LFENCE.  On AMD,
      the PAUSE instruction is not a serializing instruction, so the pause/jmp
      loop will use excess power as it is speculated over waiting for return
      to mispredict to the correct target.
      
      The RSB filling macro is applicable to AMD, and, if software is unable to
      verify that LFENCE is serializing on AMD (possible when running under a
      hypervisor), the generic retpoline support will be used and, so, is also
      applicable to AMD.  Keep the current usage of PAUSE for Intel, but add an
      LFENCE instruction to the speculation trap for AMD.
      
      The same sequence has been adopted by GCC for the GCC generated retpolines.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@alien8.de>
      Acked-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
      Cc: Kees Cook <keescook@google.com>
      Link: https://lkml.kernel.org/r/20180113232730.31060.36287.stgit@tlendack-t1.amdoffice.net
      28d437d5
    • D
      x86/retpoline: Fill RSB on context switch for affected CPUs · c995efd5
      David Woodhouse 提交于
      On context switch from a shallow call stack to a deeper one, as the CPU
      does 'ret' up the deeper side it may encounter RSB entries (predictions for
      where the 'ret' goes to) which were populated in userspace.
      
      This is problematic if neither SMEP nor KPTI (the latter of which marks
      userspace pages as NX for the kernel) are active, as malicious code in
      userspace may then be executed speculatively.
      
      Overwrite the CPU's return prediction stack with calls which are predicted
      to return to an infinite loop, to "capture" speculation if this
      happens. This is required both for retpoline, and also in conjunction with
      IBRS for !SMEP && !KPTI.
      
      On Skylake+ the problem is slightly different, and an *underflow* of the
      RSB may cause errant branch predictions to occur. So there it's not so much
      overwrite, as *filling* the RSB to attempt to prevent it getting
      empty. This is only a partial solution for Skylake+ since there are many
      other conditions which may result in the RSB becoming empty. The full
      solution on Skylake+ is to use IBRS, which will prevent the problem even
      when the RSB becomes empty. With IBRS, the RSB-stuffing will not be
      required on context switch.
      
      [ tglx: Added missing vendor check and slighty massaged comments and
        	changelog ]
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: thomas.lendacky@amd.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
      Cc: Paul Turner <pjt@google.com>
      Link: https://lkml.kernel.org/r/1515779365-9032-1-git-send-email-dwmw@amazon.co.uk
      c995efd5
    • A
      x86/kasan: Panic if there is not enough memory to boot · 0d39e266
      Andrey Ryabinin 提交于
      Currently KASAN doesn't panic in case it don't have enough memory
      to boot. Instead, it crashes in some random place:
      
       kernel BUG at arch/x86/mm/physaddr.c:27!
      
       RIP: 0010:__phys_addr+0x268/0x276
       Call Trace:
        kasan_populate_shadow+0x3f2/0x497
        kasan_init+0x12e/0x2b2
        setup_arch+0x2825/0x2a2c
        start_kernel+0xc8/0x15f4
        x86_64_start_reservations+0x2a/0x2c
        x86_64_start_kernel+0x72/0x75
        secondary_startup_64+0xa5/0xb0
      
      Use memblock_virt_alloc_try_nid() for allocations without failure
      fallback. It will panic with an out of memory message.
      Reported-by: Nkernel test robot <xiaolong.ye@intel.com>
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: kasan-dev@googlegroups.com
      Cc: Alexander Potapenko <glider@google.com>
      Cc: lkp@01.org
      Link: https://lkml.kernel.org/r/20180110153602.18919-1-aryabinin@virtuozzo.com
      0d39e266
    • T
      x86/retpoline: Remove compile time warning · b8b9ce4b
      Thomas Gleixner 提交于
      Remove the compile time warning when CONFIG_RETPOLINE=y and the compiler
      does not have retpoline support. Linus rationale for this is:
      
        It's wrong because it will just make people turn off RETPOLINE, and the
        asm updates - and return stack clearing - that are independent of the
        compiler are likely the most important parts because they are likely the
        ones easiest to target.
      
        And it's annoying because most people won't be able to do anything about
        it. The number of people building their own compiler? Very small. So if
        their distro hasn't got a compiler yet (and pretty much nobody does), the
        warning is just annoying crap.
      
        It is already properly reported as part of the sysfs interface. The
        compile-time warning only encourages bad things.
      
      Fixes: 76b04384 ("x86/retpoline: Add initial retpoline support")
      Requested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: thomas.lendacky@amd.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
      Link: https://lkml.kernel.org/r/CA+55aFzWgquv4i6Mab6bASqYXg3ErV3XDFEYf=GEcCDQg5uAtw@mail.gmail.com
      b8b9ce4b
    • A
      x86/idt: Mark IDT tables __initconst · 327867fa
      Andi Kleen 提交于
      const variables must use __initconst, not __initdata.
      
      Fix this up for the IDT tables, which got it consistently wrong.
      
      Fixes: 16bc18d8 ("x86/idt: Move 32-bit idt_descr to C code")
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20171222001821.2157-7-andi@firstfloor.org
      327867fa
  11. 14 1月, 2018 7 次提交
  12. 12 1月, 2018 2 次提交