1. 08 12月, 2018 5 次提交
  2. 06 12月, 2018 35 次提交
    • H
      ARM: dts: rockchip: Remove @0 from the veyron memory node · 6e74fc22
      Heiko Stuebner 提交于
      commit 672e60b72bbe7aace88721db55b380b6a51fb8f9 upstream.
      
      The Coreboot version on veyron ChromeOS devices seems to ignore
      memory@0 nodes when updating the available memory and instead
      inserts another memory node without the address.
      
      This leads to 4GB systems only ever be using 2GB as the memory@0
      node takes precedence. So remove the @0 for veyron devices.
      
      Fixes: 0b639b81 ("ARM: dts: rockchip: Add missing unit name to memory nodes in rk3288 boards")
      Cc: stable@vger.kernel.org
      Reported-by: NHeikki Lindholm <holin@iki.fi>
      Signed-off-by: NHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6e74fc22
    • S
      MIPS: function_graph: Simplify with function_graph_enter() · 35aa93cb
      Steven Rostedt (VMware) 提交于
      commit 8712b27c5723c26400a2b350faf1d6d9fd7ffaad upstream.
      
      The function_graph_enter() function does the work of calling the function
      graph hook function and the management of the shadow stack, simplifying the
      work done in the architecture dependent prepare_ftrace_return().
      
      Have MIPS use the new code, and remove the shadow stack management as well as
      having to set up the trace structure.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      35aa93cb
    • S
      arm64: function_graph: Simplify with function_graph_enter() · bdfd01cf
      Steven Rostedt (VMware) 提交于
      commit 01e0ab2c4ff12358f15a856fd1a7bbea0670972b upstream.
      
      The function_graph_enter() function does the work of calling the function
      graph hook function and the management of the shadow stack, simplifying the
      work done in the architecture dependent prepare_ftrace_return().
      
      Have arm64 use the new code, and remove the shadow stack management as well as
      having to set up the trace structure.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bdfd01cf
    • S
      s390/function_graph: Simplify with function_graph_enter() · ef9326a1
      Steven Rostedt (VMware) 提交于
      commit 18588e1487b19e45bd90bd55ec8d3a1d44f3257f upstream.
      
      The function_graph_enter() function does the work of calling the function
      graph hook function and the management of the shadow stack, simplifying the
      work done in the architecture dependent prepare_ftrace_return().
      
      Have s390 use the new code, and remove the shadow stack management as well as
      having to set up the trace structure.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Julian Wiedmann <jwi@linux.ibm.com>
      Cc: linux-s390@vger.kernel.org
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ef9326a1
    • S
      riscv/function_graph: Simplify with function_graph_enter() · 84d2023c
      Steven Rostedt (VMware) 提交于
      commit e949b6db51dc172a35c962bc4414ca148315fe21 upstream.
      
      The function_graph_enter() function does the work of calling the function
      graph hook function and the management of the shadow stack, simplifying the
      work done in the architecture dependent prepare_ftrace_return().
      
      Have riscv use the new code, and remove the shadow stack management as well as
      having to set up the trace structure.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      
      Cc: Greentime Hu <greentime@andestech.com>
      Cc: Alan Kao <alankao@andestech.com>
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: NPalmer Dabbelt <palmer@sifive.com>
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      84d2023c
    • S
      parisc: function_graph: Simplify with function_graph_enter() · 87352d62
      Steven Rostedt (VMware) 提交于
      commit a87532c78d291265efadc4b20a8c7a70cd59ea29 upstream.
      
      The function_graph_enter() function does the work of calling the function
      graph hook function and the management of the shadow stack, simplifying the
      work done in the architecture dependent prepare_ftrace_return().
      
      Have parisc use the new code, and remove the shadow stack management as well as
      having to set up the trace structure.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: linux-parisc@vger.kernel.org
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      87352d62
    • S
      sparc/function_graph: Simplify with function_graph_enter() · 34773b2f
      Steven Rostedt (VMware) 提交于
      commit 9c4bf5e0db164f330a2d3e128e9832661f69f0e9 upstream.
      
      The function_graph_enter() function does the work of calling the function
      graph hook function and the management of the shadow stack, simplifying the
      work done in the architecture dependent prepare_ftrace_return().
      
      Have sparc use the new code, and remove the shadow stack management as well as
      having to set up the trace structure.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: sparclinux@vger.kernel.org
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      34773b2f
    • S
      sh/function_graph: Simplify with function_graph_enter() · 56c1dd92
      Steven Rostedt (VMware) 提交于
      commit bc715ee4dbc5db462c59b9cfba92d31b3274fe3a upstream.
      
      The function_graph_enter() function does the work of calling the function
      graph hook function and the management of the shadow stack, simplifying the
      work done in the architecture dependent prepare_ftrace_return().
      
      Have superh use the new code, and remove the shadow stack management as well as
      having to set up the trace structure.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: linux-sh@vger.kernel.org
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      56c1dd92
    • S
      powerpc/function_graph: Simplify with function_graph_enter() · 5478648e
      Steven Rostedt (VMware) 提交于
      commit fe60522ec60082a1dd735691b82c64f65d4ad15e upstream.
      
      The function_graph_enter() function does the work of calling the function
      graph hook function and the management of the shadow stack, simplifying the
      work done in the architecture dependent prepare_ftrace_return().
      
      Have powerpc use the new code, and remove the shadow stack management as well as
      having to set up the trace structure.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5478648e
    • S
      nds32: function_graph: Simplify with function_graph_enter() · 25ac02d0
      Steven Rostedt (VMware) 提交于
      commit d48ebb24866edea2c35be02a878f25bc65529370 upstream.
      
      The function_graph_enter() function does the work of calling the function
      graph hook function and the management of the shadow stack, simplifying the
      work done in the architecture dependent prepare_ftrace_return().
      
      Have nds32 use the new code, and remove the shadow stack management as well as
      having to set up the trace structure.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      
      Cc: Greentime Hu <greentime@andestech.com>
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      25ac02d0
    • S
      x86/function_graph: Simplify with function_graph_enter() · 21761499
      Steven Rostedt (VMware) 提交于
      commit 07f7175b43827640d1e69c9eded89aa089a234b4 upstream.
      
      The function_graph_enter() function does the work of calling the function
      graph hook function and the management of the shadow stack, simplifying the
      work done in the architecture dependent prepare_ftrace_return().
      
      Have x86 use the new code, and remove the shadow stack management as well as
      having to set up the trace structure.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      21761499
    • S
      microblaze: function_graph: Simplify with function_graph_enter() · e7deeabe
      Steven Rostedt (VMware) 提交于
      commit 556763e5a500d71879d632867b75826551acd49c upstream.
      
      The function_graph_enter() function does the work of calling the function
      graph hook function and the management of the shadow stack, simplifying the
      work done in the architecture dependent prepare_ftrace_return().
      
      Have microblaze use the new code, and remove the shadow stack management as well as
      having to set up the trace structure.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e7deeabe
    • S
      ARM: function_graph: Simplify with function_graph_enter() · fbbee0cf
      Steven Rostedt (VMware) 提交于
      commit f1f5b14afd7cce39e6a9b25c685e1ea34c231096 upstream.
      
      The function_graph_enter() function does the work of calling the function
      graph hook function and the management of the shadow stack, simplifying the
      work done in the architecture dependent prepare_ftrace_return().
      
      Have ARM use the new code, and remove the shadow stack management as well as
      having to set up the trace structure.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fbbee0cf
    • J
      perf/x86/intel: Disallow precise_ip on BTS events · 205af59e
      Jiri Olsa 提交于
      commit 472de49fdc53365c880ab81ae2b5cfdd83db0b06 upstream.
      
      Vince reported a crash in the BTS flush code when touching the callchain
      data, which was supposed to be initialized as an 'early' callchain,
      but intel_pmu_drain_bts_buffer() does not do that:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
        ...
        Call Trace:
         <IRQ>
         intel_pmu_drain_bts_buffer+0x151/0x220
         ? intel_get_event_constraints+0x219/0x360
         ? perf_assign_events+0xe2/0x2a0
         ? select_idle_sibling+0x22/0x3a0
         ? __update_load_avg_se+0x1ec/0x270
         ? enqueue_task_fair+0x377/0xdd0
         ? cpumask_next_and+0x19/0x20
         ? load_balance+0x134/0x950
         ? check_preempt_curr+0x7a/0x90
         ? ttwu_do_wakeup+0x19/0x140
         x86_pmu_stop+0x3b/0x90
         x86_pmu_del+0x57/0x160
         event_sched_out.isra.106+0x81/0x170
         group_sched_out.part.108+0x51/0xc0
         __perf_event_disable+0x7f/0x160
         event_function+0x8c/0xd0
         remote_function+0x3c/0x50
         flush_smp_call_function_queue+0x35/0xe0
         smp_call_function_single_interrupt+0x3a/0xd0
         call_function_single_interrupt+0xf/0x20
         </IRQ>
      
      It was triggered by fuzzer but can be easily reproduced by:
      
        # perf record -e cpu/branch-instructions/pu -g -c 1
      
      Peter suggested not to allow branch tracing for precise events:
      
       > Now arguably, this is really stupid behaviour. Who in his right mind
       > wants callchain output on BTS entries. And even if they do, BTS +
       > precise_ip is nonsensical.
       >
       > So in my mind disallowing precise_ip on BTS would be the simplest fix.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@vger.kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 6cbc304f ("perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)")
      Link: http://lkml.kernel.org/r/20181121101612.16272-3-jolsa@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      205af59e
    • J
      perf/x86/intel: Add generic branch tracing check to intel_pmu_has_bts() · be0e2e24
      Jiri Olsa 提交于
      commit 67266c1080ad56c31af72b9c18355fde8ccc124a upstream.
      
      Currently we check the branch tracing only by checking for the
      PERF_COUNT_HW_BRANCH_INSTRUCTIONS event of PERF_TYPE_HARDWARE
      type. But we can define the same event with the PERF_TYPE_RAW
      type.
      
      Changing the intel_pmu_has_bts() code to check on event's final
      hw config value, so both HW types are covered.
      
      Adding unlikely to intel_pmu_has_bts() condition calls, because
      it was used in the original code in intel_bts_constraints.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@vger.kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/20181121101612.16272-2-jolsa@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be0e2e24
    • J
      perf/x86/intel: Move branch tracing setup to the Intel-specific source file · ad65b548
      Jiri Olsa 提交于
      commit ed6101bbf6266ee83e620b19faa7c6ad56bb41ab upstream.
      
      Moving branch tracing setup to Intel core object into separate
      intel_pmu_bts_config function, because it's Intel specific.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@vger.kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/20181121101612.16272-1-jolsa@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad65b548
    • S
      x86/fpu: Disable bottom halves while loading FPU registers · 33448a8b
      Sebastian Andrzej Siewior 提交于
      commit 68239654acafe6aad5a3c1dc7237e60accfebc03 upstream.
      
      The sequence
      
        fpu->initialized = 1;		/* step A */
        preempt_disable();		/* step B */
        fpu__restore(fpu);
        preempt_enable();
      
      in __fpu__restore_sig() is racy in regard to a context switch.
      
      For 32bit frames, __fpu__restore_sig() prepares the FPU state within
      fpu->state. To ensure that a context switch (switch_fpu_prepare() in
      particular) does not modify fpu->state it uses fpu__drop() which sets
      fpu->initialized to 0.
      
      After fpu->initialized is cleared, the CPU's FPU state is not saved
      to fpu->state during a context switch. The new state is loaded via
      fpu__restore(). It gets loaded into fpu->state from userland and
      ensured it is sane. fpu->initialized is then set to 1 in order to avoid
      fpu__initialize() doing anything (overwrite the new state) which is part
      of fpu__restore().
      
      A context switch between step A and B above would save CPU's current FPU
      registers to fpu->state and overwrite the newly prepared state. This
      looks like a tiny race window but the Kernel Test Robot reported this
      back in 2016 while we had lazy FPU support. Borislav Petkov made the
      link between that report and another patch that has been posted. Since
      the removal of the lazy FPU support, this race goes unnoticed because
      the warning has been removed.
      
      Disable bottom halves around the restore sequence to avoid the race. BH
      need to be disabled because BH is allowed to run (even with preemption
      disabled) and might invoke kernel_fpu_begin() by doing IPsec.
      
       [ bp: massage commit message a bit. ]
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
      Cc: kvm ML <kvm@vger.kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: stable@vger.kernel.org
      Cc: x86-ml <x86@kernel.org>
      Link: http://lkml.kernel.org/r/20181120102635.ddv3fvavxajjlfqk@linutronix.de
      Link: https://lkml.kernel.org/r/20160226074940.GA28911@pd.tnicSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      33448a8b
    • B
      x86/MCE/AMD: Fix the thresholding machinery initialization order · 00f91adf
      Borislav Petkov 提交于
      commit 60c8144afc287ef09ce8c1230c6aa972659ba1bb upstream.
      
      Currently, the code sets up the thresholding interrupt vector and only
      then goes about initializing the thresholding banks. Which is wrong,
      because an early thresholding interrupt would cause a NULL pointer
      dereference when accessing those banks and prevent the machine from
      booting.
      
      Therefore, set the thresholding interrupt vector only *after* having
      initialized the banks successfully.
      
      Fixes: 18807ddb ("x86/mce/AMD: Reset Threshold Limit after logging error")
      Reported-by: NRafał Miłecki <rafal@milecki.pl>
      Reported-by: NJohn Clemens <clemej@gmail.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NRafał Miłecki <rafal@milecki.pl>
      Tested-by: NJohn Clemens <john@deater.net>
      Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
      Cc: linux-edac@vger.kernel.org
      Cc: stable@vger.kernel.org
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: x86@kernel.org
      Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
      Link: https://lkml.kernel.org/r/20181127101700.2964-1-zajec5@gmail.com
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=201291Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      00f91adf
    • C
      arm64: dts: rockchip: Fix PCIe reset polarity for rk3399-puma-haikou. · 8af02415
      Christoph Muellner 提交于
      commit c1d91f86a1b4c9c05854d59c6a0abd5d0f75b849 upstream.
      
      This patch fixes the wrong polarity setting for the PCIe host driver's
      pre-reset pin for rk3399-puma-haikou. Without this patch link training
      will most likely fail.
      
      Fixes: 60fd9f72 ("arm64: dts: rockchip: add Haikou baseboard with RK3399-Q7 SoM")
      Cc: stable@vger.kernel.org
      Signed-off-by: NChristoph Muellner <christoph.muellner@theobroma-systems.com>
      Signed-off-by: NHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8af02415
    • M
      xtensa: fix coprocessor part of ptrace_{get,set}xregs · 01fb21bf
      Max Filippov 提交于
      commit 38a35a78c5e270cbe53c4fef6b0d3c2da90dd849 upstream.
      
      Layout of coprocessor registers in the elf_xtregs_t and
      xtregs_coprocessor_t may be different due to alignment. Thus it is not
      always possible to copy data between the xtregs_coprocessor_t structure
      and the elf_xtregs_t and get correct values for all registers.
      Use a table of offsets and sizes of individual coprocessor register
      groups to do coprocessor context copying in the ptrace_getxregs and
      ptrace_setxregs.
      This fixes incorrect coprocessor register values reading from the user
      process by the native gdb on an xtensa core with multiple coprocessors
      and registers with high alignment requirements.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01fb21bf
    • M
      xtensa: fix coprocessor context offset definitions · 5f84a996
      Max Filippov 提交于
      commit 03bc996af0cc71c7f30c384d8ce7260172423b34 upstream.
      
      Coprocessor context offsets are used by the assembly code that moves
      coprocessor context between the individual fields of the
      thread_info::xtregs_cp structure and coprocessor registers.
      This fixes coprocessor context clobbering on flushing and reloading
      during normal user code execution and user process debugging in the
      presence of more than one coprocessor in the core configuration.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5f84a996
    • M
      xtensa: enable coprocessors that are being flushed · 4ec1039f
      Max Filippov 提交于
      commit 2958b66694e018c552be0b60521fec27e8d12988 upstream.
      
      coprocessor_flush_all may be called from a context of a thread that is
      different from the thread being flushed. In that case contents of the
      cpenable special register may not match ti->cpenable of the target
      thread, resulting in unhandled coprocessor exception in the kernel
      context.
      Set cpenable special register to the ti->cpenable of the target register
      for the duration of the flush and restore it afterwards.
      This fixes the following crash caused by coprocessor register inspection
      in native gdb:
      
        (gdb) p/x $w0
        Illegal instruction in kernel: sig: 9 [#1] PREEMPT
        Call Trace:
          ___might_sleep+0x184/0x1a4
          __might_sleep+0x41/0xac
          exit_signals+0x14/0x218
          do_exit+0xc9/0x8b8
          die+0x99/0xa0
          do_illegal_instruction+0x18/0x6c
          common_exception+0x77/0x77
          coprocessor_flush+0x16/0x3c
          arch_ptrace+0x46c/0x674
          sys_ptrace+0x2ce/0x3b4
          system_call+0x54/0x80
          common_exception+0x77/0x77
        note: gdb[100] exited with preempt_count 1
        Killed
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ec1039f
    • L
      KVM: VMX: re-add ple_gap module parameter · bbe23c4b
      Luiz Capitulino 提交于
      commit a87c99e61236ba8ca962ce97a19fab5ebd588d35 upstream.
      
      Apparently, the ple_gap parameter was accidentally removed
      by commit c8e88717. Add it
      back.
      Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: c8e88717Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bbe23c4b
    • W
      KVM: X86: Fix scan ioapic use-before-initialization · 61c42d65
      Wanpeng Li 提交于
      commit e97f852fd4561e77721bb9a4e0ea9d98305b1e93 upstream.
      
      Reported by syzkaller:
      
       BUG: unable to handle kernel NULL pointer dereference at 00000000000001c8
       PGD 80000003ec4da067 P4D 80000003ec4da067 PUD 3f7bfa067 PMD 0
       Oops: 0000 [#1] PREEMPT SMP PTI
       CPU: 7 PID: 5059 Comm: debug Tainted: G           OE     4.19.0-rc5 #16
       RIP: 0010:__lock_acquire+0x1a6/0x1990
       Call Trace:
        lock_acquire+0xdb/0x210
        _raw_spin_lock+0x38/0x70
        kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
        vcpu_enter_guest+0x167e/0x1910 [kvm]
        kvm_arch_vcpu_ioctl_run+0x35c/0x610 [kvm]
        kvm_vcpu_ioctl+0x3e9/0x6d0 [kvm]
        do_vfs_ioctl+0xa5/0x690
        ksys_ioctl+0x6d/0x80
        __x64_sys_ioctl+0x1a/0x20
        do_syscall_64+0x83/0x6e0
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The reason is that the testcase writes hyperv synic HV_X64_MSR_SINT6 msr
      and triggers scan ioapic logic to load synic vectors into EOI exit bitmap.
      However, irqchip is not initialized by this simple testcase, ioapic/apic
      objects should not be accessed.
      This can be triggered by the following program:
      
          #define _GNU_SOURCE
      
          #include <endian.h>
          #include <stdint.h>
          #include <stdio.h>
          #include <stdlib.h>
          #include <string.h>
          #include <sys/syscall.h>
          #include <sys/types.h>
          #include <unistd.h>
      
          uint64_t r[3] = {0xffffffffffffffff, 0xffffffffffffffff, 0xffffffffffffffff};
      
          int main(void)
          {
          	syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0);
          	long res = 0;
          	memcpy((void*)0x20000040, "/dev/kvm", 9);
          	res = syscall(__NR_openat, 0xffffffffffffff9c, 0x20000040, 0, 0);
          	if (res != -1)
          		r[0] = res;
          	res = syscall(__NR_ioctl, r[0], 0xae01, 0);
          	if (res != -1)
          		r[1] = res;
          	res = syscall(__NR_ioctl, r[1], 0xae41, 0);
          	if (res != -1)
          		r[2] = res;
          	memcpy(
          			(void*)0x20000080,
          			"\x01\x00\x00\x00\x00\x5b\x61\xbb\x96\x00\x00\x40\x00\x00\x00\x00\x01\x00"
          			"\x08\x00\x00\x00\x00\x00\x0b\x77\xd1\x78\x4d\xd8\x3a\xed\xb1\x5c\x2e\x43"
          			"\xaa\x43\x39\xd6\xff\xf5\xf0\xa8\x98\xf2\x3e\x37\x29\x89\xde\x88\xc6\x33"
          			"\xfc\x2a\xdb\xb7\xe1\x4c\xac\x28\x61\x7b\x9c\xa9\xbc\x0d\xa0\x63\xfe\xfe"
          			"\xe8\x75\xde\xdd\x19\x38\xdc\x34\xf5\xec\x05\xfd\xeb\x5d\xed\x2e\xaf\x22"
          			"\xfa\xab\xb7\xe4\x42\x67\xd0\xaf\x06\x1c\x6a\x35\x67\x10\x55\xcb",
          			106);
          	syscall(__NR_ioctl, r[2], 0x4008ae89, 0x20000080);
          	syscall(__NR_ioctl, r[2], 0xae80, 0);
          	return 0;
          }
      
      This patch fixes it by bailing out scan ioapic if ioapic is not initialized in
      kernel.
      Reported-by: NWei Wu <ww9210@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Wei Wu <ww9210@gmail.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      61c42d65
    • W
      KVM: LAPIC: Fix pv ipis use-before-initialization · ffb01e73
      Wanpeng Li 提交于
      commit 38ab012f109caf10f471db1adf284e620dd8d701 upstream.
      
      Reported by syzkaller:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000014
       PGD 800000040410c067 P4D 800000040410c067 PUD 40410d067 PMD 0
       Oops: 0000 [#1] PREEMPT SMP PTI
       CPU: 3 PID: 2567 Comm: poc Tainted: G           OE     4.19.0-rc5 #16
       RIP: 0010:kvm_pv_send_ipi+0x94/0x350 [kvm]
       Call Trace:
        kvm_emulate_hypercall+0x3cc/0x700 [kvm]
        handle_vmcall+0xe/0x10 [kvm_intel]
        vmx_handle_exit+0xc1/0x11b0 [kvm_intel]
        vcpu_enter_guest+0x9fb/0x1910 [kvm]
        kvm_arch_vcpu_ioctl_run+0x35c/0x610 [kvm]
        kvm_vcpu_ioctl+0x3e9/0x6d0 [kvm]
        do_vfs_ioctl+0xa5/0x690
        ksys_ioctl+0x6d/0x80
        __x64_sys_ioctl+0x1a/0x20
        do_syscall_64+0x83/0x6e0
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The reason is that the apic map has not yet been initialized, the testcase
      triggers pv_send_ipi interface by vmcall which results in kvm->arch.apic_map
      is dereferenced. This patch fixes it by checking whether or not apic map is
      NULL and bailing out immediately if that is the case.
      
      Fixes: 4180bf1b (KVM: X86: Implement "send IPI" hypercall)
      Reported-by: NWei Wu <ww9210@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Wei Wu <ww9210@gmail.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ffb01e73
    • L
      KVM: x86: Fix kernel info-leak in KVM_HC_CLOCK_PAIRING hypercall · 6d772df4
      Liran Alon 提交于
      commit bcbfbd8ec21096027f1ee13ce6c185e8175166f6 upstream.
      
      kvm_pv_clock_pairing() allocates local var
      "struct kvm_clock_pairing clock_pairing" on stack and initializes
      all it's fields besides padding (clock_pairing.pad[]).
      
      Because clock_pairing var is written completely (including padding)
      to guest memory, failure to init struct padding results in kernel
      info-leak.
      
      Fix the issue by making sure to also init the padding with zeroes.
      
      Fixes: 55dd00a7 ("KVM: x86: add KVM_HC_CLOCK_PAIRING hypercall")
      Reported-by: syzbot+a8ef68d71211ba264f56@syzkaller.appspotmail.com
      Reviewed-by: NMark Kanda <mark.kanda@oracle.com>
      Signed-off-by: NLiran Alon <liran.alon@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6d772df4
    • L
      KVM: nVMX/nSVM: Fix bug which sets vcpu->arch.tsc_offset to L1 tsc_offset · 76c8476c
      Leonid Shatz 提交于
      commit 326e742533bf0a23f0127d8ea62fb558ba665f08 upstream.
      
      Since commit e79f245d ("X86/KVM: Properly update 'tsc_offset' to
      represent the running guest"), vcpu->arch.tsc_offset meaning was
      changed to always reflect the tsc_offset value set on active VMCS.
      Regardless if vCPU is currently running L1 or L2.
      
      However, above mentioned commit failed to also change
      kvm_vcpu_write_tsc_offset() to set vcpu->arch.tsc_offset correctly.
      This is because vmx_write_tsc_offset() could set the tsc_offset value
      in active VMCS to given offset parameter *plus vmcs12->tsc_offset*.
      However, kvm_vcpu_write_tsc_offset() just sets vcpu->arch.tsc_offset
      to given offset parameter. Without taking into account the possible
      addition of vmcs12->tsc_offset. (Same is true for SVM case).
      
      Fix this issue by changing kvm_x86_ops->write_tsc_offset() to return
      actually set tsc_offset in active VMCS and modify
      kvm_vcpu_write_tsc_offset() to set returned value in
      vcpu->arch.tsc_offset.
      In addition, rename write_tsc_offset() callback to write_l1_tsc_offset()
      to make it clear that it is meant to set L1 TSC offset.
      
      Fixes: e79f245d ("X86/KVM: Properly update 'tsc_offset' to represent the running guest")
      Reviewed-by: NLiran Alon <liran.alon@oracle.com>
      Reviewed-by: NMihai Carabas <mihai.carabas@oracle.com>
      Reviewed-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Signed-off-by: NLeonid Shatz <leonid.shatz@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76c8476c
    • J
      kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb · b8b0c871
      Jim Mattson 提交于
      commit fd65d3142f734bc4376053c8d75670041903134d upstream.
      
      Previously, we only called indirect_branch_prediction_barrier on the
      logical CPU that freed a vmcb. This function should be called on all
      logical CPUs that last loaded the vmcb in question.
      
      Fixes: 15d45071 ("KVM/x86: Add IBPB support")
      Reported-by: NNeel Natu <neelnatu@google.com>
      Signed-off-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b8b0c871
    • J
      kvm: mmu: Fix race in emulated page table writes · 471aca57
      Junaid Shahid 提交于
      commit 0e0fee5c539b61fdd098332e0e2cc375d9073706 upstream.
      
      When a guest page table is updated via an emulated write,
      kvm_mmu_pte_write() is called to update the shadow PTE using the just
      written guest PTE value. But if two emulated guest PTE writes happened
      concurrently, it is possible that the guest PTE and the shadow PTE end
      up being out of sync. Emulated writes do not mark the shadow page as
      unsync-ed, so this inconsistency will not be resolved even by a guest TLB
      flush (unless the page was marked as unsync-ed at some other point).
      
      This is fixed by re-reading the current value of the guest PTE after the
      MMU lock has been acquired instead of just using the value that was
      written prior to calling kvm_mmu_pte_write().
      Signed-off-by: NJunaid Shahid <junaids@google.com>
      Reviewed-by: NWanpeng Li <wanpengli@tencent.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      471aca57
    • T
      x86/speculation: Provide IBPB always command line options · 9f3baace
      Thomas Gleixner 提交于
      commit 55a974021ec952ee460dc31ca08722158639de72 upstream
      
      Provide the possibility to enable IBPB always in combination with 'prctl'
      and 'seccomp'.
      
      Add the extra command line options and rework the IBPB selection to
      evaluate the command instead of the mode selected by the STIPB switch case.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185006.144047038@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9f3baace
    • T
      x86/speculation: Add seccomp Spectre v2 user space protection mode · d1ec2354
      Thomas Gleixner 提交于
      commit 6b3e64c237c072797a9ec918654a60e3a46488e2 upstream
      
      If 'prctl' mode of user space protection from spectre v2 is selected
      on the kernel command-line, STIBP and IBPB are applied on tasks which
      restrict their indirect branch speculation via prctl.
      
      SECCOMP enables the SSBD mitigation for sandboxed tasks already, so it
      makes sense to prevent spectre v2 user space to user space attacks as
      well.
      
      The Intel mitigation guide documents how STIPB works:
          
         Setting bit 1 (STIBP) of the IA32_SPEC_CTRL MSR on a logical processor
         prevents the predicted targets of indirect branches on any logical
         processor of that core from being controlled by software that executes
         (or executed previously) on another logical processor of the same core.
      
      Ergo setting STIBP protects the task itself from being attacked from a task
      running on a different hyper-thread and protects the tasks running on
      different hyper-threads from being attacked.
      
      While the document suggests that the branch predictors are shielded between
      the logical processors, the observed performance regressions suggest that
      STIBP simply disables the branch predictor more or less completely. Of
      course the document wording is vague, but the fact that there is also no
      requirement for issuing IBPB when STIBP is used points clearly in that
      direction. The kernel still issues IBPB even when STIBP is used until Intel
      clarifies the whole mechanism.
      
      IBPB is issued when the task switches out, so malicious sandbox code cannot
      mistrain the branch predictor for the next user space task on the same
      logical processor.
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185006.051663132@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1ec2354
    • T
      x86/speculation: Enable prctl mode for spectre_v2_user · 7b62ef14
      Thomas Gleixner 提交于
      commit 7cc765a67d8e04ef7d772425ca5a2a1e2b894c15 upstream
      
      Now that all prerequisites are in place:
      
       - Add the prctl command line option
      
       - Default the 'auto' mode to 'prctl'
      
       - When SMT state changes, update the static key which controls the
         conditional STIBP evaluation on context switch.
      
       - At init update the static key which controls the conditional IBPB
         evaluation on context switch.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.958421388@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7b62ef14
    • T
      x86/speculation: Add prctl() control for indirect branch speculation · 238ba6e7
      Thomas Gleixner 提交于
      commit 9137bb27e60e554dab694eafa4cca241fa3a694f upstream
      
      Add the PR_SPEC_INDIRECT_BRANCH option for the PR_GET_SPECULATION_CTRL and
      PR_SET_SPECULATION_CTRL prctls to allow fine grained per task control of
      indirect branch speculation via STIBP and IBPB.
      
      Invocations:
       Check indirect branch speculation status with
       - prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0);
      
       Enable indirect branch speculation with
       - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0);
      
       Disable indirect branch speculation with
       - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0);
      
       Force disable indirect branch speculation with
       - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);
      
      See Documentation/userspace-api/spec_ctrl.rst.
      Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.866780996@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      238ba6e7
    • T
      x86/speculation: Prepare arch_smt_update() for PRCTL mode · f67fafb8
      Thomas Gleixner 提交于
      commit 6893a959d7fdebbab5f5aa112c277d5a44435ba1 upstream
      
      The upcoming fine grained per task STIBP control needs to be updated on CPU
      hotplug as well.
      
      Split out the code which controls the strict mode so the prctl control code
      can be added later. Mark the SMP function call argument __unused while at it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.759457117@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f67fafb8
    • T
      x86/speculation: Prevent stale SPEC_CTRL msr content · e8412401
      Thomas Gleixner 提交于
      commit 6d991ba509ebcfcc908e009d1db51972a4f7a064 upstream
      
      The seccomp speculation control operates on all tasks of a process, but
      only the current task of a process can update the MSR immediately. For the
      other threads the update is deferred to the next context switch.
      
      This creates the following situation with Process A and B:
      
      Process A task 2 and Process B task 1 are pinned on CPU1. Process A task 2
      does not have the speculation control TIF bit set. Process B task 1 has the
      speculation control TIF bit set.
      
      CPU0					CPU1
      					MSR bit is set
      					ProcB.T1 schedules out
      					ProcA.T2 schedules in
      					MSR bit is cleared
      ProcA.T1
        seccomp_update()
        set TIF bit on ProcA.T2
      					ProcB.T1 schedules in
      					MSR is not updated  <-- FAIL
      
      This happens because the context switch code tries to avoid the MSR update
      if the speculation control TIF bits of the incoming and the outgoing task
      are the same. In the worst case ProcB.T1 and ProcA.T2 are the only tasks
      scheduling back and forth on CPU1, which keeps the MSR stale forever.
      
      In theory this could be remedied by IPIs, but chasing the remote task which
      could be migrated is complex and full of races.
      
      The straight forward solution is to avoid the asychronous update of the TIF
      bit and defer it to the next context switch. The speculation control state
      is stored in task_struct::atomic_flags by the prctl and seccomp updates
      already.
      
      Add a new TIF_SPEC_FORCE_UPDATE bit and set this after updating the
      atomic_flags. Check the bit on context switch and force a synchronous
      update of the speculation control if set. Use the same mechanism for
      updating the current task.
      Reported-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1811272247140.1875@nanos.tec.linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e8412401