1. 19 6月, 2019 6 次提交
  2. 15 6月, 2019 1 次提交
    • B
      x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback · 78f4e932
      Borislav Petkov 提交于
      Adric Blake reported the following warning during suspend-resume:
      
        Enabling non-boot CPUs ...
        x86: Booting SMP configuration:
        smpboot: Booting Node 0 Processor 1 APIC 0x2
        unchecked MSR access error: WRMSR to 0x10f (tried to write 0x0000000000000000) \
         at rIP: 0xffffffff8d267924 (native_write_msr+0x4/0x20)
        Call Trace:
         intel_set_tfa
         intel_pmu_cpu_starting
         ? x86_pmu_dead_cpu
         x86_pmu_starting_cpu
         cpuhp_invoke_callback
         ? _raw_spin_lock_irqsave
         notify_cpu_starting
         start_secondary
         secondary_startup_64
        microcode: sig=0x806ea, pf=0x80, revision=0x96
        microcode: updated to revision 0xb4, date = 2019-04-01
        CPU1 is up
      
      The MSR in question is MSR_TFA_RTM_FORCE_ABORT and that MSR is emulated
      by microcode. The log above shows that the microcode loader callback
      happens after the PMU restoration, leading to the conjecture that
      because the microcode hasn't been updated yet, that MSR is not present
      yet, leading to the #GP.
      
      Add a microcode loader-specific hotplug vector which comes before
      the PERF vectors and thus executes earlier and makes sure the MSR is
      present.
      
      Fixes: 400816f6 ("perf/x86/intel: Implement support for TSX Force Abort")
      Reported-by: NAdric Blake <promarbler14@gmail.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: <stable@vger.kernel.org>
      Cc: x86@kernel.org
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=203637
      78f4e932
  3. 14 6月, 2019 2 次提交
  4. 13 6月, 2019 1 次提交
    • M
      x86/kgdb: Return 0 from kgdb_arch_set_breakpoint() · 71ab8323
      Matt Mullins 提交于
      err must be nonzero in order to reach text_poke(), which caused kgdb to
      fail to set breakpoints:
      
        (gdb) break __x64_sys_sync
        Breakpoint 1 at 0xffffffff81288910: file ../fs/sync.c, line 124.
        (gdb) c
        Continuing.
        Warning:
        Cannot insert breakpoint 1.
        Cannot access memory at address 0xffffffff81288910
      
        Command aborted.
      
      Fixes: 86a22057 ("x86/kgdb: Avoid redundant comparison of patched code")
      Signed-off-by: NMatt Mullins <mmullins@fb.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NNadav Amit <namit@vmware.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Daniel Thompson <daniel.thompson@linaro.org>
      Cc: Douglas Anderson <dianders@chromium.org>
      Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20190531194755.6320-1-mmullins@fb.com
      71ab8323
  5. 12 6月, 2019 2 次提交
    • P
      x86/resctrl: Prevent NULL pointer dereference when local MBM is disabled · c7563e62
      Prarit Bhargava 提交于
      Booting with kernel parameter "rdt=cmt,mbmtotal,memlocal,l3cat,mba" and
      executing "mount -t resctrl resctrl -o mba_MBps /sys/fs/resctrl" results in
      a NULL pointer dereference on systems which do not have local MBM support
      enabled..
      
      BUG: kernel NULL pointer dereference, address: 0000000000000020
      PGD 0 P4D 0
      Oops: 0000 [#1] SMP PTI
      CPU: 0 PID: 722 Comm: kworker/0:3 Not tainted 5.2.0-0.rc3.git0.1.el7_UNSUPPORTED.x86_64 #2
      Workqueue: events mbm_handle_overflow
      RIP: 0010:mbm_handle_overflow+0x150/0x2b0
      
      Only enter the bandwith update loop if the system has local MBM enabled.
      
      Fixes: de73f38f ("x86/intel_rdt/mba_sc: Feedback loop to dynamically update mem bandwidth")
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Reinette Chatre <reinette.chatre@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190610171544.13474-1-prarit@redhat.com
      c7563e62
    • J
      x86/resctrl: Don't stop walking closids when a locksetup group is found · 87d3aa28
      James Morse 提交于
      When a new control group is created __init_one_rdt_domain() walks all
      the other closids to calculate the sets of used and unused bits.
      
      If it discovers a pseudo_locksetup group, it breaks out of the loop.  This
      means any later closid doesn't get its used bits added to used_b.  These
      bits will then get set in unused_b, and added to the new control group's
      configuration, even if they were marked as exclusive for a later closid.
      
      When encountering a pseudo_locksetup group, we should continue. This is
      because "a resource group enters 'pseudo-locked' mode after the schemata is
      written while the resource group is in 'pseudo-locksetup' mode." When we
      find a pseudo_locksetup group, its configuration is expected to be
      overwritten, we can skip it.
      
      Fixes: dfe9674b ("x86/intel_rdt: Enable entering of pseudo-locksetup mode")
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NReinette Chatre <reinette.chatre@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H Peter Avin <hpa@zytor.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20190603172531.178830-1-james.morse@arm.com
      87d3aa28
  6. 08 6月, 2019 3 次提交
    • S
      x86/fpu: Update kernel's FPU state before using for the fsave header · aab8445c
      Sebastian Andrzej Siewior 提交于
      In commit
      
        39388e80 ("x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()")
      
      I removed the statement
      
      |       if (ia32_fxstate)
      |               copy_fxregs_to_kernel(fpu);
      
      and argued that it was wrongly merged because the content was already
      saved in kernel's state.
      
      This was wrong: It is required to write it back because it is only
      saved on the user-stack and save_fsave_header() reads it from task's
      FPU-state. I missed that part…
      
      Save x87 FPU state unless thread's FPU registers are already up to date.
      
      Fixes: 39388e80 ("x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()")
      Reported-by: NEric Biggers <ebiggers@kernel.org>
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NEric Biggers <ebiggers@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
      Cc: kvm ML <kvm@vger.kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20190607142915.y52mfmgk5lvhll7n@linutronix.de
      aab8445c
    • B
      x86/mm/KASLR: Compute the size of the vmemmap section properly · 00e5a2bb
      Baoquan He 提交于
      The size of the vmemmap section is hardcoded to 1 TB to support the
      maximum amount of system RAM in 4-level paging mode - 64 TB.
      
      However, 1 TB is not enough for vmemmap in 5-level paging mode. Assuming
      the size of struct page is 64 Bytes, to support 4 PB system RAM in 5-level,
      64 TB of vmemmap area is needed:
      
        4 * 1000^5 PB / 4096 bytes page size * 64 bytes per page struct / 1000^4 TB = 62.5 TB.
      
      This hardcoding may cause vmemmap to corrupt the following
      cpu_entry_area section, if KASLR puts vmemmap very close to it and the
      actual vmemmap size is bigger than 1 TB.
      
      So calculate the actual size of the vmemmap region needed and then align
      it up to 1 TB boundary.
      
      In 4-level paging mode it is always 1 TB. In 5-level it's adjusted on
      demand. The current code reserves 0.5 PB for vmemmap on 5-level. With
      this change, the space can be saved and thus used to increase entropy
      for the randomization.
      
       [ bp: Spell out how the 64 TB needed for vmemmap is computed and massage commit
         message. ]
      
      Fixes: eedb92ab ("x86/mm: Make virtual memory layout dynamic for CONFIG_X86_5LEVEL=y")
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Acked-by: NKirill A. Shutemov <kirill@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: kirill.shutemov@linux.intel.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable <stable@vger.kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20190523025744.3756-1-bhe@redhat.com
      00e5a2bb
    • J
      x86/insn-eval: Fix use-after-free access to LDT entry · de9f8696
      Jann Horn 提交于
      get_desc() computes a pointer into the LDT while holding a lock that
      protects the LDT from being freed, but then drops the lock and returns the
      (now potentially dangling) pointer to its caller.
      
      Fix it by giving the caller a copy of the LDT entry instead.
      
      Fixes: 670f928b ("x86/insn-eval: Add utility function to get segment descriptor")
      Cc: stable@vger.kernel.org
      Signed-off-by: NJann Horn <jannh@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de9f8696
  7. 07 6月, 2019 1 次提交
  8. 06 6月, 2019 1 次提交
  9. 05 6月, 2019 20 次提交
  10. 03 6月, 2019 1 次提交
    • J
      x86/power: Fix 'nosmt' vs hibernation triple fault during resume · ec527c31
      Jiri Kosina 提交于
      As explained in
      
      	0cc3cd21 ("cpu/hotplug: Boot HT siblings at least once")
      
      we always, no matter what, have to bring up x86 HT siblings during boot at
      least once in order to avoid first MCE bringing the system to its knees.
      
      That means that whenever 'nosmt' is supplied on the kernel command-line,
      all the HT siblings are as a result sitting in mwait or cpudile after
      going through the online-offline cycle at least once.
      
      This causes a serious issue though when a kernel, which saw 'nosmt' on its
      commandline, is going to perform resume from hibernation: if the resume
      from the hibernated image is successful, cr3 is flipped in order to point
      to the address space of the kernel that is being resumed, which in turn
      means that all the HT siblings are all of a sudden mwaiting on address
      which is no longer valid.
      
      That results in triple fault shortly after cr3 is switched, and machine
      reboots.
      
      Fix this by always waking up all the SMT siblings before initiating the
      'restore from hibernation' process; this guarantees that all the HT
      siblings will be properly carried over to the resumed kernel waiting in
      resume_play_dead(), and acted upon accordingly afterwards, based on the
      target kernel configuration.
      
      Symmetricaly, the resumed kernel has to push the SMT siblings to mwait
      again in case it has SMT disabled; this means it has to online all
      the siblings when resuming (so that they come out of hlt) and offline
      them again to let them reach mwait.
      
      Cc: 4.19+ <stable@vger.kernel.org> # v4.19+
      Debugged-by: NThomas Gleixner <tglx@linutronix.de>
      Fixes: 0cc3cd21 ("cpu/hotplug: Boot HT siblings at least once")
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      ec527c31
  11. 31 5月, 2019 2 次提交