1. 21 9月, 2018 1 次提交
  2. 19 9月, 2018 10 次提交
  3. 16 9月, 2018 2 次提交
    • B
      x86/kvm: Use __bss_decrypted attribute in shared variables · 6a1cac56
      Brijesh Singh 提交于
      The recent removal of the memblock dependency from kvmclock caused a SEV
      guest regression because the wall_clock and hv_clock_boot variables are
      no longer mapped decrypted when SEV is active.
      
      Use the __bss_decrypted attribute to put the static wall_clock and
      hv_clock_boot in the .bss..decrypted section so that they are mapped
      decrypted during boot.
      
      In the preparatory stage of CPU hotplug, the per-cpu pvclock data pointer
      assigns either an element of the static array or dynamically allocated
      memory for the pvclock data pointer. The static array are now mapped
      decrypted but the dynamically allocated memory is not mapped decrypted.
      However, when SEV is active this memory range must be mapped decrypted.
      
      Add a function which is called after the page allocator is up, and
      allocate memory for the pvclock data pointers for the all possible cpus.
      Map this memory range as decrypted when SEV is active.
      
      Fixes: 368a540e ("x86/kvmclock: Remove memblock dependency")
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: kvm@vger.kernel.org
      Link: https://lkml.kernel.org/r/1536932759-12905-3-git-send-email-brijesh.singh@amd.com
      6a1cac56
    • B
      x86/mm: Add .bss..decrypted section to hold shared variables · b3f0907c
      Brijesh Singh 提交于
      kvmclock defines few static variables which are shared with the
      hypervisor during the kvmclock initialization.
      
      When SEV is active, memory is encrypted with a guest-specific key, and
      if the guest OS wants to share the memory region with the hypervisor
      then it must clear the C-bit before sharing it.
      
      Currently, we use kernel_physical_mapping_init() to split large pages
      before clearing the C-bit on shared pages. But it fails when called from
      the kvmclock initialization (mainly because the memblock allocator is
      not ready that early during boot).
      
      Add a __bss_decrypted section attribute which can be used when defining
      such shared variable. The so-defined variables will be placed in the
      .bss..decrypted section. This section will be mapped with C=0 early
      during boot.
      
      The .bss..decrypted section has a big chunk of memory that may be unused
      when memory encryption is not active, free it when memory encryption is
      not active.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Radim Krčmář<rkrcmar@redhat.com>
      Cc: kvm@vger.kernel.org
      Link: https://lkml.kernel.org/r/1536932759-12905-2-git-send-email-brijesh.singh@amd.com
      b3f0907c
  4. 15 9月, 2018 1 次提交
  5. 13 9月, 2018 1 次提交
  6. 12 9月, 2018 1 次提交
  7. 08 9月, 2018 1 次提交
  8. 06 9月, 2018 2 次提交
  9. 02 9月, 2018 2 次提交
  10. 31 8月, 2018 1 次提交
  11. 30 8月, 2018 2 次提交
  12. 27 8月, 2018 2 次提交
  13. 24 8月, 2018 2 次提交
  14. 21 8月, 2018 2 次提交
    • D
      x86/memory_failure: Introduce {set, clear}_mce_nospec() · 284ce401
      Dan Williams 提交于
      Currently memory_failure() returns zero if the error was handled. On
      that result mce_unmap_kpfn() is called to zap the page out of the kernel
      linear mapping to prevent speculative fetches of potentially poisoned
      memory. However, in the case of dax mapped devmap pages the page may be
      in active permanent use by the device driver, so it cannot be unmapped
      from the kernel.
      
      Instead of marking the page not present, marking the page UC should
      be sufficient for preventing poison from being pre-fetched into the
      cache. Convert mce_unmap_pfn() to set_mce_nospec() remapping the page as
      UC, to hide it from speculative accesses.
      
      Given that that persistent memory errors can be cleared by the driver,
      include a facility to restore the page to cacheable operation,
      clear_mce_nospec().
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: <linux-edac@vger.kernel.org>
      Cc: <x86@kernel.org>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: NIngo Molnar <mingo@redhat.com>
      Signed-off-by: NDave Jiang <dave.jiang@intel.com>
      284ce401
    • R
      x86/process: Re-export start_thread() · dc76803e
      Rian Hunter 提交于
      The consolidation of the start_thread() functions removed the export
      unintentionally. This breaks binfmt handlers built as a module.
      
      Add it back.
      
      Fixes: e634d8fc ("x86-64: merge the standard and compat start_thread() functions")
      Signed-off-by: NRian Hunter <rian@alum.mit.edu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Dmitry Safonov <dima@arista.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180819230854.7275-1-rian@alum.mit.edu
      dc76803e
  15. 17 8月, 2018 1 次提交
  16. 16 8月, 2018 1 次提交
  17. 15 8月, 2018 2 次提交
  18. 10 8月, 2018 1 次提交
  19. 08 8月, 2018 1 次提交
    • P
      x86/paravirt: Fix spectre-v2 mitigations for paravirt guests · 5800dc5c
      Peter Zijlstra 提交于
      Nadav reported that on guests we're failing to rewrite the indirect
      calls to CALLEE_SAVE paravirt functions. In particular the
      pv_queued_spin_unlock() call is left unpatched and that is all over the
      place. This obviously wrecks Spectre-v2 mitigation (for paravirt
      guests) which relies on not actually having indirect calls around.
      
      The reason is an incorrect clobber test in paravirt_patch_call(); this
      function rewrites an indirect call with a direct call to the _SAME_
      function, there is no possible way the clobbers can be different
      because of this.
      
      Therefore remove this clobber check. Also put WARNs on the other patch
      failure case (not enough room for the instruction) which I've not seen
      trigger in my (limited) testing.
      
      Three live kernel image disassemblies for lock_sock_nested (as a small
      function that illustrates the problem nicely). PRE is the current
      situation for guests, POST is with this patch applied and NATIVE is with
      or without the patch for !guests.
      
      PRE:
      
      (gdb) disassemble lock_sock_nested
      Dump of assembler code for function lock_sock_nested:
         0xffffffff817be970 <+0>:     push   %rbp
         0xffffffff817be971 <+1>:     mov    %rdi,%rbp
         0xffffffff817be974 <+4>:     push   %rbx
         0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
         0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
         0xffffffff817be981 <+17>:    mov    %rbx,%rdi
         0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
         0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
         0xffffffff817be98f <+31>:    test   %eax,%eax
         0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
         0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
         0xffffffff817be99d <+45>:    mov    %rbx,%rdi
         0xffffffff817be9a0 <+48>:    callq  *0xffffffff822299e8
         0xffffffff817be9a7 <+55>:    pop    %rbx
         0xffffffff817be9a8 <+56>:    pop    %rbp
         0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
         0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
         0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063ae0 <__local_bh_enable_ip>
         0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
         0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
         0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
      End of assembler dump.
      
      POST:
      
      (gdb) disassemble lock_sock_nested
      Dump of assembler code for function lock_sock_nested:
         0xffffffff817be970 <+0>:     push   %rbp
         0xffffffff817be971 <+1>:     mov    %rdi,%rbp
         0xffffffff817be974 <+4>:     push   %rbx
         0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
         0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
         0xffffffff817be981 <+17>:    mov    %rbx,%rdi
         0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
         0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
         0xffffffff817be98f <+31>:    test   %eax,%eax
         0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
         0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
         0xffffffff817be99d <+45>:    mov    %rbx,%rdi
         0xffffffff817be9a0 <+48>:    callq  0xffffffff810a0c20 <__raw_callee_save___pv_queued_spin_unlock>
         0xffffffff817be9a5 <+53>:    xchg   %ax,%ax
         0xffffffff817be9a7 <+55>:    pop    %rbx
         0xffffffff817be9a8 <+56>:    pop    %rbp
         0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
         0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
         0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063aa0 <__local_bh_enable_ip>
         0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
         0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
         0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
      End of assembler dump.
      
      NATIVE:
      
      (gdb) disassemble lock_sock_nested
      Dump of assembler code for function lock_sock_nested:
         0xffffffff817be970 <+0>:     push   %rbp
         0xffffffff817be971 <+1>:     mov    %rdi,%rbp
         0xffffffff817be974 <+4>:     push   %rbx
         0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
         0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
         0xffffffff817be981 <+17>:    mov    %rbx,%rdi
         0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
         0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
         0xffffffff817be98f <+31>:    test   %eax,%eax
         0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
         0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
         0xffffffff817be99d <+45>:    mov    %rbx,%rdi
         0xffffffff817be9a0 <+48>:    movb   $0x0,(%rdi)
         0xffffffff817be9a3 <+51>:    nopl   0x0(%rax)
         0xffffffff817be9a7 <+55>:    pop    %rbx
         0xffffffff817be9a8 <+56>:    pop    %rbp
         0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
         0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
         0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063ae0 <__local_bh_enable_ip>
         0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
         0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
         0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
      End of assembler dump.
      
      
      Fixes: 63f70270 ("[PATCH] i386: PARAVIRT: add common patching machinery")
      Fixes: 3010a066 ("x86/paravirt, objtool: Annotate indirect calls")
      Reported-by: NNadav Amit <namit@vmware.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: stable@vger.kernel.org
      5800dc5c
  20. 07 8月, 2018 2 次提交
    • T
      cpu/hotplug: Fix SMT supported evaluation · bc2d8d26
      Thomas Gleixner 提交于
      Josh reported that the late SMT evaluation in cpu_smt_state_init() sets
      cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied
      on the kernel command line as it cannot differentiate between SMT disabled
      by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and
      makes the sysfs interface unusable.
      
      Rework this so that during bringup of the non boot CPUs the availability of
      SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a
      'primary' thread then set the local cpu_smt_available marker and evaluate
      this explicitely right after the initial SMP bringup has finished.
      
      SMT evaulation on x86 is a trainwreck as the firmware has all the
      information _before_ booting the kernel, but there is no interface to query
      it.
      
      Fixes: 73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
      Reported-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      bc2d8d26
    • M
      xen/pv: Call get_cpu_address_sizes to set x86_virt/phys_bits · 405c018a
      M. Vefa Bicakci 提交于
      Commit d94a155c ("x86/cpu: Prevent cpuinfo_x86::x86_phys_bits
      adjustment corruption") has moved the query and calculation of the
      x86_virt_bits and x86_phys_bits fields of the cpuinfo_x86 struct
      from the get_cpu_cap function to a new function named
      get_cpu_address_sizes.
      
      One of the call sites related to Xen PV VMs was unfortunately missed
      in the aforementioned commit. This prevents successful boot-up of
      kernel versions 4.17 and up in Xen PV VMs if CONFIG_DEBUG_VIRTUAL
      is enabled, due to the following code path:
      
        enlighten_pv.c::xen_start_kernel
          mmu_pv.c::xen_reserve_special_pages
            page.h::__pa
              physaddr.c::__phys_addr
                physaddr.h::phys_addr_valid
      
      phys_addr_valid uses boot_cpu_data.x86_phys_bits to validate physical
      addresses. boot_cpu_data.x86_phys_bits is no longer populated before
      the call to xen_reserve_special_pages due to the aforementioned commit
      though, so the validation performed by phys_addr_valid fails, which
      causes __phys_addr to trigger a BUG, preventing boot-up.
      Signed-off-by: NM. Vefa Bicakci <m.v.b@runbox.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: xen-devel@lists.xenproject.org
      Cc: x86@kernel.org
      Cc: stable@vger.kernel.org # for v4.17 and up
      Fixes: d94a155c ("x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption")
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      405c018a
  21. 06 8月, 2018 2 次提交