1. 27 6月, 2018 1 次提交
    • K
      x86/mm: Drop unneeded __always_inline for p4d page table helpers · b8c1e429
      Kirill A. Shutemov 提交于
      This reverts the following commits:
      
        1ea66554 ("x86/mm: Mark p4d_offset() __always_inline")
        046c0dbe ("x86: Mark native_set_p4d() as __always_inline")
      
      p4d_offset(), native_set_p4d() and native_p4d_clear() were marked
      __always_inline in attempt to move __pgtable_l5_enabled into __initdata
      section.
      
      It was required as KASAN initialization code is a user of
      USE_EARLY_PGTABLE_L5, so all pgtable_l5_enabled() translated to
      __pgtable_l5_enabled there. This includes pgtable_l5_enabled() called
      from inline p4d helpers.
      
      If compiler would decided to not inline these p4d helpers, but leave
      them standalone, we end up with section mismatch.
      
      We don't need __always_inline here anymore. __pgtable_l5_enabled moved
      back to be __ro_after_init. See the following commit:
      
        51be1335 ("Revert "x86/mm: Mark __pgtable_l5_enabled __initdata"")
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180626100341.49910-1-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b8c1e429
  2. 26 6月, 2018 1 次提交
  3. 22 6月, 2018 1 次提交
    • M
      kvm: vmx: Nested VM-entry prereqs for event inj. · 0447378a
      Marc Orr 提交于
      This patch extends the checks done prior to a nested VM entry.
      Specifically, it extends the check_vmentry_prereqs function with checks
      for fields relevant to the VM-entry event injection information, as
      described in the Intel SDM, volume 3.
      
      This patch is motivated by a syzkaller bug, where a bad VM-entry
      interruption information field is generated in the VMCS02, which causes
      the nested VM launch to fail. Then, KVM fails to resume L1.
      
      While KVM should be improved to correctly resume L1 execution after a
      failed nested launch, this change is justified because the existing code
      to resume L1 is flaky/ad-hoc and the test coverage for resuming L1 is
      sparse.
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NMarc Orr <marcorr@google.com>
      [Removed comment whose parts were describing previous revisions and the
       rest was obvious from function/variable naming. - Radim]
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      0447378a
  4. 21 6月, 2018 1 次提交
  5. 14 6月, 2018 1 次提交
    • L
      Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variables · 050e9baa
      Linus Torvalds 提交于
      The changes to automatically test for working stack protector compiler
      support in the Kconfig files removed the special STACKPROTECTOR_AUTO
      option that picked the strongest stack protector that the compiler
      supported.
      
      That was all a nice cleanup - it makes no sense to have the AUTO case
      now that the Kconfig phase can just determine the compiler support
      directly.
      
      HOWEVER.
      
      It also meant that doing "make oldconfig" would now _disable_ the strong
      stackprotector if you had AUTO enabled, because in a legacy config file,
      the sane stack protector configuration would look like
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        # CONFIG_CC_STACKPROTECTOR_NONE is not set
        # CONFIG_CC_STACKPROTECTOR_REGULAR is not set
        # CONFIG_CC_STACKPROTECTOR_STRONG is not set
        CONFIG_CC_STACKPROTECTOR_AUTO=y
      
      and when you ran this through "make oldconfig" with the Kbuild changes,
      it would ask you about the regular CONFIG_CC_STACKPROTECTOR (that had
      been renamed from CONFIG_CC_STACKPROTECTOR_REGULAR to just
      CONFIG_CC_STACKPROTECTOR), but it would think that the STRONG version
      used to be disabled (because it was really enabled by AUTO), and would
      disable it in the new config, resulting in:
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
        CONFIG_CC_STACKPROTECTOR=y
        # CONFIG_CC_STACKPROTECTOR_STRONG is not set
        CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
      
      That's dangerously subtle - people could suddenly find themselves with
      the weaker stack protector setup without even realizing.
      
      The solution here is to just rename not just the old RECULAR stack
      protector option, but also the strong one.  This does that by just
      removing the CC_ prefix entirely for the user choices, because it really
      is not about the compiler support (the compiler support now instead
      automatially impacts _visibility_ of the options to users).
      
      This results in "make oldconfig" actually asking the user for their
      choice, so that we don't have any silent subtle security model changes.
      The end result would generally look like this:
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
        CONFIG_STACKPROTECTOR=y
        CONFIG_STACKPROTECTOR_STRONG=y
        CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
      
      where the "CC_" versions really are about internal compiler
      infrastructure, not the user selections.
      Acked-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      050e9baa
  6. 12 6月, 2018 1 次提交
    • P
      kvm: x86: use correct privilege level for sgdt/sidt/fxsave/fxrstor access · 3c9fa24c
      Paolo Bonzini 提交于
      The functions that were used in the emulation of fxrstor, fxsave, sgdt and
      sidt were originally meant for task switching, and as such they did not
      check privilege levels.  This is very bad when the same functions are used
      in the emulation of unprivileged instructions.  This is CVE-2018-10853.
      
      The obvious fix is to add a new argument to ops->read_std and ops->write_std,
      which decides whether the access is a "system" access or should use the
      processor's CPL.
      
      Fixes: 129a72a0 ("KVM: x86: Introduce segmented_write_std", 2017-01-12)
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3c9fa24c
  7. 08 6月, 2018 1 次提交
    • L
      mm: introduce ARCH_HAS_PTE_SPECIAL · 3010a5ea
      Laurent Dufour 提交于
      Currently the PTE special supports is turned on in per architecture
      header files.  Most of the time, it is defined in
      arch/*/include/asm/pgtable.h depending or not on some other per
      architecture static definition.
      
      This patch introduce a new configuration variable to manage this
      directly in the Kconfig files.  It would later replace
      __HAVE_ARCH_PTE_SPECIAL.
      
      Here notes for some architecture where the definition of
      __HAVE_ARCH_PTE_SPECIAL is not obvious:
      
      arm
       __HAVE_ARCH_PTE_SPECIAL which is currently defined in
      arch/arm/include/asm/pgtable-3level.h which is included by
      arch/arm/include/asm/pgtable.h when CONFIG_ARM_LPAE is set.
      So select ARCH_HAS_PTE_SPECIAL if ARM_LPAE.
      
      powerpc
      __HAVE_ARCH_PTE_SPECIAL is defined in 2 files:
       - arch/powerpc/include/asm/book3s/64/pgtable.h
       - arch/powerpc/include/asm/pte-common.h
      The first one is included if (PPC_BOOK3S & PPC64) while the second is
      included in all the other cases.
      So select ARCH_HAS_PTE_SPECIAL all the time.
      
      sparc:
      __HAVE_ARCH_PTE_SPECIAL is defined if defined(__sparc__) &&
      defined(__arch64__) which are defined through the compiler in
      sparc/Makefile if !SPARC32 which I assume to be if SPARC64.
      So select ARCH_HAS_PTE_SPECIAL if SPARC64
      
      There is no functional change introduced by this patch.
      
      Link: http://lkml.kernel.org/r/1523433816-14460-2-git-send-email-ldufour@linux.vnet.ibm.comSigned-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
      Suggested-by: NJerome Glisse <jglisse@redhat.com>
      Reviewed-by: NJerome Glisse <jglisse@redhat.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Albert Ou <albert@sifive.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Christophe LEROY <christophe.leroy@c-s.fr>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3010a5ea
  8. 06 6月, 2018 6 次提交
  9. 02 6月, 2018 1 次提交
  10. 28 5月, 2018 1 次提交
  11. 26 5月, 2018 2 次提交
  12. 25 5月, 2018 1 次提交
  13. 23 5月, 2018 2 次提交
    • J
      KVM: nVMX: Restore the VMCS12 offsets for v4.0 fields · b348e793
      Jim Mattson 提交于
      Changing the VMCS12 layout will break save/restore compatibility with
      older kvm releases once the KVM_{GET,SET}_NESTED_STATE ioctls are
      accepted upstream. Google has already been using these ioctls for some
      time, and we implore the community not to disturb the existing layout.
      
      Move the four most recently added fields to preserve the offsets of
      the previously defined fields and reserve locations for the vmread and
      vmwrite bitmaps, which will be used in the virtualization of VMCS
      shadowing (to improve the performance of double-nesting).
      Signed-off-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      [Kept the SDM order in vmcs_field_to_offset_table. - Radim]
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      b348e793
    • D
      x86, nfit_test: Add unit test for memcpy_mcsafe() · 5d8beee2
      Dan Williams 提交于
      Given the fact that the ACPI "EINJ" (error injection) facility is not
      universally available, implement software infrastructure to validate the
      memcpy_mcsafe() exception handling implementation.
      
      For each potential read exception point in memcpy_mcsafe(), inject a
      emulated exception point at the address identified by 'mcsafe_inject'
      variable. With this infrastructure implement a test to validate that the
      'bytes remaining' calculation is correct for a range of various source
      buffer alignments.
      
      This code is compiled out by default. The CONFIG_MCSAFE_DEBUG
      configuration symbol needs to be manually enabled by editing
      Kconfig.debug. I.e. this functionality can not be accidentally enabled
      by a user / distro, it's only for development.
      
      Cc: <x86@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Reported-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      5d8beee2
  14. 20 5月, 2018 1 次提交
    • T
      x86/Hyper-V/hv_apic: Build the Hyper-V APIC conditionally · 2d2ccf24
      Thomas Gleixner 提交于
      The Hyper-V APIC code is built when CONFIG_HYPERV is enabled but the actual
      code in that file is guarded with CONFIG_X86_64. There is no point in doing
      this. Neither is there a point in having the CONFIG_HYPERV guard in there
      because the containing directory is not built when CONFIG_HYPERV=n.
      
      Further for the hv_init_apic() function a stub is provided only for
      CONFIG_HYPERV=n, which is pointless as the callsite is not compiled at
      all. But for X86_32 the stub is missing and the build fails.
      
      Clean that up:
      
        - Compile hv_apic.c only when CONFIG_X86_64=y
        - Make the stub for hv_init_apic() available when CONFG_X86_64=n
      
      Fixes: 6b48cb5f ("X86/Hyper-V: Enlighten APIC access")
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Michael Kelley <mikelley@microsoft.com>
      2d2ccf24
  15. 19 5月, 2018 8 次提交
  16. 18 5月, 2018 3 次提交
    • K
      x86/bugs: Rename SSBD_NO to SSB_NO · 240da953
      Konrad Rzeszutek Wilk 提交于
      The "336996 Speculative Execution Side Channel Mitigations" from
      May defines this as SSB_NO, hence lets sync-up.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      240da953
    • A
      x86/io: Define readq()/writeq() to use 64-bit type · 6469a0ee
      Andy Shevchenko 提交于
      Since non atomic readq() and writeq() were added some of the drivers
      would like to use it in a manner of:
      
       #include <io-64-nonatomic-lo-hi.h>
       ...
       pr_debug("Debug value of some register: %016llx\n", readq(addr));
      
      However, lo_hi_readq() always returns __u64 data, while readq()
      on x86_64 defines it as unsigned long. and thus compiler warns
      about type mismatch, although they are both 64-bit on x86_64.
      
      Convert readq() and writeq() on x86 to operate on deterministic
      64-bit type. The most of architectures in the kernel already are using
      either unsigned long long, or u64 type for readq() / writeq().
      This change propagates consistency in that sense.
      
      While this is not an issue per se, though if someone wants to address it,
      the anchor could be the commit:
      
        797a796a ("asm-generic: architecture independent readq/writeq for 32bit environment")
      
      where non-atomic variants had been introduced.
      
      Note, there are only few users of above pattern and they will not be
      affected because they do cast returned value. The actual warning has
      been issued on not-yet-upstreamed code.
      
      Potentially we might get a new warnings if some 64-bit only code
      assigns returned value to unsigned long type of variable. This is
      assumed to be addressed on case-by-case basis.
      Reported-by: Nlkp <lkp@intel.com>
      Tested-by: NSohil Mehta <sohil.mehta@intel.com>
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180515115211.55050-1-andriy.shevchenko@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6469a0ee
    • M
      kvm: rename KVM_HINTS_DEDICATED to KVM_HINTS_REALTIME · 633711e8
      Michael S. Tsirkin 提交于
      KVM_HINTS_DEDICATED seems to be somewhat confusing:
      
      Guest doesn't really care whether it's the only task running on a host
      CPU as long as it's not preempted.
      
      And there are more reasons for Guest to be preempted than host CPU
      sharing, for example, with memory overcommit it can get preempted on a
      memory access, post copy migration can cause preemption, etc.
      
      Let's call it KVM_HINTS_REALTIME which seems to better
      match what guests expect.
      
      Also, the flag most be set on all vCPUs - current guests assume this.
      Note so in the documentation.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      633711e8
  17. 17 5月, 2018 8 次提交