1. 28 9月, 2018 1 次提交
    • K
      x86/boot: Fix kexec booting failure in the SEV bit detection code · bdec8d7f
      Kairui Song 提交于
      Commit
      
        1958b5fc ("x86/boot: Add early boot support when running with SEV active")
      
      can occasionally cause system resets when kexec-ing a second kernel even
      if SEV is not active.
      
      That's because get_sev_encryption_bit() uses 32-bit rIP-relative
      addressing to read the value of enc_bit - a variable which caches a
      previously detected encryption bit position - but kexec may allocate
      the early boot code to a higher location, beyond the 32-bit addressing
      limit.
      
      In this case, garbage will be read and get_sev_encryption_bit() will
      return the wrong value, leading to accessing memory with the wrong
      encryption setting.
      
      Therefore, remove enc_bit, and thus get rid of the need to do 32-bit
      rIP-relative addressing in the first place.
      
       [ bp: massage commit message heavily. ]
      
      Fixes: 1958b5fc ("x86/boot: Add early boot support when running with SEV active")
      Suggested-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NKairui Song <kasong@redhat.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NTom Lendacky <thomas.lendacky@amd.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: tglx@linutronix.de
      Cc: mingo@redhat.com
      Cc: hpa@zytor.com
      Cc: brijesh.singh@amd.com
      Cc: kexec@lists.infradead.org
      Cc: dyoung@redhat.com
      Cc: bhe@redhat.com
      Cc: ghook@redhat.com
      Link: https://lkml.kernel.org/r/20180927123845.32052-1-kasong@redhat.com
      bdec8d7f
  2. 24 8月, 2018 1 次提交
  3. 23 8月, 2018 1 次提交
  4. 02 8月, 2018 1 次提交
  5. 31 7月, 2018 1 次提交
  6. 25 7月, 2018 1 次提交
    • K
      x86/boot: Fix if_changed build flip/flop bug · 92a47286
      Kees Cook 提交于
      Dirk Gouders reported that two consecutive "make" invocations on an
      already compiled tree will show alternating behaviors:
      
      $ make
        CALL    scripts/checksyscalls.sh
        DESCEND  objtool
        CHK     include/generated/compile.h
        DATAREL arch/x86/boot/compressed/vmlinux
      Kernel: arch/x86/boot/bzImage is ready  (#48)
        Building modules, stage 2.
        MODPOST 165 modules
      
      $ make
        CALL    scripts/checksyscalls.sh
        DESCEND  objtool
        CHK     include/generated/compile.h
        LD      arch/x86/boot/compressed/vmlinux
        ZOFFSET arch/x86/boot/zoffset.h
        AS      arch/x86/boot/header.o
        LD      arch/x86/boot/setup.elf
        OBJCOPY arch/x86/boot/setup.bin
        OBJCOPY arch/x86/boot/vmlinux.bin
        BUILD   arch/x86/boot/bzImage
      Setup is 15644 bytes (padded to 15872 bytes).
      System is 6663 kB
      CRC 3eb90f40
      Kernel: arch/x86/boot/bzImage is ready  (#48)
        Building modules, stage 2.
        MODPOST 165 modules
      
      He bisected it back to:
      
          commit 98f78525 ("x86/boot: Refuse to build with data relocations")
      
      The root cause was the use of the "if_changed" kbuild function multiple
      times for the same target. It was designed to only be used once per
      target, otherwise it will effectively always trigger, flipping back and
      forth between the two commands getting recorded by "if_changed". Instead,
      this patch merges the two commands into a single function to get stable
      build artifacts (i.e. .vmlinux.cmd), and a single build behavior.
      Bisected-and-Reported-by: NDirk Gouders <dirk@gouders.net>
      Fix-Suggested-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180724230827.GA37823@beastSigned-off-by: NIngo Molnar <mingo@kernel.org>
      92a47286
  7. 22 7月, 2018 6 次提交
  8. 16 7月, 2018 1 次提交
  9. 11 7月, 2018 1 次提交
    • A
      efi/x86: Fix mixed mode reboot loop by removing pointless call to PciIo->Attributes() · e2967018
      Ard Biesheuvel 提交于
      Hans de Goede reported that his mixed EFI mode Bay Trail tablet
      would not boot at all any more, but enter a reboot loop without
      any logs printed by the kernel.
      
      Unbreak 64-bit Linux/x86 on 32-bit UEFI:
      
      When it was first introduced, the EFI stub code that copies the
      contents of PCI option ROMs originally only intended to do so if
      the EFI_PCI_IO_ATTRIBUTE_EMBEDDED_ROM attribute was *not* set.
      
      The reason was that the UEFI spec permits PCI option ROM images
      to be provided by the platform directly, rather than via the ROM
      BAR, and in this case, the OS can only access them at runtime if
      they are preserved at boot time by copying them from the areas
      described by PciIo->RomImage and PciIo->RomSize.
      
      However, it implemented this check erroneously, as can be seen in
      commit:
      
        dd5fc854 ("EFI: Stash ROMs if they're not in the PCI BAR")
      
      which introduced:
      
          if (!attributes & EFI_PCI_IO_ATTRIBUTE_EMBEDDED_ROM)
                  continue;
      
      and given that the numeric value of EFI_PCI_IO_ATTRIBUTE_EMBEDDED_ROM
      is 0x4000, this condition never becomes true, and so the option ROMs
      were copied unconditionally.
      
      This was spotted and 'fixed' by commit:
      
        886d751a ("x86, efi: correct precedence of operators in setup_efi_pci")
      
      but inadvertently inverted the logic at the same time, defeating
      the purpose of the code, since it now only preserves option ROM
      images that can be read from the ROM BAR as well.
      
      Unsurprisingly, this broke some systems, and so the check was removed
      entirely in the following commit:
      
        73970188 ("x86, efi: remove attribute check from setup_efi_pci")
      
      It is debatable whether this check should have been included in the
      first place, since the option ROM image provided to the UEFI driver by
      the firmware may be different from the one that is actually present in
      the card's flash ROM, and so whatever PciIo->RomImage points at should
      be preferred regardless of whether the attribute is set.
      
      As this was the only use of the attributes field, we can remove
      the call to PciIo->Attributes() entirely, which is especially
      nice because its prototype involves uint64_t type by-value
      arguments which the EFI mixed mode has trouble dealing with.
      
      Any mixed mode system with PCI is likely to be affected.
      Tested-by: NWilfried Klaebe <linux-kernel@lebenslange-mailadresse.de>
      Tested-by: NHans de Goede <hdegoede@redhat.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/20180711090235.9327-2-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e2967018
  10. 03 7月, 2018 2 次提交
    • B
      x86/boot/KASLR: Skip specified number of 1GB huge pages when doing physical randomization (KASLR) · 747ff626
      Baoquan He 提交于
      When KASLR is enabled then 1GB huge pages allocations might regress
      sporadically.
      
      To reproduce on a KVM guest with 4GB RAM:
      
      - add the following options to the kernel command-line:
      
         'default_hugepagesz=1G hugepagesz=1G hugepages=1'
      
      - boot the guest and check number of 1GB pages reserved:
      
          # grep HugePages_Total /proc/meminfo
      
      - sporadically, every couple of bootups the output of this
        command shows that when booting with "nokaslr" HugePages_Total is always 1,
        while booting without "nokaslr" sometimes HugePages_Total is set as 0
        (that is, reserving the 1GB page failed).
      
      Note that you may need to boot a few times to trigger the issue,
      because it's somewhat non-deterministic.
      
      The root cause is that kernel may be put into the only good 1GB huge page
      in the [0x40000000, 0x7fffffff] physical range randomly.
      
      Below is the dmesg output snippet from the KVM guest. We can see that only
      [0x40000000, 0x7fffffff] region is good 1GB huge page,
      [0x100000000, 0x13fffffff] will be touched by the memblock top-down allocation:
      
      [...] e820: BIOS-provided physical RAM map:
      [...] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
      [...] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
      [...] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
      [...] BIOS-e820: [mem 0x0000000000100000-0x00000000bffdffff] usable
      [...] BIOS-e820: [mem 0x00000000bffe0000-0x00000000bfffffff] reserved
      [...] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
      [...] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
      [...] BIOS-e820: [mem 0x0000000100000000-0x000000013fffffff] usable
      
      Besides, on bare-metal machines with larger memory, one less 1GB huge page
      might be available with KASLR enabled. That too is because the kernel
      image might be randomized into those "good" 1GB huge pages.
      
      To fix this, firstly parse the kernel command-line to get how many 1GB huge
      pages are specified. Then try to skip the specified number of 1GB huge
      pages when decide which memory region kernel can be randomized into.
      
      Also change the name of handle_mem_memmap() as handle_mem_options()
      since it handles not only 'mem=' and 'memmap=', but also 'hugepagesxxx' now.
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: douly.fnst@cn.fujitsu.com
      Cc: fanc.fnst@cn.fujitsu.com
      Cc: indou.takao@jp.fujitsu.com
      Cc: keescook@chromium.org
      Cc: lcapitulino@redhat.com
      Cc: yasu.isimatu@gmail.com
      Link: http://lkml.kernel.org/r/20180625031656.12443-3-bhe@redhat.com
      [ Rewrote the changelog, fixed style problems in the code. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      747ff626
    • B
      x86/boot/KASLR: Add two new functions for 1GB huge pages handling · 9b912485
      Baoquan He 提交于
      Introduce two new functions: parse_gb_huge_pages() and process_gb_huge_pages(),
      which handle a conflict between KASLR and huge pages of 1GB.
      
      These two functions will be used in the next patch:
      
      - parse_gb_huge_pages() is used to parse kernel command-line to get
        how many 1GB huge pages have been specified. A static global
        variable 'max_gb_huge_pages' is added to store the number.
      
      - process_gb_huge_pages() is used to skip as many 1GB huge pages
        as possible from the passed in memory region according to the
        specified number.
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: douly.fnst@cn.fujitsu.com
      Cc: fanc.fnst@cn.fujitsu.com
      Cc: indou.takao@jp.fujitsu.com
      Cc: keescook@chromium.org
      Cc: lcapitulino@redhat.com
      Cc: yasu.isimatu@gmail.com
      Link: http://lkml.kernel.org/r/20180625031656.12443-2-bhe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9b912485
  11. 24 6月, 2018 1 次提交
    • A
      efi/x86: Fix incorrect invocation of PciIo->Attributes() · 2e6eb40c
      Ard Biesheuvel 提交于
      The following commit:
      
        2c3625cb ("efi/x86: Fold __setup_efi_pci32() and __setup_efi_pci64() into one function")
      
      ... merged the two versions of __setup_efi_pciXX(), without taking into
      account that the 32-bit version used a rather dodgy trick to pass an
      immediate 0 constant as argument for a uint64_t parameter.
      
      The issue is caused by the fact that on x86, UEFI protocol method calls
      are redirected via struct efi_config::call(), which is a variadic function,
      and so the compiler has to infer the types of the parameters from the
      arguments rather than from the prototype.
      
      As the 32-bit x86 calling convention passes arguments via the stack,
      passing the unqualified constant 0 twice is the same as passing 0ULL,
      which is why the 32-bit code in __setup_efi_pci32() contained the
      following call:
      
        status = efi_early->call(pci->attributes, pci,
                                 EfiPciIoAttributeOperationGet, 0, 0,
                                 &attributes);
      
      to invoke this UEFI protocol method:
      
        typedef
        EFI_STATUS
        (EFIAPI *EFI_PCI_IO_PROTOCOL_ATTRIBUTES) (
          IN  EFI_PCI_IO_PROTOCOL                     *This,
          IN  EFI_PCI_IO_PROTOCOL_ATTRIBUTE_OPERATION Operation,
          IN  UINT64                                  Attributes,
          OUT UINT64                                  *Result OPTIONAL
          );
      
      After the merge, we inadvertently ended up with this version for both
      32-bit and 64-bit builds, breaking the latter.
      
      So replace the two zeroes with the explicitly typed constant 0ULL,
      which works as expected on both 32-bit and 64-bit builds.
      
      Wilfried tested the 64-bit build, and I checked the generated assembly
      of a 32-bit build with and without this patch, and they are identical.
      Reported-by: NWilfried Klaebe <linux-kernel@lebenslange-mailadresse.de>
      Tested-by: NWilfried Klaebe <linux-kernel@lebenslange-mailadresse.de>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hdegoede@redhat.com
      Cc: linux-efi@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      2e6eb40c
  12. 06 6月, 2018 1 次提交
  13. 19 5月, 2018 3 次提交
  14. 16 5月, 2018 2 次提交
  15. 14 5月, 2018 4 次提交
  16. 12 4月, 2018 1 次提交
    • D
      x86/mm: Do not auto-massage page protections · fb43d6cb
      Dave Hansen 提交于
      A PTE is constructed from a physical address and a pgprotval_t.
      __PAGE_KERNEL, for instance, is a pgprot_t and must be converted
      into a pgprotval_t before it can be used to create a PTE.  This is
      done implicitly within functions like pfn_pte() by massage_pgprot().
      
      However, this makes it very challenging to set bits (and keep them
      set) if your bit is being filtered out by massage_pgprot().
      
      This moves the bit filtering out of pfn_pte() and friends.  For
      users of PAGE_KERNEL*, filtering will be done automatically inside
      those macros but for users of __PAGE_KERNEL*, they need to do their
      own filtering now.
      
      Note that we also just move pfn_pte/pmd/pud() over to check_pgprot()
      instead of massage_pgprot().  This way, we still *look* for
      unsupported bits and properly warn about them if we find them.  This
      might happen if an unfiltered __PAGE_KERNEL* value was passed in,
      for instance.
      
      - printk format warning fix from: Arnd Bergmann <arnd@arndb.de>
      - boot crash fix from:            Tom Lendacky <thomas.lendacky@amd.com>
      - crash bisected by:              Mike Galbraith <efault@gmx.de>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reported-and-fixed-by: NArnd Bergmann <arnd@arndb.de>
      Fixed-by: NTom Lendacky <thomas.lendacky@amd.com>
      Bisected-by: NMike Galbraith <efault@gmx.de>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20180406205509.77E1D7F6@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fb43d6cb
  17. 31 3月, 2018 1 次提交
  18. 28 3月, 2018 1 次提交
    • T
      x86/boot: Fix SEV boot failure from change to __PHYSICAL_MASK_SHIFT · 07344b15
      Tom Lendacky 提交于
      In arch/x86/boot/compressed/kaslr_64.c, CONFIG_AMD_MEM_ENCRYPT support was
      initially #undef'd to support SME with minimal effort.  When support for
      SEV was added, the #undef remained and some minimal support for setting the
      encryption bit was added for building identity mapped pagetable entries.
      
      Commit b83ce5ee ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52")
      changed __PHYSICAL_MASK_SHIFT from 46 to 52 in support of 5-level paging.
      This change resulted in SEV guests failing to boot because the encryption
      bit was no longer being automatically masked out.  The compressed boot
      path now requires sme_me_mask to be defined in order for the pagetable
      functions, such as pud_present(), to properly mask out the encryption bit
      (currently bit 47) when evaluating pagetable entries.
      
      Add an sme_me_mask variable in arch/x86/boot/compressed/mem_encrypt.S,
      which is set when SEV is active, delete the #undef CONFIG_AMD_MEM_ENCRYPT
      from arch/x86/boot/compressed/kaslr_64.c and use sme_me_mask when building
      the identify mapped pagetable entries.
      
      Fixes: b83ce5ee ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52")
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Link: https://lkml.kernel.org/r/20180327220711.8702.55842.stgit@tlendack-t1.amdoffice.net
      07344b15
  19. 20 3月, 2018 1 次提交
  20. 12 3月, 2018 9 次提交
    • K
      x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G · 194a9749
      Kirill A. Shutemov 提交于
      This patch addresses a shortcoming in current boot process on machines
      that supports 5-level paging.
      
      If a bootloader enables 64-bit mode with 4-level paging, we might need to
      switch over to 5-level paging. The switching requires the disabling
      paging. It works fine if kernel itself is loaded below 4G.
      
      But if the bootloader put the kernel above 4G (not sure if anybody does
      this), we would lose control as soon as paging is disabled, because the
      code becomes unreachable to the CPU.
      
      This patch implements a trampoline in lower memory to handle this
      situation.
      
      We only need the memory for a very short time, until the main kernel
      image sets up own page tables.
      
      We go through the trampoline even if we don't have to: if we're already
      in 5-level paging mode or if we don't need to switch to it. This way the
      trampoline gets tested on every boot.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20180312100246.89175-5-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      194a9749
    • K
      x86/boot/compressed/64: Use page table in trampoline memory · 0a1756bd
      Kirill A. Shutemov 提交于
      If a bootloader enables 64-bit mode with 4-level paging, we might need to
      switch over to 5-level paging. The switching requires the disabling
      paging. It works fine if kernel itself is loaded below 4G.
      
      But if the bootloader put the kernel above 4G (i.e. in kexec() case),
      we would lose control as soon as paging is disabled, because the code
      becomes unreachable to the CPU.
      
      To handle the situation, we need a trampoline in lower memory that would
      take care of switching on 5-level paging.
      
      Apart from the trampoline code itself we also need a place to store
      top-level page table in lower memory as we don't have a way to load
      64-bit values into CR3 in 32-bit mode. We only really need 8 bytes there
      as we only use the very first entry of the page table. But we allocate a
      whole page anyway.
      
      This patch switches 32-bit code to use page table in trampoline memory.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20180312100246.89175-4-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0a1756bd
    • K
      x86/boot/compressed/64: Use stack from trampoline memory · f7ff53e4
      Kirill A. Shutemov 提交于
      As the first step on using trampoline memory, let's make 32-bit code use
      stack there.
      
      Separate stack is required to return back from trampoline and we cannot
      user stack from 64-bit mode as it may be above 4G.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20180312100246.89175-3-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f7ff53e4
    • K
      x86/boot/compressed/64: Make sure we have a 32-bit code segment · 7beebacc
      Kirill A. Shutemov 提交于
      When kernel starts in 64-bit mode we inherit the GDT from the bootloader.
      It may cause a problem if the GDT doesn't have a 32-bit code segment
      where we expect it to be.
      
      Load our own GDT with known segments.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20180312100246.89175-2-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7beebacc
    • A
      efi: Use string literals for efi_char16_t variable initializers · 36b64976
      Ard Biesheuvel 提交于
      Now that we unambiguously build the entire kernel with -fshort-wchar,
      it is no longer necessary to open code efi_char16_t[] initializers as
      arrays of characters, and we can move to the L"xxx" notation instead.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lukas Wunner <lukas@wunner.de>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/20180312084500.10764-6-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      36b64976
    • K
      x86/boot/compressed/64: Prepare new top-level page table for trampoline · e9d0e633
      Kirill A. Shutemov 提交于
      If trampoline code would need to switch between 4- and 5-level paging
      modes, we have to use a page table in trampoline memory.
      
      Having it in trampoline memory guarantees that it's below 4G and we can
      point CR3 to it from 32-bit trampoline code.
      
      We only use the page table if the desired paging mode doesn't match the
      mode we are in. Otherwise the page table is unused and trampoline code
      wouldn't touch CR3.
      
      For 4- to 5-level paging transition, we set up current (4-level paging)
      CR3 as the first and the only entry in a new top-level page table.
      
      For 5- to 4-level paging transition, copy page table pointed by first
      entry in the current top-level page table as our new top-level page
      table.
      
      If the page table is used by trampoline we would need to copy it to new
      page table outside trampoline and update CR3 before restoring trampoline
      memory.
      Tested-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20180226180451.86788-6-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e9d0e633
    • K
      x86/boot/compressed/64: Set up trampoline memory · 32fcefa2
      Kirill A. Shutemov 提交于
      This patch clears up trampoline memory and copies trampoline code in
      place. It's not yet used though.
      Tested-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20180226180451.86788-5-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      32fcefa2
    • K
      x86/boot/compressed/64: Save and restore trampoline memory · fb526835
      Kirill A. Shutemov 提交于
      The memory area we found for trampoline shouldn't contain anything
      useful. But let's preserve the data anyway. Just to be on safe side.
      
      paging_prepare() would save the data into a buffer.
      
      cleanup_trampoline() would restore it back once we are done with the
      trampoline.
      Tested-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20180226180451.86788-4-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fb526835
    • K
      x86/boot/compressed/64: Find a place for 32-bit trampoline · 3548e131
      Kirill A. Shutemov 提交于
      If a bootloader enables 64-bit mode with 4-level paging, we might need to
      switch over to 5-level paging. The switching requires the disabling of
      paging, which works fine if kernel itself is loaded below 4G.
      
      But if the bootloader puts the kernel above 4G (not sure if anybody does
      this), we would lose control as soon as paging is disabled, because the
      code becomes unreachable to the CPU.
      
      To handle the situation, we need a trampoline in lower memory that would
      take care of switching on 5-level paging.
      
      This patch finds a spot in low memory for a trampoline.
      
      The heuristic is based on code in reserve_bios_regions().
      
      We find the end of low memory based on BIOS and EBDA start addresses.
      The trampoline is put just before end of low memory. It's mimic approach
      taken to allocate memory for realtime trampoline.
      Tested-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20180226180451.86788-3-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3548e131