1. 25 1月, 2020 1 次提交
  2. 22 1月, 2020 1 次提交
    • A
      efi/x86: Disallow efi=old_map in mixed mode · 0779221e
      Ard Biesheuvel 提交于
      Before:
      
        1f299fad: ("efi/x86: Limit EFI old memory map to SGI UV machines")
      
      enabling the old EFI memory map on mixed mode systems
      disabled EFI runtime services altogether.
      
      Given that efi=old_map is a debug feature designed to work around
      firmware problems related to EFI runtime services, and disabling
      them can be achieved more straightforwardly using 'noefi' or
      'efi=noruntime', it makes more sense to ignore efi=old_map on
      mixed mode systems.
      
      Currently, we do neither, and try to use the old memory map in
      combination with mixed mode routines, which results in crashes,
      so let's fix this by making efi=old_map functional on native
      systems only.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      0779221e
  3. 20 1月, 2020 10 次提交
    • A
      x86/boot/compressed: Relax sed symbol type regex for LLVM ld.lld · bc310baf
      Ard Biesheuvel 提交于
      The final build stage of the x86 kernel captures some symbol
      addresses from the decompressor binary and copies them into zoffset.h.
      It uses sed with a regular expression that matches the address, symbol
      type and symbol name, and mangles the captured addresses and the names
      of symbols of interest into #define directives that are added to
      zoffset.h
      
      The symbol type is indicated by a single letter, which we match
      strictly: only letters in the set 'ABCDGRSTVW' are matched, even
      though the actual symbol type is relevant and therefore ignored.
      
      Commit bc7c9d62 ("efi/libstub/x86: Force 'hidden' visibility for
      extern declarations") made a change to the way external symbol
      references are classified, resulting in 'startup_32' now being
      emitted as a hidden symbol. This prevents the use of GOT entries to
      refer to this symbol via its absolute address, which recent toolchains
      (including Clang based ones) already avoid by default, making this
      change a no-op in the majority of cases.
      
      However, as it turns out, the LLVM linker classifies such hidden
      symbols as symbols with static linkage in fully linked ELF binaries,
      causing tools such as NM to output a lowercase 't' rather than an upper
      case 'T' for the type of such symbols. Since our sed expression only
      matches upper case letters for the symbol type, the line describing
      startup_32 is disregarded, resulting in a build error like the following
      
        arch/x86/boot/header.S:568:18: error: symbol 'ZO_startup_32' can not be
                                              undefined in a subtraction expression
        init_size: .long (0x00000000008fd000 - ZO_startup_32 +
                          (((0x0000000001f6361c + ((0x0000000001f6361c >> 8) + 65536)
                           - 0x00000000008c32e5) + 4095) & ~4095)) # kernel initialization size
      
      Given that we are only interested in the value of the symbol, let's match
      any character in the set 'a-zA-Z' instead.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Tested-by: NNathan Chancellor <natechancellor@gmail.com>
      bc310baf
    • A
      efi/x86: avoid KASAN false positives when accessing the 1: 1 mapping · 3cc02861
      Ard Biesheuvel 提交于
      When installing the EFI virtual address map during early boot, we
      access the EFI system table to retrieve the 1:1 mapped address of
      the SetVirtualAddressMap() EFI runtime service. This memory is not
      known to KASAN, so on KASAN enabled builds, this may result in a
      splat like
      
        ==================================================================
        BUG: KASAN: user-memory-access in efi_set_virtual_address_map+0x141/0x354
        Read of size 4 at addr 000000003fbeef38 by task swapper/0/0
      
        CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc5+ #758
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
        Call Trace:
         dump_stack+0x8b/0xbb
         ? efi_set_virtual_address_map+0x141/0x354
         ? efi_set_virtual_address_map+0x141/0x354
         __kasan_report+0x176/0x192
         ? efi_set_virtual_address_map+0x141/0x354
         kasan_report+0xe/0x20
         efi_set_virtual_address_map+0x141/0x354
         ? efi_thunk_runtime_setup+0x148/0x148
         ? __inc_numa_state+0x19/0x90
         ? memcpy+0x34/0x50
         efi_enter_virtual_mode+0x5fd/0x67d
         start_kernel+0x5cd/0x682
         ? mem_encrypt_init+0x6/0x6
         ? x86_family+0x5/0x20
         ? load_ucode_bsp+0x46/0x154
         secondary_startup_64+0xa4/0xb0
        ==================================================================
      
      Since this code runs only a single time during early boot, let's annotate
      it as __no_sanitize_address so KASAN disregards it entirely.
      
      Fixes: 69829470 ("efi/x86: Split SetVirtualAddresMap() wrappers into ...")
      Reported-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      3cc02861
    • D
      efi: Add tracking for dynamically allocated memmaps · 1db91035
      Dan Williams 提交于
      In preparation for fixing efi_memmap_alloc() leaks, add support for
      recording whether the memmap was dynamically allocated from slab,
      memblock, or is the original physical memmap provided by the platform.
      
      Given this tracking is established in efi_memmap_alloc() and needs to be
      carried to efi_memmap_install(), use 'struct efi_memory_map_data' to
      convey the flags.
      
      Some small cleanups result from this reorganization, specifically the
      removal of local variables for 'phys' and 'size' that are already
      tracked in @data.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20200113172245.27925-12-ardb@kernel.org
      1db91035
    • A
      efi/x86: Limit EFI old memory map to SGI UV machines · 1f299fad
      Ard Biesheuvel 提交于
      We carry a quirk in the x86 EFI code to switch back to an older
      method of mapping the EFI runtime services memory regions, because
      it was deemed risky at the time to implement a new method without
      providing a fallback to the old method in case problems arose.
      
      Such problems did arise, but they appear to be limited to SGI UV1
      machines, and so these are the only ones for which the fallback gets
      enabled automatically (via a DMI quirk). The fallback can be enabled
      manually as well, by passing efi=old_map, but there is very little
      evidence that suggests that this is something that is being relied
      upon in the field.
      
      Given that UV1 support is not enabled by default by the distros
      (Ubuntu, Fedora), there is no point in carrying this fallback code
      all the time if there are no other users. So let's move it into the
      UV support code, and document that efi=old_map now requires this
      support code to be enabled.
      
      Note that efi=old_map has been used in the past on other SGI UV
      machines to work around kernel regressions in production, so we
      keep the option to enable it by hand, but only if the kernel was
      built with UV support.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20200113172245.27925-8-ardb@kernel.org
      1f299fad
    • A
      efi/x86: Avoid RWX mappings for all of DRAM · 97bb9cdc
      Ard Biesheuvel 提交于
      The EFI code creates RWX mappings for all memory regions that are
      occupied after the stub completes, and in the mixed mode case, it
      even creates RWX mappings for all of the remaining DRAM as well.
      
      Let's try to avoid this, by setting the NX bit for all memory
      regions except the ones that are marked as EFI runtime services
      code [which means text+rodata+data in practice, so we cannot mark
      them read-only right away]. For cases of buggy firmware where boot
      services code is called during SetVirtualAddressMap(), map those
      regions with exec permissions as well - they will be unmapped in
      efi_free_boot_services().
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20200113172245.27925-7-ardb@kernel.org
      97bb9cdc
    • A
      efi/x86: Don't map the entire kernel text RW for mixed mode · d9e3d2c4
      Ard Biesheuvel 提交于
      The mixed mode thunking routine requires a part of it to be
      mapped 1:1, and for this reason, we currently map the entire
      kernel .text read/write in the EFI page tables, which is bad.
      
      In fact, the kernel_map_pages_in_pgd() invocation that installs
      this mapping is entirely redundant, since all of DRAM is already
      1:1 mapped read/write in the EFI page tables when we reach this
      point, which means that .rodata is mapped read-write as well.
      
      So let's remap both .text and .rodata read-only in the EFI
      page tables.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20200113172245.27925-6-ardb@kernel.org
      d9e3d2c4
    • A
      x86/mm: Fix NX bit clearing issue in kernel_map_pages_in_pgd · 75fbef0a
      Ard Biesheuvel 提交于
      The following commit:
      
        15f003d2 ("x86/mm/pat: Don't implicitly allow _PAGE_RW in kernel_map_pages_in_pgd()")
      
      modified kernel_map_pages_in_pgd() to manage writable permissions
      of memory mappings in the EFI page table in a different way, but
      in the process, it removed the ability to clear NX attributes from
      read-only mappings, by clobbering the clear mask if _PAGE_RW is not
      being requested.
      
      Failure to remove the NX attribute from read-only mappings is
      unlikely to be a security issue, but it does prevent us from
      tightening the permissions in the EFI page tables going forward,
      so let's fix it now.
      
      Fixes: 15f003d2 ("x86/mm/pat: Don't implicitly allow _PAGE_RW in kernel_map_pages_in_pgd()
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20200113172245.27925-5-ardb@kernel.org
      75fbef0a
    • A
      efi/libstub/x86: Fix unused-variable warning · bd1d7093
      Arnd Bergmann 提交于
      The only users of these got removed, so they also need to be
      removed to avoid warnings:
      
        arch/x86/boot/compressed/eboot.c: In function 'setup_efi_pci':
        arch/x86/boot/compressed/eboot.c:117:16: error: unused variable 'nr_pci' [-Werror=unused-variable]
          unsigned long nr_pci;
                        ^~~~~~
        arch/x86/boot/compressed/eboot.c: In function 'setup_uga':
        arch/x86/boot/compressed/eboot.c:244:16: error: unused variable 'nr_ugas' [-Werror=unused-variable]
          unsigned long nr_ugas;
                        ^~~~~~~
      
      Fixes: 2732ea0d ("efi/libstub: Use a helper to iterate over a EFI handle array")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20200113172245.27925-4-ardb@kernel.org
      bd1d7093
    • A
      efi/libstub/x86: Use mandatory 16-byte stack alignment in mixed mode · ac3c76cc
      Ard Biesheuvel 提交于
      Reduce the stack frame of the EFI stub's mixed mode thunk routine by
      8 bytes, by moving the GDT and return addresses to EBP and EBX, which
      we need to preserve anyway, since their top halves will be cleared by
      the call into 32-bit firmware code. Doing so results in the UEFI code
      being entered with a 16 byte aligned stack, as mandated by the UEFI
      spec, fixing the last occurrence in the 64-bit kernel where we violate
      this requirement.
      
      Also, move the saved GDT from a global variable to an unused part of the
      stack frame, and touch up some other parts of the code.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20200113172245.27925-3-ardb@kernel.org
      ac3c76cc
    • A
      efi/libstub/x86: Use const attribute for efi_is_64bit() · 796eb8d2
      Ard Biesheuvel 提交于
      Reshuffle the x86 stub code a bit so that we can tag the efi_is_64bit()
      function with the 'const' attribute, which permits the compiler to
      optimize away any redundant calls. Since we have two different entry
      points for 32 and 64 bit firmware in the startup code, this also
      simplifies the C code since we'll enter it with the efi_is64 variable
      already set.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20200113172245.27925-2-ardb@kernel.org
      796eb8d2
  4. 17 1月, 2020 4 次提交
  5. 15 1月, 2020 1 次提交
  6. 11 1月, 2020 17 次提交
    • M
      efi: Allow disabling PCI busmastering on bridges during boot · 4444f854
      Matthew Garrett 提交于
      Add an option to disable the busmaster bit in the control register on
      all PCI bridges before calling ExitBootServices() and passing control
      to the runtime kernel. System firmware may configure the IOMMU to prevent
      malicious PCI devices from being able to attack the OS via DMA. However,
      since firmware can't guarantee that the OS is IOMMU-aware, it will tear
      down IOMMU configuration when ExitBootServices() is called. This leaves
      a window between where a hostile device could still cause damage before
      Linux configures the IOMMU again.
      
      If CONFIG_EFI_DISABLE_PCI_DMA is enabled or "efi=disable_early_pci_dma"
      is passed on the command line, the EFI stub will clear the busmaster bit
      on all PCI bridges before ExitBootServices() is called. This will
      prevent any malicious PCI devices from being able to perform DMA until
      the kernel reenables busmastering after configuring the IOMMU.
      
      This option may cause failures with some poorly behaved hardware and
      should not be enabled without testing. The kernel commandline options
      "efi=disable_early_pci_dma" or "efi=no_disable_early_pci_dma" may be
      used to override the default. Note that PCI devices downstream from PCI
      bridges are disconnected from their drivers first, using the UEFI
      driver model API, so that DMA can be disabled safely at the bridge
      level.
      
      [ardb: disconnect PCI I/O handles first, as suggested by Arvind]
      Co-developed-by: NMatthew Garrett <mjg59@google.com>
      Signed-off-by: NMatthew Garrett <mjg59@google.com>
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <matthewgarrett@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-18-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4444f854
    • A
      efi/x86: Allow translating 64-bit arguments for mixed mode calls · ea7d87f9
      Arvind Sankar 提交于
      Introduce the ability to define macros to perform argument translation
      for the calls that need it, and define them for the boot services that
      we currently use.
      
      When calling 32-bit firmware methods in mixed mode, all output
      parameters that are 32-bit according to the firmware, but 64-bit in the
      kernel (ie OUT UINTN * or OUT VOID **) must be initialized in the
      kernel, or the upper 32 bits may contain garbage. Define macros that
      zero out the upper 32 bits of the output before invoking the firmware
      method.
      
      When a 32-bit EFI call takes 64-bit arguments, the mixed-mode call must
      push the two 32-bit halves as separate arguments onto the stack. This
      can be achieved by splitting the argument into its two halves when
      calling the assembler thunk. Define a macro to do this for the
      free_pages boot service.
      Signed-off-by: NArvind Sankar <nivedita@alum.mit.edu>
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-17-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ea7d87f9
    • A
      efi/x86: Check number of arguments to variadic functions · 14b864f4
      Arvind Sankar 提交于
      On x86 we need to thunk through assembler stubs to call the EFI services
      for mixed mode, and for runtime services in 64-bit mode. The assembler
      stubs have limits on how many arguments it handles. Introduce a few
      macros to check that we do not try to pass too many arguments to the
      stubs.
      Signed-off-by: NArvind Sankar <nivedita@alum.mit.edu>
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-16-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      14b864f4
    • A
      efi/x86: Remove unreachable code in kexec_enter_virtual_mode() · 4684abe3
      Ard Biesheuvel 提交于
      Remove some code that is guaranteed to be unreachable, given
      that we have already bailed by this time if EFI_OLD_MEMMAP is
      set.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-15-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4684abe3
    • A
      efi/x86: Don't panic or BUG() on non-critical error conditions · e2d68a95
      Ard Biesheuvel 提交于
      The logic in __efi_enter_virtual_mode() does a number of steps in
      sequence, all of which may fail in one way or the other. In most
      cases, we simply print an error and disable EFI runtime services
      support, but in some cases, we BUG() or panic() and bring down the
      system when encountering conditions that we could easily handle in
      the same way.
      
      While at it, replace a pointless page-to-virt-phys conversion with
      one that goes straight from struct page to physical.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-14-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e2d68a95
    • A
      efi/x86: Clean up efi_systab_init() routine for legibility · 5b279a26
      Ard Biesheuvel 提交于
      Clean up the efi_systab_init() routine which maps the EFI system
      table and copies the relevant pieces of data out of it.
      
      The current routine is very difficult to read, so let's clean that
      up. Also, switch to a R/O mapping of the system table since that is
      all we need.
      
      Finally, use a plain u64 variable to record the physical address of
      the system table instead of pointlessly stashing it in a struct efi
      that is never used for anything else.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-13-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5b279a26
    • A
      efi/x86: Drop two near identical versions of efi_runtime_init() · 33b85447
      Ard Biesheuvel 提交于
      The routines efi_runtime_init32() and efi_runtime_init64() are
      almost indistinguishable, and the only relevant difference is
      the offset in the runtime struct from where to obtain the physical
      address of the SetVirtualAddressMap() routine.
      
      However, this address is only used once, when installing the virtual
      address map that the OS will use to invoke EFI runtime services, and
      at the time of the call, we will necessarily be running with a 1:1
      mapping, and so there is no need to do the map/unmap dance here to
      retrieve the address. In fact, in the preceding changes to these users,
      we stopped using the address recorded here entirely.
      
      So let's just get rid of all this code since it no longer serves a
      purpose. While at it, tweak the logic so that we handle unsupported
      and disable EFI runtime services in the same way, and unmap the EFI
      memory map in both cases.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-12-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      33b85447
    • A
      efi/x86: Simplify mixed mode call wrapper · ea5e1919
      Ard Biesheuvel 提交于
      Calling 32-bit EFI runtime services from a 64-bit OS involves
      switching back to the flat mapping with a stack carved out of
      memory that is 32-bit addressable.
      
      There is no need to actually execute the 64-bit part of this
      routine from the flat mapping as well, as long as the entry
      and return address fit in 32 bits. There is also no need to
      preserve part of the calling context in global variables: we
      can simply push the old stack pointer value to the new stack,
      and keep the return address from the code32 section in EBX.
      
      While at it, move the conditional check whether to invoke
      the mixed mode version of SetVirtualAddressMap() into the
      64-bit implementation of the wrapper routine.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-11-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ea5e1919
    • A
      efi/x86: Simplify 64-bit EFI firmware call wrapper · e5f930fe
      Ard Biesheuvel 提交于
      The efi_call() wrapper used to invoke EFI runtime services serves
      a number of purposes:
      - realign the stack to 16 bytes
      - preserve FP and CR0 register state
      - translate from SysV to MS calling convention.
      
      Preserving CR0.TS is no longer necessary in Linux, and preserving the
      FP register state is also redundant in most cases, since efi_call() is
      almost always used from within the scope of a pair of kernel_fpu_begin()/
      kernel_fpu_end() calls, with the exception of the early call to
      SetVirtualAddressMap() and the SGI UV support code.
      
      So let's add a pair of kernel_fpu_begin()/_end() calls there as well,
      and remove the unnecessary code from the assembly implementation of
      efi_call(), and only keep the pieces that deal with the stack
      alignment and the ABI translation.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-10-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e5f930fe
    • A
      efi/x86: Simplify i386 efi_call_phys() firmware call wrapper · a46d6740
      Ard Biesheuvel 提交于
      The variadic efi_call_phys() wrapper that exists on i386 was
      originally created to call into any EFI firmware runtime service,
      but in practice, we only use it once, to call SetVirtualAddressMap()
      during early boot.
      The flexibility provided by the variadic nature also makes it
      type unsafe, and makes the assembler code more complicated than
      needed, since it has to deal with an unknown number of arguments
      living on the stack.
      
      So clean this up, by renaming the helper to efi_call_svam(), and
      dropping the unneeded complexity. Let's also drop the reference
      to the efi_phys struct and grab the address from the EFI system
      table directly.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-9-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a46d6740
    • A
      efi/x86: Split SetVirtualAddresMap() wrappers into 32 and 64 bit versions · 69829470
      Ard Biesheuvel 提交于
      Split the phys_efi_set_virtual_address_map() routine into 32 and 64 bit
      versions, so we can simplify them individually in subsequent patches.
      
      There is very little overlap between the logic anyway, and this has
      already been factored out in prolog/epilog routines which are completely
      different between 32 bit and 64 bit. So let's take it one step further,
      and get rid of the overlap completely.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-8-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      69829470
    • A
      efi/x86: Split off some old memmap handling into separate routines · 98dd0e3a
      Ard Biesheuvel 提交于
      In a subsequent patch, we will fold the prolog/epilog routines that are
      part of the support code to call SetVirtualAddressMap() with a 1:1
      mapping into the callers. However, the 64-bit version mostly consists
      of ugly mapping code that is only used when efi=old_map is in effect,
      which is extremely rare. So let's move this code out of the way so it
      does not clutter the common code.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-7-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      98dd0e3a
    • A
      efi/x86: Avoid redundant cast of EFI firmware service pointer · 89ed4865
      Ard Biesheuvel 提交于
      All EFI firmware call prototypes have been annotated as __efiapi,
      permitting us to attach attributes regarding the calling convention
      by overriding __efiapi to an architecture specific value.
      
      On 32-bit x86, EFI firmware calls use the plain calling convention
      where all arguments are passed via the stack, and cleaned up by the
      caller. Let's add this to the __efiapi definition so we no longer
      need to cast the function pointers before invoking them.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-6-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      89ed4865
    • A
      efi/x86: Map the entire EFI vendor string before copying it · ffc2760b
      Ard Biesheuvel 提交于
      Fix a couple of issues with the way we map and copy the vendor string:
      - we map only 2 bytes, which usually works since you get at least a
        page, but if the vendor string happens to cross a page boundary,
        a crash will result
      - only call early_memunmap() if early_memremap() succeeded, or we will
        call it with a NULL address which it doesn't like,
      - while at it, switch to early_memremap_ro(), and array indexing rather
        than pointer dereferencing to read the CHAR16 characters.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Fixes: 5b83683f ("x86: EFI runtime service support")
      Link: https://lkml.kernel.org/r/20200103113953.9571-5-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ffc2760b
    • A
      efi/x86: Re-disable RT services for 32-bit kernels running on 64-bit EFI · 6cfcd6f0
      Ard Biesheuvel 提交于
      Commit a8147dba ("efi/x86: Rename efi_is_native() to efi_is_mixed()")
      renamed and refactored efi_is_native() into efi_is_mixed(), but failed
      to take into account that these are not diametrical opposites.
      
      Mixed mode is a construct that permits 64-bit kernels to boot on 32-bit
      firmware, but there is another non-native combination which is supported,
      i.e., 32-bit kernels booting on 64-bit firmware, but only for boot and not
      for runtime services. Also, mixed mode can be disabled in Kconfig, in
      which case the 64-bit kernel can still be booted from 32-bit firmware,
      but without access to runtime services.
      
      Due to this oversight, efi_runtime_supported() now incorrectly returns
      true for such configurations, resulting in crashes at boot. So fix this
      by making efi_runtime_supported() aware of this.
      
      As a side effect, some efi_thunk_xxx() stubs have become obsolete, so
      remove them as well.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-4-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6cfcd6f0
    • A
      efi/libstub/x86: Force 'hidden' visibility for extern declarations · bc7c9d62
      Ard Biesheuvel 提交于
      Commit c3710de5 ("efi/libstub/x86: Drop __efi_early() export and
      efi_config struct") introduced a reference from C code in eboot.c to
      the startup_32 symbol defined in the .S startup code. This results in
      a GOT based reference to startup_32, and since GOT entries carry
      absolute addresses, they need to be fixed up before they can be used.
      
      On modern toolchains (binutils 2.26 or later), this reference is
      relaxed into a R_386_GOTOFF relocation (or the analogous X86_64 one)
      which never uses the absolute address in the entry, and so we get
      away with not fixing up the GOT table before calling the EFI entry
      point. However, GCC 4.6 combined with a binutils of the era (2.24)
      will produce a true GOT indirected reference, resulting in a wrong
      value to be returned for the address of startup_32() if the boot
      code is not running at the address it was linked at.
      
      Fortunately, we can easily override this behavior, and force GCC to
      emit the GOTOFF relocations explicitly, by setting the visibility
      pragma 'hidden'.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-3-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bc7c9d62
    • A
      efi/libstub: Fix boot argument handling in mixed mode entry code · 12dc9e15
      Ard Biesheuvel 提交于
      The mixed mode refactor actually broke mixed mode by failing to
      pass the bootparam structure to startup_32(). This went unnoticed
      because it apparently has a high tolerance for being passed random
      junk, and still boots fine in some cases. So let's fix this by
      populating %esi as required when entering via efi32_stub_entry,
      and while at it, preserve the arguments themselves instead of their
      address in memory (via the stack pointer) since that memory could
      be clobbered before we get to it.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-2-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      12dc9e15
  7. 07 1月, 2020 1 次提交
  8. 05 1月, 2020 1 次提交
    • D
      mm/memory_hotplug: shrink zones when offlining memory · feee6b29
      David Hildenbrand 提交于
      We currently try to shrink a single zone when removing memory.  We use
      the zone of the first page of the memory we are removing.  If that
      memmap was never initialized (e.g., memory was never onlined), we will
      read garbage and can trigger kernel BUGs (due to a stale pointer):
      
          BUG: unable to handle page fault for address: 000000000000353d
          #PF: supervisor write access in kernel mode
          #PF: error_code(0x0002) - not-present page
          PGD 0 P4D 0
          Oops: 0002 [#1] SMP PTI
          CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
          Workqueue: kacpi_hotplug acpi_hotplug_work_fn
          RIP: 0010:clear_zone_contiguous+0x5/0x10
          Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
          RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
          RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
          RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
          RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
          R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
          R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
          FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
           __remove_pages+0x4b/0x640
           arch_remove_memory+0x63/0x8d
           try_remove_memory+0xdb/0x130
           __remove_memory+0xa/0x11
           acpi_memory_device_remove+0x70/0x100
           acpi_bus_trim+0x55/0x90
           acpi_device_hotplug+0x227/0x3a0
           acpi_hotplug_work_fn+0x1a/0x30
           process_one_work+0x221/0x550
           worker_thread+0x50/0x3b0
           kthread+0x105/0x140
           ret_from_fork+0x3a/0x50
          Modules linked in:
          CR2: 000000000000353d
      
      Instead, shrink the zones when offlining memory or when onlining failed.
      Introduce and use remove_pfn_range_from_zone(() for that.  We now
      properly shrink the zones, even if we have DIMMs whereby
      
       - Some memory blocks fall into no zone (never onlined)
      
       - Some memory blocks fall into multiple zones (offlined+re-onlined)
      
       - Multiple memory blocks that fall into different zones
      
      Drop the zone parameter (with a potential dubious value) from
      __remove_pages() and __remove_section().
      
      Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e8]
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: <stable@vger.kernel.org>	[5.0+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      feee6b29
  9. 03 1月, 2020 1 次提交
  10. 31 12月, 2019 1 次提交
    • Q
      x86/resctrl: Fix an imbalance in domain_remove_cpu() · e278af89
      Qian Cai 提交于
      A system that supports resource monitoring may have multiple resources
      while not all of these resources are capable of monitoring. Monitoring
      related state is initialized only for resources that are capable of
      monitoring and correspondingly this state should subsequently only be
      removed from these resources that are capable of monitoring.
      
      domain_add_cpu() calls domain_setup_mon_state() only when r->mon_capable
      is true where it will initialize d->mbm_over. However,
      domain_remove_cpu() calls cancel_delayed_work(&d->mbm_over) without
      checking r->mon_capable resulting in an attempt to cancel d->mbm_over on
      all resources, even those that never initialized d->mbm_over because
      they are not capable of monitoring. Hence, it triggers a debugobjects
      warning when offlining CPUs because those timer debugobjects are never
      initialized:
      
        ODEBUG: assert_init not available (active state 0) object type:
        timer_list hint: 0x0
        WARNING: CPU: 143 PID: 789 at lib/debugobjects.c:484
        debug_print_object
        Hardware name: HP Synergy 680 Gen9/Synergy 680 Gen9 Compute Module, BIOS I40 05/23/2018
        RIP: 0010:debug_print_object
        Call Trace:
        debug_object_assert_init
        del_timer
        try_to_grab_pending
        cancel_delayed_work
        resctrl_offline_cpu
        cpuhp_invoke_callback
        cpuhp_thread_fun
        smpboot_thread_fn
        kthread
        ret_from_fork
      
      Fixes: e3302683 ("x86/intel_rdt/mbm: Handle counter overflow")
      Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NReinette Chatre <reinette.chatre@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: john.stultz@linaro.org
      Cc: sboyd@kernel.org
      Cc: <stable@vger.kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: tj@kernel.org
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191211033042.2188-1-cai@lca.pw
      e278af89
  11. 25 12月, 2019 2 次提交