1. 22 2月, 2016 2 次提交
    • S
      x86/efi: Map EFI_MEMORY_{XP,RO} memory region bits to EFI page tables · 6d0cc887
      Sai Praneeth 提交于
      Now that we have EFI memory region bits that indicate which regions do
      not need execute permission or read/write permission in the page tables,
      let's use them.
      
      We also check for EFI_NX_PE_DATA and only enforce the restrictive
      mappings if it's present (to allow us to ignore buggy firmware that sets
      bits it didn't mean to and to preserve backwards compatibility).
      
      Instead of assuming that firmware would set appropriate attributes in
      memory descriptor like EFI_MEMORY_RO for code and EFI_MEMORY_XP for
      data, we can expect some firmware out there which might only set *type*
      in memory descriptor to be EFI_RUNTIME_SERVICES_CODE or
      EFI_RUNTIME_SERVICES_DATA leaving away attribute. This will lead to
      improper mappings of EFI runtime regions. In order to avoid it, we check
      attribute and type of memory descriptor to update mappings and moreover
      Windows works this way.
      Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Lee, Chun-Yi <jlee@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Shankar <ravi.v.shankar@intel.com>
      Cc: Ricardo Neri <ricardo.neri@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1455712566-16727-13-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6d0cc887
    • S
      x86/mm/pat: Don't implicitly allow _PAGE_RW in kernel_map_pages_in_pgd() · 15f003d2
      Sai Praneeth 提交于
      As part of the preparation for the EFI_MEMORY_RO flag added in the UEFI
      2.5 specification, we need the ability to map pages in kernel page
      tables without _PAGE_RW being set.
      
      Modify kernel_map_pages_in_pgd() to require its callers to pass _PAGE_RW
      if the pages need to be mapped read/write. Otherwise, we'll map the
      pages as read-only.
      Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Lee, Chun-Yi <jlee@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Shankar <ravi.v.shankar@intel.com>
      Cc: Ricardo Neri <ricardo.neri@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1455712566-16727-12-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      15f003d2
  2. 03 2月, 2016 3 次提交
    • R
      x86/efi: Show actual ending addresses in efi_print_memmap · 1e82b947
      Robert Elliott 提交于
      Adjust efi_print_memmap to print the real end address of each
      range, not 1 byte beyond. This matches other prints like those
      for SRAT and nosave memory.
      
      While investigating grub persistent memory corruption issues, it
      was helpful to make this table match the ending address
      convention used by:
      * the kernel's e820 table prints
      	BIOS-e820: [mem 0x0000001680000000-0x0000001c7fffffff] reserved
      * the kernel's nosave memory prints
      	PM: Registered nosave memory: [mem 0x880000000-0xc7fffffff]
      * the kernel's ACPI System Resource Affinity Table prints
      	SRAT: Node 1 PXM 1 [mem 0x480000000-0x87fffffff]
      * grub's lsmmap and lsefimmap commands
      	reserved  0000001680000000-0000001c7fffffff 00600000     24GiB UC WC WT WB NV
      * the UEFI shell's memmap command
      	Reserved   000000007FC00000-000000007FFFFFFF 0000000000000400 0000000000000001
      
      For example, if you grep all the various logs for c7fffffff, you
      won't find the kernel's line if it uses c80000000.
      
      Also, change the closing ) to ] to match the opening [.
      
      old:
          efi: mem61: [Persistent Memory  |   |  |  |  |  |  |   |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c80000000) (16384MB)
      
      new:
          efi: mem61: [Persistent Memory  |   |  |  |  |  |  |   |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff] (16384MB)
      Signed-off-by: NRobert Elliott <elliott@hpe.com>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Reviewed-by: NLaszlo Ersek <lersek@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Leif Lindholm <leif.lindholm@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1454364428-494-12-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1e82b947
    • M
      x86/efi/bgrt: Don't ignore the BGRT if the 'valid' bit is 0 · 66dbe99c
      Môshe van der Sterre 提交于
      Unintuitively, the BGRT graphic is apparently meant to be usable
      if the valid bit in not set. The valid bit only conveys
      uncertainty about the validity in relation to the screen state.
      
      Windows 10 actually uses the BGRT image for its boot screen even
      if not 'valid', for example when the user triggered the boot
      menu. Because it is unclear if all firmwares will provide a
      usable graphic in this case, we now look at the BMP magic number
      as an additional check.
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NMôshe van der Sterre <me@moshe.nl>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: =?UTF-8?q?M=C3=B4she=20van=20der=20Sterre?= <me@moshe.nl>
      Link: http://lkml.kernel.org/r/1454364428-494-10-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      66dbe99c
    • A
      efi: Add nonblocking option to efi_query_variable_store() · ca0e30dc
      Ard Biesheuvel 提交于
      The function efi_query_variable_store() may be invoked by
      efivar_entry_set_nonblocking(), which itself takes care to only
      call a non-blocking version of the SetVariable() runtime
      wrapper. However, efi_query_variable_store() may call the
      SetVariable() wrapper directly, as well as the wrapper for
      QueryVariableInfo(), both of which could deadlock in the same
      way we are trying to prevent by calling
      efivar_entry_set_nonblocking() in the first place.
      
      So instead, modify efi_query_variable_store() to use the
      non-blocking variants of QueryVariableInfo() (and give up rather
      than free up space if the available space is below
      EFI_MIN_RESERVE) if invoked with the 'nonblocking' argument set
      to true.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1454364428-494-5-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ca0e30dc
  3. 22 1月, 2016 1 次提交
    • M
      x86/efi: Setup separate EFI page tables in kexec paths · 753b11ef
      Matt Fleming 提交于
      The switch to using a new dedicated page table for EFI runtime
      calls in commit commit 67a9108e ("x86/efi: Build our own
      page table structures") failed to take into account changes
      required for the kexec code paths, which are unfortunately
      duplicated in the EFI code.
      
      Call the allocation and setup functions in
      kexec_enter_virtual_mode() just like we do for
      __efi_enter_virtual_mode() to avoid hitting NULL-pointer
      dereferences when making EFI runtime calls.
      
      At the very least, the call to efi_setup_page_tables() should
      have existed for kexec before the following commit:
      
        67a9108e ("x86/efi: Build our own page table structures")
      
      Things just magically worked because we were actually using
      the kernel's page tables that contained the required mappings.
      Reported-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Tested-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1453385519-11477-1-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      753b11ef
  4. 21 1月, 2016 1 次提交
  5. 19 1月, 2016 2 次提交
  6. 07 1月, 2016 1 次提交
    • M
      x86/efi-bgrt: Replace early_memremap() with memremap() · e2c90dd7
      Matt Fleming 提交于
      Môshe reported the following warning triggered on his machine since
      commit 50a0cb56 ("x86/efi-bgrt: Fix kernel panic when mapping BGRT
      data"),
      
        [    0.026936] ------------[ cut here ]------------
        [    0.026941] WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:137 __early_ioremap+0x102/0x1bb()
        [    0.026941] Modules linked in:
        [    0.026944] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.0-rc1 #2
        [    0.026945] Hardware name: Dell Inc. XPS 13 9343/09K8G1, BIOS A05 07/14/2015
        [    0.026946]  0000000000000000 900f03d5a116524d ffffffff81c03e60 ffffffff813a3fff
        [    0.026948]  0000000000000000 ffffffff81c03e98 ffffffff810a0852 00000000d7b76000
        [    0.026949]  0000000000000000 0000000000000001 0000000000000001 000000000000017c
        [    0.026951] Call Trace:
        [    0.026955]  [<ffffffff813a3fff>] dump_stack+0x44/0x55
        [    0.026958]  [<ffffffff810a0852>] warn_slowpath_common+0x82/0xc0
        [    0.026959]  [<ffffffff810a099a>] warn_slowpath_null+0x1a/0x20
        [    0.026961]  [<ffffffff81d8c395>] __early_ioremap+0x102/0x1bb
        [    0.026962]  [<ffffffff81d8c602>] early_memremap+0x13/0x15
        [    0.026964]  [<ffffffff81d78361>] efi_bgrt_init+0x162/0x1ad
        [    0.026966]  [<ffffffff81d778ec>] efi_late_init+0x9/0xb
        [    0.026968]  [<ffffffff81d58ff5>] start_kernel+0x46f/0x49f
        [    0.026970]  [<ffffffff81d58120>] ? early_idt_handler_array+0x120/0x120
        [    0.026972]  [<ffffffff81d58339>] x86_64_start_reservations+0x2a/0x2c
        [    0.026974]  [<ffffffff81d58485>] x86_64_start_kernel+0x14a/0x16d
        [    0.026977] ---[ end trace f9b3812eb8e24c58 ]---
        [    0.026978] efi_bgrt: Ignoring BGRT: failed to map image memory
      
      early_memremap() has an upper limit on the size of mapping it can
      handle which is ~200KB. Clearly the BGRT image on Môshe's machine is
      much larger than that.
      
      There's actually no reason to restrict ourselves to using the early_*
      version of memremap() - the ACPI BGRT driver is invoked late enough in
      boot that we can use the standard version, with the benefit that the
      late version allows mappings of arbitrary size.
      Reported-by: NMôshe van der Sterre <me@moshe.nl>
      Tested-by: NMôshe van der Sterre <me@moshe.nl>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Link: http://lkml.kernel.org/r/1450707172-12561-1-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      e2c90dd7
  7. 14 12月, 2015 2 次提交
    • S
      x86/efi-bgrt: Fix kernel panic when mapping BGRT data · 50a0cb56
      Sai Praneeth 提交于
      Starting with this commit 35eb8b81edd4 ("x86/efi: Build our own page
      table structures") efi regions have a separate page directory called
      "efi_pgd". In order to access any efi region we have to first shift %cr3
      to this page table. In the bgrt code we are trying to copy bgrt_header
      and image, but these regions fall under "EFI_BOOT_SERVICES_DATA"
      and to access these regions we have to shift %cr3 to efi_pgd and not
      doing so will cause page fault as shown below.
      
      [    0.251599] Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0, 1GB 4
      [    0.259126] Freeing SMP alternatives memory: 32K (ffffffff8230e000 - ffffffff82316000)
      [    0.271803] BUG: unable to handle kernel paging request at fffffffefce35002
      [    0.279740] IP: [<ffffffff821bca49>] efi_bgrt_init+0x144/0x1fd
      [    0.286383] PGD 300f067 PUD 0
      [    0.289879] Oops: 0000 [#1] SMP
      [    0.293566] Modules linked in:
      [    0.297039] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.0-rc1-eywa-eywa-built-in-47041+ #2
      [    0.306619] Hardware name: Intel Corporation Skylake Client platform/Skylake Y LPDDR3 RVP3, BIOS SKLSE2R1.R00.B104.B01.1511110114 11/11/2015
      [    0.320925] task: ffffffff820134c0 ti: ffffffff82000000 task.ti: ffffffff82000000
      [    0.329420] RIP: 0010:[<ffffffff821bca49>]  [<ffffffff821bca49>] efi_bgrt_init+0x144/0x1fd
      [    0.338821] RSP: 0000:ffffffff82003f18  EFLAGS: 00010246
      [    0.344852] RAX: fffffffefce35000 RBX: fffffffefce35000 RCX: fffffffefce2b000
      [    0.352952] RDX: 000000008a82b000 RSI: ffffffff8235bb80 RDI: 000000008a835000
      [    0.361050] RBP: ffffffff82003f30 R08: 000000008a865000 R09: ffffffffff202850
      [    0.369149] R10: ffffffff811ad62f R11: 0000000000000000 R12: 0000000000000000
      [    0.377248] R13: ffff88016dbaea40 R14: ffffffff822622c0 R15: ffffffff82003fb0
      [    0.385348] FS:  0000000000000000(0000) GS:ffff88016d800000(0000) knlGS:0000000000000000
      [    0.394533] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [    0.401054] CR2: fffffffefce35002 CR3: 000000000300c000 CR4: 00000000003406f0
      [    0.409153] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [    0.417252] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [    0.425350] Stack:
      [    0.427638]  ffffffffffffffff ffffffff82256900 ffff88016dbaea40 ffffffff82003f40
      [    0.436086]  ffffffff821bbce0 ffffffff82003f88 ffffffff8219c0c2 0000000000000000
      [    0.444533]  ffffffff8219ba4a ffffffff822622c0 0000000000083000 00000000ffffffff
      [    0.452978] Call Trace:
      [    0.455763]  [<ffffffff821bbce0>] efi_late_init+0x9/0xb
      [    0.461697]  [<ffffffff8219c0c2>] start_kernel+0x463/0x47f
      [    0.467928]  [<ffffffff8219ba4a>] ? set_init_arg+0x55/0x55
      [    0.474159]  [<ffffffff8219b120>] ? early_idt_handler_array+0x120/0x120
      [    0.481669]  [<ffffffff8219b5ee>] x86_64_start_reservations+0x2a/0x2c
      [    0.488982]  [<ffffffff8219b72d>] x86_64_start_kernel+0x13d/0x14c
      [    0.495897] Code: 00 41 b4 01 48 8b 78 28 e8 09 36 01 00 48 85 c0 48 89 c3 75 13 48 c7 c7 f8 ac d3 81 31 c0 e8 d7 3b fb fe e9 b5 00 00 00 45 84 e4 <44> 8b 6b 02 74 0d be 06 00 00 00 48 89 df e8 ae 34 0$
      [    0.518151] RIP  [<ffffffff821bca49>] efi_bgrt_init+0x144/0x1fd
      [    0.524888]  RSP <ffffffff82003f18>
      [    0.528851] CR2: fffffffefce35002
      [    0.532615] ---[ end trace 7b06521e6ebf2aea ]---
      [    0.537852] Kernel panic - not syncing: Attempted to kill the idle task!
      
      As said above one way to fix this bug is to shift %cr3 to efi_pgd but we
      are not doing that way because it leaks inner details of how we switch
      to EFI page tables into a new call site and it also adds duplicate code.
      Instead, we remove the call to efi_lookup_mapped_addr() and always
      perform early_mem*() instead of early_io*() because we want to remap RAM
      regions and not I/O regions. We also delete efi_lookup_mapped_addr()
      because we are no longer using it.
      Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Reported-by: NWendy Wang <wendy.wang@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Ricardo Neri <ricardo.neri@intel.com>
      Cc: Ravi Shankar <ravi.v.shankar@intel.com>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      50a0cb56
    • M
      x86/efi: Preface all print statements with efi* tag · 26d7f65f
      Matt Fleming 提交于
      The pr_*() calls in the x86 EFI code may or may not include a
      subsystem tag, which makes it difficult to grep the kernel log for all
      relevant EFI messages and leads users to miss important information.
      
      Recently, a bug reporter provided all the EFI print messages from the
      kernel log when trying to diagnose an issue but missed the following
      statement because it wasn't prefixed with anything indicating it was
      related to EFI,
      
        pr_err("Error ident-mapping new memmap (0x%lx)!\n", pa_memmap);
      
      Cc: Borislav Petkov <bp@suse.de>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      26d7f65f
  8. 11 12月, 2015 1 次提交
    • I
      x86/platform/uv: Include clocksource.h for clocksource_touch_watchdog() · d51953b0
      Ingo Molnar 提交于
      This build failure triggers on 64-bit allmodconfig:
      
        arch/x86/platform/uv/uv_nmi.c:493:2: error: implicit declaration of function ‘clocksource_touch_watchdog’ [-Werror=implicit-function-declaration]
      
      which is caused by recent changes exposing a missing clocksource.h include
      in uv_nmi.c:
      
        cc1e24fd x86/vdso: Remove pvclock fixmap machinery
      
      this file got clocksource.h indirectly via fixmap.h - that stealth route
      of header inclusion is now gone.
      
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d51953b0
  9. 09 12月, 2015 1 次提交
  10. 29 11月, 2015 4 次提交
    • M
      x86/efi: Build our own page table structures · 67a9108e
      Matt Fleming 提交于
      With commit e1a58320 ("x86/mm: Warn on W^X mappings") all
      users booting on 64-bit UEFI machines see the following warning,
      
        ------------[ cut here ]------------
        WARNING: CPU: 7 PID: 1 at arch/x86/mm/dump_pagetables.c:225 note_page+0x5dc/0x780()
        x86/mm: Found insecure W+X mapping at address ffff88000005f000/0xffff88000005f000
        ...
        x86/mm: Checked W+X mappings: FAILED, 165660 W+X pages found.
        ...
      
      This is caused by mapping EFI regions with RWX permissions.
      There isn't much we can do to restrict the permissions for these
      regions due to the way the firmware toolchains mix code and
      data, but we can at least isolate these mappings so that they do
      not appear in the regular kernel page tables.
      
      In commit d2f7cbe7 ("x86/efi: Runtime services virtual
      mapping") we started using 'trampoline_pgd' to map the EFI
      regions because there was an existing identity mapping there
      which we use during the SetVirtualAddressMap() call and for
      broken firmware that accesses those addresses.
      
      But 'trampoline_pgd' shares some PGD entries with
      'swapper_pg_dir' and does not provide the isolation we require.
      Notably the virtual address for __START_KERNEL_map and
      MODULES_START are mapped by the same PGD entry so we need to be
      more careful when copying changes over in
      efi_sync_low_kernel_mappings().
      
      This patch doesn't go the full mile, we still want to share some
      PGD entries with 'swapper_pg_dir'. Having completely separate
      page tables brings its own issues such as synchronising new
      mappings after memory hotplug and module loading. Sharing also
      keeps memory usage down.
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1448658575-17029-6-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      67a9108e
    • M
      x86/efi: Hoist page table switching code into efi_call_virt() · c9f2a9a6
      Matt Fleming 提交于
      This change is a prerequisite for pending patches that switch to
      a dedicated EFI page table, instead of using 'trampoline_pgd'
      which shares PGD entries with 'swapper_pg_dir'. The pending
      patches make it impossible to dereference the runtime service
      function pointer without first switching %cr3.
      
      It's true that we now have duplicated switching code in
      efi_call_virt() and efi_call_phys_{prolog,epilog}() but we are
      sacrificing code duplication for a little more clarity and the
      ease of writing the page table switching code in C instead of
      asm.
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1448658575-17029-5-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c9f2a9a6
    • M
      x86/efi: Map RAM into the identity page table for mixed mode · b61a76f8
      Matt Fleming 提交于
      We are relying on the pre-existing mappings in 'trampoline_pgd'
      when accessing function arguments in the EFI mixed mode thunking
      code.
      
      Instead let's map memory explicitly so that things will continue
      to work when we move to a separate page table in the future.
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1448658575-17029-4-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b61a76f8
    • M
      x86/mm/pat: Ensure cpa->pfn only contains page frame numbers · edc3b912
      Matt Fleming 提交于
      The x86 pageattr code is confused about the data that is stored
      in cpa->pfn, sometimes it's treated as a page frame number,
      sometimes it's treated as an unshifted physical address, and in
      one place it's treated as a pte.
      
      The result of this is that the mapping functions do not map the
      intended physical address.
      
      This isn't a problem in practice because most of the addresses
      we're mapping in the EFI code paths are already mapped in
      'trampoline_pgd' and so the pageattr mapping functions don't
      actually do anything in this case. But when we move to using a
      separate page table for the EFI runtime this will be an issue.
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1448658575-17029-3-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      edc3b912
  11. 28 10月, 2015 1 次提交
    • A
      efi: Use correct type for struct efi_memory_map::phys_map · 44511fb9
      Ard Biesheuvel 提交于
      We have been getting away with using a void* for the physical
      address of the UEFI memory map, since, even on 32-bit platforms
      with 64-bit physical addresses, no truncation takes place if the
      memory map has been allocated by the firmware (which only uses
      1:1 virtually addressable memory), which is usually the case.
      
      However, commit:
      
        0f96a99d ("efi: Add "efi_fake_mem" boot option")
      
      adds code that clones and modifies the UEFI memory map, and the
      clone may live above 4 GB on 32-bit platforms.
      
      This means our use of void* for struct efi_memory_map::phys_map has
      graduated from 'incorrect but working' to 'incorrect and
      broken', and we need to fix it.
      
      So redefine struct efi_memory_map::phys_map as phys_addr_t, and
      get rid of a bunch of casts that are now unneeded.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: izumi.taku@jp.fujitsu.com
      Cc: kamezawa.hiroyu@jp.fujitsu.com
      Cc: linux-efi@vger.kernel.org
      Cc: matt.fleming@intel.com
      Link: http://lkml.kernel.org/r/1445593697-1342-1-git-send-email-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      44511fb9
  12. 12 10月, 2015 3 次提交
  13. 01 10月, 2015 1 次提交
    • M
      x86/efi: Fix boot crash by mapping EFI memmap entries bottom-up at runtime, instead of top-down · a5caa209
      Matt Fleming 提交于
      Beginning with UEFI v2.5 EFI_PROPERTIES_TABLE was introduced
      that signals that the firmware PE/COFF loader supports splitting
      code and data sections of PE/COFF images into separate EFI
      memory map entries. This allows the kernel to map those regions
      with strict memory protections, e.g. EFI_MEMORY_RO for code,
      EFI_MEMORY_XP for data, etc.
      
      Unfortunately, an unwritten requirement of this new feature is
      that the regions need to be mapped with the same offsets
      relative to each other as observed in the EFI memory map. If
      this is not done crashes like this may occur,
      
        BUG: unable to handle kernel paging request at fffffffefe6086dd
        IP: [<fffffffefe6086dd>] 0xfffffffefe6086dd
        Call Trace:
         [<ffffffff8104c90e>] efi_call+0x7e/0x100
         [<ffffffff81602091>] ? virt_efi_set_variable+0x61/0x90
         [<ffffffff8104c583>] efi_delete_dummy_variable+0x63/0x70
         [<ffffffff81f4e4aa>] efi_enter_virtual_mode+0x383/0x392
         [<ffffffff81f37e1b>] start_kernel+0x38a/0x417
         [<ffffffff81f37495>] x86_64_start_reservations+0x2a/0x2c
         [<ffffffff81f37582>] x86_64_start_kernel+0xeb/0xef
      
      Here 0xfffffffefe6086dd refers to an address the firmware
      expects to be mapped but which the OS never claimed was mapped.
      The issue is that included in these regions are relative
      addresses to other regions which were emitted by the firmware
      toolchain before the "splitting" of sections occurred at
      runtime.
      
      Needless to say, we don't satisfy this unwritten requirement on
      x86_64 and instead map the EFI memory map entries in reverse
      order. The above crash is almost certainly triggerable with any
      kernel newer than v3.13 because that's when we rewrote the EFI
      runtime region mapping code, in commit d2f7cbe7 ("x86/efi:
      Runtime services virtual mapping"). For kernel versions before
      v3.13 things may work by pure luck depending on the
      fragmentation of the kernel virtual address space at the time we
      map the EFI regions.
      
      Instead of mapping the EFI memory map entries in reverse order,
      where entry N has a higher virtual address than entry N+1, map
      them in the same order as they appear in the EFI memory map to
      preserve this relative offset between regions.
      
      This patch has been kept as small as possible with the intention
      that it should be applied aggressively to stable and
      distribution kernels. It is very much a bugfix rather than
      support for a new feature, since when EFI_PROPERTIES_TABLE is
      enabled we must map things as outlined above to even boot - we
      have no way of asking the firmware not to split the code/data
      regions.
      
      In fact, this patch doesn't even make use of the more strict
      memory protections available in UEFI v2.5. That will come later.
      Suggested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reported-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chun-Yi <jlee@suse.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: James Bottomley <JBottomley@Odin.com>
      Cc: Lee, Chun-Yi <jlee@suse.com>
      Cc: Leif Lindholm <leif.lindholm@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Jones <pjones@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/r/1443218539-7610-2-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a5caa209
  14. 18 9月, 2015 1 次提交
  15. 14 9月, 2015 1 次提交
    • M
      x86/platform/uv: Implement simple dump failover if kdump fails · d0a9964e
      Mike Travis 提交于
      The ability to trigger a kdump using the system NMI command
      was added by
      
          commit 12ba6c99 ("x86/UV: Add kdump to UV NMI handler")
          Author: Mike Travis <travis@sgi.com>
          Date:   Mon Sep 23 16:25:03 2013 -0500
      
      This is useful because when kdump is working the information
      gathered is more informative than the original per CPU stack
      traces or "dump" option.  However a number of things can go
      wrong with kdump and then the stack traces are more useful than
      nothing.
      
      The two most common reasons for kdump to not be available are:
      
        1) if a problem occurs during boot before the kdump service is
           started, or
        2) the kdump daemon failed to start.
      
      In either case the call to crash_kexec() returns unexpectedly.
      
      When this happens uv_nmi_kdump() also sets the
      uv_nmi_kexec_failed flag which causes the slave CPU's to also
      return to the NMI handler. Upon this unexpected return to the
      NMI handler, the NMI handler will revert to the "dump" action
      which uses show_regs() to obtain a process trace dump for all
      the CPU's.
      
      Other minor changes:
          The "dump" action now generates both the show_regs() stack trace
          and show instruction pointer information.  Whereas the "ips"
          action only shows instruction pointers for non-idle CPU's.  This
          is more like an abbreviated "ps" display.
      
          Change printk(KERN_DEFAULT...) --> pr_info()
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NGeorge Beshers <gbeshers@sgi.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Hedi Berriche <hedi@sgi.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d0a9964e
  16. 11 9月, 2015 1 次提交
    • D
      kexec: split kexec_load syscall from kexec core code · 2965faa5
      Dave Young 提交于
      There are two kexec load syscalls, kexec_load another and kexec_file_load.
       kexec_file_load has been splited as kernel/kexec_file.c.  In this patch I
      split kexec_load syscall code to kernel/kexec.c.
      
      And add a new kconfig option KEXEC_CORE, so we can disable kexec_load and
      use kexec_file_load only, or vice verse.
      
      The original requirement is from Ted Ts'o, he want kexec kernel signature
      being checked with CONFIG_KEXEC_VERIFY_SIG enabled.  But kexec-tools use
      kexec_load syscall can bypass the checking.
      
      Vivek Goyal proposed to create a common kconfig option so user can compile
      in only one syscall for loading kexec kernel.  KEXEC/KEXEC_FILE selects
      KEXEC_CORE so that old config files still work.
      
      Because there's general code need CONFIG_KEXEC_CORE, so I updated all the
      architecture Kconfig with a new option KEXEC_CORE, and let KEXEC selects
      KEXEC_CORE in arch Kconfig.  Also updated general kernel code with to
      kexec_load syscall.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Petr Tesarik <ptesarik@suse.cz>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2965faa5
  17. 25 8月, 2015 1 次提交
    • P
      x86/platform: Make atom/pmc_atom.c explicitly non-modular · e971aa2c
      Paul Gortmaker 提交于
      The Kconfig currently controlling compilation of this code is:
      
      config PMC_ATOM
              def_bool y
      
      ...meaning that it currently is not being built as a module by
      anyone.
      
      Lets remove the couple traces of modularity so that when reading
      the driver there is no doubt it is builtin-only.
      
      Since module_init() translates to device_initcall() in the
      non-modular case, the init ordering remains unchanged with this
      commit.
      
      We leave some tags like MODULE_AUTHOR() for documentation
      purposes.
      
      Also note that MODULE_DEVICE_TABLE() is a no-op for non-modular
      code. We correct a comment that indicates the data was only used
      by that macro, as it actually is used by the code directly.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1440459295-21814-2-git-send-email-paul.gortmaker@windriver.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e971aa2c
  18. 08 8月, 2015 2 次提交
  19. 31 7月, 2015 2 次提交
    • V
      x86/uv/time: Migrate to new set-state interface · ca53d434
      Viresh Kumar 提交于
      Migrate uv driver to the new 'set-state' interface provided by
      clockevents core, the earlier 'set-mode' interface is marked obsolete
      now.
      
      This also enables us to implement callbacks for new states of clockevent
      devices, for example: ONESHOT_STOPPED.
      
      We weren't doing anything while switching modes other than in shutdown
      mode and so those are not implemented.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Cc: linaro-kernel@lists.linaro.org
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Tejun Heo <tj@kernel.org>
      Link: http://lkml.kernel.org/r/52e04139746222a2e82a96d13953cbc306cfb59b.1437042675.git.viresh.kumar@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      ca53d434
    • R
      efi: Check for NULL efi kernel parameters · 9115c758
      Ricardo Neri 提交于
      Even though it is documented how to specifiy efi parameters, it is
      possible to cause a kernel panic due to a dereference of a NULL pointer when
      parsing such parameters if "efi" alone is given:
      
      PANIC: early exception 0e rip 10:ffffffff812fb361 error 0 cr2 0
      [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.2.0-rc1+ #450
      [ 0.000000]  ffffffff81fe20a9 ffffffff81e03d50 ffffffff8184bb0f 00000000000003f8
      [ 0.000000]  0000000000000000 ffffffff81e03e08 ffffffff81f371a1 64656c62616e6520
      [ 0.000000]  0000000000000069 000000000000005f 0000000000000000 0000000000000000
      [ 0.000000] Call Trace:
      [ 0.000000]  [<ffffffff8184bb0f>] dump_stack+0x45/0x57
      [ 0.000000]  [<ffffffff81f371a1>] early_idt_handler_common+0x81/0xae
      [ 0.000000]  [<ffffffff812fb361>] ? parse_option_str+0x11/0x90
      [ 0.000000]  [<ffffffff81f4dd69>] arch_parse_efi_cmdline+0x15/0x42
      [ 0.000000]  [<ffffffff81f376e1>] do_early_param+0x50/0x8a
      [ 0.000000]  [<ffffffff8106b1b3>] parse_args+0x1e3/0x400
      [ 0.000000]  [<ffffffff81f37a43>] parse_early_options+0x24/0x28
      [ 0.000000]  [<ffffffff81f37691>] ? loglevel+0x31/0x31
      [ 0.000000]  [<ffffffff81f37a78>] parse_early_param+0x31/0x3d
      [ 0.000000]  [<ffffffff81f3ae98>] setup_arch+0x2de/0xc08
      [ 0.000000]  [<ffffffff8109629a>] ? vprintk_default+0x1a/0x20
      [ 0.000000]  [<ffffffff81f37b20>] start_kernel+0x90/0x423
      [ 0.000000]  [<ffffffff81f37495>] x86_64_start_reservations+0x2a/0x2c
      [ 0.000000]  [<ffffffff81f37582>] x86_64_start_kernel+0xeb/0xef
      [ 0.000000] RIP 0xffffffff81ba2efc
      
      This panic is not reproducible with "efi=" as this will result in a non-NULL
      zero-length string.
      
      Thus, verify that the pointer to the parameter string is not NULL. This is
      consistent with other parameter-parsing functions which check for NULL pointers.
      Signed-off-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      9115c758
  20. 16 7月, 2015 5 次提交
  21. 14 7月, 2015 1 次提交
  22. 07 7月, 2015 1 次提交
  23. 25 6月, 2015 1 次提交
  24. 17 6月, 2015 1 次提交
    • P
      x86: don't use module_init in non-modular intel_mid_vrtc.c · 4711e2f9
      Paul Gortmaker 提交于
      The X86_INTEL_MID option is bool, and hence this code is either
      present or absent.  It will never be modular, so using
      module_init as an alias for __initcall is rather misleading.
      
      Fix this up now, so that we can relocate module_init from
      init.h into module.h in the future.  If we don't do this, we'd
      have to add module.h to obviously non-modular code, and that
      would be a worse thing.
      
      Note that direct use of __initcall is discouraged, vs. one
      of the priority categorized subgroups.  As __initcall gets
      mapped onto device_initcall, our use of device_initcall
      directly in this change means that the runtime impact is
      zero -- it will remain at level 6 in initcall ordering.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      4711e2f9