1. 21 3月, 2017 1 次提交
  2. 07 2月, 2017 1 次提交
  3. 23 1月, 2017 1 次提交
  4. 13 12月, 2016 1 次提交
    • R
      mm: remove x86-only restriction of movable_node · 39fa104d
      Reza Arbab 提交于
      In commit c5320926 ("mem-hotplug: introduce movable_node boot
      option"), the memblock allocation direction is changed to bottom-up and
      then back to top-down like this:
      
      1. memblock_set_bottom_up(true), called by cmdline_parse_movable_node().
      2. memblock_set_bottom_up(false), called by x86's numa_init().
      
      Even though (1) occurs in generic mm code, it is wrapped by #ifdef
      CONFIG_MOVABLE_NODE, which depends on X86_64.
      
      This means that when we extend CONFIG_MOVABLE_NODE to non-x86 arches,
      things will be unbalanced.  (1) will happen for them, but (2) will not.
      
      This toggle was added in the first place because x86 has a delay between
      adding memblocks and marking them as hotpluggable.  Since other arches
      do this marking either immediately or not at all, they do not require
      the bottom-up toggle.
      
      So, resolve things by moving (1) from cmdline_parse_movable_node() to
      x86's setup_arch(), immediately after the movable_node parameter has
      been parsed.
      
      Link: http://lkml.kernel.org/r/1479160961-25840-3-git-send-email-arbab@linux.vnet.ibm.comSigned-off-by: NReza Arbab <arbab@linux.vnet.ibm.com>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Alistair Popple <apopple@au1.ibm.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
      Cc: Frank Rowand <frowand.list@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      39fa104d
  5. 29 10月, 2016 1 次提交
    • T
      x86/smpboot: Init apic mapping before usage · 1e90a13d
      Thomas Gleixner 提交于
      The recent changes, which forced the registration of the boot cpu on UP
      systems, which do not have ACPI tables, have been fixed for systems w/o
      local APIC, but left a wreckage for systems which have neither ACPI nor
      mptables, but the CPU has an APIC, e.g. virtualbox.
      
      The boot process crashes in prefill_possible_map() as it wants to register
      the boot cpu, which needs to access the local apic, but the local APIC is
      not yet mapped.
      
      There is no reason why init_apic_mapping() can't be invoked before
      prefill_possible_map(). So instead of playing another silly early mapping
      game, as the ACPI/mptables code does, we just move init_apic_mapping()
      before the call to prefill_possible_map().
      
      In hindsight, I should have noticed that combination earlier.
      
      Sorry for the churn (also in stable)!
      
      Fixes: ff856051 ("x86/boot/smp: Don't try to poke disabled/non-existent APIC")
      Reported-and-debugged-by: NMichal Necasek <michal.necasek@oracle.com>
      Reported-and-tested-by: NWolfgang Bauer <wbauer@tmo.at>
      Cc: prarit@redhat.com
      Cc: ville.syrjala@linux.intel.com
      Cc: michael.thayer@oracle.com
      Cc: knut.osmundsen@oracle.com
      Cc: frank.mehnert@oracle.com
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610282114380.5053@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      1e90a13d
  6. 30 9月, 2016 2 次提交
  7. 21 9月, 2016 1 次提交
  8. 09 9月, 2016 2 次提交
    • R
      x86/efi: Defer efi_esrt_init until after memblock_x86_fill · 3dad6f7f
      Ricardo Neri 提交于
      Commit 7b02d53e7852 ("efi: Allow drivers to reserve boot services forever")
      introduced a new efi_mem_reserve to reserve the boot services memory
      regions forever. This reservation involves allocating a new EFI memory
      range descriptor. However, allocation can only succeed if there is memory
      available for the allocation. Otherwise, error such as the following may
      occur:
      
      esrt: Reserving ESRT space from 0x000000003dd6a000 to 0x000000003dd6a010.
      Kernel panic - not syncing: ERROR: Failed to allocate 0x9f0 bytes below \
       0x0.
      CPU: 0 PID: 0 Comm: swapper Not tainted 4.7.0-rc5+ #503
       0000000000000000 ffffffff81e03ce0 ffffffff8131dae8 ffffffff81bb6c50
       ffffffff81e03d70 ffffffff81e03d60 ffffffff8111f4df 0000000000000018
       ffffffff81e03d70 ffffffff81e03d08 00000000000009f0 00000000000009f0
      Call Trace:
       [<ffffffff8131dae8>] dump_stack+0x4d/0x65
       [<ffffffff8111f4df>] panic+0xc5/0x206
       [<ffffffff81f7c6d3>] memblock_alloc_base+0x29/0x2e
       [<ffffffff81f7c6e3>] memblock_alloc+0xb/0xd
       [<ffffffff81f6c86d>] efi_arch_mem_reserve+0xbc/0x134
       [<ffffffff81fa3280>] efi_mem_reserve+0x2c/0x31
       [<ffffffff81fa3280>] ? efi_mem_reserve+0x2c/0x31
       [<ffffffff81fa40d3>] efi_esrt_init+0x19e/0x1b4
       [<ffffffff81f6d2dd>] efi_init+0x398/0x44a
       [<ffffffff81f5c782>] setup_arch+0x415/0xc30
       [<ffffffff81f55af1>] start_kernel+0x5b/0x3ef
       [<ffffffff81f55434>] x86_64_start_reservations+0x2f/0x31
       [<ffffffff81f55520>] x86_64_start_kernel+0xea/0xed
      ---[ end Kernel panic - not syncing: ERROR: Failed to allocate 0x9f0
           bytes below 0x0.
      
      An inspection of the memblock configuration reveals that there is no memory
      available for the allocation:
      
      MEMBLOCK configuration:
       memory size = 0x0 reserved size = 0x4f339c0
       memory.cnt  = 0x1
       memory[0x0]    [0x00000000000000-0xffffffffffffffff], 0x0 bytes on node 0\
                       flags: 0x0
       reserved.cnt  = 0x4
       reserved[0x0]  [0x0000000008c000-0x0000000008c9bf], 0x9c0 bytes flags: 0x0
       reserved[0x1]  [0x0000000009f000-0x000000000fffff], 0x61000 bytes\
                       flags: 0x0
       reserved[0x2]  [0x00000002800000-0x0000000394bfff], 0x114c000 bytes\
                       flags: 0x0
       reserved[0x3]  [0x000000304e4000-0x00000034269fff], 0x3d86000 bytes\
                       flags: 0x0
      
      This situation can be avoided if we call efi_esrt_init after memblock has
      memory regions for the allocation.
      
      Also, the EFI ESRT driver makes use of early_memremap'pings. Therfore, we
      do not want to defer efi_esrt_init for too long. We must call such function
      while calls to early_memremap are still valid.
      
      A good place to meet the two aforementioned conditions is right after
      memblock_x86_fill, grouped with other EFI-related functions.
      Reported-by: NScott Lawson <scott.lawson@intel.com>
      Signed-off-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Peter Jones <pjones@redhat.com>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      3dad6f7f
    • M
      x86/efi: Test for EFI_MEMMAP functionality when iterating EFI memmap · 4971531a
      Matt Fleming 提交于
      Both efi_find_mirror() and efi_fake_memmap() really want to know
      whether the EFI memory map is available, not just whether the machine
      was booted using EFI. efi_fake_memmap() even has a check for
      EFI_MEMMAP at the start of the function.
      
      Since we've already got other code that has this dependency, merge
      everything under one if() conditional, and remove the now superfluous
      check from efi_fake_memmap().
      
      Tested-by: Dave Young <dyoung@redhat.com> [kexec/kdump]
      Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [arm]
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      4971531a
  9. 15 8月, 2016 1 次提交
    • B
      x86/mm/numa: Open code function early_get_boot_cpu_id() · a91bf718
      Baoquan He 提交于
      Previously early_acpi_boot_init() was called in early_get_boot_cpu_id()
      to get the value for boot_cpu_physical_apicid. Now early_acpi_boot_init()
      has been taken out and moved to setup_arch(), the name of
      early_get_boot_cpu_id() doesn't match its implementation anymore, and
      only the getting boot-time SMP configuration code was left.
      
      So in this patch we open code it.
      
      Also move the smp_found_config check into default_get_smp_config to
      simplify code, because both early_get_smp_config() and get_smp_config()
      call x86_init.mpparse.get_smp_config().
      
      Also remove the redundent CONFIG_X86_MPPARSE #ifdef check when we call
      early_get_smp_config().
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-acpi@vger.kernel.org
      Cc: rjw@rjwysocki.net
      Link: http://lkml.kernel.org/r/1470985033-22493-1-git-send-email-bhe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a91bf718
  10. 11 8月, 2016 3 次提交
    • A
      x86/boot: Defer setup_real_mode() to early_initcall time · d0de0f68
      Andy Lutomirski 提交于
      There's no need to run setup_real_mode() as early as we run it.
      Defer it to the same early_initcall that sets up the page
      permissions for the real mode code.
      
      This should be a code size reduction.  More importantly, it give us
      a longer window in which we can allocate the real mode trampoline.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mario Limonciello <mario_limonciello@dell.com>
      Cc: Matt Fleming <mfleming@suse.de>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/fd62f0da4f79357695e9bf3e365623736b05f119.1470821230.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d0de0f68
    • A
      x86/boot: Synchronize trampoline_cr4_features and mmu_cr4_features directly · 18bc7bd5
      Andy Lutomirski 提交于
      The initialization process for trampoline_cr4_features and
      mmu_cr4_features was confusing.  The intent is for mmu_cr4_features
      and *trampoline_cr4_features to stay in sync, but
      trampoline_cr4_features is NULL until setup_real_mode() runs.  The
      old code synchronized *trampoline_cr4_features *twice*, once in
      setup_real_mode() and once in setup_arch().  It also initialized
      mmu_cr4_features in setup_real_mode(), which causes the actual value
      of mmu_cr4_features to potentially depend on when setup_real_mode()
      is called.
      
      With this patch, mmu_cr4_features is initialized directly in
      setup_arch(), and *trampoline_cr4_features is synchronized to
      mmu_cr4_features when the trampoline is set up.
      
      After this patch, it should be safe to defer setup_real_mode().
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mario Limonciello <mario_limonciello@dell.com>
      Cc: Matt Fleming <mfleming@suse.de>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/d48a263f9912389b957dd495a7127b009259ffe0.1470821230.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      18bc7bd5
    • A
      x86/boot: Run reserve_bios_regions() after we initialize the memory map · 007b7560
      Andy Lutomirski 提交于
      reserve_bios_regions() is a quirk that reserves memory that we might
      otherwise think is available.  There's no need to run it so early,
      and running it before we have the memory map initialized with its
      non-quirky inputs makes it hard to make reserve_bios_regions() more
      intelligent.
      
      Move it right after we populate the memblock state.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mario Limonciello <mario_limonciello@dell.com>
      Cc: Matt Fleming <mfleming@suse.de>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/59f58618911005c799c6c9979ce6ae4881d907c2.1470821230.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      007b7560
  11. 10 8月, 2016 2 次提交
    • K
      x86: Apply more __ro_after_init and const · 404f6aac
      Kees Cook 提交于
      Guided by grsecurity's analogous __read_only markings in arch/x86,
      this applies several uses of __ro_after_init to structures that are
      only updated during __init, and const for some structures that are
      never updated.  Additionally extends __init markings to some functions
      that are only used during __init, and cleans up some missing C99 style
      static initializers.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brad Spengler <spender@grsecurity.net>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Brown <david.brown@linaro.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Emese Revfy <re.emese@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mathias Krause <minipli@googlemail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: PaX Team <pageexec@freemail.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kernel-hardening@lists.openwall.com
      Link: http://lkml.kernel.org/r/20160808232906.GA29731@www.outflux.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      404f6aac
    • T
      x86/mm/KASLR: Fix physical memory calculation on KASLR memory randomization · c7d2361f
      Thomas Garnier 提交于
      Initialize KASLR memory randomization after max_pfn is initialized. Also
      ensure the size is rounded up. It could create problems on machines
      with more than 1Tb of memory on certain random addresses.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Cc: Aleksey Makarov <aleksey.makarov@linaro.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: kernel-hardening@lists.openwall.com
      Fixes: 021182e5 ("Enable KASLR for physical mapping memory regions")
      Link: http://lkml.kernel.org/r/1470762665-88032-1-git-send-email-thgarnie@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c7d2361f
  12. 14 7月, 2016 1 次提交
    • P
      x86/kernel: Audit and remove any unnecessary uses of module.h · 186f4360
      Paul Gortmaker 提交于
      Historically a lot of these existed because we did not have
      a distinction between what was modular code and what was providing
      support to modules via EXPORT_SYMBOL and friends.  That changed
      when we forked out support for the latter into the export.h file.
      
      This means we should be able to reduce the usage of module.h
      in code that is obj-y Makefile or bool Kconfig.  The advantage
      in doing so is that module.h itself sources about 15 other headers;
      adding significantly to what we feed cpp, and it can obscure what
      headers we are effectively using.
      
      Since module.h was the source for init.h (for __init) and for
      export.h (for EXPORT_SYMBOL) we consider each obj-y/bool instance
      for the presence of either and replace as needed.  Build testing
      revealed some implicit header usage that was fixed up accordingly.
      
      Note that some bool/obj-y instances remain since module.h is
      the header for some exception table entry stuff, and for things
      like __init_or_module (code that is tossed when MODULES=n).
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160714001901.31603-4-paul.gortmaker@windriver.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      186f4360
  13. 08 7月, 2016 1 次提交
    • T
      x86/mm: Implement ASLR for kernel memory regions · 0483e1fa
      Thomas Garnier 提交于
      Randomizes the virtual address space of kernel memory regions for
      x86_64. This first patch adds the infrastructure and does not randomize
      any region. The following patches will randomize the physical memory
      mapping, vmalloc and vmemmap regions.
      
      This security feature mitigates exploits relying on predictable kernel
      addresses. These addresses can be used to disclose the kernel modules
      base addresses or corrupt specific structures to elevate privileges
      bypassing the current implementation of KASLR. This feature can be
      enabled with the CONFIG_RANDOMIZE_MEMORY option.
      
      The order of each memory region is not changed. The feature looks at the
      available space for the regions based on different configuration options
      and randomizes the base and space between each. The size of the physical
      memory mapping is the available physical memory. No performance impact
      was detected while testing the feature.
      
      Entropy is generated using the KASLR early boot functions now shared in
      the lib directory (originally written by Kees Cook). Randomization is
      done on PGD & PUD page table levels to increase possible addresses. The
      physical memory mapping code was adapted to support PUD level virtual
      addresses. This implementation on the best configuration provides 30,000
      possible virtual addresses in average for each memory region.  An
      additional low memory page is used to ensure each CPU can start with a
      PGD aligned virtual address (for realmode).
      
      x86/dump_pagetable was updated to correctly display each region.
      
      Updated documentation on x86_64 memory layout accordingly.
      
      Performance data, after all patches in the series:
      
      Kernbench shows almost no difference (-+ less than 1%):
      
      Before:
      
      Average Optimal load -j 12 Run (std deviation): Elapsed Time 102.63 (1.2695)
      User Time 1034.89 (1.18115) System Time 87.056 (0.456416) Percent CPU 1092.9
      (13.892) Context Switches 199805 (3455.33) Sleeps 97907.8 (900.636)
      
      After:
      
      Average Optimal load -j 12 Run (std deviation): Elapsed Time 102.489 (1.10636)
      User Time 1034.86 (1.36053) System Time 87.764 (0.49345) Percent CPU 1095
      (12.7715) Context Switches 199036 (4298.1) Sleeps 97681.6 (1031.11)
      
      Hackbench shows 0% difference on average (hackbench 90 repeated 10 times):
      
      attemp,before,after 1,0.076,0.069 2,0.072,0.069 3,0.066,0.066 4,0.066,0.068
      5,0.066,0.067 6,0.066,0.069 7,0.067,0.066 8,0.063,0.067 9,0.067,0.065
      10,0.068,0.071 average,0.0677,0.0677
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
      Cc: Alexander Popov <alpopov@ptsecurity.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: linux-doc@vger.kernel.org
      Link: http://lkml.kernel.org/r/1466556426-32664-6-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0483e1fa
  14. 22 6月, 2016 1 次提交
  15. 19 4月, 2016 2 次提交
    • L
      ACPI / x86: Cleanup initrd related code · af06f8b7
      Lv Zheng 提交于
      In arch/x86/kernel/setup.c, the #ifdef kept for CONFIG_ACPI actually is
      related to the accessibility of initrd_start/initrd_end, so the stub should
      be provided from this source file and should only be dependent on
      CONFIG_BLK_DEV_INITRD.
      
      Note that when ACPI=n and BLK_DEV_INITRD=y, early_initrd_acpi_init() is
      still a stub because of the stub prepared for early_acpi_table_init().
      Signed-off-by: NLv Zheng <lv.zheng@intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      af06f8b7
    • L
      ACPI / tables: Move table override mechanisms to tables.c · 5ae74f2c
      Lv Zheng 提交于
      This patch moves acpi_os_table_override() and
      acpi_os_physical_table_override() to tables.c.
      
      Along with the mechanisms, acpi_initrd_initialize_tables() is also moved to
      tables.c to form a static function. The following functions are renamed
      according to this change:
       1. acpi_initrd_override() -> renamed to early_acpi_table_init(), which
          invokes acpi_table_initrd_init()
       2. acpi_os_physical_table_override() -> which invokes
          acpi_table_initrd_override()
       3. acpi_initialize_initrd_tables() -> renamed to acpi_table_initrd_scan()
      Signed-off-by: NLv Zheng <lv.zheng@intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      5ae74f2c
  16. 15 4月, 2016 1 次提交
  17. 07 4月, 2016 1 次提交
  18. 19 2月, 2016 1 次提交
    • D
      x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps · c1192f84
      Dave Hansen 提交于
      The protection key can now be just as important as read/write
      permissions on a VMA.  We need some debug mechanism to help
      figure out if it is in play.  smaps seems like a logical
      place to expose it.
      
      arch/x86/kernel/setup.c is a bit of a weirdo place to put
      this code, but it already had seq_file.h and there was not
      a much better existing place to put it.
      
      We also use no #ifdef.  If protection keys is .config'd out we
      will effectively get the same function as if we used the weak
      generic function.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Mark Williamson <mwilliamson@undo-software.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20160212210227.4F8EB3F8@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c1192f84
  19. 30 1月, 2016 1 次提交
    • T
      x86/e820: Set System RAM type and descriptor · f33b14a4
      Toshi Kani 提交于
      Change e820_reserve_resources() to set 'flags' and 'desc' from
      e820 types.
      
      Set E820_RESERVED_KERN and E820_RAM's (System RAM) io resource
      type to IORESOURCE_SYSTEM_RAM.
      
      Do the same for "Kernel data", "Kernel code", and "Kernel bss",
      which are child nodes of System RAM.
      
      I/O resource descriptor is set to 'desc' for entries that are
      (and will be) target ranges of walk_iomem_res() and
      region_intersects().
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mm <linux-mm@kvack.org>
      Link: http://lkml.kernel.org/r/1453841853-11383-5-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f33b14a4
  20. 06 12月, 2015 1 次提交
  21. 23 11月, 2015 1 次提交
  22. 07 11月, 2015 1 次提交
  23. 21 10月, 2015 6 次提交
  24. 16 10月, 2015 1 次提交
    • P
      x86/setup: Extend low identity map to cover whole kernel range · f5f3497c
      Paolo Bonzini 提交于
      On 32-bit systems, the initial_page_table is reused by
      efi_call_phys_prolog as an identity map to call
      SetVirtualAddressMap.  efi_call_phys_prolog takes care of
      converting the current CPU's GDT to a physical address too.
      
      For PAE kernels the identity mapping is achieved by aliasing the
      first PDPE for the kernel memory mapping into the first PDPE
      of initial_page_table.  This makes the EFI stub's trick "just work".
      
      However, for non-PAE kernels there is no guarantee that the identity
      mapping in the initial_page_table extends as far as the GDT; in this
      case, accesses to the GDT will cause a page fault (which quickly becomes
      a triple fault).  Fix this by copying the kernel mappings from
      swapper_pg_dir to initial_page_table twice, both at PAGE_OFFSET and at
      identity mapping.
      
      For some reason, this is only reproducible with QEMU's dynamic translation
      mode, and not for example with KVM.  However, even under KVM one can clearly
      see that the page table is bogus:
      
          $ qemu-system-i386 -pflash OVMF.fd -M q35 vmlinuz0 -s -S -daemonize
          $ gdb
          (gdb) target remote localhost:1234
          (gdb) hb *0x02858f6f
          Hardware assisted breakpoint 1 at 0x2858f6f
          (gdb) c
          Continuing.
      
          Breakpoint 1, 0x02858f6f in ?? ()
          (gdb) monitor info registers
          ...
          GDT=     0724e000 000000ff
          IDT=     fffbb000 000007ff
          CR0=0005003b CR2=ff896000 CR3=032b7000 CR4=00000690
          ...
      
      The page directory is sane:
      
          (gdb) x/4wx 0x32b7000
          0x32b7000:	0x03398063	0x03399063	0x0339a063	0x0339b063
          (gdb) x/4wx 0x3398000
          0x3398000:	0x00000163	0x00001163	0x00002163	0x00003163
          (gdb) x/4wx 0x3399000
          0x3399000:	0x00400003	0x00401003	0x00402003	0x00403003
      
      but our particular page directory entry is empty:
      
          (gdb) x/1wx 0x32b7000 + (0x724e000 >> 22) * 4
          0x32b7070:	0x00000000
      
      [ It appears that you can skate past this issue if you don't receive
        any interrupts while the bogus GDT pointer is loaded, or if you avoid
        reloading the segment registers in general.
      
        Andy Lutomirski provides some additional insight:
      
         "AFAICT it's entirely permissible for the GDTR and/or LDT
          descriptor to point to unmapped memory.  Any attempt to use them
          (segment loads, interrupts, IRET, etc) will try to access that memory
          as if the access came from CPL 0 and, if the access fails, will
          generate a valid page fault with CR2 pointing into the GDT or
          LDT."
      
        Up until commit 23a0d4e8 ("efi: Disable interrupts around EFI
        calls, not in the epilog/prolog calls") interrupts were disabled
        around the prolog and epilog calls, and the functional GDT was
        re-installed before interrupts were re-enabled.
      
        Which explains why no one has hit this issue until now. ]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reported-by: NLaszlo Ersek <lersek@redhat.com>
      Cc: <stable@vger.kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      [ Updated changelog. ]
      f5f3497c
  25. 12 10月, 2015 1 次提交
    • T
      efi: Add "efi_fake_mem" boot option · 0f96a99d
      Taku Izumi 提交于
      This patch introduces new boot option named "efi_fake_mem".
      By specifying this parameter, you can add arbitrary attribute
      to specific memory range.
      This is useful for debugging of Address Range Mirroring feature.
      
      For example, if "efi_fake_mem=2G@4G:0x10000,2G@0x10a0000000:0x10000"
      is specified, the original (firmware provided) EFI memmap will be
      updated so that the specified memory regions have
      EFI_MEMORY_MORE_RELIABLE attribute (0x10000):
      
       <original>
         efi: mem36: [Conventional Memory|  |  |  |  |  |   |WB|WT|WC|UC] range=[0x0000000100000000-0x00000020a0000000) (129536MB)
      
       <updated>
         efi: mem36: [Conventional Memory|  |MR|  |  |  |   |WB|WT|WC|UC] range=[0x0000000100000000-0x0000000180000000) (2048MB)
         efi: mem37: [Conventional Memory|  |  |  |  |  |   |WB|WT|WC|UC] range=[0x0000000180000000-0x00000010a0000000) (61952MB)
         efi: mem38: [Conventional Memory|  |MR|  |  |  |   |WB|WT|WC|UC] range=[0x00000010a0000000-0x0000001120000000) (2048MB)
         efi: mem39: [Conventional Memory|  |  |  |  |  |   |WB|WT|WC|UC] range=[0x0000001120000000-0x00000020a0000000) (63488MB)
      
      And you will find that the following message is output:
      
         efi: Memory: 4096M/131455M mirrored memory
      Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      0f96a99d
  26. 11 9月, 2015 1 次提交
    • D
      kexec: split kexec_load syscall from kexec core code · 2965faa5
      Dave Young 提交于
      There are two kexec load syscalls, kexec_load another and kexec_file_load.
       kexec_file_load has been splited as kernel/kexec_file.c.  In this patch I
      split kexec_load syscall code to kernel/kexec.c.
      
      And add a new kconfig option KEXEC_CORE, so we can disable kexec_load and
      use kexec_file_load only, or vice verse.
      
      The original requirement is from Ted Ts'o, he want kexec kernel signature
      being checked with CONFIG_KEXEC_VERIFY_SIG enabled.  But kexec-tools use
      kexec_load syscall can bypass the checking.
      
      Vivek Goyal proposed to create a common kconfig option so user can compile
      in only one syscall for loading kexec kernel.  KEXEC/KEXEC_FILE selects
      KEXEC_CORE so that old config files still work.
      
      Because there's general code need CONFIG_KEXEC_CORE, so I updated all the
      architecture Kconfig with a new option KEXEC_CORE, and let KEXEC selects
      KEXEC_CORE in arch Kconfig.  Also updated general kernel code with to
      kexec_load syscall.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Petr Tesarik <ptesarik@suse.cz>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2965faa5
  27. 09 9月, 2015 1 次提交
    • M
      x86: use generic early mem copy · 5dd2c4bd
      Mark Salter 提交于
      The early_ioremap library now has a generic copy_from_early_mem()
      function.  Use the generic copy function for x86 relocate_initrd().
      
      [akpm@linux-foundation.org: remove MAX_MAP_CHUNK define, per Yinghai Lu]
      Signed-off-by: NMark Salter <msalter@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5dd2c4bd
  28. 21 7月, 2015 1 次提交
  29. 25 6月, 2015 1 次提交