1. 22 1月, 2014 1 次提交
    • S
      x86: memblock: set current limit to max low memory address · 5b6e5295
      Santosh Shilimkar 提交于
      The memblock current limit value is used to limit early boot memory
      allocations below max low memory address by default, as the kernel can
      access only to the low memory.
      
      Hence, set memblock current limit value to the max mapped low memory
      address instead of max mapped memory address.
      Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Grygorii Strashko <grygorii.strashko@ti.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Paul Walmsley <paul@pwsan.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Tony Lindgren <tony@atomide.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5b6e5295
  2. 14 1月, 2014 1 次提交
  3. 29 12月, 2013 2 次提交
    • D
      x86: Reserve setup_data ranges late after parsing memmap cmdline · 77ea8c94
      Dave Young 提交于
      Currently e820_reserve_setup_data() is called before parsing early
      params, it works in normal case. But for memmap=exactmap, the final
      memory ranges are created after parsing memmap= cmdline params, so the
      previous e820_reserve_setup_data() has no effect. For example,
      setup_data ranges will still be marked as normal system ram, thus when
      later sysfs driver ioremap them kernel will warn about mapping normal
      ram.
      
      This patch fix it by moving the e820_reserve_setup_data() callback after
      parsing early params so they can be set as reserved ranges and later
      ioremap will be fine with it.
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      77ea8c94
    • D
      x86/efi: Pass necessary EFI data for kexec via setup_data · 1fec0533
      Dave Young 提交于
      Add a new setup_data type SETUP_EFI for kexec use.  Passing the saved
      fw_vendor, runtime, config tables and EFI runtime mappings.
      
      When entering virtual mode, directly mapping the EFI runtime regions
      which we passed in previously. And skip the step to call
      SetVirtualAddressMap().
      
      Specially for HP z420 workstation we need save the smbios physical
      address.  The kernel boot sequence proceeds in the following order.
      Step 2 requires efi.smbios to be the physical address.  However, I found
      that on HP z420 EFI system table has a virtual address of SMBIOS in step
      1.  Hence, we need set it back to the physical address with the smbios
      in efi_setup_data.  (When it is still the physical address, it simply
      sets the same value.)
      
      1. efi_init() - Set efi.smbios from EFI system table
      2. dmi_scan_machine() - Temporary map efi.smbios to access SMBIOS table
      3. efi_enter_virtual_mode() - Map EFI ranges
      
      Tested on ovmf+qemu, lenovo thinkpad, a dell laptop and an
      HP z420 workstation.
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      1fec0533
  4. 13 11月, 2013 1 次提交
    • T
      x86, acpi, crash, kdump: do reserve_crashkernel() after SRAT is parsed. · fa591c4a
      Tang Chen 提交于
      Memory reserved for crashkernel could be large.  So we should not allocate
      this memory bottom up from the end of kernel image.
      
      When SRAT is parsed, we will be able to know which memory is hotpluggable,
      and we can avoid allocating this memory for the kernel.  So reorder
      reserve_crashkernel() after SRAT is parsed.
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NToshi Kani <toshi.kani@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fa591c4a
  5. 24 10月, 2013 1 次提交
  6. 13 10月, 2013 1 次提交
  7. 14 8月, 2013 2 次提交
  8. 07 8月, 2013 1 次提交
  9. 15 7月, 2013 1 次提交
    • P
      x86: delete __cpuinit usage from all x86 files · 148f9bb8
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      Note that some harmless section mismatch warnings may result, since
      notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
      are flagged as __cpuinit  -- so if we remove the __cpuinit from
      arch specific callers, we will also get section mismatch warnings.
      As an intermediate step, we intend to turn the linux/init.h cpuinit
      content into no-ops as early as possible, since that will get rid
      of these warnings.  In any case, they are temporary and harmless.
      
      This removes all the arch/x86 uses of the __cpuinit macros from
      all C files.  x86 only had the one __CPUINIT used in assembly files,
      and it wasn't paired off with a .previous or a __FINIT, so we can
      delete it directly w/o any corresponding additional change there.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      148f9bb8
  10. 04 7月, 2013 1 次提交
  11. 01 5月, 2013 1 次提交
    • T
      dump_stack: implement arch-specific hardware description in task dumps · 98e5e1bf
      Tejun Heo 提交于
      x86 and ia64 can acquire extra hardware identification information
      from DMI and print it along with task dumps; however, the usage isn't
      consistent.
      
      * x86 show_regs() collects vendor, product and board strings and print
        them out with PID, comm and utsname.  Some of the information is
        printed again later in the same dump.
      
      * warn_slowpath_common() explicitly accesses the DMI board and prints
        it out with "Hardware name:" label.  This applies to both x86 and
        ia64 but is irrelevant on all other archs.
      
      * ia64 doesn't show DMI information on other non-WARN dumps.
      
      This patch introduces arch-specific hardware description used by
      dump_stack().  It can be set by calling dump_stack_set_arch_desc()
      during boot and, if exists, printed out in a separate line with
      "Hardware name:" label.
      
      dmi_set_dump_stack_arch_desc() is added which sets arch-specific
      description from DMI data.  It uses dmi_ids_string[] which is set from
      dmi_present() used for DMI debug message.  It is superset of the
      information x86 show_regs() is using.  The function is called from x86
      and ia64 boot code right after dmi_scan_machine().
      
      This makes the explicit DMI handling in warn_slowpath_common()
      unnecessary.  Removed.
      
      show_regs() isn't yet converted to use generic debug information
      printing and this patch doesn't remove the duplicate DMI handling in
      x86 show_regs().  The next patch will unify show_regs() handling and
      remove the duplication.
      
      An example WARN dump follows.
      
       WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505()
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #3
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
        0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48
        ffffffff8108f500 ffffffff82228240 0000000000000040 ffffffff8234a08e
        0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58
       Call Trace:
        [<ffffffff81c614dc>] dump_stack+0x19/0x1b
        [<ffffffff8108f500>] warn_slowpath_common+0x70/0xa0
        [<ffffffff8108f54a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff8234a0c3>] init_workqueues+0x35/0x505
        ...
      
      v2: Use the same string as the debug message from dmi_present() which
          also contains BIOS information.  Move hardware name into its own
          line as warn_slowpath_common() did.  This change was suggested by
          Bjorn Helgaas.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      98e5e1bf
  12. 25 4月, 2013 1 次提交
  13. 18 4月, 2013 3 次提交
  14. 03 4月, 2013 1 次提交
  15. 07 3月, 2013 1 次提交
  16. 03 3月, 2013 1 次提交
    • Y
      x86, ACPI, mm: Revert movablemem_map support · 20e6926d
      Yinghai Lu 提交于
      Tim found:
      
        WARNING: at arch/x86/kernel/smpboot.c:324 topology_sane.isra.2+0x6f/0x80()
        Hardware name: S2600CP
        sched: CPU #1's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
        smpboot: Booting Node   1, Processors  #1
        Modules linked in:
        Pid: 0, comm: swapper/1 Not tainted 3.9.0-0-generic #1
        Call Trace:
          set_cpu_sibling_map+0x279/0x449
          start_secondary+0x11d/0x1e5
      
      Don Morris reproduced on a HP z620 workstation, and bisected it to
      commit e8d19552 ("acpi, memory-hotplug: parse SRAT before memblock
      is ready")
      
      It turns out movable_map has some problems, and it breaks several things
      
      1. numa_init is called several times, NOT just for srat. so those
      	nodes_clear(numa_nodes_parsed)
      	memset(&numa_meminfo, 0, sizeof(numa_meminfo))
         can not be just removed.  Need to consider sequence is: numaq, srat, amd, dummy.
         and make fall back path working.
      
      2. simply split acpi_numa_init to early_parse_srat.
         a. that early_parse_srat is NOT called for ia64, so you break ia64.
         b.  for (i = 0; i < MAX_LOCAL_APIC; i++)
      	     set_apicid_to_node(i, NUMA_NO_NODE)
           still left in numa_init. So it will just clear result from early_parse_srat.
           it should be moved before that....
         c.  it breaks ACPI_TABLE_OVERIDE...as the acpi table scan is moved
             early before override from INITRD is settled.
      
      3. that patch TITLE is total misleading, there is NO x86 in the title,
         but it changes critical x86 code. It caused x86 guys did not
         pay attention to find the problem early. Those patches really should
         be routed via tip/x86/mm.
      
      4. after that commit, following range can not use movable ram:
        a. real_mode code.... well..funny, legacy Node0 [0,1M) could be hot-removed?
        b. initrd... it will be freed after booting, so it could be on movable...
        c. crashkernel for kdump...: looks like we can not put kdump kernel above 4G
      	anymore.
        d. init_mem_mapping: can not put page table high anymore.
        e. initmem_init: vmemmap can not be high local node anymore. That is
           not good.
      
      If node is hotplugable, the mem related range like page table and
      vmemmap could be on the that node without problem and should be on that
      node.
      
      We have workaround patch that could fix some problems, but some can not
      be fixed.
      
      So just remove that offending commit and related ones including:
      
       f7210e6c ("mm/memblock.c: use CONFIG_HAVE_MEMBLOCK_NODE_MAP to
          protect movablecore_map in memblock_overlaps_region().")
      
       01a178a9 ("acpi, memory-hotplug: support getting hotplug info from
          SRAT")
      
       27168d38 ("acpi, memory-hotplug: extend movablemem_map ranges to
          the end of node")
      
       e8d19552 ("acpi, memory-hotplug: parse SRAT before memblock is
          ready")
      
       fb06bc8e ("page_alloc: bootmem limit with movablecore_map")
      
       42f47e27 ("page_alloc: make movablemem_map have higher priority")
      
       6981ec31 ("page_alloc: introduce zone_movable_limit[] to keep
          movable limit for nodes")
      
       34b71f1e ("page_alloc: add movable_memmap kernel parameter")
      
       4d59a751 ("x86: get pg_data_t's memory from other node")
      
      Later we should have patches that will make sure kernel put page table
      and vmemmap on local node ram instead of push them down to node0.  Also
      need to find way to put other kernel used ram to local node ram.
      Reported-by: NTim Gardner <tim.gardner@canonical.com>
      Reported-by: NDon Morris <don.morris@hp.com>
      Bisected-by: NDon Morris <don.morris@hp.com>
      Tested-by: NDon Morris <don.morris@hp.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      20e6926d
  17. 24 2月, 2013 1 次提交
    • T
      acpi, memory-hotplug: parse SRAT before memblock is ready · e8d19552
      Tang Chen 提交于
      On linux, the pages used by kernel could not be migrated.  As a result,
      if a memory range is used by kernel, it cannot be hot-removed.  So if we
      want to hot-remove memory, we should prevent kernel from using it.
      
      The way now used to prevent this is specify a memory range by
      movablemem_map boot option and set it as ZONE_MOVABLE.
      
      But when the system is booting, memblock will allocate memory, and
      reserve the memory for kernel.  And before we parse SRAT, and know the
      node memory ranges, memblock is working.  And it may allocate memory in
      ranges to be set as ZONE_MOVABLE.  This memory can be used by kernel,
      and never be freed.
      
      So, let's parse SRAT before memblock is called first.  And it is early
      enough.
      
      The first call of memblock_find_in_range_node() is in:
      
        setup_arch()
          |-->setup_real_mode()
      
      so, this patch add a function early_parse_srat() to parse SRAT, and call
      it before setup_real_mode() is called.
      
      NOTE:
      
      1) early_parse_srat() is called before numa_init(), and has initialized
         numa_meminfo.  So DO NOT clear numa_nodes_parsed in numa_init() and DO
         NOT zero numa_meminfo in numa_init(), otherwise we will lose memory
         numa info.
      
      2) I don't know why using count of memory affinities parsed from SRAT
         as a return value in original acpi_numa_init().  So I add a static
         variable srat_mem_cnt to remember this count and use it as the return
         value of the new acpi_numa_init()
      
      [mhocko@suse.cz: parse SRAT before memblock is ready fix]
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Reviewed-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Wu Jianguo <wujianguo@huawei.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: "Brown, Len" <len.brown@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8d19552
  18. 15 2月, 2013 1 次提交
  19. 14 2月, 2013 1 次提交
  20. 31 1月, 2013 1 次提交
    • M
      efi: Make 'efi_enabled' a function to query EFI facilities · 83e68189
      Matt Fleming 提交于
      Originally 'efi_enabled' indicated whether a kernel was booted from
      EFI firmware. Over time its semantics have changed, and it now
      indicates whether or not we are booted on an EFI machine with
      bit-native firmware, e.g. 64-bit kernel with 64-bit firmware.
      
      The immediate motivation for this patch is the bug report at,
      
          https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557
      
      which details how running a platform driver on an EFI machine that is
      designed to run under BIOS can cause the machine to become
      bricked. Also, the following report,
      
          https://bugzilla.kernel.org/show_bug.cgi?id=47121
      
      details how running said driver can also cause Machine Check
      Exceptions. Drivers need a new means of detecting whether they're
      running on an EFI machine, as sadly the expression,
      
          if (!efi_enabled)
      
      hasn't been a sufficient condition for quite some time.
      
      Users actually want to query 'efi_enabled' for different reasons -
      what they really want access to is the list of available EFI
      facilities.
      
      For instance, the x86 reboot code needs to know whether it can invoke
      the ResetSystem() function provided by the EFI runtime services, while
      the ACPI OSL code wants to know whether the EFI config tables were
      mapped successfully. There are also checks in some of the platform
      driver code to simply see if they're running on an EFI machine (which
      would make it a bad idea to do BIOS-y things).
      
      This patch is a prereq for the samsung-laptop fix patch.
      
      Cc: David Airlie <airlied@linux.ie>
      Cc: Corentin Chary <corentincj@iksaif.net>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Peter Jones <pjones@redhat.com>
      Cc: Colin Ian King <colin.king@canonical.com>
      Cc: Steve Langasek <steve.langasek@canonical.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      83e68189
  21. 30 1月, 2013 12 次提交
  22. 14 1月, 2013 2 次提交
  23. 12 1月, 2013 1 次提交
    • J
      x86/Sandy Bridge: reserve pages when integrated graphics is present · a9acc536
      Jesse Barnes 提交于
      SNB graphics devices have a bug that prevent them from accessing certain
      memory ranges, namely anything below 1M and in the pages listed in the
      table.  So reserve those at boot if set detect a SNB gfx device on the
      CPU to avoid GPU hangs.
      
      Stephane Marchesin had a similar patch to the page allocator awhile
      back, but rather than reserving pages up front, it leaked them at
      allocation time.
      
      [ hpa: made a number of stylistic changes, marked arrays as static
        const, and made less verbose; use "memblock=debug" for full
        verbosity. ]
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      a9acc536
  24. 06 12月, 2012 1 次提交