• T
    x86/mem-hotplug: support initialize page tables in bottom-up · b959ed6c
    Tang Chen 提交于
    The Linux kernel cannot migrate pages used by the kernel.  As a result,
    kernel pages cannot be hot-removed.  So we cannot allocate hotpluggable
    memory for the kernel.
    
    In a memory hotplug system, any numa node the kernel resides in should be
    unhotpluggable.  And for a modern server, each node could have at least
    16GB memory.  So memory around the kernel image is highly likely
    unhotpluggable.
    
    ACPI SRAT (System Resource Affinity Table) contains the memory hotplug
    info.  But before SRAT is parsed, memblock has already started to allocate
    memory for the kernel.  So we need to prevent memblock from doing this.
    
    So direct memory mapping page tables setup is the case.
    init_mem_mapping() is called before SRAT is parsed.  To prevent page
    tables being allocated within hotpluggable memory, we will use bottom-up
    direction to allocate page tables from the end of kernel image to the
    higher memory.
    
    Note:
    As for allocating page tables in lower memory, TJ said:
    
    : This is an optional behavior which is triggered by a very specific kernel
    : boot param, which I suspect is gonna need to stick around to support
    : memory hotplug in the current setup unless we add another layer of address
    : translation to support memory hotplug.
    
    As for page tables may occupy too much lower memory if using 4K mapping
    (CONFIG_DEBUG_PAGEALLOC and CONFIG_KMEMCHECK both disable using >4k
    pages), TJ said:
    
    : But as I said in the same paragraph, parsing SRAT earlier doesn't solve
    : the problem in itself either.  Ignoring the option if 4k mapping is
    : required and memory consumption would be prohibitive should work, no?
    : Something like that would be necessary if we're gonna worry about cases
    : like this no matter how we implement it, but, frankly, I'm not sure this
    : is something worth worrying about.
    Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
    Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
    Acked-by: NTejun Heo <tj@kernel.org>
    Acked-by: NToshi Kani <toshi.kani@hp.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
    Cc: Thomas Renninger <trenn@suse.de>
    Cc: Yinghai Lu <yinghai@kernel.org>
    Cc: Jiang Liu <jiang.liu@huawei.com>
    Cc: Wen Congyang <wency@cn.fujitsu.com>
    Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
    Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
    Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Michal Nazarewicz <mina86@mina86.com>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    b959ed6c
init.c 18.6 KB