提交 f24cd454 编写于 作者: V Vijay Balakrishna 提交者: Zheng Zengkai

arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones

stable inclusion
from stable-v5.10.110
commit a25864c5bc20966cdc5ba5eb65b74b9b1e9ec8d2
bugzilla: https://gitee.com/openeuler/kernel/issues/I574AL

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a25864c5bc20966cdc5ba5eb65b74b9b1e9ec8d2

--------------------------------

commit 03149563 upstream.

The following patches resulted in deferring crash kernel reservation to
mem_init(), mainly aimed at platforms with DMA memory zones (no IOMMU),
in particular Raspberry Pi 4.

commit 1a8e1cef ("arm64: use both ZONE_DMA and ZONE_DMA32")
commit 8424ecdd ("arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges")
commit 0a30c535 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
commit 2687275a ("arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required")

Above changes introduced boot slowdown due to linear map creation for
all the memory banks with NO_BLOCK_MAPPINGS, see discussion[1].  The proposed
changes restore crash kernel reservation to earlier behavior thus avoids
slow boot, particularly for platforms with IOMMU (no DMA memory zones).

Tested changes to confirm no ~150ms boot slowdown on our SoC with IOMMU
and 8GB memory.  Also tested with ZONE_DMA and/or ZONE_DMA32 configs to confirm
no regression to deferring scheme of crash kernel memory reservation.
In both cases successfully collected kernel crash dump.

[1] https://lore.kernel.org/all/9436d033-579b-55fa-9b00-6f4b661c2dd7@linux.microsoft.com/Signed-off-by: NVijay Balakrishna <vijayb@linux.microsoft.com>
Cc: stable@vger.kernel.org
Reviewed-by: NPasha Tatashin <pasha.tatashin@soleen.com>
Link: https://lore.kernel.org/r/1646242689-20744-1-git-send-email-vijayb@linux.microsoft.com
[will: Add #ifdef CONFIG_KEXEC_CORE guards to fix 'crashk_res' references in allnoconfig build]
Signed-off-by: NWill Deacon <will@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>

 Conflicts:
	arch/arm64/mm/mmu.c
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
上级 a934c00c
...@@ -63,8 +63,34 @@ EXPORT_SYMBOL(memstart_addr); ...@@ -63,8 +63,34 @@ EXPORT_SYMBOL(memstart_addr);
* unless restricted on specific platforms (e.g. 30-bit on Raspberry Pi 4). * unless restricted on specific platforms (e.g. 30-bit on Raspberry Pi 4).
* In such case, ZONE_DMA32 covers the rest of the 32-bit addressable memory, * In such case, ZONE_DMA32 covers the rest of the 32-bit addressable memory,
* otherwise it is empty. * otherwise it is empty.
*
* Memory reservation for crash kernel either done early or deferred
* depending on DMA memory zones configs (ZONE_DMA) --
*
* In absence of ZONE_DMA configs arm64_dma_phys_limit initialized
* here instead of max_zone_phys(). This lets early reservation of
* crash kernel memory which has a dependency on arm64_dma_phys_limit.
* Reserving memory early for crash kernel allows linear creation of block
* mappings (greater than page-granularity) for all the memory bank rangs.
* In this scheme a comparatively quicker boot is observed.
*
* If ZONE_DMA configs are defined, crash kernel memory reservation
* is delayed until DMA zone memory range size initilazation performed in
* zone_sizes_init(). The defer is necessary to steer clear of DMA zone
* memory range to avoid overlap allocation. So crash kernel memory boundaries
* are not known when mapping all bank memory ranges, which otherwise means
* not possible to exclude crash kernel range from creating block mappings
* so page-granularity mappings are created for the entire memory range.
* Hence a slightly slower boot is observed.
*
* Note: Page-granularity mapppings are necessary for crash kernel memory
* range for shrinking its size via /sys/kernel/kexec_crash_size interface.
*/ */
phys_addr_t arm64_dma_phys_limit __ro_after_init; #if IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)
phys_addr_t __ro_after_init arm64_dma_phys_limit;
#else
phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
#endif
#ifndef CONFIG_KEXEC_CORE #ifndef CONFIG_KEXEC_CORE
static void __init reserve_crashkernel(void) static void __init reserve_crashkernel(void)
...@@ -173,8 +199,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) ...@@ -173,8 +199,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
if (!arm64_dma_phys_limit) if (!arm64_dma_phys_limit)
arm64_dma_phys_limit = dma32_phys_limit; arm64_dma_phys_limit = dma32_phys_limit;
#endif #endif
if (!arm64_dma_phys_limit)
arm64_dma_phys_limit = PHYS_MASK + 1;
max_zone_pfns[ZONE_NORMAL] = max; max_zone_pfns[ZONE_NORMAL] = max;
free_area_init(max_zone_pfns); free_area_init(max_zone_pfns);
...@@ -498,6 +522,9 @@ void __init arm64_memblock_init(void) ...@@ -498,6 +522,9 @@ void __init arm64_memblock_init(void)
reserve_elfcorehdr(); reserve_elfcorehdr();
if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
reserve_crashkernel();
high_memory = __va(memblock_end_of_DRAM() - 1) + 1; high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
} }
...@@ -553,6 +580,7 @@ void __init bootmem_init(void) ...@@ -553,6 +580,7 @@ void __init bootmem_init(void)
* request_standard_resources() depends on crashkernel's memory being * request_standard_resources() depends on crashkernel's memory being
* reserved, so do it here. * reserved, so do it here.
*/ */
if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32))
reserve_crashkernel(); reserve_crashkernel();
reserve_quick_kexec(); reserve_quick_kexec();
......
...@@ -562,16 +562,6 @@ static void __init map_mem(pgd_t *pgdp) ...@@ -562,16 +562,6 @@ static void __init map_mem(pgd_t *pgdp)
PAGE_KERNEL, NO_CONT_MAPPINGS); PAGE_KERNEL, NO_CONT_MAPPINGS);
memblock_clear_nomap(kernel_start, kernel_end - kernel_start); memblock_clear_nomap(kernel_start, kernel_end - kernel_start);
#ifdef CONFIG_KEXEC_CORE
if (crashk_res.end) {
__map_memblock(pgdp, crashk_res.start,
crashk_res.end + 1,
PAGE_KERNEL,
NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS);
memblock_clear_nomap(crashk_res.start,
resource_size(&crashk_res));
}
#endif
#ifdef CONFIG_KFENCE #ifdef CONFIG_KFENCE
/* /*
* Map the __kfence_pool at page granularity now. * Map the __kfence_pool at page granularity now.
...@@ -584,6 +574,22 @@ static void __init map_mem(pgd_t *pgdp) ...@@ -584,6 +574,22 @@ static void __init map_mem(pgd_t *pgdp)
memblock_clear_nomap(__pa(__kfence_pool), KFENCE_POOL_SIZE); memblock_clear_nomap(__pa(__kfence_pool), KFENCE_POOL_SIZE);
} }
#endif #endif
/*
* Use page-level mappings here so that we can shrink the region
* in page granularity and put back unused memory to buddy system
* through /sys/kernel/kexec_crash_size interface.
*/
#ifdef CONFIG_KEXEC_CORE
if (crashk_res.end) {
__map_memblock(pgdp, crashk_res.start,
crashk_res.end + 1,
PAGE_KERNEL,
NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS);
memblock_clear_nomap(crashk_res.start,
resource_size(&crashk_res));
}
#endif
} }
void mark_rodata_ro(void) void mark_rodata_ro(void)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册