• J
    x86, mm: Only direct map addresses that are marked as E820_RAM · 66520ebc
    Jacob Shin 提交于
    Currently direct mappings are created for [ 0 to max_low_pfn<<PAGE_SHIFT )
    and [ 4GB to max_pfn<<PAGE_SHIFT ), which may include regions that are not
    backed by actual DRAM. This is fine for holes under 4GB which are covered
    by fixed and variable range MTRRs to be UC. However, we run into trouble
    on higher memory addresses which cannot be covered by MTRRs.
    
    Our system with 1TB of RAM has an e820 that looks like this:
    
     BIOS-e820: [mem 0x0000000000000000-0x00000000000983ff] usable
     BIOS-e820: [mem 0x0000000000098400-0x000000000009ffff] reserved
     BIOS-e820: [mem 0x00000000000d0000-0x00000000000fffff] reserved
     BIOS-e820: [mem 0x0000000000100000-0x00000000c7ebffff] usable
     BIOS-e820: [mem 0x00000000c7ec0000-0x00000000c7ed7fff] ACPI data
     BIOS-e820: [mem 0x00000000c7ed8000-0x00000000c7ed9fff] ACPI NVS
     BIOS-e820: [mem 0x00000000c7eda000-0x00000000c7ffffff] reserved
     BIOS-e820: [mem 0x00000000fec00000-0x00000000fec0ffff] reserved
     BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
     BIOS-e820: [mem 0x00000000fff00000-0x00000000ffffffff] reserved
     BIOS-e820: [mem 0x0000000100000000-0x000000e037ffffff] usable
     BIOS-e820: [mem 0x000000e038000000-0x000000fcffffffff] reserved
     BIOS-e820: [mem 0x0000010000000000-0x0000011ffeffffff] usable
    
    and so direct mappings are created for huge memory hole between
    0x000000e038000000 to 0x0000010000000000. Even though the kernel never
    generates memory accesses in that region, since the page tables mark
    them incorrectly as being WB, our (AMD) processor ends up causing a MCE
    while doing some memory bookkeeping/optimizations around that area.
    
    This patch iterates through e820 and only direct maps ranges that are
    marked as E820_RAM, and keeps track of those pfn ranges. Depending on
    the alignment of E820 ranges, this may possibly result in using smaller
    size (i.e. 4K instead of 2M or 1G) page tables.
    
    -v2: move changes from setup.c to mm/init.c, also use for_each_mem_pfn_range
    	instead.  - Yinghai Lu
    -v3: add calculate_all_table_space_size() to get correct needed page table
    	size. - Yinghai Lu
    -v4: fix add_pfn_range_mapped() to get correct max_low_pfn_mapped when
         mem map does have hole under 4g that is found by Konard on xen
         domU with 8g ram. - Yinghai
    Signed-off-by: NJacob Shin <jacob.shin@amd.com>
    Link: http://lkml.kernel.org/r/1353123563-3103-16-git-send-email-yinghai@kernel.orgSigned-off-by: NYinghai Lu <yinghai@kernel.org>
    Reviewed-by: NPekka Enberg <penberg@kernel.org>
    Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
    66520ebc
setup.c 26.3 KB