• S
    x86/xen: don't copy bogus duplicate entries into kernel page tables · 0b5a5063
    Stefan Bader 提交于
    When RANDOMIZE_BASE (KASLR) is enabled; or the sum of all loaded
    modules exceeds 512 MiB, then loading modules fails with a warning
    (and hence a vmalloc allocation failure) because the PTEs for the
    newly-allocated vmalloc address space are not zero.
    
      WARNING: CPU: 0 PID: 494 at linux/mm/vmalloc.c:128
               vmap_page_range_noflush+0x2a1/0x360()
    
    This is caused by xen_setup_kernel_pagetables() copying
    level2_kernel_pgt into level2_fixmap_pgt, overwriting many non-present
    entries.
    
    Without KASLR, the normal kernel image size only covers the first half
    of level2_kernel_pgt and module space starts after that.
    
    L4[511]->level3_kernel_pgt[510]->level2_kernel_pgt[  0..255]->kernel
                                                      [256..511]->module
                              [511]->level2_fixmap_pgt[  0..505]->module
    
    This allows 512 MiB of of module vmalloc space to be used before
    having to use the corrupted level2_fixmap_pgt entries.
    
    With KASLR enabled, the kernel image uses the full PUD range of 1G and
    module space starts in the level2_fixmap_pgt. So basically:
    
    L4[511]->level3_kernel_pgt[510]->level2_kernel_pgt[0..511]->kernel
                              [511]->level2_fixmap_pgt[0..505]->module
    
    And now no module vmalloc space can be used without using the corrupt
    level2_fixmap_pgt entries.
    
    Fix this by properly converting the level2_fixmap_pgt entries to MFNs,
    and setting level1_fixmap_pgt as read-only.
    
    A number of comments were also using the the wrong L3 offset for
    level2_kernel_pgt.  These have been corrected.
    Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
    Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
    Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
    Cc: stable@vger.kernel.org
    0b5a5063
pgtable_64.h 4.9 KB