• J
    vmalloc: eagerly clear ptes on vunmap · 64141da5
    Jeremy Fitzhardinge 提交于
    On stock 2.6.37-rc4, running:
    
      # mount lilith:/export /mnt/lilith
      # find  /mnt/lilith/ -type f -print0 | xargs -0 file
    
    crashes the machine fairly quickly under Xen.  Often it results in oops
    messages, but the couple of times I tried just now, it just hung quietly
    and made Xen print some rude messages:
    
        (XEN) mm.c:2389:d80 Bad type (saw 7400000000000001 != exp
        3000000000000000) for mfn 1d7058 (pfn 18fa7)
        (XEN) mm.c:964:d80 Attempt to create linear p.t. with write perms
        (XEN) mm.c:2389:d80 Bad type (saw 7400000000000010 != exp
        1000000000000000) for mfn 1d2e04 (pfn 1d1fb)
        (XEN) mm.c:2965:d80 Error while pinning mfn 1d2e04
    
    Which means the domain tried to map a pagetable page RW, which would
    allow it to map arbitrary memory, so Xen stopped it.  This is because
    vm_unmap_ram() left some pages mapped in the vmalloc area after NFS had
    finished with them, and those pages got recycled as pagetable pages
    while still having these RW aliases.
    
    Removing those mappings immediately removes the Xen-visible aliases, and
    so it has no problem with those pages being reused as pagetable pages.
    Deferring the TLB flush doesn't upset Xen because it can flush the TLB
    itself as needed to maintain its invariants.
    
    When unmapping a region in the vmalloc space, clear the ptes
    immediately.  There's no point in deferring this because there's no
    amortization benefit.
    
    The TLBs are left dirty, and they are flushed lazily to amortize the
    cost of the IPIs.
    
    This specific motivation for this patch is an oops-causing regression
    since 2.6.36 when using NFS under Xen, triggered by the NFS client's use
    of vm_map_ram() introduced in 56e4ebf8 ("NFS: readdir with vmapped
    pages") .  XFS also uses vm_map_ram() and could cause similar problems.
    Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
    Cc: Nick Piggin <npiggin@kernel.dk>
    Cc: Bryan Schumaker <bjschuma@netapp.com>
    Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
    Cc: Alex Elder <aelder@sgi.com>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    64141da5
vmalloc.h 4.1 KB