1. 24 1月, 2013 1 次提交
  2. 09 10月, 2012 1 次提交
    • D
      mm: Add and use update_mmu_cache_pmd() in transparent huge page code. · b113da65
      David Miller 提交于
      The transparent huge page code passes a PMD pointer in as the third
      argument of update_mmu_cache(), which expects a PTE pointer.
      
      This never got noticed because X86 implements update_mmu_cache() as a
      macro and thus we don't get any type checking, and X86 is the only
      architecture which supports transparent huge pages currently.
      
      Before other architectures can support transparent huge pages properly we
      need to add a new interface which will take a PMD pointer as the third
      argument rather than a PTE pointer.
      
      [akpm@linux-foundation.org: implement update_mm_cache_pmd() for s390]
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b113da65
  3. 27 10月, 2010 1 次提交
  4. 21 10月, 2010 1 次提交
  5. 19 8月, 2010 1 次提交
    • J
      x86-32: Separate 1:1 pagetables from swapper_pg_dir · fd89a137
      Joerg Roedel 提交于
      This patch fixes machine crashes which occur when heavily exercising the
      CPU hotplug codepaths on a 32-bit kernel. These crashes are caused by
      AMD Erratum 383 and result in a fatal machine check exception. Here's
      the scenario:
      
      1. On 32-bit, the swapper_pg_dir page table is used as the initial page
      table for booting a secondary CPU.
      
      2. To make this work, swapper_pg_dir needs a direct mapping of physical
      memory in it (the low mappings). By adding those low, large page (2M)
      mappings (PAE kernel), we create the necessary conditions for Erratum
      383 to occur.
      
      3. Other CPUs which do not participate in the off- and onlining game may
      use swapper_pg_dir while the low mappings are present (when leave_mm is
      called). For all steps below, the CPU referred to is a CPU that is using
      swapper_pg_dir, and not the CPU which is being onlined.
      
      4. The presence of the low mappings in swapper_pg_dir can result
      in TLB entries for addresses below __PAGE_OFFSET to be established
      speculatively. These TLB entries are marked global and large.
      
      5. When the CPU with such TLB entry switches to another page table, this
      TLB entry remains because it is global.
      
      6. The process then generates an access to an address covered by the
      above TLB entry but there is a permission mismatch - the TLB entry
      covers a large global page not accessible to userspace.
      
      7. Due to this permission mismatch a new 4kb, user TLB entry gets
      established. Further, Erratum 383 provides for a small window of time
      where both TLB entries are present. This results in an uncorrectable
      machine check exception signalling a TLB multimatch which panics the
      machine.
      
      There are two ways to fix this issue:
      
              1. Always do a global TLB flush when a new cr3 is loaded and the
              old page table was swapper_pg_dir. I consider this a hack hard
              to understand and with performance implications
      
              2. Do not use swapper_pg_dir to boot secondary CPUs like 64-bit
              does.
      
      This patch implements solution 2. It introduces a trampoline_pg_dir
      which has the same layout as swapper_pg_dir with low_mappings. This page
      table is used as the initial page table of the booting CPU. Later in the
      bringup process, it switches to swapper_pg_dir and does a global TLB
      flush. This fixes the crashes in our test cases.
      
      -v2: switch to swapper_pg_dir right after entering start_secondary() so
      that we are able to access percpu data which might not be mapped in the
      trampoline page table.
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      LKML-Reference: <20100816123833.GB28147@aftab>
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      fd89a137
  6. 30 3月, 2010 1 次提交
    • T
      x86: don't include slab.h from arch/x86/include/asm/pgtable_32.h · 57f4c226
      Tejun Heo 提交于
      Including slab.h from x86 pgtable_32.h creates a troublesome
      dependency chain w/ ftrace enabled.  The following chain leads to
      inclusion of pgtable_32.h from define_trace.h.
      
       trace/define_trace.h
       trace/ftrace.h
       linux/ftrace_event.h
       linux/ring_buffer.h
       linux/mm.h
       asm/pgtable.h
       asm/pgtable_32.h
      
      slab.h itself defines trace hooks via
      
       linux/sl[aou]b_def.h
       linux/kmemtrace.h
       trace/events/kmem.h
      
      If slab.h is not included before define_trace.h is included, this
      leads to duplicate definitions of kmemtrace hooks or other include
      dependency problems.
      
      pgtable_32.h doesn't need slab.h to begin with.  Don't include it from
      there.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Acked-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      57f4c226
  7. 28 2月, 2010 1 次提交
  8. 21 2月, 2010 1 次提交
    • R
      MM: Pass a PTE pointer to update_mmu_cache() rather than the PTE itself · 4b3073e1
      Russell King 提交于
      On VIVT ARM, when we have multiple shared mappings of the same file
      in the same MM, we need to ensure that we have coherency across all
      copies.  We do this via make_coherent() by making the pages
      uncacheable.
      
      This used to work fine, until we allowed highmem with highpte - we
      now have a page table which is mapped as required, and is not available
      for modification via update_mmu_cache().
      
      Ralf Beache suggested getting rid of the PTE value passed to
      update_mmu_cache():
      
        On MIPS update_mmu_cache() calls __update_tlb() which walks pagetables
        to construct a pointer to the pte again.  Passing a pte_t * is much
        more elegant.  Maybe we might even replace the pte argument with the
        pte_t?
      
      Ben Herrenschmidt would also like the pte pointer for PowerPC:
      
        Passing the ptep in there is exactly what I want.  I want that
        -instead- of the PTE value, because I have issue on some ppc cases,
        for I$/D$ coherency, where set_pte_at() may decide to mask out the
        _PAGE_EXEC.
      
      So, pass in the mapped page table pointer into update_mmu_cache(), and
      remove the PTE value, updating all implementations and call sites to
      suit.
      
      Includes a fix from Stephen Rothwell:
      
        sparc: fix fallout from update_mmu_cache API change
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      4b3073e1
  9. 15 6月, 2009 2 次提交
    • P
      x86: Add NMI types for kmap_atomic, fix · 0990b1c6
      Peter Zijlstra 提交于
      I just realized this has a kmap_atomic bug in...
      
      The below would fix it - but it's complicating this code
      some more.
      
      Alternatively I would have to introduce something like
      pte_offset_map_irq() which would make the irq/nmi detection and leave
      the regular code paths alone, however that would mean either duplicating
      the gup_fast() pagewalk or passing down a pte function pointer, which
      would only duplicate the gup_pte_range() bit, neither is really
      attractive ...
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Nick Piggin <npiggin@suse.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0990b1c6
    • P
      x86: Add NMI types for kmap_atomic · 3ff0141a
      Peter Zijlstra 提交于
      Two new kmap_atomic slots for NMI context. And teach pte_offset_map()
      about NMI context.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Nick Piggin <npiggin@suse.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3ff0141a
  10. 15 3月, 2009 1 次提交
  11. 12 2月, 2009 2 次提交
  12. 07 2月, 2009 13 次提交
  13. 19 12月, 2008 1 次提交
  14. 23 10月, 2008 2 次提交
  15. 21 8月, 2008 1 次提交
    • D
      i386: vmalloc size fix · e621bd18
      Dave Young 提交于
      Booting kernel with vmalloc=[any size<=16m] will oops on my pc (i386/1G memory).
      
      BUG_ON in arch/x86/mm/init_32.c triggered:
      BUG_ON((unsigned long)high_memory > VMALLOC_START);
      
      It's due to the vm area hole.
      
      In include/asm-x86/pgtable_32.h:
      #define VMALLOC_OFFSET	(8 * 1024 * 1024)
      #define VMALLOC_START	(((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) \
      			 & ~(VMALLOC_OFFSET - 1))
      
      There's several related point:
      1. MAXMEM :
       (-__PAGE_OFFSET - __VMALLOC_RESERVE).
      The space after VMALLOC_END is included as well, I set it to
      (VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE)
      
      2. VMALLOC_OFFSET is not considered in __VMALLOC_RESERVE
      fixed by adding VMALLOC_OFFSET to it.
      
      3. VMALLOC_START :
       (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) & ~(VMALLOC_OFFSET - 1))
      So it's not always 8M, bigger than 8M possible.
      I set it to ((unsigned long)high_memory + VMALLOC_OFFSET)
      
      4. the VMALLOC_RESERVE is an unused macro, so remove it here.
      Signed-off-by: NDave Young <hidave.darkstar@gmail.com>
      Cc: akpm@linux-foundation.org
      Cc: hidave.darkstar@gmail.com
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      e621bd18
  16. 23 7月, 2008 2 次提交
  17. 22 7月, 2008 2 次提交
  18. 16 7月, 2008 1 次提交
  19. 08 7月, 2008 1 次提交
  20. 20 5月, 2008 1 次提交
  21. 07 5月, 2008 1 次提交
    • H
      x86: fix PAE pmd_bad bootup warning · aeed5fce
      Hugh Dickins 提交于
      Fix warning from pmd_bad() at bootup on a HIGHMEM64G HIGHPTE x86_32.
      
      That came from 9fc34113 x86: debug pmd_bad();
      but we understand now that the typecasting was wrong for PAE in the previous
      version: pagetable pages above 4GB looked bad and stopped Arjan from booting.
      
      And revert that cded932b x86: fix pmd_bad
      and pud_bad to support huge pages.  It was the wrong way round: we shouldn't
      weaken every pmd_bad and pud_bad check to let huge pages slip through - in
      part they check that we _don't_ have a huge page where it's not expected.
      
      Put the x86 pmd_bad() and pud_bad() definitions back to what they have long
      been: they can be improved (x86_32 should use PTE_MASK, to stop PAE thinking
      junk in the upper word is good; and x86_64 should follow x86_32's stricter
      comparison, to stop thinking any subset of required bits is good); but that
      should be a later patch.
      
      Fix Hans' good observation that follow_page() will never find pmd_huge()
      because that would have already failed the pmd_bad test: test pmd_huge in
      between the pmd_none and pmd_bad tests.  Tighten x86's pmd_huge() check?
      No, once it's a hugepage entry, it can get quite far from a good pmd: for
      example, PROT_NONE leaves it with only ACCESSED of the KERN_PGTABLE bits.
      
      However... though follow_page() contains this and another test for huge
      pages, so it's nice to keep it working on them, where does it actually get
      called on a huge page?  get_user_pages() checks is_vm_hugetlb_page(vma) to
      to call alternative hugetlb processing, as does unmap_vmas() and others.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Earlier-version-tested-by: NIngo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jeff Chua <jeff.chua.linux@gmail.com>
      Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aeed5fce
  22. 26 4月, 2008 1 次提交
  23. 25 4月, 2008 1 次提交