1. 04 3月, 2008 1 次提交
  2. 01 3月, 2008 1 次提交
    • H
      x86: fix pmd_bad and pud_bad to support huge pages · cded932b
      Hans Rosenfeld 提交于
      I recently stumbled upon a problem in the support for huge pages. If a
      program using huge pages does not explicitly unmap them, they remain
      mapped (and therefore, are lost) after the program exits.
      
      I observed that the free huge page count in /proc/meminfo decreased when
      running my program, and it did not increase after the program exited.
      After running the program a few times, no more huge pages could be
      allocated.
      
      The reason for this seems to be that the x86 pmd_bad and pud_bad
      consider pmd/pud entries having the PSE bit set invalid. I think there
      is nothing wrong with this bit being set, it just indicates that the
      lowest level of translation has been reached. This bit has to be (and
      is) checked after the basic validity of the entry has been checked, like
      in this fragment from follow_page() in mm/memory.c:
      
        if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd)))
                goto no_page_table;
      
        if (pmd_huge(*pmd)) {
                BUG_ON(flags & FOLL_GET);
                page = follow_huge_pmd(mm, address, pmd, flags & FOLL_WRITE);
                goto out;
        }
      
      Note that this code currently doesn't work as intended if the pmd refers
      to a huge page, the pmd_huge() check can not be reached if the page is
      huge.
      
      Extending pmd_bad() (and, for future 1GB page support, pud_bad()) to
      allow for the PSE bit being set fixes this. For similar reasons,
      allowing the NX bit being set is necessary, too. I have seen huge pages
      having the NX bit set in their pmd entry, which would cause the same
      problem.
      Signed-Off-By: NHans Rosenfeld <hans.rosenfeld@amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cded932b
  3. 10 2月, 2008 1 次提交
    • I
      x86: construct 32-bit boot time page tables in native format. · 551889a6
      Ian Campbell 提交于
      Specifically the boot time page tables in a CONFIG_X86_PAE=y enabled
      kernel are in PAE format.
      
      early_ioremap is updated to use the standard page table accessors.
      
      Clear any mappings beyond max_low_pfn from the boot page tables in
      native_pagetable_setup_start because the initial mappings can extend
      beyond the range of physical memory and into the vmalloc area.
      
      Derived from patches by Eric Biederman and H. Peter Anvin.
      
      [ jeremy@goop.org: PAE swapper_pg_dir needs to be page-sized fix ]
      Signed-off-by: NIan Campbell <ijc@hellion.org.uk>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Mika Penttilä <mika.penttila@kolumbus.fi>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      551889a6
  4. 06 2月, 2008 1 次提交
  5. 04 2月, 2008 1 次提交
  6. 30 1月, 2008 16 次提交
  7. 20 10月, 2007 1 次提交
  8. 17 10月, 2007 1 次提交
  9. 11 10月, 2007 1 次提交
  10. 18 7月, 2007 2 次提交
  11. 17 7月, 2007 2 次提交
  12. 17 6月, 2007 2 次提交
  13. 13 5月, 2007 1 次提交
  14. 09 5月, 2007 2 次提交
  15. 08 5月, 2007 2 次提交
  16. 03 5月, 2007 5 次提交
    • Z
      [PATCH] i386: pte simplify ops · 9e5e3162
      Zachary Amsden 提交于
      Add comment and condense code to make use of native_local_ptep_get_and_clear
      function.  Also, it turns out the 2-level and 3-level paging definitions were
      identical, so move the common definition into pgtable.h
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      9e5e3162
    • Z
      [PATCH] i386: pte clear optimization · c2c1accd
      Zachary Amsden 提交于
      When exiting from an address space, no special hypervisor notification of page
      table updates needs to occur; direct page table hypervisors, such as Xen,
      switch to another address space first (init_mm) and unprotects the page tables
      to avoid the cost of trapping to the hypervisor for each pte_clear.  Shadow
      mode hypervisors, such as VMI and lhype don't need to do the extra work of
      calling through paravirt-ops, and can just directly clear the page table
      entries without notifiying the hypervisor, since all the page tables are about
      to be freed.
      
      So introduce native_pte_clear functions which bypass any paravirt-ops
      notification.  This results in a significant performance win for VMI and
      removes some indirect calls from zap_pte_range.
      
      Note the 3-level paging already had a native_pte_clear function, thus
      demanding argument conformance and extra args for the 2-level definition.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      c2c1accd
    • J
      [PATCH] i386: PARAVIRT: drop unused ptep_get_and_clear · 4cdd9c89
      Jeremy Fitzhardinge 提交于
      In shadow mode hypervisors, ptep_get_and_clear achieves the desired
      purpose of keeping the shadows in sync by issuing a native_get_and_clear,
      followed by a call to pte_update, which indicates the PTE has been
      modified.
      
      Direct mode hypervisors (Xen) have no need for this anyway, and will trap
      the update using writable pagetables.
      
      This means no hypervisor makes use of ptep_get_and_clear; there is no
      reason to have it in the paravirt-ops structure.  Change confusing
      terminology about raw vs. native functions into consistent use of
      native_pte_xxx for operations which do not invoke paravirt-ops.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      4cdd9c89
    • J
      [PATCH] i386: PARAVIRT: add kmap_atomic_pte for mapping highpte pages · ce6234b5
      Jeremy Fitzhardinge 提交于
      Xen and VMI both have special requirements when mapping a highmem pte
      page into the kernel address space.  These can be dealt with by adding
      a new kmap_atomic_pte() function for mapping highptes, and hooking it
      into the paravirt_ops infrastructure.
      
      Xen specifically wants to map the pte page RO, so this patch exposes a
      helper function, kmap_atomic_prot, which maps the page with the
      specified page protections.
      
      This also adds a kmap_flush_unused() function to clear out the cached
      kmap mappings.  Xen needs this to clear out any potential stray RW
      mappings of pages which will become part of a pagetable.
      
      [ Zach - vmi.c will need some attention after this patch.  It wasn't
        immediately obvious to me what needs to be done. ]
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Zachary Amsden <zach@vmware.com>
      ce6234b5
    • J
      [PATCH] i386: PARAVIRT: revert map_pt_hook. · a27fe809
      Jeremy Fitzhardinge 提交于
      Back out the map_pt_hook to clear the way for kmap_atomic_pte.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Zachary Amsden <zach@vmware.com>
      a27fe809