1. 25 10月, 2015 1 次提交
  2. 23 9月, 2015 4 次提交
  3. 25 8月, 2015 1 次提交
  4. 07 6月, 2015 2 次提交
    • T
      x86/mm/pat: Add set_memory_wt() for Write-Through type · 623dffb2
      Toshi Kani 提交于
      Now that reserve_ram_pages_type() accepts the WT type, add
      set_memory_wt(), set_memory_array_wt() and set_pages_array_wt()
      in order to be able to set memory to Write-Through page cache
      mode.
      
      Also, extend ioremap_change_attr() to accept the WT type.
      Signed-off-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Elliott@hp.com
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arnd@arndb.de
      Cc: hch@lst.de
      Cc: hmh@hmh.eng.br
      Cc: jgross@suse.com
      Cc: konrad.wilk@oracle.com
      Cc: linux-mm <linux-mm@kvack.org>
      Cc: linux-nvdimm@lists.01.org
      Cc: stefan.bader@canonical.com
      Cc: yigal@plexistor.com
      Link: http://lkml.kernel.org/r/1433436928-31903-13-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      623dffb2
    • B
      x86/mm/pat: Remove pat_enabled() checks · 7202fdb1
      Borislav Petkov 提交于
      Now that we emulate a PAT table when PAT is disabled, there's no
      need for those checks anymore as the PAT abstraction will handle
      those cases too.
      
      Based on a conglomerate patch from Toshi Kani.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NToshi Kani <toshi.kani@hp.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Elliott@hp.com
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arnd@arndb.de
      Cc: hch@lst.de
      Cc: hmh@hmh.eng.br
      Cc: jgross@suse.com
      Cc: konrad.wilk@oracle.com
      Cc: linux-mm <linux-mm@kvack.org>
      Cc: linux-nvdimm@lists.01.org
      Cc: stefan.bader@canonical.com
      Cc: yigal@plexistor.com
      Link: http://lkml.kernel.org/r/1433436928-31903-4-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7202fdb1
  5. 03 6月, 2015 1 次提交
    • S
      x86/mm: Decouple <linux/vmalloc.h> from <asm/io.h> · d6472302
      Stephen Rothwell 提交于
      Nothing in <asm/io.h> uses anything from <linux/vmalloc.h>, so
      remove it from there and fix up the resulting build problems
      triggered on x86 {64|32}-bit {def|allmod|allno}configs.
      
      The breakages were triggering in places where x86 builds relied
      on vmalloc() facilities but did not include <linux/vmalloc.h>
      explicitly and relied on the implicit inclusion via <asm/io.h>.
      
      Also add:
      
        - <linux/init.h> to <linux/io.h>
        - <asm/pgtable_types> to <asm/io.h>
      
      ... which were two other implicit header file dependencies.
      Suggested-by: NDavid Miller <davem@davemloft.net>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      [ Tidied up the changelog. ]
      Acked-by: NDavid Miller <davem@davemloft.net>
      Acked-by: NTakashi Iwai <tiwai@suse.de>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Acked-by: NVinod Koul <vinod.koul@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Anton Vorontsov <anton@enomsg.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Colin Cross <ccross@android.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: James E.J. Bottomley <JBottomley@odin.com>
      Cc: Jaroslav Kysela <perex@perex.cz>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Suma Ramars <sramars@cisco.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d6472302
  6. 27 5月, 2015 1 次提交
    • L
      x86/mm/pat: Wrap pat_enabled into a function API · cb32edf6
      Luis R. Rodriguez 提交于
      We use pat_enabled in x86-specific code to see if PAT is enabled
      or not but we're granting full access to it even though readers
      do not need to set it. If, for instance, we granted access to it
      to modules later they then could override the variable
      setting... no bueno.
      
      This renames pat_enabled to a new static variable __pat_enabled.
      Folks are redirected to use pat_enabled() now.
      
      Code that sets this can only be internal to pat.c. Apart from
      the early kernel parameter "nopat" to disable PAT, we also have
      a few cases that disable it later and make use of a helper
      pat_disable(). It is wrapped under an ifdef but since that code
      cannot run unless PAT was enabled its not required to wrap it
      with ifdefs, unwrap that. Likewise, since "nopat" doesn't really
      change non-PAT systems just remove that ifdef as well.
      
      Although we could add and use an early_param_off(), these
      helpers don't use __read_mostly but we want to keep
      __read_mostly for __pat_enabled as this is a hot path -- upon
      boot, for instance, a simple guest may see ~4k accesses to
      pat_enabled(). Since __read_mostly early boot params are not
      that common we don't add a helper for them just yet.
      Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Walls <awalls@md.metrocast.net>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kyle McMartin <kyle@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1430425520-22275-3-git-send-email-mcgrof@do-not-panic.com
      Link: http://lkml.kernel.org/r/1432628901-18044-13-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cb32edf6
  7. 11 5月, 2015 3 次提交
    • D
      x86/mm/pageattr: Remove an unused variable in slow_virt_to_phys() · 1fcb61c5
      Dexuan Cui 提交于
      The patch doesn't change any logic.
      Signed-off-by: NDexuan Cui <decui@microsoft.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1429776428-4475-1-git-send-email-decui@microsoft.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1fcb61c5
    • L
      x86/mm: Add ioremap_uc() helper to map memory uncacheable (not UC-) · e4b6be33
      Luis R. Rodriguez 提交于
      ioremap_nocache() currently uses UC- by default. Our goal is to
      eventually make UC the default. Linux maps UC- to PCD=1, PWT=0
      page attributes on non-PAT systems. Linux maps UC to PCD=1,
      PWT=1 page attributes on non-PAT systems. On non-PAT and PAT
      systems a WC MTRR has different effects on pages with either of
      these attributes. In order to help with a smooth transition its
      best to enable use of UC (PCD,1, PWT=1) on a region as that
      ensures a WC MTRR will have no effect on a region, this however
      requires us to have an way to declare a region as UC and we
      currently do not have a way to do this.
      
        WC MTRR on non-PAT system with PCD=1, PWT=0 (UC-) yields WC.
        WC MTRR on non-PAT system with PCD=1, PWT=1 (UC)  yields UC.
      
        WC MTRR on PAT system with PCD=1, PWT=0 (UC-) yields WC.
        WC MTRR on PAT system with PCD=1, PWT=1 (UC)  yields UC.
      
      A flip of the default ioremap_nocache() behaviour from UC- to UC
      can therefore regress a memory region from effective memory type
      WC to UC if MTRRs are used. Use of MTRRs should be phased out
      and in the best case only arch_phys_wc_add() use will remain,
      even if this happens arch_phys_wc_add() will have an effect on
      non-PAT systems and changes to default ioremap_nocache()
      behaviour could regress drivers.
      
      Now, ideally we'd use ioremap_nocache() on the regions in which
      we'd need uncachable memory types and avoid any MTRRs on those
      regions. There are however some restrictions on MTRRs use, such
      as the requirement of having the base and size of variable sized
      MTRRs to be powers of two, which could mean having to use a WC
      MTRR over a large area which includes a region in which
      write-combining effects are undesirable.
      
      Add ioremap_uc() to help with the both phasing out of MTRR use
      and also provide a way to blacklist small WC undesirable regions
      in devices with mixed regions which are size-implicated to use
      large WC MTRRs. Use of ioremap_uc() helps phase out MTRR use by
      avoiding regressions with an eventual flip of default behaviour
      or ioremap_nocache() from UC- to UC.
      
      Drivers working with WC MTRRs can use the below table to review
      and consider the use of ioremap*() and similar helpers to ensure
      appropriate behaviour long term even if default
      ioremap_nocache() behaviour changes from UC- to UC.
      
      Although ioremap_uc() is being added we leave set_memory_uc() to
      use UC- as only initial memory type setup is required to be able
      to accommodate existing device drivers and phase out MTRR use.
      It should also be clarified that set_memory_uc() cannot be used
      with IO memory, even though its use will not return any errors,
      it really has no effect.
      
        ----------------------------------------------------------------------
        MTRR Non-PAT   PAT    Linux ioremap value        Effective memory type
        ----------------------------------------------------------------------
                                                          Non-PAT |  PAT
             PAT
             |PCD
             ||PWT
             |||
        WC   000      WB      _PAGE_CACHE_MODE_WB            WC   |   WC
        WC   001      WC      _PAGE_CACHE_MODE_WC            WC*  |   WC
        WC   010      UC-     _PAGE_CACHE_MODE_UC_MINUS      WC*  |   WC
        WC   011      UC      _PAGE_CACHE_MODE_UC            UC   |   UC
        ----------------------------------------------------------------------
      Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Antonino Daplas <adaplas@gmail.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Travis <travis@sgi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Ville Syrjälä <syrjala@sci.fi>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-fbdev@vger.kernel.org
      Link: http://lkml.kernel.org/r/1430343851-967-2-git-send-email-mcgrof@do-not-panic.com
      Link: http://lkml.kernel.org/r/1431332153-18566-9-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e4b6be33
    • R
      x86/mm: Do not flush last cacheline twice in clflush_cache_range() · 6c434d61
      Ross Zwisler 提交于
      The current algorithm used in clflush_cache_range() can cause
      the last cache line of the buffer to be flushed twice. Fix that
      algorithm so that each cache line will only be flushed once.
      Reported-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Link: http://lkml.kernel.org/r/1430259192-18802-1-git-send-email-ross.zwisler@linux.intel.com
      Link: http://lkml.kernel.org/r/1431332153-18566-5-git-send-email-bp@alien8.de
      [ Changed it to 'void *' to simplify the type conversions. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6c434d61
  8. 05 3月, 2015 1 次提交
    • L
      x86/mm: Simplify enabling direct_gbpages · e5008abe
      Luis R. Rodriguez 提交于
      direct_gbpages can be force enabled as an early parameter
      but not really have taken effect when DEBUG_PAGEALLOC
      or KMEMCHECK is enabled. You can also enable direct_gbpages
      right now if you have an x86_64 architecture but your CPU
      doesn't really have support for this feature. In both cases
      PG_LEVEL_1G won't actually be enabled but direct_gbpages is used
      in other areas under the assumptions that PG_LEVEL_1G
      was set. Fix this by putting together all requirements
      which make this feature sensible to enable under, and only
      enable both finally flipping on PG_LEVEL_1G and leaving
      PG_LEVEL_1G set when this is true.
      
      We only enable this feature then to be possible on sensible
      builds defined by the new ENABLE_DIRECT_GBPAGES. If the
      CPU has support for it you can either enable this by using
      the DIRECT_GBPAGES option or using the early kernel parameter.
      If a platform had support for this you can always force disable
      it as well.
      Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: JBeulich@suse.com
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Lindgren <tony@atomide.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: julia.lawall@lip6.fr
      Link: http://lkml.kernel.org/r/1425518654-3403-3-git-send-email-mcgrof@do-not-panic.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e5008abe
  9. 28 2月, 2015 1 次提交
  10. 14 12月, 2014 1 次提交
    • J
      mm/debug-pagealloc: make debug-pagealloc boottime configurable · 031bc574
      Joonsoo Kim 提交于
      Now, we have prepared to avoid using debug-pagealloc in boottime.  So
      introduce new kernel-parameter to disable debug-pagealloc in boottime, and
      makes related functions to be disabled in this case.
      
      Only non-intuitive part is change of guard page functions.  Because guard
      page is effective only if debug-pagealloc is enabled, turning off
      according to debug-pagealloc is reasonable thing to do.
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Jungsoo Son <jungsoo.son@lge.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      031bc574
  11. 04 12月, 2014 1 次提交
  12. 16 11月, 2014 4 次提交
  13. 29 10月, 2014 1 次提交
  14. 14 3月, 2014 1 次提交
  15. 05 3月, 2014 2 次提交
  16. 28 2月, 2014 1 次提交
  17. 02 11月, 2013 8 次提交
  18. 12 4月, 2013 1 次提交
  19. 11 4月, 2013 1 次提交
  20. 10 4月, 2013 1 次提交
  21. 24 2月, 2013 2 次提交
    • A
      x86/mm/pageattr: Prevent PSE and GLOABL leftovers to confuse pmd/pte_present and pmd_huge · a8aed3e0
      Andrea Arcangeli 提交于
      Without this patch any kernel code that reads kernel memory in
      non present kernel pte/pmds (as set by pageattr.c) will crash.
      
      With this kernel code:
      
      static struct page *crash_page;
      static unsigned long *crash_address;
      [..]
      	crash_page = alloc_pages(GFP_KERNEL, 9);
      	crash_address = page_address(crash_page);
      	if (set_memory_np((unsigned long)crash_address, 1))
      		printk("set_memory_np failure\n");
      [..]
      
      The kernel will crash if inside the "crash tool" one would try
      to read the memory at the not present address.
      
      crash> p crash_address
      crash_address = $8 = (long unsigned int *) 0xffff88023c000000
      crash> rd 0xffff88023c000000
      [ *lockup* ]
      
      The lockup happens because _PAGE_GLOBAL and _PAGE_PROTNONE
      shares the same bit, and pageattr leaves _PAGE_GLOBAL set on a
      kernel pte which is then mistaken as _PAGE_PROTNONE (so
      pte_present returns true by mistake and the kernel fault then
      gets confused and loops).
      
      With THP the same can happen after we taught pmd_present to
      check _PAGE_PROTNONE and _PAGE_PSE in commit
      027ef6c8 ("mm: thp: fix pmd_present for
      split_huge_page and PROT_NONE with THP").  THP has the same
      problem with _PAGE_GLOBAL as the 4k pages, but it also has a
      problem with _PAGE_PSE, which must be cleared too.
      
      After the patch is applied copy_user correctly returns -EFAULT
      and doesn't lockup anymore.
      
      crash> p crash_address
      crash_address = $9 = (long unsigned int *) 0xffff88023c000000
      crash> rd 0xffff88023c000000
      rd: read error: kernel virtual address: ffff88023c000000  type:
      "64-bit KVADDR"
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Shaohua Li <shaohua.li@intel.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a8aed3e0
    • W
      memory-hotplug: common APIs to support page tables hot-remove · ae9aae9e
      Wen Congyang 提交于
      When memory is removed, the corresponding pagetables should alse be
      removed.  This patch introduces some common APIs to support vmemmap
      pagetable and x86_64 architecture direct mapping pagetable removing.
      
      All pages of virtual mapping in removed memory cannot be freed if some
      pages used as PGD/PUD include not only removed memory but also other
      memory.  So this patch uses the following way to check whether a page
      can be freed or not.
      
      1) When removing memory, the page structs of the removed memory are
         filled with 0FD.
      
      2) All page structs are filled with 0xFD on PT/PMD, PT/PMD can be
         cleared.  In this case, the page used as PT/PMD can be freed.
      
      For direct mapping pages, update direct_pages_count[level] when we freed
      their pagetables.  And do not free the pages again because they were
      freed when offlining.
      
      For vmemmap pages, free the pages and their pagetables.
      
      For larger pages, do not split them into smaller ones because there is
      no way to know if the larger page has been split.  As a result, there is
      no way to decide when to split.  We deal the larger pages in the
      following way:
      
      1) For direct mapped pages, all the pages were freed when they were
         offlined.  And since menmory offline is done section by section, all
         the memory ranges being removed are aligned to PAGE_SIZE.  So only need
         to deal with unaligned pages when freeing vmemmap pages.
      
      2) For vmemmap pages being used to store page_struct, if part of the
         larger page is still in use, just fill the unused part with 0xFD.  And
         when the whole page is fulfilled with 0xFD, then free the larger page.
      
      [akpm@linux-foundation.org: fix typo in comment]
      [tangchen@cn.fujitsu.com: do not calculate direct mapping pages when freeing vmemmap pagetables]
      [tangchen@cn.fujitsu.com: do not free direct mapping pages twice]
      [tangchen@cn.fujitsu.com: do not free page split from hugepage one by one]
      [tangchen@cn.fujitsu.com: do not split pages when freeing pagetable pages]
      [akpm@linux-foundation.org: use pmd_page_vaddr()]
      [akpm@linux-foundation.org: fix used-uninitialised bug]
      Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Signed-off-by: NJianguo Wu <wujianguo@huawei.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Wu Jianguo <wujianguo@huawei.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ae9aae9e
  22. 26 1月, 2013 1 次提交
    • D
      x86, mm: Create slow_virt_to_phys() · d7656534
      Dave Hansen 提交于
      This is necessary because __pa() does not work on some kinds of
      memory, like vmalloc() or the alloc_remap() areas on 32-bit
      NUMA systems.  We have some functions to do conversions _like_
      this in the vmalloc() code (like vmalloc_to_page()), but they
      do not work on sizes other than 4k pages.  We would potentially
      need to be able to handle all the page sizes that we use for
      the kernel linear mapping (4k, 2M, 1G).
      
      In practice, on 32-bit NUMA systems, the percpu areas get stuck
      in the alloc_remap() area.  Any __pa() call on them will break
      and basically return garbage.
      
      This patch introduces a new function slow_virt_to_phys(), which
      walks the kernel page tables on x86 and should do precisely
      the same logical thing as __pa(), but actually work on a wider
      range of memory.  It should work on the normal linear mapping,
      vmalloc(), kmap(), etc...
      Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/20130122212433.4D1FCA62@kernel.stglabs.ibm.comAcked-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      d7656534