1. 09 9月, 2017 1 次提交
    • Z
      mm: thp: enable thp migration in generic path · 616b8371
      Zi Yan 提交于
      Add thp migration's core code, including conversions between a PMD entry
      and a swap entry, setting PMD migration entry, removing PMD migration
      entry, and waiting on PMD migration entries.
      
      This patch makes it possible to support thp migration.  If you fail to
      allocate a destination page as a thp, you just split the source thp as
      we do now, and then enter the normal page migration.  If you succeed to
      allocate destination thp, you enter thp migration.  Subsequent patches
      actually enable thp migration for each caller of page migration by
      allowing its get_new_page() callback to allocate thps.
      
      [zi.yan@cs.rutgers.edu: fix gcc-4.9.0 -Wmissing-braces warning]
        Link: http://lkml.kernel.org/r/A0ABA698-7486-46C3-B209-E95A9048B22C@cs.rutgers.edu
      [akpm@linux-foundation.org: fix x86_64 allnoconfig warning]
      Signed-off-by: NZi Yan <zi.yan@cs.rutgers.edu>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Nellans <dnellans@nvidia.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      616b8371
  2. 10 3月, 2017 1 次提交
  3. 25 2月, 2017 1 次提交
  4. 18 3月, 2016 1 次提交
  5. 12 2月, 2016 2 次提交
  6. 16 1月, 2016 2 次提交
  7. 15 1月, 2016 1 次提交
  8. 17 10月, 2015 3 次提交
  9. 25 6月, 2015 2 次提交
  10. 13 2月, 2015 2 次提交
  11. 30 8月, 2014 1 次提交
  12. 19 12月, 2013 2 次提交
    • R
      mm: fix TLB flush race between migration, and change_protection_range · 20841405
      Rik van Riel 提交于
      There are a few subtle races, between change_protection_range (used by
      mprotect and change_prot_numa) on one side, and NUMA page migration and
      compaction on the other side.
      
      The basic race is that there is a time window between when the PTE gets
      made non-present (PROT_NONE or NUMA), and the TLB is flushed.
      
      During that time, a CPU may continue writing to the page.
      
      This is fine most of the time, however compaction or the NUMA migration
      code may come in, and migrate the page away.
      
      When that happens, the CPU may continue writing, through the cached
      translation, to what is no longer the current memory location of the
      process.
      
      This only affects x86, which has a somewhat optimistic pte_accessible.
      All other architectures appear to be safe, and will either always flush,
      or flush whenever there is a valid mapping, even with no permissions
      (SPARC).
      
      The basic race looks like this:
      
      CPU A			CPU B			CPU C
      
      						load TLB entry
      make entry PTE/PMD_NUMA
      			fault on entry
      						read/write old page
      			start migrating page
      			change PTE/PMD to new page
      						read/write old page [*]
      flush TLB
      						reload TLB from new entry
      						read/write new page
      						lose data
      
      [*] the old page may belong to a new user at this point!
      
      The obvious fix is to flush remote TLB entries, by making sure that
      pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
      still be accessible if there is a TLB flush pending for the mm.
      
      This should fix both NUMA migration and compaction.
      
      [mgorman@suse.de: fix build]
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      20841405
    • M
      mm: clear pmd_numa before invalidating · 67f87463
      Mel Gorman 提交于
      On x86, PMD entries are similar to _PAGE_PROTNONE protection and are
      handled as NUMA hinting faults.  The following two page table protection
      bits are what defines them
      
      	_PAGE_NUMA:set	_PAGE_PRESENT:clear
      
      A PMD is considered present if any of the _PAGE_PRESENT, _PAGE_PROTNONE,
      _PAGE_PSE or _PAGE_NUMA bits are set.  If pmdp_invalidate encounters a
      pmd_numa, it clears the present bit leaving _PAGE_NUMA which will be
      considered not present by the CPU but present by pmd_present.  The
      existing caller of pmdp_invalidate should handle it but it's an
      inconsistent state for a PMD.  This patch keeps the state consistent
      when calling pmdp_invalidate.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      67f87463
  13. 15 11月, 2013 2 次提交
    • K
      mm: convert the rest to new page table lock api · c4088ebd
      Kirill A. Shutemov 提交于
      Only trivial cases left. Let's convert them altogether.
      Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: NAlex Thorlton <athorlton@sgi.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robin Holt <robinmholt@gmail.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c4088ebd
    • K
      mm, thp: do not access mm->pmd_huge_pte directly · c389a250
      Kirill A. Shutemov 提交于
      Currently mm->pmd_huge_pte protected by page table lock.  It will not
      work with split lock.  We have to have per-pmd pmd_huge_pte for proper
      access serialization.
      
      For now, let's just introduce wrapper to access mm->pmd_huge_pte.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: NAlex Thorlton <athorlton@sgi.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robin Holt <robinmholt@gmail.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c389a250
  14. 12 9月, 2013 1 次提交
  15. 20 6月, 2013 1 次提交
  16. 11 12月, 2012 2 次提交
    • R
      mm: Only flush the TLB when clearing an accessible pte · 8d1acce4
      Rik van Riel 提交于
      If ptep_clear_flush() is called to clear a page table entry that is
      accessible anyway by the CPU, eg. a _PAGE_PROTNONE page table entry,
      there is no need to flush the TLB on remote CPUs.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-vm3rkzevahelwhejx5uwm8ex@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8d1acce4
    • R
      mm,generic: only flush the local TLB in ptep_set_access_flags · cef23d9d
      Rik van Riel 提交于
      The function ptep_set_access_flags is only ever used to upgrade
      access permissions to a page. That means the only negative side
      effect of not flushing remote TLBs is that other CPUs may incur
      spurious page faults, if they happen to access the same address,
      and still have a PTE with the old permissions cached in their
      TLB.
      
      Having another CPU maybe incur a spurious page fault is faster
      than always incurring the cost of a remote TLB flush, so replace
      the remote TLB flush with a purely local one.
      
      This should be safe on every architecture that correctly
      implements flush_tlb_fix_spurious_fault() to actually invalidate
      the local TLB entry that caused a page fault, as well as on
      architectures where the hardware invalidates TLB entries that
      cause page faults.
      
      In the unlikely event that you are hitting what appears to be
      an infinite loop of page faults, and 'git bisect' took you to
      this changeset, your architecture needs to implement
      flush_tlb_fix_spurious_fault to actually flush the TLB entry.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      cef23d9d
  17. 09 10月, 2012 2 次提交
    • G
      thp: introduce pmdp_invalidate() · 46dcde73
      Gerald Schaefer 提交于
      On s390, a valid page table entry must not be changed while it is attached
      to any CPU.  So instead of pmd_mknotpresent() and set_pmd_at(), an IDTE
      operation would be necessary there.  This patch introduces the
      pmdp_invalidate() function, to allow architecture-specific
      implementations.
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      46dcde73
    • G
      thp: remove assumptions on pgtable_t type · e3ebcf64
      Gerald Schaefer 提交于
      The thp page table pre-allocation code currently assumes that pgtable_t is
      of type "struct page *".  This may not be true for all architectures, so
      this patch removes that assumption by replacing the functions
      prepare_pmd_huge_pte() and get_pmd_huge_pte() with two new functions that
      can be defined architecture-specific.
      
      It also removes two VM_BUG_ON checks for page_count() and page_mapcount()
      operating on a pgtable_t.  Apart from the VM_BUG_ON removal, there will be
      no functional change introduced by this patch.
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e3ebcf64
  18. 26 5月, 2012 1 次提交
    • C
      arch/tile: allow building Linux with transparent huge pages enabled · 73636b1a
      Chris Metcalf 提交于
      The change adds some infrastructure for managing tile pmd's more generally,
      using pte_pmd() and pmd_pte() methods to translate pmd values to and
      from ptes, since on TILEPro a pmd is really just a nested structure
      holding a pgd (aka pte).  Several existing pmd methods are moved into
      this framework, and a whole raft of additional pmd accessors are defined
      that are used by the transparent hugepage framework.
      
      The tile PTE now has a "client2" bit.  The bit is used to indicate a
      transparent huge page is in the process of being split into subpages.
      
      This change also fixes a generic bug where the return value of the
      generic pmdp_splitting_flush() was incorrect.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      73636b1a
  19. 22 3月, 2012 1 次提交
  20. 26 1月, 2011 1 次提交
    • A
      mm/pgtable-generic.c: fix CONFIG_SWAP=n build · f95ba941
      Andrew Morton 提交于
      mips (and sparc32):
      
        In file included from arch/mips/include/asm/tlb.h:21,
                         from mm/pgtable-generic.c:9:
        include/asm-generic/tlb.h: In function `tlb_flush_mmu':
        include/asm-generic/tlb.h:76: error: implicit declaration of function `release_pages'
        include/asm-generic/tlb.h: In function `tlb_remove_page':
        include/asm-generic/tlb.h:105: error: implicit declaration of function `page_cache_release'
      
      free_pages_and_swap_cache() and free_page_and_swap_cache() are macros
      which call release_pages() and page_cache_release().  The obvious fix is
      to include pagemap.h in swap.h, where those macros are defined.  But that
      breaks sparc for weird reasons.
      
      So fix it within mm/pgtable-generic.c instead.
      Reported-by: NYoichi Yuasa <yuasa@linux-mips.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: NSam Ravnborg <sam@ravnborg.org>
      Cc: Sergei Shtylyov <sshtylyov@mvista.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f95ba941
  21. 17 1月, 2011 1 次提交
    • A
      fix non-x86 build failure in pmdp_get_and_clear · b3697c02
      Andrea Arcangeli 提交于
      pmdp_get_and_clear/pmdp_clear_flush/pmdp_splitting_flush were trapped as
      BUG() and they were defined only to diminish the risk of build issues on
      not-x86 archs and to be consistent with the generic pte methods previously
      defined in include/asm-generic/pgtable.h.
      
      But they are causing more trouble than they were supposed to solve, so
      it's simpler not to define them when THP is off.
      
      This is also correcting the export of pmdp_splitting_flush which is
      currently unused (x86 isn't using the generic implementation in
      mm/pgtable-generic.c and no other arch needs that [yet]).
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Sam Ravnborg <sam@ravnborg.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3697c02
  22. 14 1月, 2011 1 次提交