1. 12 3月, 2016 8 次提交
  2. 05 3月, 2016 1 次提交
  3. 03 3月, 2016 4 次提交
  4. 02 3月, 2016 1 次提交
  5. 01 3月, 2016 4 次提交
    • D
      powerpc/mm: Clean up memory hotplug failure paths · 1dace6c6
      David Gibson 提交于
      This makes a number of cleanups to handling of mapping failures during
      memory hotplug on Power:
      
      For errors creating the linear mapping for the hot-added region:
        * This is now reported with EFAULT which is more appropriate than the
          previous EINVAL (the failure is unlikely to be related to the
          function's parameters)
        * An error in this path now prints a warning message, rather than just
          silently failing to add the extra memory.
        * Previously a failure here could result in the region being partially
          mapped.  We now clean up any partial mapping before failing.
      
      For errors creating the vmemmap for the hot-added region:
         * This is now reported with EFAULT instead of causing a BUG() - this
           could happen for external reason (e.g. full hash table) so it's better
           to handle this non-fatally
         * An error message is also printed, so the failure won't be silent
         * As above a failure could cause a partially mapped region, we now
           clean this up. [mpe: move htab_remove_mapping() out of #ifdef
           CONFIG_MEMORY_HOTPLUG to enable this]
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1dace6c6
    • D
      powerpc/mm: Handle removing maybe-present bolted HPTEs · 27828f98
      David Gibson 提交于
      At the moment the hpte_removebolted callback in ppc_md returns void and
      will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
      place.  This is awkward for the case of cleaning up a mapping which was
      partially made before failing.
      
      So, we add a return value to hpte_removebolted, and have it return ENOENT
      in the case that the HPTE to remove didn't exist in the first place.
      
      In the (sole) caller, we propagate errors in hpte_removebolted to its
      caller to handle.  However, we handle ENOENT specially, continuing to
      complete the unmapping over the specified range before returning the error
      to the caller.
      
      This means that htab_remove_mapping() will work sanely on a partially
      present mapping, removing any HPTEs which are present, while also returning
      ENOENT to its caller in case it's important there.
      
      There are two callers of htab_remove_mapping():
         - In remove_section_mapping() we already WARN_ON() any error return,
           which is reasonable - in this case the mapping should be fully
           present
         - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
           just a WARN_ON() in the case of ENOENT, since failing to remove a
           mapping that wasn't there in the first place probably shouldn't be
           fatal.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      27828f98
    • D
      powerpc/mm: Clean up error handling for htab_remove_mapping · abd0a0e7
      David Gibson 提交于
      Currently, the only error that htab_remove_mapping() can report is -EINVAL,
      if removal of bolted HPTEs isn't implemeted for this platform.  We make
      a few clean ups to the handling of this:
      
       * EINVAL isn't really the right code - there's nothing wrong with the
         function's arguments - use ENODEV instead
       * We were also printing a warning message, but that's a decision better
         left up to the callers, so remove it
       * One caller is vmemmap_remove_mapping(), which will just BUG_ON() on
         error, making the warning message redundant, so no change is needed
         there.
       * The other caller is remove_section_mapping().  This is called in the
         memory hot remove path at a point after vmemmap_remove_mapping() so
         if hpte_removebolted isn't implemented, we'd expect to have already
         BUG()ed anyway.  Put a WARN_ON() here, in lieu of a printk() since this
         really shouldn't be happening.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      abd0a0e7
    • A
      446957ba
  6. 29 2月, 2016 3 次提交
  7. 27 2月, 2016 2 次提交
  8. 22 2月, 2016 1 次提交
    • A
      powerpc/mm/hash: Clear the invalid slot information correctly · 9ab3ac23
      Aneesh Kumar K.V 提交于
      We can get a hash pte fault with 4k base page size and find the pte
      already inserted with 64K base page size. In that case we need to clear
      the existing slot information from the old pte. Fix this correctly
      
      With THP, we also clear the slot information with respect to all
      the 64K hash pte mapping that 16MB page. They are all invalid
      now. This make sure we don't find the slot valid when we fault with
      4k base page size. Finding the slot valid should not result in any wrong
      behavior because we do check again in hash page table for the validity.
      But we can avoid that check completely.
      
      Fixes: a43c0eb8 ("powerpc/mm: Convert 4k hash insert to C")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9ab3ac23
  9. 15 2月, 2016 1 次提交
    • A
      powerpc/mm: Fix Multi hit ERAT cause by recent THP update · c777e2a8
      Aneesh Kumar K.V 提交于
      With ppc64 we use the deposited pgtable_t to store the hash pte slot
      information. We should not withdraw the deposited pgtable_t without
      marking the pmd none. This ensure that low level hash fault handling
      will skip this huge pte and we will handle them at upper levels.
      
      Recent change to pmd splitting changed the above in order to handle the
      race between pmd split and exit_mmap. The race is explained below.
      
      Consider following race:
      
      		CPU0				CPU1
      shrink_page_list()
        add_to_swap()
          split_huge_page_to_list()
            __split_huge_pmd_locked()
              pmdp_huge_clear_flush_notify()
      	// pmd_none() == true
      					exit_mmap()
      					  unmap_vmas()
      					    zap_pmd_range()
      					      // no action on pmd since pmd_none() == true
      	pmd_populate()
      
      As result the THP will not be freed. The leak is detected by check_mm():
      
      	BUG: Bad rss-counter state mm:ffff880058d2e580 idx:1 val:512
      
      The above required us to not mark pmd none during a pmd split.
      
      The fix for ppc is to clear the huge pte of _PAGE_USER, so that low
      level fault handling code skip this pte. At higher level we do take ptl
      lock. That should serialze us against the pmd split. Once the lock is
      acquired we do check the pmd again using pmd_same. That should always
      return false for us and hence we should retry the access. We do the
      pmd_same check in all case after taking plt with
      THP (do_huge_pmd_wp_page, do_huge_pmd_numa_page and
      huge_pmd_set_accessed)
      
      Also make sure we wait for irq disable section in other cpus to finish
      before flipping a huge pte entry with a regular pmd entry. Code paths
      like find_linux_pte_or_hugepte depend on irq disable to get
      a stable pte_t pointer. A parallel thp split need to make sure we
      don't convert a pmd pte to a regular pmd entry without waiting for the
      irq disable section to finish.
      
      Fixes: eef1b3ba ("thp: implement split_huge_pmd()")
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c777e2a8
  10. 25 1月, 2016 1 次提交
  11. 16 1月, 2016 3 次提交
  12. 09 1月, 2016 1 次提交
  13. 27 12月, 2015 1 次提交
  14. 23 12月, 2015 1 次提交
  15. 19 12月, 2015 1 次提交
  16. 14 12月, 2015 7 次提交