1. 22 2月, 2016 1 次提交
    • G
      powerpc/eeh: Fix partial hotplug criterion · f6bf0fa1
      Gavin Shan 提交于
      During error recovery, the device could be removed as part of the
      partial hotplug. The criterion used to come with partial hotplug
      is: if the device driver provides error_detected(), slot_reset()
      and resume() callbacks, it's immune from hotplug. Otherwise,
      it's going to experience partial hotplug during EEH recovery. But
      the criterion isn't correct enough: mlx4_core driver for Mellanox
      adapters provides error_detected(), slot_reset() callbacks, but
      resume() isn't there. Those Mellanox adapters won't be to involved
      in the partial hotplug.
      
      This fixes the criterion to a practical one: adpater with driver
      that provides error_detected(), slot_reset() will be immune from
      partial hotplug. resume() isn't mandatory.
      
      Fixes: f2da4ccf ("powerpc/eeh: More relaxed hotplug criterion")
      Cc: stable@vger.kernel.org #v4.4+
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f6bf0fa1
  2. 17 2月, 2016 1 次提交
  3. 15 2月, 2016 4 次提交
    • A
      powerpc/mm: Fix Multi hit ERAT cause by recent THP update · c777e2a8
      Aneesh Kumar K.V 提交于
      With ppc64 we use the deposited pgtable_t to store the hash pte slot
      information. We should not withdraw the deposited pgtable_t without
      marking the pmd none. This ensure that low level hash fault handling
      will skip this huge pte and we will handle them at upper levels.
      
      Recent change to pmd splitting changed the above in order to handle the
      race between pmd split and exit_mmap. The race is explained below.
      
      Consider following race:
      
      		CPU0				CPU1
      shrink_page_list()
        add_to_swap()
          split_huge_page_to_list()
            __split_huge_pmd_locked()
              pmdp_huge_clear_flush_notify()
      	// pmd_none() == true
      					exit_mmap()
      					  unmap_vmas()
      					    zap_pmd_range()
      					      // no action on pmd since pmd_none() == true
      	pmd_populate()
      
      As result the THP will not be freed. The leak is detected by check_mm():
      
      	BUG: Bad rss-counter state mm:ffff880058d2e580 idx:1 val:512
      
      The above required us to not mark pmd none during a pmd split.
      
      The fix for ppc is to clear the huge pte of _PAGE_USER, so that low
      level fault handling code skip this pte. At higher level we do take ptl
      lock. That should serialze us against the pmd split. Once the lock is
      acquired we do check the pmd again using pmd_same. That should always
      return false for us and hence we should retry the access. We do the
      pmd_same check in all case after taking plt with
      THP (do_huge_pmd_wp_page, do_huge_pmd_numa_page and
      huge_pmd_set_accessed)
      
      Also make sure we wait for irq disable section in other cpus to finish
      before flipping a huge pte entry with a regular pmd entry. Code paths
      like find_linux_pte_or_hugepte depend on irq disable to get
      a stable pte_t pointer. A parallel thp split need to make sure we
      don't convert a pmd pte to a regular pmd entry without waiting for the
      irq disable section to finish.
      
      Fixes: eef1b3ba ("thp: implement split_huge_pmd()")
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c777e2a8
    • G
      powerpc/powernv: Fix stale PE primary bus · 1bc74f1c
      Gavin Shan 提交于
      When PCI bus is unplugged during full hotplug for EEH recovery,
      the platform PE instance (struct pnv_ioda_pe) isn't released and
      it dereferences the stale PCI bus that has been released. It leads
      to kernel crash when referring to the stale PCI bus.
      
      This fixes the issue by correcting the PE's primary bus when it's
      oneline at plugging time, in pnv_pci_dma_bus_setup() which is to
      be called by pcibios_fixup_bus().
      
      Cc: stable@vger.kernel.org # v4.1+
      Reported-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reported-by: NPradipta Ghosh <pradghos@in.ibm.com>
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Tested-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1bc74f1c
    • G
      powerpc/eeh: Fix stale cached primary bus · 05ba75f8
      Gavin Shan 提交于
      When PE is created, its primary bus is cached to pe->bus. At later
      point, the cached primary bus is returned from eeh_pe_bus_get().
      However, we could get stale cached primary bus and run into kernel
      crash in one case: full hotplug as part of fenced PHB error recovery
      releases all PCI busses under the PHB at unplugging time and recreate
      them at plugging time. pe->bus is still dereferencing the PCI bus
      that was released.
      
      This adds another PE flag (EEH_PE_PRI_BUS) to represent the validity
      of pe->bus. pe->bus is updated when its first child EEH device is
      online and the flag is set. Before unplugging in full hotplug for
      error recovery, the flag is cleared.
      
      Fixes: 8cdb2833 ("powerpc/eeh: Trace PCI bus from PE")
      Cc: stable@vger.kernel.org #v3.11+
      Reported-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reported-by: NPradipta Ghosh <pradghos@in.ibm.com>
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Tested-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      05ba75f8
    • D
      powerpc/pseries: Don't trace hcalls on offline CPUs · 126df08c
      Denis Kirjanov 提交于
      If a cpu is hotplugged while the hcall trace points are active, it's
      possible to hit a warning from RCU due to the trace points calling into
      RCU from an offline cpu, eg:
      
        RCU used illegally from offline CPU!
        rcu_scheduler_active = 1, debug_locks = 1
      
      Make the hypervisor tracepoints conditional by using
      TRACE_EVENT_FN_COND.
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      126df08c
  4. 08 2月, 2016 1 次提交
    • A
      powerpc: Fix dedotify for binutils >= 2.26 · f15838e9
      Andreas Schwab 提交于
      Since binutils 2.26 BFD is doing suffix merging on STRTAB sections.  But
      dedotify modifies the symbol names in place, which can also modify
      unrelated symbols with a name that matches a suffix of a dotted name.  To
      remove the leading dot of a symbol name we can just increment the pointer
      into the STRTAB section instead.
      
      Backport to all stables to avoid breakage when people update their
      binutils - mpe.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NAndreas Schwab <schwab@linux-m68k.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f15838e9
  5. 31 1月, 2016 1 次提交
    • A
      powerpc/book3s_32: Fix build error with checkpoint restart · 19f97c98
      Aneesh Kumar K.V 提交于
      In file included from mm/vmscan.c:54:0:
      include/linux/swapops.h: In function ‘pte_to_swp_entry’:
      include/linux/swapops.h:69:2: error: implicit declaration of function ‘pte_swp_soft_dirty’ [-Werror=implicit-function-declaration]
        if (pte_swp_soft_dirty(pte))
        ^
      include/linux/swapops.h:70:3: error: implicit declaration of function ‘pte_swp_clear_soft_dirty’ [-Werror=implicit-function-declaration]
         pte = pte_swp_clear_soft_dirty(pte);
      
      We support soft dirty tracking only with book3s 64 for now.
      So change the Kconfig dependency accordingly. Also CHECKPOINT_RESTORE
      feature is not really dependent on SOFT_DIRTY. We track the dependency
      between MEM_SOFT_DIRTY and ARCH_SOFT_DIRTY through headers
      
      Fixes: 7207f436 ("powerpc/mm: Add page soft dirty tracking")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      19f97c98
  6. 28 1月, 2016 2 次提交
    • A
      powerpc/mm: Fixup _HPAGE_CHG_MASK · 2d19fc63
      Aneesh Kumar K.V 提交于
      This was wrongly updated by commit 7aa9a23c ("powerpc, thp: remove
      infrastructure for handling splitting PMDs") during the last merge
      window. Fix it up.
      
      This could lead to incorrect behaviour in THP and/or mprotect(), at a
      minimum.
      
      Fixes: 7aa9a23c ("powerpc, thp: remove infrastructure for handling splitting PMDs")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2d19fc63
    • M
      powerpc/perf: Remove PPMU_HAS_SSLOT flag for Power8 · 370f06c8
      Madhavan Srinivasan 提交于
      Commit 7a786832 ("powerpc/perf: Add an explict flag indicating
      presence of SLOT field") introduced the PPMU_HAS_SSLOT flag to remove
      the assumption that MMCRA[SLOT] was present when PPMU_ALT_SIPR was not
      set.
      
      That commit's changelog also mentions that Power8 does not support
      MMCRA[SLOT]. However when the Power8 PMU support was merged, it
      errnoeously included the PPMU_HAS_SSLOT flag.
      
      So remove PPMU_HAS_SSLOT from the Power8 flags.
      
      mpe: On systems where MMCRA[SLOT] exists, the field occupies bits 37:39
      (IBM numbering). On Power8 bit 37 is reserved, and 38:39 overlap with
      the high bits of the Threshold Event Counter Mantissa. I am not aware of
      any published events which use the threshold counting mechanism, which
      would cause the mantissa bits to be set. So in practice this bug is
      unlikely to trigger.
      
      Fixes: e05b9b9e ("powerpc/perf: Power8 PMU support")
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      370f06c8
  7. 27 1月, 2016 1 次提交
  8. 25 1月, 2016 1 次提交
  9. 21 1月, 2016 10 次提交
  10. 20 1月, 2016 18 次提交