1. 05 12月, 2017 1 次提交
    • D
      Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier" · ab9dbf77
      David Gibson 提交于
      This reverts commit a3b2cb30.
      
      That commit tried to fix problems with panic on powerpc in certain
      circumstances, where some output from the generic panic code was being
      dropped.
      
      Unfortunately, it breaks things worse in other circumstances. In
      particular when running a PAPR guest, it will now attempt to reboot
      instead of informing the hypervisor (KVM or PowerVM) that the guest
      has crashed. The crash notification is important to some
      virtualization management layers.
      
      Revert it for now until we can come up with a better solution.
      
      Fixes: a3b2cb30 ("powerpc: Do not call ppc_md.panic in fadump panic notifier")
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      [mpe: Tweak change log a bit]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ab9dbf77
  2. 31 8月, 2017 1 次提交
  3. 19 6月, 2017 1 次提交
  4. 20 4月, 2017 1 次提交
  5. 30 11月, 2016 1 次提交
  6. 20 9月, 2016 1 次提交
    • M
      powerpc: Remove all usages of NO_IRQ · ef24ba70
      Michael Ellerman 提交于
      NO_IRQ has been == 0 on powerpc for just over ten years (since commit
      0ebfff14 ("[POWERPC] Add new interrupt mapping core and change
      platforms to use it")). It's also 0 on most other arches.
      
      Although it's fairly harmless, every now and then it causes confusion
      when a driver is built on powerpc and another arch which doesn't define
      NO_IRQ. There's at least 6 definitions of NO_IRQ in drivers/, at least
      some of which are to work around that problem.
      
      So we'd like to remove it. This is fairly trivial in the arch code, we
      just convert:
      
          if (irq == NO_IRQ)	to	if (!irq)
          if (irq != NO_IRQ)	to	if (irq)
          irq = NO_IRQ;	to	irq = 0;
          return NO_IRQ;	to	return 0;
      
      And a few other odd cases as well.
      
      At least for now we keep the #define NO_IRQ, because there is driver
      code that uses NO_IRQ and the fixes to remove those will go via other
      trees.
      
      Note we also change some occurrences in PPC sound drivers, drivers/ps3,
      and drivers/macintosh.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ef24ba70
  7. 04 8月, 2016 1 次提交
    • M
      powerpc/mm: Move register_process_table() out of ppc_md · eea8148c
      Michael Ellerman 提交于
      We want to initialise register_process_table() before ppc_md is setup,
      so that it can be called as part of MMU init (at least on Radix ATM).
      
      That no longer works because probe_machine() requires that ppc_md be
      empty before it's called, and we now do probe_machine() much later.
      
      So make register_process_table a global for now. It will probably move
      into a mmu_radix_ops struct at some point in the future.
      
      This was broken by me when applying commit 7025776e "powerpc/mm:
      Move hash table ops to a separate structure" due to conflicts with other
      patches.
      
      Fixes: 7025776e ("powerpc/mm: Move hash table ops to a separate structure")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      eea8148c
  8. 21 7月, 2016 2 次提交
  9. 17 7月, 2016 1 次提交
  10. 14 7月, 2016 1 次提交
  11. 01 5月, 2016 1 次提交
  12. 01 3月, 2016 2 次提交
    • D
      powerpc/mm: Handle removing maybe-present bolted HPTEs · 27828f98
      David Gibson 提交于
      At the moment the hpte_removebolted callback in ppc_md returns void and
      will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
      place.  This is awkward for the case of cleaning up a mapping which was
      partially made before failing.
      
      So, we add a return value to hpte_removebolted, and have it return ENOENT
      in the case that the HPTE to remove didn't exist in the first place.
      
      In the (sole) caller, we propagate errors in hpte_removebolted to its
      caller to handle.  However, we handle ENOENT specially, continuing to
      complete the unmapping over the specified range before returning the error
      to the caller.
      
      This means that htab_remove_mapping() will work sanely on a partially
      present mapping, removing any HPTEs which are present, while also returning
      ENOENT to its caller in case it's important there.
      
      There are two callers of htab_remove_mapping():
         - In remove_section_mapping() we already WARN_ON() any error return,
           which is reasonable - in this case the mapping should be fully
           present
         - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
           just a WARN_ON() in the case of ENOENT, since failing to remove a
           mapping that wasn't there in the first place probably shouldn't be
           fatal.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      27828f98
    • A
      446957ba
  13. 09 10月, 2015 1 次提交
    • C
      powerpc: Fix checkstop in native_hpte_clear() with lockdep · fdf880a6
      Cyril Bur 提交于
      native_hpte_clear() is called in real mode from two places:
      - Early in boot during htab initialisation if firmware assisted dump is
        active.
      - Late in the kexec path.
      
      In both contexts there is no need to disable interrupts are they are
      already disabled. Furthermore, locking around the tlbie() is only required
      for pre POWER5 hardware.
      
      On POWER5 or newer hardware concurrent tlbie()s work as expected and on pre
      POWER5 hardware concurrent tlbie()s could result in deadlock. This code
      would only be executed at crashdump time, during which all bets are off,
      concurrent tlbie()s are unlikely and taking locks is unsafe therefore the
      best course of action is to simply do nothing. Concurrent tlbie()s are not
      possible in the first case as secondary CPUs have not come up yet.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fdf880a6
  14. 23 7月, 2015 1 次提交
    • P
      powerpc: Use hardware RNG for arch_get_random_seed_* not arch_get_random_* · 01c9348c
      Paul Mackerras 提交于
      The hardware RNG on POWER8 and POWER7+ can be relatively slow, since
      it can only supply one 64-bit value per microsecond.  Currently we
      read it in arch_get_random_long(), but that slows down reading from
      /dev/urandom since the code in random.c calls arch_get_random_long()
      for every longword read from /dev/urandom.
      
      Since the hardware RNG supplies high-quality entropy on every read, it
      matches the semantics of arch_get_random_seed_long() better than those
      of arch_get_random_long().  Therefore this commit makes the code use
      the POWER8/7+ hardware RNG only for arch_get_random_seed_{long,int}
      and not for arch_get_random_{long,int}.
      
      This won't affect any other PowerPC-based platforms because none of
      them currently support a hardware RNG.  To make it clear that the
      ppc_md function pointer is used for arch_get_random_seed_*, we rename
      it from get_random_long to get_random_seed.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      01c9348c
  15. 11 6月, 2015 1 次提交
    • A
      powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table · da004c36
      Alexey Kardashevskiy 提交于
      This adds a iommu_table_ops struct and puts pointer to it into
      the iommu_table struct. This moves tce_build/tce_free/tce_get/tce_flush
      callbacks from ppc_md to the new struct where they really belong to.
      
      This adds the requirement for @it_ops to be initialized before calling
      iommu_init_table() to make sure that we do not leave any IOMMU table
      with iommu_table_ops uninitialized. This is not a parameter of
      iommu_init_table() though as there will be cases when iommu_init_table()
      will not be called on TCE tables, for example - VFIO.
      
      This does s/tce_build/set/, s/tce_free/clear/ and removes "tce_"
      redundant prefixes.
      
      This removes tce_xxx_rm handlers from ppc_md but does not add
      them to iommu_table_ops as this will be done later if we decide to
      support TCE hypercalls in real mode. This removes _vm callbacks as
      only virtual mode is supported by now so this also removes @rm parameter.
      
      For pSeries, this always uses tce_buildmulti_pSeriesLP/
      tce_buildmulti_pSeriesLP. This changes multi callback to fall back to
      tce_build_pSeriesLP/tce_free_pSeriesLP if FW_FEATURE_MULTITCE is not
      present. The reason for this is we still have to support "multitce=off"
      boot parameter in disable_multitce() and we do not want to walk through
      all IOMMU tables in the system and replace "multi" callbacks with single
      ones.
      
      For powernv, this defines _ops per PHB type which are P5IOC2/IODA1/IODA2.
      This makes the callbacks for them public. Later patches will extend
      callbacks for IODA1/2.
      
      No change in behaviour is expected.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      da004c36
  16. 02 6月, 2015 1 次提交
  17. 11 4月, 2015 2 次提交
  18. 31 3月, 2015 2 次提交
    • W
      powerpc/powernv: Implement pcibios_iov_resource_alignment() on powernv · 5350ab3f
      Wei Yang 提交于
      Implement pcibios_iov_resource_alignment() on powernv platform.
      
      On PowerNV platform, there are 3 cases for the IOV BAR:
      1. initial state, the IOV BAR size is multiple times of VF BAR size
      2. after expanded, the IOV BAR size is expanded to meet the M64 segment size
      3. sizing stage, the IOV BAR is truncated to 0
      
      pnv_pci_iov_resource_alignment() handle these three cases respectively.
      
      [bhelgaas: adjust to drop "align" parameter, return pci_iov_resource_size()
      if no ppc_md machdep_call version]
      Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5350ab3f
    • W
      powerpc/powernv: Reserve additional space for IOV BAR according to the number of total_pe · 6e628c7d
      Wei Yang 提交于
      On PHB3, PF IOV BAR will be covered by M64 BAR to have better PE isolation.
      M64 BAR is a type of hardware resource in PHB3, which could map a range of
      MMIO to PE numbers on powernv platform. And this range is divided equally
      by the number of total_pe with each divided range mapping to a PE number.
      Also, the M64 BAR must map a MMIO range with power-of-two size.
      
      The total_pe number is usually different from total_VFs, which can lead to
      a conflict between MMIO space and the PE number.
      
      For example, if total_VFs is 128 and total_pe is 256, the second half of
      M64 BAR will be part of other PCI device, which may already belong to other
      PEs.
      
      This patch prevents the conflict by reserving additional space for the PF
      IOV BAR, which is total_pe number of VF's BAR size.
      
      [bhelgaas: make dev_printk() output more consistent, index resource[]
      conventionally]
      Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      6e628c7d
  19. 17 3月, 2015 1 次提交
  20. 05 12月, 2014 1 次提交
    • A
      powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault · aefa5688
      Aneesh Kumar K.V 提交于
      upatepp can get called for a nohpte fault when we find from the linux
      page table that the translation was hashed before. In that case
      we are sure that there is no existing translation, hence we could
      avoid doing tlbie.
      
      We could possibly race with a parallel fault filling the TLB. But
      that should be ok because updatepp is only ever relaxing permissions.
      We also look at linux pte permission bits when filling hash pte
      permission bits. We also hold the linux pte busy bits while
      inserting/updating a hashpte entry, hence a paralle update of
      linux pte is not possible. On the other hand mprotect involves
      ptep_modify_prot_start which cause a hpte invalidate and not updatepp.
      
      Performance number:
      We use randbox_access_bench written by Anton.
      
      Kernel with THP disabled and smaller hash page table size.
      
          86.60%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_updatepp
           2.10%  random_access_b  random_access_bench              [.] doit
           1.99%  random_access_b  [kernel.kallsyms]                [k] .do_raw_spin_lock
           1.85%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
           1.26%  random_access_b  [kernel.kallsyms]                [k] .native_flush_hash_range
           1.18%  random_access_b  [kernel.kallsyms]                [k] .__delay
           0.69%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
           0.37%  random_access_b  [kernel.kallsyms]                [k] .clear_user_page
           0.34%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
           0.32%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
           0.30%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
      
      With Fix:
      
          27.54%  random_access_b  random_access_bench              [.] doit
          22.90%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
           5.76%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
           5.20%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
           5.12%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
           4.80%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
           3.31%  random_access_b  [kernel.kallsyms]                [k] data_access_common
           1.84%  random_access_b  [kernel.kallsyms]                [k] .trace_hardirqs_on_caller
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      aefa5688
  21. 02 12月, 2014 1 次提交
  22. 05 11月, 2014 2 次提交
  23. 03 11月, 2014 1 次提交
    • A
      powerpc: Convert power off logic to pm_power_off · 9178ba29
      Alexander Graf 提交于
      The generic Linux framework to power off the machine is a function pointer
      called pm_power_off. The trick about this pointer is that device drivers can
      potentially implement it rather than board files.
      
      Today on powerpc we set pm_power_off to invoke our generic full machine power
      off logic which then calls ppc_md.power_off to invoke machine specific power
      off.
      
      However, when we want to add a power off GPIO via the "gpio-poweroff" driver,
      this card house falls apart. That driver only registers itself if pm_power_off
      is NULL to ensure it doesn't override board specific logic. However, since we
      always set pm_power_off to the generic power off logic (which will just not
      power off the machine if no ppc_md.power_off call is implemented), we can't
      implement power off via the generic GPIO power off driver.
      
      To fix this up, let's get rid of the ppc_md.power_off logic and just always use
      pm_power_off as was intended. Then individual drivers such as the GPIO power off
      driver can implement power off logic via that function pointer.
      
      With this patch set applied and a few patches on top of QEMU that implement a
      power off GPIO on the virt e500 machine, I can successfully turn off my virtual
      machine after halt.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      [mpe: Squash into one patch and update changelog based on cover letter]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9178ba29
  24. 02 10月, 2014 2 次提交
  25. 13 8月, 2014 1 次提交
  26. 05 8月, 2014 1 次提交
  27. 28 7月, 2014 1 次提交
  28. 05 6月, 2014 1 次提交
  29. 28 4月, 2014 2 次提交
    • G
      powerpc: powernv: Framework to show the correct clock in /proc/cpuinfo · 2299d03a
      Gautham R. Shenoy 提交于
      Currently, the code in setup-common.c for powerpc assumes that all
      clock rates are same in a smp system. This value is cached in the
      variable named ppc_proc_freq and is the value that is reported in
      /proc/cpuinfo.
      
      However on the PowerNV platform, the clock rate is same only across
      the threads of the same core. Hence the value that is reported in
      /proc/cpuinfo is incorrect on PowerNV platforms. We need a better way
      to query and report the correct value of the processor clock in
      /proc/cpuinfo.
      
      The patch achieves this by creating a machdep_call named
      get_proc_freq() which is expected to returns the frequency in Hz. The
      code in show_cpuinfo() can invoke this method to display the correct
      clock rate on platforms that have implemented this method. On the
      other powerpc platforms it can use the value cached in ppc_proc_freq.
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2299d03a
    • G
      powerpc/pci: Mask linkDown on resetting PCI bus · d92a208d
      Gavin Shan 提交于
      The problem was initially reported by Wendy who tried pass through
      IPR adapter, which was connected to PHB root port directly, to KVM
      based guest. When doing that, pci_reset_bridge_secondary_bus() was
      called by VFIO driver and linkDown was detected by the root port.
      That caused all PEs to be frozen.
      
      The patch fixes the issue by routing the reset for the secondary bus
      of root port to underly firmware. For that, one more weak function
      pci_reset_secondary_bus() is introduced so that the individual platforms
      can override that and do specific reset for bridge's secondary bus.
      Reported-by: NWendy Xiong <wenxiong@linux.vnet.ibm.com>
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d92a208d
  30. 07 3月, 2014 2 次提交
    • N
      powerpc/pseries: Use remove_memory() to remove memory · 9ac8cde9
      Nathan Fontenot 提交于
      The memory remove code for powerpc/pseries should call remove_memory()
      so that we are holding the hotplug_memory lock during memory remove
      operations.
      
      This patch updates the memory node remove handler to call remove_memory()
      and adds a ppc_md.remove_memory() entry to handle pseries specific work
      that is called from arch_remove_memory().
      
      During memory remove in pseries_remove_memblock() we have to stay with
      removing memory one section at a time. This is needed because of how memory
      resources are handled. During memory add for pseries (via the probe file in
      sysfs) we add memory one section at a time which gives us a memory resource
      for each section. Future patches will aim to address this so will not have
      to remove memory one section at a time.
      Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9ac8cde9
    • M
      powerpc/book3s: Recover from MC in sapphire on SCOM read via MMIO. · 55672ecf
      Mahesh Salgaonkar 提交于
      Detect and recover from machine check when inside opal on a special
      scom load instructions. On specific SCOM read via MMIO we may get a machine
      check exception with SRR0 pointing inside opal. To recover from MC
      in this scenario, get a recovery instruction address and return to it from
      MC.
      
      OPAL will export the machine check recoverable ranges through
      device tree node mcheck-recoverable-ranges under ibm,opal:
      
      # hexdump /proc/device-tree/ibm,opal/mcheck-recoverable-ranges
      0000000 0000 0000 3000 2804 0000 000c 0000 0000
      0000010 3000 2814 0000 0000 3000 27f0 0000 000c
      0000020 0000 0000 3000 2814 xxxx xxxx xxxx xxxx
      0000030 llll llll yyyy yyyy yyyy yyyy
      ...
      ...
      #
      
      where:
      	xxxx xxxx xxxx xxxx = Starting instruction address
      	llll llll           = Length of the address range.
      	yyyy yyyy yyyy yyyy = recovery address
      
      Each recoverable address range entry is (start address, len,
      recovery address), 2 cells each for start and recovery address, 1 cell for
      len, totalling 5 cells per entry. During kernel boot time, build up the
      recovery table with the list of recovery ranges from device-tree node which
      will be used during machine check exception to recover from MMIO SCOM UE.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      55672ecf
  31. 11 10月, 2013 2 次提交