提交 · 4dea2d1a927c61114a168d4509b56329ea6effb7 · openanolis / cloud-kernel

31 8月, 2017 8 次提交

powerpc/powernv/vas: Define vas_init() and vas_exit() · 4dea2d1a

由 Sukadev Bhattiprolu 提交于 8月 28, 2017

Implement vas_init() and vas_exit() functions for a new VAS module.
This VAS module is essentially a library for other device drivers
and kernel users of the NX coprocessors like NX-842 and NX-GZIP.
In the future this will be extended to add support for user space
to access the NX coprocessors.

VAS is currently only supported with 64K page size.
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4dea2d1a

powerpc/powernv/vas: Define macros, register fields and structures · 96768914

由 Sukadev Bhattiprolu 提交于 8月 28, 2017

Define macros for the VAS hardware registers and bit-fields as well
as couple of data structures needed by the VAS driver.
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
[mpe: Fixup include guard to use _ASM_POWERPC_VAS_H]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

96768914

powerpc/eeh: Remove unnecessary config_addr from eeh_dev · 405b33a7

由 Alexey Kardashevskiy 提交于 8月 29, 2017

The eeh_dev struct hold a config space address of an associated node
and the very same address is also stored in the pci_dn struct which
is always present during the eeh_dev lifetime.

This uses bus:devfn directly from pci_dn instead of cached and packed
config_addr.

Since config_addr is made from device's bus:dev.fn, there is no point
in keeping it in the debugfs either so remove that too.
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

405b33a7

powerpc/eeh: Remove unnecessary pointer to phb from eeh_dev · 69672bd7

由 Alexey Kardashevskiy 提交于 8月 29, 2017

The eeh_dev struct already holds a pointer to pci_dn which it does not
exist without and pci_dn itself holds the very same pointer so just
use it.
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

69672bd7

powerpc/eeh: Reduce to one the number of places where edev is allocated · 8bae6a23

由 Alexey Kardashevskiy 提交于 8月 29, 2017

arch/powerpc/kernel/eeh_dev.c:57 is the only legit place where edev
is allocated; other 2 places allocate it on stack and in the heap for
a very short period of time to use eeh_pe_get() as takes edev.

This changes eeh_pe_get() to receive required parameters explicitly.

This removes unnecessary temporary allocation of edev.

This uses the "pe_no" name instead of the "pe_config_addr" name as
it actually is a PE number and not a config space address as it seemed.
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8bae6a23

powerpc/powernv: Use kernel crash path for machine checks · 6fcd6baa

由 Nicholas Piggin 提交于 7月 19, 2017

There are quite a few machine check exceptions that can be caused by
kernel bugs. To make debugging easier, use the kernel crash path in
cases of synchronous machine checks that occur in kernel mode, if that
would not result in the machine going straight to panic or crash dump.

There is a downside here that die()ing the process in kernel mode can
still leave the system unstable. panic_on_oops will always force the
system to fail-stop, so systems where that behaviour is important will
still do the right thing.

As a test, when triggering an i-side 0111b error (ifetch from foreign
address) in kernel mode process context on POWER9, the kernel currently
dies quickly like this:

  Severe Machine check interrupt [Not recovered]
    NIP [ffff000000000000]: 0xffff000000000000
    Initiator: CPU
    Error type: Real address [Instruction fetch (foreign)]
  [  127.426651616,0] OPAL: Reboot requested due to Platform error.
      Effective[  127.426693712,3] OPAL: Reboot requested due to Platform error. address: ffff000000000000
  opal: Reboot type 1 not supported
  Kernel panic - not syncing: PowerNV Unrecovered Machine Check
  CPU: 56 PID: 4425 Comm: syscall Tainted: G   M            4.12.0-rc1-13857-ga4700a26-dirty #35
  Call Trace:
  [  128.017988928,4] IPMI: BUG: Dropping ESEL on the floor due to
    buggy/mising code in OPAL for this BMC
    Rebooting in 10 seconds..
  Trying to free IRQ 496 from IRQ context!

After this patch, the process is killed and the kernel continues with
this message, which gives enough information to identify the offending
branch (i.e., with CFAR):

  Severe Machine check interrupt [Not recovered]
    NIP [ffff000000000000]: 0xffff000000000000
    Initiator: CPU
    Error type: Real address [Instruction fetch (foreign)]
      Effective address: ffff000000000000
  Oops: Machine check, sig: 7 [#1]
  SMP NR_CPUS=2048
  NUMA
  PowerNV
  Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 ...
  CPU: 22 PID: 4436 Comm: syscall Tainted: G   M            4.12.0-rc1-13857-ga4700a26-dirty #36
  task: c000000932300000 task.stack: c000000932380000
  NIP: ffff000000000000 LR: 00000000217706a4 CTR: ffff000000000000
  REGS: c00000000fc8fd80 TRAP: 0200   Tainted: G   M             (4.12.0-rc1-13857-ga4700a26-dirty)
  MSR: 90000000001c1003 <SF,HV,ME,RI,LE>
    CR: 24000484  XER: 20000000
  CFAR: c000000000004c80 DAR: 0000000021770a90 DSISR: 0a000000 SOFTE: 1
  GPR00: 0000000000001ebe 00007fffce4818b0 0000000021797f00 0000000000000000
  GPR04: 00007fff8007ac24 0000000044000484 0000000000004000 00007fff801405e8
  GPR08: 900000000280f033 0000000024000484 0000000000000000 0000000000000030
  GPR12: 9000000000001003 00007fff801bc370 0000000000000000 0000000000000000
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR28: 00007fff801b0000 0000000000000000 00000000217707a0 00007fffce481918
  NIP [ffff000000000000] 0xffff000000000000
  LR [00000000217706a4] 0x217706a4
  Call Trace:
  Instruction dump:
  XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
  XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6fcd6baa

powerpc/powernv: Flush console before platform error reboot · b746e3e0

由 Nicholas Piggin 提交于 7月 19, 2017

Unrecovered MCE and HMI errors are sent through a special restart OPAL
call to log the platform error. The downside is that they don't go
through normal Linux crash paths, so they don't give much information
to the Linux console.

Change this by providing a special crash function which does some of
the console flushing from the panic() path before calling firmware to
reboot.

The downside of this is a little more code to execute before reaching
the firmware reboot. However in practice, it's critical to get the
Linux console messages output in order to debug a problem. So this is
a desirable tradeoff.

Note on the implementation: It is difficult to plumb a custom reboot
handler into the panic path, because panic does a little bit too much
work. For example, it will try to delay with the timebase, but that
may be corrupted in some cases resulting in a hang without reaching
the platform reboot. Another problem is that panic can invoke the
crash dump code which is not what we want in the case of a hardware
platform error. Long-term the best solution will be to rework the
panic path so it can be suitable for this kind of panic, but for now
we just duplicate a bit of the code.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b746e3e0

powerpc/powernv: powernv platform is not constrained by RMA · 76b42e28

由 Nicholas Piggin 提交于 8月 13, 2017

Remove incorrect comment about real mode address restrictions on
powernv (bare metal), and unnecessary clamping to ppc64_rma_size.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

76b42e28

28 8月, 2017 1 次提交

powerpc/powernv: Fix build error in opal-imc.c when NUMA=n · 6538ac30

由 LABBE Corentin 提交于 8月 16, 2017

When building a random powerpc kernel I hit this build error:

  arch/powerpc/platforms/powernv/opal-imc.c:130:13: error : assignment
  discards « const » qualifier from pointer target type
  [-Werror=discarded-qualifiers]
     l_cpumask = cpumask_of_node(nid);
             ^

This happens because when CONFIG_NUMA=n cpumask_of_node() returns a
const pointer.

This patch simply adds const to l_cpumask to fix this issue.
Signed-off-by: NCorentin Labbe <clabbe.montjoie@gmail.com>
Reviewed-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Flesh out change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6538ac30

24 8月, 2017 1 次提交

powerpc/powernv: Enable removal of memory for in memory tracing · 9d5171a8

由 Rashmica Gupta 提交于 6月 01, 2017

The hardware trace macro feature requires access to a chunk of real
memory. This patch provides a debugfs interface to do this. By
writing an integer containing the size of memory to be unplugged into
/sys/kernel/debug/powerpc/memtrace/enable, the code will attempt to
remove that much memory from the end of each NUMA node.

This patch also adds additional debugsfs files for each node that
allows the tracer to interact with the removed memory, as well as
a trace file that allows userspace to read the generated trace.

Note that this patch does not invoke the hardware trace macro, it
only allows memory to be removed during runtime for the trace macro
to utilise.
Signed-off-by: NRashmica Gupta <rashmica.g@gmail.com>
[mpe: Minor formatting etc fixups]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9d5171a8

23 8月, 2017 1 次提交

powerpc: Convert to using %pOF instead of full_name · b7c670d6

由 Rob Herring 提交于 8月 21, 2017

Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.
Signed-off-by: NRob Herring <robh@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: Scott Wood <oss@buserror.net>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linuxppc-dev@lists.ozlabs.org
Reviewed-by: NTyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b7c670d6

17 8月, 2017 1 次提交

powerpc: Add const to bin_attribute structures · 8bfa42ab

由 Bhumika Goyal 提交于 8月 02, 2017

Declare bin_attribute structures as const as they are only passed as an
argument to the function sysfs_create_bin_file. This argument is of
type const, so declare the structure as const.
Signed-off-by: NBhumika Goyal <bhumirks@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8bfa42ab

10 8月, 2017 3 次提交

powerpc/powernv: Add support to clear sensor groups data · bf957155

由 Shilpasri G Bhat 提交于 8月 10, 2017

Adds support for clearing different sensor groups. OCC inband sensor
groups like CSM, Profiler, Job Scheduler can be cleared using this
driver. The min/max of all sensors belonging to these sensor groups
will be cleared.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bf957155

powerpc/powernv: Add support to set power-shifting-ratio · 8e84b2d1

由 Shilpasri G Bhat 提交于 8月 10, 2017

This patch adds support to set power-shifting-ratio which hints the
firmware how to distribute/throttle power between different entities
in a system (e.g CPU v/s GPU). This ratio is used by OCC for power
capping algorithm.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8e84b2d1

powerpc/powernv: Add support for powercap framework · cb8b340d

由 Shilpasri G Bhat 提交于 8月 10, 2017

Adds a generic powercap framework to change the system powercap
inband through OPAL-OCC command/response interface.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

cb8b340d

08 8月, 2017 3 次提交

powerpc/powernv/idle: Disable LOSE_FULL_CONTEXT states when stop-api fails · 785a12af

由 Gautham R. Shenoy 提交于 8月 08, 2017

Currently, we use the opal call opal_slw_set_reg() to inform the
Sleep-Winkle Engine (SLW) to restore the contents of some of the
Hypervisor state on wakeup from deep idle states that lose full
hypervisor context (characterized by the flag
OPAL_PM_LOSE_FULL_CONTEXT).

However, the current code has a bug in that if opal_slw_set_reg()
fails, we don't disable the use of these deep states (winkle on
POWER8, stop4 onwards on POWER9).

This patch fixes this bug by ensuring that if programing the
sleep-winkle engine to restore the hypervisor states in
pnv_save_sprs_for_deep_states() fails, then we exclude such states by
clearing the OPAL_PM_LOSE_FULL_CONTEXT flag from
supported_cpuidle_states. As a result POWER8 will be prevented from
using winkle for CPU-Hotplug, and POWER9 will put the offlined CPUs to
the default stop state when available.

Further, we ensure in the initialization of the cpuidle-powernv driver
to only include those states whose flags are present in
supported_cpuidle_states, thereby skipping OPAL_PM_LOSE_FULL_CONTEXT
states when they have been disabled due to stop-api failure.

Fixes: 1e1601b3 ("powerpc/powernv/idle: Restore SPRs for deep idle
states via stop API.")
Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

785a12af

powerpc/powernv: Use darn instruction for get_random_seed() on Power9 · e66ca3db

由 Matt Brown 提交于 8月 04, 2017

This adds powernv_get_random_darn() which utilises the darn instruction,
introduced in ISA v3.0/POWER9.

The darn instruction can potentially return an error, which is supported
by the get_random_seed() API, in normal usage if we see an error we just
return that to the caller.

However when detecting whether darn is functional at boot we try up to
10 times, before deciding that darn doesn't work and failing the
registration of get_random_seed(). That way an intermittent failure
at boot doesn't deprive the system of randomness until the next reboot.
Signed-off-by: NMatt Brown <matthew.brown.dev@gmail.com>
[mpe: Move init into a function, tweak change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

e66ca3db

powerpc/powernv: Enable PCI peer-to-peer · 25529100

由 Frederic Barrat 提交于 8月 04, 2017

P9 has support for PCI peer-to-peer, enabling a device to write in the
MMIO space of another device directly, without interrupting the CPU.

This patch adds support for it on powernv, by adding a new API to be
called by drivers. The pnv_pci_set_p2p(...) call configures an
'initiator', i.e the device which will issue the MMIO operation, and a
'target', i.e. the device on the receiving side.

P9 really only supports MMIO stores for the time being but that's
expected to change in the future, so the API allows to define both
load and store operations.

  /* PCI p2p descriptor */
  #define OPAL_PCI_P2P_ENABLE           0x1
  #define OPAL_PCI_P2P_LOAD             0x2
  #define OPAL_PCI_P2P_STORE            0x4

  int pnv_pci_set_p2p(struct pci_dev *initiator, struct pci_dev *target,
                      u64 desc)

It uses a new OPAL call, as the configuration magic is done on the
PHBs by skiboot.
Signed-off-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: NRussell Currey <ruscur@russell.cc>
[mpe: Drop unrelated OPAL calls, s/uint64_t/u64/, minor formatting]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

25529100

01 8月, 2017 1 次提交

powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug · 24be85a2

由 Gautham R. Shenoy 提交于 7月 21, 2017

Currently we use the stop-api provided by the firmware to program the
SLW engine to restore the values of hypervisor resources that get lost
on deeper idle states (such as winkle). Since the deep states were
only used for CPU-Hotplug on POWER8 systems, we would program the LPCR
to have the PECE1 bit since Hotplugged CPUs shouldn't be spuriously
woken up by decrementer.

On POWER9, some of the deep platform idle states such as stop4 can be
used in cpuidle as well. In this case, we want the CPU in stop4 to be
woken up by the decrementer when some timer on the CPU expires.

In this patch, we program the stop-api for LPCR with PECE1
bit cleared only when we are offlining the CPU and set it
back once the CPU is online.
Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

24be85a2

28 7月, 2017 1 次提交

powerpc/powernv/pci: Return failure for some uses of dma_set_mask() · 253fd51e

由 Alistair Popple 提交于 7月 26, 2017

Commit 8e3f1b1d ("powerpc/powernv/pci: Enable 64-bit devices to access
>4GB DMA space") introduced the ability for PCI device drivers to request a
DMA mask between 64 and 32 bits and actually get a mask greater than
32-bits. However currently if certain machine configuration dependent
conditions are not meet the code silently falls back to a 32-bit mask.

This makes it hard for device drivers to detect which mask they actually
got. Instead we should return an error when the request could not be
fulfilled which allows drivers to either fallback or implement other
workarounds as documented in DMA-API-HOWTO.txt.
Signed-off-by: NAlistair Popple <alistair@popple.id.au>
Acked-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

253fd51e

25 7月, 2017 2 次提交

powerpc/perf: Add nest IMC PMU support · 885dcd70

由 Anju T Sudhakar 提交于 7月 19, 2017

Add support to register Nest In-Memory Collection PMU counters.
Patch adds a new device file called "imc-pmu.c" under powerpc/perf
folder to contain all the device PMU functions.

Device tree parser code added to parse the PMU events information
and create sysfs event attributes for the PMU.

Cpumask attribute added along with Cpu hotplug online/offline functions
specific for nest PMU. A new state "CPUHP_AP_PERF_POWERPC_NEST_IMC_ONLINE"
added for the cpu hotplug callbacks. Error handle path frees the memory
and unregisters the CPU hotplug callbacks.
Signed-off-by: NAnju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: NHemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

885dcd70

powerpc/powernv: Detect and create IMC device · 8f95faaa

由 Madhavan Srinivasan 提交于 7月 19, 2017

Code to create platform device for the In-Memory Collection (IMC)
counters. Platform devices are created based on the IMC compatibility.
New header file created to contain the data structures and macros
needed for In-Memory Collection (IMC) counter pmu devices.

The device tree for IMC counters starts at the node "imc-counters".
This node contains all the IMC PMU nodes and event nodes for these IMC
PMUs. Device probe() parses the device to locate three possible IMC
device types (Nest/Core/Thread). Function then branch to parse each
unit nodes to populate vital information such as device memory sizes,
event nodes information, base address for reserve memory access (if
any) and so on. Simple bare-minimum shutdown function added which only
"stops" the engines.
Signed-off-by: NAnju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: NHemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Fix build with CONFIG_PERF_EVENTS=n]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8f95faaa

24 7月, 2017 3 次提交

powerpc/powernv: Add IMC OPAL APIs · 28a5db00

由 Madhavan Srinivasan 提交于 7月 19, 2017

In-Memory Collection (IMC) counters are performance monitoring
infrastructure. These counters need special sequence of SCOMs to
init/start/stop which is handled by OPAL. And OPAL provides three APIs
to init and control these IMC engines.

OPAL API documentation:
  https://github.com/open-power/skiboot/blob/master/doc/opal-api/opal-imc-counters.rst

Patch updates the kernel side powernv platform code to support the new
OPAL APIs
Signed-off-by: NHemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: NAnju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

28a5db00

powerpc/powernv: use memdup_user · 5588b29a

由 Geliang Tang 提交于 4月 29, 2017

Use memdup_user() helper instead of open-coding to simplify the code.
Signed-off-by: NGeliang Tang <geliangtang@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5588b29a

powerpc/powernv: Get cpu only after validity check · 76d98ab4

由 Santosh Sivaraj 提交于 7月 04, 2017

Check for validity of cpu before calling get_hard_smp_processor_id().

Found with coverity.
Signed-off-by: NSantosh Sivaraj <santosh@fossix.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

76d98ab4

17 7月, 2017 1 次提交

powerpc/powernv: Fix boot on Power8 bare metal due to opal_configure_cores() · a70b487b

由 Michael Ellerman 提交于 7月 17, 2017

In commit 1c0eaf0f ("powerpc/powernv: Tell OPAL about our MMU mode
on POWER9"), we added additional flags to the OPAL call to configure
CPUs at boot.

These flags only work on Power9 firmwares, and worse can cause boot
failures on Power8 machines, so we check for CPU_FTR_ARCH_300 (aka POWER9)
before adding the extra flags.

Unfortunately we forgot that opal_configure_cores() is called before
the CPU feature checks are dynamically patched, meaning the check
always returns true.

We definitely need to do something to make the CPU feature checks less
prone to bugs like this, but for now the minimal fix is to use
early_cpu_has_feature().
Reported-and-tested-by: NAbdul Haleem <abdhalee@linux.vnet.ibm.com>
Fixes: 1c0eaf0f ("powerpc/powernv: Tell OPAL about our MMU mode on POWER9")
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a70b487b

10 7月, 2017 1 次提交

powerpc/powernv: Tell OPAL about our MMU mode on POWER9 · 1c0eaf0f

由 Benjamin Herrenschmidt 提交于 6月 30, 2017

That will allow OPAL to configure the CPU in an optimal way.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1c0eaf0f

28 6月, 2017 2 次提交

powerpc/smp: Convert NR_CPUS to nr_cpu_ids · c642af9c

由 Santosh Sivaraj 提交于 6月 27, 2017

nr_cpu_ids can be limited by nr_cpus boot parameter, whereas NR_CPUS is a
compile time constant, which shouldn't be compared against during cpu kick.
Signed-off-by: NSantosh Sivaraj <santosh@fossix.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c642af9c

powerpc/smp: Do not BUG_ON if invalid CPU during kick · f8d0d5dc

由 Santosh Sivaraj 提交于 6月 27, 2017

During secondary start, we do not need to BUG_ON if an invalid CPU number
is passed. We already print an error if secondary cannot be started, so
just return an error instead.
Signed-off-by: NSantosh Sivaraj <santosh@fossix.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f8d0d5dc

27 6月, 2017 5 次提交

powerpc/powernv/pci: Enable 64-bit devices to access >4GB DMA space · 8e3f1b1d

由 Russell Currey 提交于 6月 21, 2017

On PHB3/POWER8 systems, devices can select between two different sections
of address space, TVE#0 and TVE#1. TVE#0 is intended for 32bit devices
that aren't capable of addressing more than 4GB. Selecting TVE#1 instead,
with the capability of addressing over 4GB, is performed by setting bit 59
of a PCI address.

However, some devices aren't capable of addressing at least 59 bits, but
still want more than 4GB of DMA space. In order to enable this, reconfigure
TVE#0 to be suitable for 64-bit devices by allocating memory past the
initial 4GB that is inaccessible by 64-bit DMAs.

This bypass mode is only enabled if a device requests 4GB or more of DMA
address space, if the system has PHB3 (POWER8 systems), and if the device
does not share a PE with any devices from different vendors.
Signed-off-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8e3f1b1d

powerpc/powernv/pci: Add helper to check if a PE has a single vendor · a0f98629

由 Russell Currey 提交于 6月 21, 2017

Add a helper that determines if all the devices contained in a given PE
are all from the same vendor or not.  This can be useful in determining
if it's okay to make PE-wide changes that may be suitable for some
devices but not for others.

This is used later in the series.
Signed-off-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a0f98629

powerpc/powernv/pci: Add support for PHB4 diagnostics · a4b48ba9

由 Russell Currey 提交于 6月 14, 2017

As with P7IOC and PHB3, add kernel-side support for decoding and printing
diagnostic data for PHB4.
Signed-off-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a4b48ba9

powerpc/powernv/pci: Dynamically allocate PHB diag data · 5cb1f8fd

由 Russell Currey 提交于 6月 14, 2017

Diagnostic data for PHBs currently works by allocated a fixed-sized buffer.
This is simple, but either wastes memory (though only a few kilobytes) or
in the case of PHB4 isn't enough to fit the whole data blob.

For machines that don't describe the diagnostic data size in the device
tree, use the hardcoded buffer size as before.  For those that do, only
allocate exactly what's needed.

In the special case of P7IOC (which has two types of diag data), the larger
should be specified in the device tree.
Signed-off-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5cb1f8fd

powerpc/powernv/pci: Reduce spam when dumping PEST · 31bbd45a

由 Russell Currey 提交于 6月 14, 2017

Dumping the PE State Tables (PEST) can be highly verbose if a number of PEs
are affected, especially in the case where the whole PHB is frozen and 512
lines get printed.  Check for duplicates when dumping the PEST to reduce
useless output.

For example:

    PE[0f8] A/B: 9700002600000000 80000080d00000f8
    PE[0f9] A/B: 8000000000000000 0000000000000000
    PE[..0fe] A/B: as above
    PE[0ff] A/B: 8440002b00000000 0000000000000000

instead of:

    PE[0f8] A/B: 9700002600000000 80000080d00000f8
    PE[0f9] A/B: 8000000000000000 0000000000000000
    PE[0fa] A/B: 8000000000000000 0000000000000000
    PE[0fb] A/B: 8000000000000000 0000000000000000
    PE[0fc] A/B: 8000000000000000 0000000000000000
    PE[0fd] A/B: 8000000000000000 0000000000000000
    PE[0fe] A/B: 8000000000000000 0000000000000000
    PE[0ff] A/B: 8440002b00000000 0000000000000000

and you can imagine how much worse it can get for 512 PEs.
Signed-off-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

31bbd45a

22 6月, 2017 1 次提交

powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD · bbd5ff50

由 Alistair Popple 提交于 6月 20, 2017

NPU2 requires an extra explicit flush to an active GPU PID when
sending address translation shoot downs (ATSDs) to reliably flush the
GPU TLB. This patch adds just such a flush at the end of each sequence
of ATSDs.

We can safely use PID 0 which is always reserved and active on the
GPU. PID 0 is only used for init_mm which will never be a user mm on
the GPU. To enforce this we add a check in pnv_npu2_init_context()
just in case someone tries to use PID 0 on the GPU.
Signed-off-by: NAlistair Popple <alistair@popple.id.au>
[mpe: Use true/false for bool literals]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bbd5ff50

19 6月, 2017 4 次提交

powerpc/64s/idle: Run latch switch is done with MSR[EE]=0 · 40d24343

由 Nicholas Piggin 提交于 6月 13, 2017

In the idle sleep/wake code we know that MSR[EE] is clear, so we can
avoid 2 x mfmsr and 2 x mtmsr by calling the double-underscore
versions of the run latch routines which assume interrupts are already
disabled.
Acked-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

40d24343

powerpc/64s/idle: Process interrupts from system reset wakeup · 771d4304

由 Nicholas Piggin 提交于 6月 13, 2017

When the CPU wakes from low power state, it begins at the system reset
interrupt with the exception that caused the wakeup encoded in SRR1.

Today, powernv idle wakeup ignores the wakeup reason (except a special
case for HMI), and the regular interrupt corresponding to the
exception will fire after the idle wakeup exits.

Change this to replay the interrupt from the idle wakeup before
interrupts are hard-enabled.

Test on POWER8 of context_switch selftests benchmark with polling idle
disabled (e.g., always nap, giving cross-CPU IPIs) gives the following
results:

                                original         wakeup direct
Different threads, same core:   315k/s           264k/s
Different cores:                235k/s           242k/s

There is a slowdown for doorbell IPI (same core) case because system
reset wakeup does not clear the message and the doorbell interrupt
fires again needlessly.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

771d4304

powerpc/powernv: Simplify lazy IRQ handling in CPU offline · 2525db04

由 Nicholas Piggin 提交于 6月 13, 2017

Rather than concern ourselves with any soft-mask logic in the CPU
hotplug handler, just hard disable interrupts. This ensures there
are no lazy-irqs pending, which means we can call directly to idle
instruction in order to sleep.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

2525db04

powerpc/64s/idle: Move soft interrupt mask logic into C code · 2201f994

由 Nicholas Piggin 提交于 6月 13, 2017

This simplifies the asm and fixes irq-off tracing over sleep
instructions.

Also move powersave_nap check for POWER8 into C code, and move
PSSCR register value calculation for POWER9 into C.
Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

2201f994

14 6月, 2017 1 次提交

powerpc/npu-dma: Remove spurious WARN_ON when a PCI device has no of_node · 377aa6b0

由 Alistair Popple 提交于 6月 14, 2017

Commit 4c3b89ef ("powerpc/powernv: Add sanity checks to
pnv_pci_get_{gpu|npu}_dev") introduced explicit warnings in
pnv_pci_get_npu_dev() when a PCIe device has no associated device-tree
node. However not all PCIe devices have an of_node and
pnv_pci_get_npu_dev() gets indirectly called at least once for every
PCIe device in the system. This results in spurious WARN_ON()'s so
remove it.

The same situation should not exist for pnv_pci_get_gpu_dev() as any
NPU based PCIe device requires a device-tree node.

Fixes: 4c3b89ef ("powerpc/powernv: Add sanity checks to pnv_pci_get_{gpu|npu}_dev")
Reported-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NAlistair Popple <alistair@popple.id.au>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

377aa6b0

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功