提交 · c90fca951e90ba470a3dc6087667edffcf8db21b · openanolis / cloud-kernel

05 6月, 2018 1 次提交

powerpc-opal: fix spelling mistake "Uniterrupted" -> "Uninterrupted" · b0c4acb1

由 Colin Ian King 提交于 5月 26, 2018

Trivial fix to spelling mistake in hmi_error_types text
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Reviewed-by: NStewart Smith <stewart@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b0c4acb1

04 6月, 2018 1 次提交

powerpc/powernv: copy/paste - Mask SO bit in CR · 75743649

由 Haren Myneni 提交于 6月 04, 2018

NX can set the 3rd bit in CR register for XER[SO] (Summary overflow)
which is not related to paste request. The current paste function
returns failure for a successful request when this bit is set. So mask
this bit and check the proper return status.

Fixes: 2392c8c8 ("powerpc/powernv/vas: Define copy/paste interfaces")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: NHaren Myneni <haren@us.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

75743649

03 6月, 2018 8 次提交

powerpc/64s: Enable barrier_nospec based on firmware settings · cb3d6759

由 Michal Suchanek 提交于 4月 24, 2018

Check what firmware told us and enable/disable the barrier_nospec as
appropriate.

We err on the side of enabling the barrier, as it's no-op on older
systems, see the comment for more detail.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

cb3d6759

powerpc/perf: Unregister thread-imc if core-imc not supported · 25af86b2

由 Anju T Sudhakar 提交于 5月 22, 2018

Since thread-imc internally use the core-imc hardware infrastructure
and is depended on it, having thread-imc in the kernel in the
absence of core-imc is trivial. Patch disables thread-imc, if
core-imc is not registered.
Signed-off-by: NAnju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

25af86b2

powerpc/perf: Rearrange memory freeing in imc init · cb094fa5

由 Anju T Sudhakar 提交于 5月 22, 2018

When any of the IMC (In-Memory Collection counter) devices fail
to initialize, imc_common_mem_free() frees set of memory. In doing so,
pmu_ptr pointer is also freed. But pmu_ptr pointer is used in subsequent
function (imc_common_cpuhp_mem_free()) which is wrong. Patch here reorders
the code to avoid such access.

Also free the memory which is dynamically allocated during imc
initialization, wherever required.
Signed-off-by: NAnju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

cb094fa5

powerpc: use time64_t in read_persistent_clock · 5bfd6435

由 Arnd Bergmann 提交于 4月 23, 2018

Looking through the remaining users of the deprecated mktime()
function, I found the powerpc rtc handlers, which use it in
place of rtc_tm_to_time64().

To clean this up, I'm changing over the read_persistent_clock()
function to the read_persistent_clock64() variant, and change
all the platform specific handlers along with it.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5bfd6435

ocxl: Rename pnv_ocxl_spa_remove_pe to clarify it's action · 19df3958

由 Alastair D'Silva 提交于 5月 11, 2018

The function removes the process element from NPU cache.
Signed-off-by: NAlastair D'Silva <alastair@d-silva.org>
Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

19df3958

powerpc/powernv: process all OPAL event interrupts with kopald · 56c0b48b

由 Nicholas Piggin 提交于 5月 11, 2018

Using irq_work for processing OPAL event interrupts is not necessary.
irq_work is typically used to schedule work from NMI context, a
softirq may be more appropriate. However OPAL events are not
particularly performance or latency critical, so they can all be
invoked by kopald.

This patch removes the irq_work queueing, and instead wakes up
kopald when there is an event to be processed. kopald processes
interrupts individually, enabling irqs and calling cond_resched
between each one to minimise latencies.

Event handlers themselves should still use threaded handlers,
workqueues, etc. as necessary to avoid high interrupts-off latencies
within any single interrupt.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

56c0b48b

powerpc/powernv: call OPAL_QUIESCE before OPAL_SIGNAL_SYSTEM_RESET · ee03b9b4

由 Nicholas Piggin 提交于 5月 10, 2018

Although it is often possible to recover a CPU that was interrupted
from OPAL with a system reset NMI, it's undesirable to interrupt them
for a few reasons. Firstly because dump/debug code itself needs to
call firmware, so it could hang on a lock or possibly corrupt a
per-cpu data structure if it or another CPU was interrupted from
OPAL. Secondly, the kexec crash dump code will not return from
interrupt to unwind the OPAL call.

Call OPAL_QUIESCE with QUIESCE_HOLD before sending an NMI IPI to
another CPU, which wait for it to leave firmware (or time out) to
avoid this problem in normal conditions. Firmware bugs may still
result in a timeout and interrupting OPAL, but that is the best
option (stops the CPU, and possibly allows firmware to be debugged).
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ee03b9b4

powerpc/powernv/ioda2: Remove redundant free of TCE pages · 98fd72fe

由 Alexey Kardashevskiy 提交于 5月 30, 2018

When IODA2 creates a PE, it creates an IOMMU table with it_ops::free
set to pnv_ioda2_table_free() which calls pnv_pci_ioda2_table_free_pages().

Since iommu_tce_table_put() calls it_ops::free when the last reference
to the table is released, explicit call to pnv_pci_ioda2_table_free_pages()
is not needed so let's remove it.

This should fix double free in the case of PCI hotuplug as
pnv_pci_ioda2_table_free_pages() does not reset neither
iommu_table::it_base nor ::it_size.

This was not exposed by SRIOV as it uses different code path via
pnv_pcibios_sriov_disable().

IODA1 does not inialize it_ops::free so it does not have this issue.

Fixes: c5f7700b ("powerpc/powernv: Dynamically release PE")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

98fd72fe

28 5月, 2018 1 次提交

powerpc/powernv/cpuidle: Init all present cpus for deep states · ac9816dc

由 Akshay Adiga 提交于 5月 16, 2018

Init all present cpus for deep states instead of "all possible" cpus.
Init fails if a possible cpu is guarded. Resulting in making only
non-deep states available for cpuidle/hotplug.

Stewart says, this means that for single threaded workloads, if you
guard out a CPU core you'll not get WoF (Workload Optimised
Frequency), which means that performance goes down when you wouldn't
expect it to.

Fixes: 77b54e9f ("powernv/powerpc: Add winkle support for offline cpus")
Cc: stable@vger.kernel.org # v3.19+
Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ac9816dc

24 5月, 2018 1 次提交

powerpc/reg: Add TEXASR related macros · ab3759b5

由 Simon Guo 提交于 5月 23, 2018

This patches add some macros for CR0/TEXASR bits so that PR KVM TM
logic (tbegin./treclaim./tabort.) can make use of them later.
Signed-off-by: NSimon Guo <wei.guo.simon@gmail.com>
Reviewed-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ab3759b5

22 5月, 2018 1 次提交

powerpc/64s: Add support for a store forwarding barrier at kernel entry/exit · a048a07d

由 Nicholas Piggin 提交于 5月 22, 2018

On some CPUs we can prevent a vulnerability related to store-to-load
forwarding by preventing store forwarding between privilege domains,
by inserting a barrier in kernel entry and exit paths.

This is known to be the case on at least Power7, Power8 and Power9
powerpc CPUs.

Barriers must be inserted generally before the first load after moving
to a higher privilege, and after the last store before moving to a
lower privilege, HV and PR privilege transitions must be protected.

Barriers are added as patch sections, with all kernel/hypervisor entry
points patched, and the exit points to lower privilge levels patched
similarly to the RFI flush patching.

Firmware advertisement is not implemented yet, so CPU flush types
are hard coded.

Thanks to Michal Suchánek for bug fixes and review.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NMichal Suchánek <msuchanek@suse.de>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a048a07d

21 5月, 2018 1 次提交

powernv: opal-sensor: Add support to read 64bit sensor values · 5cdcb01e

由 Shilpasri G Bhat 提交于 5月 07, 2018

This patch adds support to read 64-bit sensor values. This method is
used to read energy sensors and counters which are of type u64.
Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5cdcb01e

18 5月, 2018 2 次提交

powerpc/powernv: Use __raw_[rm_]writeq_be() in npu-dma.c · c786cf76

由 Michael Ellerman 提交于 5月 14, 2018

This allows us to squash some sparse warnings and also avoids having
to do explicity endian conversions in the code.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Reviewed-by: NSamuel Mendoza-Jonas <sam@mendozajonas.com>

c786cf76

powerpc/powernv: Use __raw_[rm_]writeq_be() in pci-ioda.c · 001ff2ee

由 Michael Ellerman 提交于 5月 14, 2018

This allows us to squash some sparse warnings and also avoids having
to do explicity endian conversions in the code.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Reviewed-by: NSamuel Mendoza-Jonas <sam@mendozajonas.com>

001ff2ee

17 5月, 2018 2 次提交

powerpc/powernv: Fix NVRAM sleep in invalid context when crashing · c1d2a313

由 Nicholas Piggin 提交于 5月 15, 2018

Similarly to opal_event_shutdown, opal_nvram_write can be called in
the crash path with irqs disabled. Special case the delay to avoid
sleeping in invalid context.

Fixes: 3b807033 ("powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops")
Cc: stable@vger.kernel.org # v3.2
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c1d2a313

powerpc/powernv: Fix opal_event_shutdown() called with interrupts disabled · c0beffc4

由 Nicholas Piggin 提交于 5月 15, 2018

A kernel crash in process context that calls emergency_restart from
panic will end up calling opal_event_shutdown with interrupts disabled
but not in interrupt. This causes a sleeping function to be called
which gives the following warning with sysrq+c:

    Rebooting in 10 seconds..
    BUG: sleeping function called from invalid context at kernel/locking/mutex.c:238
    in_atomic(): 0, irqs_disabled(): 1, pid: 7669, name: bash
    CPU: 20 PID: 7669 Comm: bash Tainted: G      D W         4.17.0-rc5+ #3
    Call Trace:
    dump_stack+0xb0/0xf4 (unreliable)
    ___might_sleep+0x174/0x1a0
    mutex_lock+0x38/0xb0
    __free_irq+0x68/0x460
    free_irq+0x70/0xc0
    opal_event_shutdown+0xb4/0xf0
    opal_shutdown+0x24/0xa0
    pnv_shutdown+0x28/0x40
    machine_shutdown+0x44/0x60
    machine_restart+0x28/0x80
    emergency_restart+0x30/0x50
    panic+0x2a0/0x328
    oops_end+0x1ec/0x1f0
    bad_page_fault+0xe8/0x154
    handle_page_fault+0x34/0x38
    --- interrupt: 300 at sysrq_handle_crash+0x44/0x60
    LR = __handle_sysrq+0xfc/0x260
    flag_spec.62335+0x12b844/0x1e8db4 (unreliable)
    __handle_sysrq+0xfc/0x260
    write_sysrq_trigger+0xa8/0xb0
    proc_reg_write+0xac/0x110
    __vfs_write+0x6c/0x240
    vfs_write+0xd0/0x240
    ksys_write+0x6c/0x110

Fixes: 9f0fd049 ("powerpc/powernv: Add a virtual irqchip for opal events")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c0beffc4

14 5月, 2018 2 次提交

powerpc/ioda: Use ibm, supported-tce-sizes for IOMMU page size mask · 7ef73cd3

由 Alexey Kardashevskiy 提交于 5月 14, 2018

At the moment we assume that IODA2 and newer PHBs can always do 4K/64K/16M
IOMMU pages, however this is not the case for POWER9 and now skiboot
advertises the supported sizes via the device so we use that instead
of hard coding the mask.
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7ef73cd3

powerpc/powernv: Fix memtrace build when NUMA=n · 8ccb442d

由 Michael Ellerman 提交于 5月 10, 2018

Currently memtrace doesn't build if NUMA=n:

  In function ‘memtrace_alloc_node’:
  arch/powerpc/platforms/powernv/memtrace.c:134:6:
  error: the address of ‘contig_page_data’ will always evaluate as ‘true’
    if (!NODE_DATA(nid) || !node_spanned_pages(nid))
        ^

This is because for NUMA=n NODE_DATA(nid) points to an always
allocated structure, contig_page_data.

But even in the NUMA=y case memtrace_alloc_node() is only called for
online nodes, and we should always have a NODE_DATA() allocated for an
online node. So remove the (hopefully) overly paranoid check, which
also means we can build when NUMA=n.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8ccb442d

07 5月, 2018 1 次提交

Revert "powerpc/powernv: Increase memory block size to 1GB on radix" · 7acf50e4

由 Balbir Singh 提交于 5月 01, 2018

This commit was a stop-gap to prevent crashes on hotunplug, caused by
the mismatch between the 1G mappings used for the linear mapping and the
memory block size. Those issues are now resolved because we split the
linear mapping at hotunplug time if necessary, as implemented in commit
4dd5f8a9 ("powerpc/mm/radix: Split linear mapping on hot-unplug").
Signed-off-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Tested-by: NRashmica Gupta <rashmica.g@gmail.com>
Tested-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7acf50e4

25 4月, 2018 1 次提交

rtc: opal: Fix OPAL RTC driver OPAL_BUSY loops · 682e6b4d

由 Nicholas Piggin 提交于 4月 10, 2018

The OPAL RTC driver does not sleep in case it gets OPAL_BUSY or
OPAL_BUSY_EVENT from firmware, which causes large scheduling
latencies, up to 50 seconds have been observed here when RTC stops
responding (BMC reboot can do it).

Fix this by converting it to the standard form OPAL_BUSY loop that
sleeps.

Fixes: 628daa8d ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks")
Cc: stable@vger.kernel.org # v3.2+
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Acked-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

682e6b4d

24 4月, 2018 4 次提交

powerpc/powernv/npu: Do a PID GPU TLB flush when invalidating a large address range · d0cf9b56

由 Alistair Popple 提交于 4月 17, 2018

The NPU has a limited number of address translation shootdown (ATSD)
registers and the GPU has limited bandwidth to process ATSDs. This can
result in contention of ATSD registers leading to soft lockups on some
threads, particularly when invalidating a large address range in
pnv_npu2_mn_invalidate_range().

At some threshold it becomes more efficient to flush the entire GPU
TLB for the given MM context (PID) than individually flushing each
address in the range. This patch will result in ranges greater than
2MB being converted from 32+ ATSDs into a single ATSD which will flush
the TLB for the given PID on each GPU.

Fixes: 1ab66d1f ("powerpc/powernv: Introduce address translation services for Nvlink2")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: NAlistair Popple <alistair@popple.id.au>
Acked-by: NBalbir Singh <bsingharora@gmail.com>
Tested-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d0cf9b56

powerpc/powernv/npu: Prevent overwriting of pnv_npu2_init_contex() callback parameters · a1409ada

由 Alistair Popple 提交于 4月 11, 2018

There is a single npu context per set of callback parameters. Callers
should be prevented from overwriting existing callback values so
instead return an error if different parameters are passed.

Fixes: 1ab66d1f ("powerpc/powernv: Introduce address translation services for Nvlink2")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: NAlistair Popple <alistair@popple.id.au>
Reviewed-by: NMark Hairgrove <mhairgrove@nvidia.com>
Tested-by: NMark Hairgrove <mhairgrove@nvidia.com>
Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a1409ada

powerpc/powernv/npu: Add lock to prevent race in concurrent context init/destroy · 28a5933e

由 Alistair Popple 提交于 4月 11, 2018

The pnv_npu2_init_context() and pnv_npu2_destroy_context() functions
are used to allocate/free contexts to allow address translation and
shootdown by the NPU on a particular GPU. Context initialisation is
implicitly safe as it is protected by the requirement mmap_sem be held
in write mode, however pnv_npu2_destroy_context() does not require
mmap_sem to be held and it is not safe to call with a concurrent
initialisation for a different GPU.

It was assumed the driver would ensure destruction was not called
concurrently with initialisation. However the driver may be simplified
by allowing concurrent initialisation and destruction for different
GPUs. As npu context creation/destruction is not a performance
critical path and the critical section is not large a single spinlock
is used for simplicity.

Fixes: 1ab66d1f ("powerpc/powernv: Introduce address translation services for Nvlink2")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: NAlistair Popple <alistair@popple.id.au>
Reviewed-by: NMark Hairgrove <mhairgrove@nvidia.com>
Tested-by: NMark Hairgrove <mhairgrove@nvidia.com>
Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

28a5933e

powerpc/powernv/memtrace: Let the arch hotunplug code flush cache · 7fd6641d

由 Balbir Singh 提交于 4月 06, 2018

Don't do this via custom code, instead now that we have support in the
arch hotplug/hotunplug code, rely on those routines to do the right
thing.

The existing flush doesn't work because it uses ppc64_caches.l1d.size
instead of ppc64_caches.l1d.line_size.

Fixes: 9d5171a8 ("powerpc/powernv: Enable removal of memory for in memory tracing")
Signed-off-by: NBalbir Singh <bsingharora@gmail.com>
Reviewed-by: NRashmica Gupta <rashmica.g@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7fd6641d

11 4月, 2018 1 次提交

powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops · 3b807033

由 Nicholas Piggin 提交于 4月 10, 2018

The OPAL NVRAM driver does not sleep in case it gets OPAL_BUSY or
OPAL_BUSY_EVENT from firmware, which causes large scheduling
latencies, and various lockup errors to trigger (again, BMC reboot
can cause it).

Fix this by converting it to the standard form OPAL_BUSY loop that
sleeps.

Fixes: 628daa8d ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks")
Depends-on: 34dd25de ("powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops")
Cc: stable@vger.kernel.org # v3.2+
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3b807033

07 4月, 2018 1 次提交

powerpc/powernv: Create platform devs for nvdimm buses · 3013e173

由 Oliver O'Halloran 提交于 4月 06, 2018

Scan the devicetree for an nvdimm-bus compatible and create
a platform device for them.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

3013e173

03 4月, 2018 2 次提交

powerpc/powernv: Always stop secondaries before reboot/shutdown · f2748bdf

由 Nicholas Piggin 提交于 4月 01, 2018

Currently powernv reboot and shutdown requests just leave secondaries
to do their own things. This is undesirable because they can trigger
any number of watchdogs while waiting for reboot, but also we don't
know what else they might be doing -- they might be causing trouble,
trampling memory, etc.

The opal scheduled flash update code already ran into watchdog problems
due to flashing taking a long time, and it was fixed with 2196c6f1
("powerpc/powernv: Return secondary CPUs to firmware before FW update"),
which returns secondaries to opal. It's been found that regular reboots
can take over 10 seconds, which can result in the hard lockup watchdog
firing,

  reboot: Restarting system
  [  360.038896709,5] OPAL: Reboot request...
  Watchdog CPU:0 Hard LOCKUP
  Watchdog CPU:44 detected Hard LOCKUP other CPUS:16
  Watchdog CPU:16 Hard LOCKUP
  watchdog: BUG: soft lockup - CPU#16 stuck for 3s! [swapper/16:0]

This patch removes the special case for flash update, and calls
smp_send_stop in all cases before calling reboot/shutdown.

smp_send_stop could return CPUs to OPAL, the main reason not to is
that the request could come from a NMI that interrupts OPAL code,
so re-entry to OPAL can cause a number of problems. Putting
secondaries into simple spin loops improves the chances of a
successful reboot.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NVasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f2748bdf

powerpc/powernv: Fix SMT4 forcing idle code · a2b5e056

由 Nicholas Piggin 提交于 4月 01, 2018

The PSSCR value is not stored to PACA_REQ_PSSCR if the CPU does not
have the XER[SO] bug.

Fix this by storing up-front, outside the workaround code. The initial
test is not required because it is a slow path.

The workaround is made to depend on CONFIG_KVM_BOOK3S_HV_POSSIBLE, to
match pnv_power9_force_smt4_catch() where it is used. Drop the comment
on pnv_power9_force_smt4_catch() as it's no longer true.

Fixes: 7672691a ("powerpc/powernv: Provide a way to force a core into SMT4 mode")
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a2b5e056

31 3月, 2018 1 次提交

powerpc/64s/idle: POWER9 implement a separate idle stop function for hotplug · 3d4fbffd

由 Nicholas Piggin 提交于 11月 18, 2017

Implement a new function to invoke stop, power9_offline_stop, which is
like power9_idle_stop but used by the cpu hotplug code.

Move KVM secondary state manipulation code to the offline case.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3d4fbffd

30 3月, 2018 2 次提交

powerpc/powernv: Handle unknown OPAL errors in opal_nvram_write() · 741de617

由 Nicholas Piggin 提交于 3月 27, 2018

opal_nvram_write currently just assumes success if it encounters an
error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO
on other errors instead.

Fixes: 628daa8d ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks")
Cc: stable@vger.kernel.org # v3.2+
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Reviewed-by: NVasant Hegde <hegdevasant@linux.vnet.ibm.com>
Acked-by: NStewart Smith <stewart@linux.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

741de617

powerpc/64: Use array of paca pointers and allocate pacas individually · d2e60075

由 Nicholas Piggin 提交于 2月 14, 2018

Change the paca array into an array of pointers to pacas. Allocate
pacas individually.

This allows flexibility in where the PACAs are allocated. Future work
will allocate them node-local. Platforms that don't have address limits
on PACAs would be able to defer PACA allocations until later in boot
rather than allocate all possible ones up-front then freeing unused.

This is slightly more overhead (one additional indirection) for cross
CPU paca references, but those aren't too common.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d2e60075

27 3月, 2018 4 次提交

powerpc/eeh: Add eeh_state_active() helper · 34a286a4

由 Sam Bobroff 提交于 3月 19, 2018

Checking for a "fully active" device state requires testing two flag
bits, which is open coded in several places, so add a function to do
it.
Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

34a286a4

powerpc/powernv/npu: Do not try invalidating 32bit table when 64bit table is enabled · d41ce7b1

由 Alexey Kardashevskiy 提交于 2月 13, 2018

GPUs and the corresponding NVLink bridges get different PEs as they
have separate translation validation entries (TVEs). We put these PEs
to the same IOMMU group so they cannot be passed through separately.
So the iommu_table_group_ops::set_window/unset_window for GPUs do set
tables to the NPU PEs as well which means that iommu_table's list of
attached PEs (iommu_table_group_link) has both GPU and NPU PEs linked.
This list is used for TCE cache invalidation.

The problem is that NPU PE has just a single TVE and can be programmed
to point to 32bit or 64bit windows while GPU PE has two (as any other
PCI device). So we end up having an 32bit iommu_table struct linked to
both PEs even though only the 64bit TCE table cache can be invalidated
on NPU. And a relatively recent skiboot detects this and prints
errors.

This changes GPU's iommu_table_group_ops::set_window/unset_window to
make sure that NPU PE is only linked to the table actually used by the
hardware. If there are two tables used by an IOMMU group, the NPU PE
will use the last programmed one which with the current use scenarios
is expected to be a 64bit one.
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d41ce7b1

powerpc/powernv: Use the security flags in pnv_setup_rfi_flush() · 37c0bdd0

由 Michael Ellerman 提交于 3月 27, 2018

Now that we have the security flags we can significantly simplify the
code in pnv_setup_rfi_flush(), because we can use the flags instead of
checking device tree properties and because the security flags have
pessimistic defaults.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

37c0bdd0

powerpc/powernv: Set or clear security feature flags · 77addf6e

由 Michael Ellerman 提交于 3月 27, 2018

Now that we have feature flags for security related things, set or
clear them based on what we see in the device tree provided by
firmware.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

77addf6e

23 3月, 2018 1 次提交

powerpc/powernv: Provide a way to force a core into SMT4 mode · 7672691a

由 Paul Mackerras 提交于 3月 21, 2018

POWER9 processors up to and including "Nimbus" v2.2 have hardware
bugs relating to transactional memory and thread reconfiguration.
One of these bugs has a workaround which is to get the core into
SMT4 state temporarily.  This workaround is only needed when
running bare-metal.

This patch provides a function which gets the core into SMT4 mode
by preventing threads from going to a stop state, and waking up
those which are already in a stop state.  Once at least 3 threads
are not in a stop state, the core will be in SMT4 and we can
continue.

To do this, we add a "dont_stop" flag to the paca to tell the
thread not to go into a stop state.  If this flag is set,
power9_idle_stop() just returns immediately with a return value
of 0.  The pnv_power9_force_smt4_catch() function does the following:

1. Set the dont_stop flag for each thread in the core, except
   ourselves (in fact we use an atomic_inc() in case more than
   one thread is calling this function concurrently).
2. See how many threads are awake, indicated by their
   requested_psscr field in the paca being 0.  If this is at
   least 3, skip to step 5.
3. Send a doorbell interrupt to each thread that was seen as
   being in a stop state in step 2.
4. Until at least 3 threads are awake, scan the threads to which
   we sent a doorbell interrupt and check if they are awake now.

This relies on the following properties:

- Once dont_stop is non-zero, requested_psccr can't go from zero to
  non-zero, except transiently (and without the thread doing stop).
- requested_psscr being zero guarantees that the thread isn't in
  a state-losing stop state where thread reconfiguration could occur.
- Doing stop with a PSSCR value of 0 won't be a state-losing stop
  and thus won't allow thread reconfiguration.
- Once threads_per_core/2 + 1 (i.e. 3) threads are awake, the core
  must be in SMT4 mode, since SMT modes are powers of 2.

This does add a sync to power9_idle_stop(), which is necessary to
provide the correct ordering between setting requested_psscr and
checking dont_stop.  The overhead of the sync should be unnoticeable
compared to the latency of going into and out of a stop state.

Because some objected to incurring this extra latency on systems where
the XER[SO] bug is not relevant, I have put the test in
power9_idle_stop inside a feature section.  This means that
pnv_power9_force_smt4_catch() WILL NOT WORK correctly on systems
without the CPU_FTR_P9_TM_XER_SO_BUG feature bit set, and will
probably hang the system.

In order to cater for uses where the caller has an operation that
has to be done while the core is in SMT4, the core continues to be
kept in SMT4 after pnv_power9_force_smt4_catch() function returns,
until the pnv_power9_force_smt4_release() function is called.
It undoes the effect of step 1 above and allows the other threads
to go into a stop state.
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7672691a

20 3月, 2018 1 次提交

powerpc: Use sizeof(*foo) rather than sizeof(struct foo) · a0828cf5

由 Markus Elfring 提交于 1月 19, 2017

It's slightly less error prone to use sizeof(*foo) rather than
specifying the type.
Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
[mpe: Consolidate into one patch, rewrite change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a0828cf5

14 3月, 2018 1 次提交

powerpc/vas: Add a couple of trace points · 007bb7d6

由 Sukadev Bhattiprolu 提交于 2月 09, 2018

Add a couple of trace points in the VAS driver
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
[mpe: Add SPDX tag to new header]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

007bb7d6

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功