提交 · fd5cee7ce8f488768f918e73231d4859a520eb33 · openeuler / raspberrypi-kernel

28 4月, 2014 7 次提交

powerpc/powernv: Reset root port in firmware · fd5cee7c

由 Gavin Shan 提交于 4月 24, 2014

Resetting root port has more stuff to do than that for PCIe switch
ports and we should have resetting root port done in firmware instead
of the kernel itself. The problem was introduced by commit 5b2e198e
("powerpc/powernv: Rework EEH reset").

Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

fd5cee7c

powerpc/powernv: Fix endless reporting frozen PE · 63796558

由 Gavin Shan 提交于 4月 24, 2014

Once one specific PE has been marked as EEH_PE_ISOLATED, it's in
the middile of recovery or removed permenently. We needn't report
the frozen PE again. Otherwise, we will have endless reporting
same frozen PE.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

63796558

powerpc/eeh: Allow to disable EEH · 7f52a526

由 Gavin Shan 提交于 4月 24, 2014

The patch introduces bootarg "eeh=off" to disable EEH functinality.
Also, it creates /sys/kerenl/debug/powerpc/eeh_enable to disable
or enable EEH functionality. By default, we have the functionality
enabled.

For PowerNV platform, we will restore to have the conventional
mechanism of clearing frozen PE during PCI config access if we're
going to disable EEH functionality. Conversely, we will rely on
EEH for error recovery.

The patch also fixes the issue that we missed to cover the case
of disabled EEH functionality in function ioda_eeh_event(). Those
events driven by interrupt should be cleared to avoid endless
reporting.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

7f52a526

powerpc/eeh: Avoid I/O access during PE reset · 78954700

由 Gavin Shan 提交于 4月 24, 2014

We have suffered recrusive frozen PE a lot, which was caused
by IO accesses during the PE reset. Ben came up with the good
idea to keep frozen PE until recovery (BAR restore) gets done.
With that, IO accesses during PE reset are dropped by hardware
and wouldn't incur the recrusive frozen PE any more.

The patch implements the idea. We don't clear the frozen state
until PE reset is done completely. During the period, the EEH
core expects unfrozen state from backend to keep going. So we
have to reuse EEH_PE_RESET flag, which has been set during PE
reset, to return normal state from backend. The side effect is
we have to clear frozen state for towice (PE reset and clear it
explicitly), but that's harmless.

We have some limitations on pHyp. pHyp doesn't allow to enable
IO or DMA for unfrozen PE. So we don't enable them on unfrozen PE
in eeh_pci_enable(). We have to enable IO before grabbing logs on
pHyp. Otherwise, 0xFF's is always returned from PCI config space.
Also, we had wrong return value from eeh_pci_enable() for
EEH_OPT_THAW_DMA case. The patch fixes it too.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

78954700

powerpc/powernv: Use EEH PCI config accessors · 1d9a5446

由 Gavin Shan 提交于 4月 24, 2014

For EEH PowerNV backends, they need use their own PCI config
accesors as the normal one could be blocked during PE reset.
The patch also removes necessary parameter "hose" for the
function ioda_eeh_bridge_reset().
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

1d9a5446

powerpc/powernv: Move PNV_EEH_STATE_ENABLED around · f5bc6b70

由 Gavin Shan 提交于 4月 24, 2014

The flag PNV_EEH_STATE_ENABLED is put into pnv_phb::eeh_state,
which is protected by CONFIG_EEH. We needn't that. Instead, we
can have pnv_phb::flags and maintain all flags there, which is
the purpose of the patch. The patch also renames PNV_EEH_STATE_ENABLED
to PNV_PHB_FLAG_EEH.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f5bc6b70

powerpc/powernv: Remove PNV_EEH_STATE_REMOVED · 467f79a9

由 Gavin Shan 提交于 4月 24, 2014

The PHB state PNV_EEH_STATE_REMOVED maintained in pnv_phb isn't
so useful any more and it's duplicated to EEH_PE_ISOLATED. The
patch replaces PNV_EEH_STATE_REMOVED with EEH_PE_ISOLATED.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

467f79a9

28 2月, 2014 1 次提交

powerpc/powernv: Dump PHB diag-data immediately · 94716604

由 Gavin Shan 提交于 2月 25, 2014

The PHB diag-data is important to help locating the root cause for
EEH errors such as frozen PE or fenced PHB. However, the EEH core
enables IO path by clearing part of HW registers before collecting
this data causing it to be corrupted.

This patch fixes this by dumping the PHB diag-data immediately when
frozen/fenced state on PE or PHB is detected for the first time in
eeh_ops::get_state() or next_error() backend.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

94716604

17 2月, 2014 2 次提交

powerpc/eeh: Disable EEH on reboot · 66f9af83

由 Gavin Shan 提交于 2月 12, 2014

We possiblly detect EEH errors during reboot, particularly in kexec
path, but it's impossible for device drivers and EEH core to handle
or recover them properly.

The patch registers one reboot notifier for EEH and disable EEH
subsystem during reboot. That means the EEH errors is going to be
cleared by hardware reset or second kernel during early stage of
PCI probe.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

66f9af83

powerpc/powernv: Rework EEH reset · 5b2e198e

由 Gavin Shan 提交于 2月 12, 2014

When doing reset in order to recover the affected PE, we issue
hot reset on PE primary bus if it's not root bus. Otherwise, we
issue hot or fundamental reset on root port or PHB accordingly.
For the later case, we didn't cover the situation where PE only
includes root port and it potentially causes kernel crash upon
EEH error to the PE.

The patch reworks the logic of EEH reset to improve the code
readability and also avoid the kernel crash.

Cc: stable@vger.kernel.org
Reported-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

5b2e198e

15 1月, 2014 3 次提交

powerpc/eeh: Escalate error on non-existing PE · cb5b242c

由 Gavin Shan 提交于 1月 15, 2014

Sometimes, especially in sinario of loading another kernel with kdump,
we got EEH error on non-existing PE. That means the PEEV / PEST in
the corresponding PHB would be messy and we can't handle that case.
The patch escalates the error to fenced PHB so that the PHB could be
rested in order to revoer the errors on non-existing PEs.
Reported-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Tested-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

cb5b242c

powerpc/eeh: Handle multiple EEH errors · 7e4e7867

由 Gavin Shan 提交于 1月 15, 2014

For one PCI error relevant OPAL event, we possibly have multiple
EEH errors for that. For example, multiple frozen PEs detected on
different PHBs. Unfortunately, we didn't cover the case. The patch
enumarates the return value from eeh_ops::next_error() and change
eeh_handle_special_event() and eeh_ops::next_error() to handle all
existing EEH errors.

As Ben pointed out, we needn't list_for_each_entry_safe() since we
are not deleting any PHB from the hose_list and the EEH serialized
lock should be held while purging EEH events. The patch covers those
suggestions as well.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

7e4e7867

powerpc: Delete non-required instances of include <linux/init.h> · c141611f

由 Paul Gortmaker 提交于 1月 09, 2014

None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>.  Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.

The one instance where we add an include for init.h covers off
a case where that file was implicitly getting it from another
header which itself didn't need it.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

c141611f

30 12月, 2013 2 次提交

powernv/eeh: Add buffer for P7IOC hub error data · ca1de5de

由 Brian W Hart 提交于 12月 20, 2013

Prevent ioda_eeh_hub_diag() from clobbering itself when called by supplying
a per-PHB buffer for P7IOC hub diagnostic data.  Take care to inform OPAL of
the correct size for the buffer.

[Small style change to the use of sizeof -- BenH]
Signed-off-by: NBrian W Hart <hartb@linux.vnet.ibm.com>
Acked-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

ca1de5de

powernv/eeh: Fix possible buffer overrun in ioda_eeh_phb_diag() · 20acebdf

由 Brian W Hart 提交于 12月 19, 2013

PHB diagnostic buffer may be smaller than PAGE_SIZE, especially when
PAGE_SIZE > 4KB.
Signed-off-by: NBrian W Hart <hartb@linux.vnet.ibm.com>
Acked-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

20acebdf

05 12月, 2013 2 次提交

powerpc/eeh: Output PHB diag-data · 2c77e957

由 Gavin Shan 提交于 11月 22, 2013

When hitting frozen PE or fenced PHB, it's always indicative to
have dumped PHB diag-data for further analysis and diagnosis.
However, we never dump that for the cases. The patch intends to
dump PHB diag-data at the backend of eeh_ops::get_log() for PowerNV
platform.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

2c77e957

powerpc/powernv: Move PHB-diag dump functions around · 93aef2a7

由 Gavin Shan 提交于 11月 22, 2013

Prior to the completion of PCI enumeration, we actively detects
EEH errors on PCI config cycles and dump PHB diag-data if necessary.
The EEH backend also dumps PHB diag-data in case of frozen PE or
fenced PHB. However, we are using different functions to dump the
PHB diag-data for those 2 cases.

The patch merges the functions for dumping PHB diag-data to one so
that we can avoid duplicate code. Also, we never dump PHB3 diag-data
during PCI config cycles with frozen PE. The patch fixes it as well.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

93aef2a7

11 10月, 2013 4 次提交

powerpc/eeh: Output PHB3 diag-data · 8c6852e0

由 Gavin Shan 提交于 9月 06, 2013

The patch adds function ioda_eeh_phb3_phb_diag() to dump PHB3
PHB diag-data. That's called while detecting informative errors
or frozen PE on the specific PHB.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8c6852e0

powerpc/eeh: Output error number · 98cea5fe

由 Gavin Shan 提交于 9月 06, 2013

The patch prints the error number while failing to retrieve error
log from firmware. It's helpful for debugging.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

98cea5fe

powerpc/powernv: Support inbound error injection · ff6bdcd9

由 Gavin Shan 提交于 9月 06, 2013

For now, we only support outbound error injection. Actually, the
hardware supports injecting inbound errors as well. The patch enables
to inject inbound errors.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

ff6bdcd9

powerpc/powernv: Enable EEH for PHB3 · 20bb842b

由 Gavin Shan 提交于 9月 06, 2013

The EEH isn't enabled for PHB3 and the patch intends to enable it.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

20bb842b

14 8月, 2013 1 次提交

powerpc/eeh: powerpc/eeh: Fix undefined variable · 20212703

由 Mike Qiu 提交于 8月 12, 2013

changes for V4:
	- changes the type of frozen_pe_no from %d to %llu
	  in pr_devel()

'pe_no' hasn't been defined, it should be an typo error,
it should be 'frozen_pe_no'.

Also '__func__' has missed in IODA_EEH_DBG(),

For safety reasons, use pr_devel() directly, instead
of use IODA_EEH_DBG()
Signed-off-by: NMike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

20212703

01 7月, 2013 2 次提交

powerpc/eeh: Refactor the output message · 56ca4fde

由 Gavin Shan 提交于 6月 27, 2013

We needn't the the whole backtrace other than one-line message in
the error reporting interrupt handler. For errors triggered by
access PCI config space or MMIO, we replace "WARN(1, ...)" with
pr_err() and dump_stack(). The patch also adds more output messages
to indicate what EEH core is doing. Besides, some printk() are
replaced with pr_warning().
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

56ca4fde

powerpc/powernv: Replace variables with flags · 0b9e267d

由 Gavin Shan 提交于 6月 27, 2013

We have 2 fields in "struct pnv_phb" to trace the states. The patch
replace the fields with one and introduces flags for that. The patch
doesn't impact the logic.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

0b9e267d

21 6月, 2013 2 次提交

powerpc/eeh: Debugfs for error injection · 8998897b

由 Gavin Shan 提交于 6月 20, 2013

The patch creates debugfs entries (powerpc/PCIxxxx/err_injct) for
injecting EEH errors for testing purpose.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8998897b

powerpc/eeh: Register OPAL notifier for PCI error · 7cb9d93d

由 Gavin Shan 提交于 6月 20, 2013

The patch registers OPAL event notifier and process the PCI errors
from firmware. If we have pending PCI errors, special EEH event
(without binding PE) will be sent to EEH core for processing.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

7cb9d93d

20 6月, 2013 7 次提交

powerpc/eeh: I/O chip next error · 70f942db

由 Gavin Shan 提交于 6月 20, 2013

The patch implements the backend for EEH core to retrieve next
EEH error to handle. For the informational errors, we won't bother
the EEH core. Otherwise, the EEH should take appropriate actions
depending on the return value:

	0 - No further errors detected
	1 - Frozen PE
	2 - Fenced PHB
	3 - Dead PHB
	4 - Dead IOC
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

70f942db

powerpc/eeh: I/O chip PE log and bridge setup · bf90dfea

由 Gavin Shan 提交于 6月 20, 2013

The patch adds backends to retrieve error log and configure p2p
bridges for the indicated PE.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

bf90dfea

powerpc/eeh: I/O chip PE reset · 9d5cab00

由 Gavin Shan 提交于 6月 20, 2013

The patch adds the I/O chip backend to do PE reset. For now, we
focus on PCI bus dependent PE. If PHB PE has been put into error
state, the PHB will take complete reset. Besides, the root bridge
will take fundamental or hot reset accordingly if the indicated
PE locates at the toppest of PCI hierarchy tree. Otherwise, the
upstream p2p bridge will take hot reset.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

9d5cab00

powerpc/eeh: I/O chip EEH state retrieval · 8c41a7f3

由 Gavin Shan 提交于 6月 20, 2013

The patch adds I/O chip backend to retrieve the state for the
indicated PE. While the PE state is temperarily unavailable,
the upper layer (powernv platform) should return default delay
(1 second).
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8c41a7f3

powerpc/eeh: I/O chip EEH enable option · eb005983

由 Gavin Shan 提交于 6月 20, 2013

The patch adds the backend to enable or disable EEH functionality
for the specified PE. The backend is also used to enable MMIO or
DMA path for the problematic PE. It's notable that all PEs on
PowerNV platform support EEH functionality by default, and we
disallow to disable EEH for the specific PE.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

eb005983

powerpc/eeh: I/O chip post initialization · 73370c66

由 Gavin Shan 提交于 6月 20, 2013

The post initialization (struct eeh_ops::post_init) is called after
the EEH probe is done. On the other hand, the EEH core post
initialization is designed to call platform and then I/O chip backend
on PowerNV platform.

The patch adds the backend for I/O chip to notify the platform
that the specific PHB is ready to supply EEH service.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

73370c66

powerpc/eeh: EEH backend for P7IOC · 8747f363

由 Gavin Shan 提交于 6月 20, 2013

For EEH on PowerNV platform, the overall architecture is different
from that on pSeries platform. In order to support multiple I/O chips
in future, we split EEH to 3 layers for PowerNV platform: EEH core,
platform layer, I/O layer. It would give EEH implementation on PowerNV
platform much more flexibility in future.

The patch adds the EEH backend for P7IOC.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8747f363