提交 · 0d5ee5205e62908172bf5e1a5fd171ba262fdb75 · openanolis / cloud-kernel

30 9月, 2014 2 次提交

powerpc/eeh: Freeze PE before PE reset · 0d5ee520

由 Gavin Shan 提交于 9月 30, 2014

The patch adds one more option (EEH_OPT_FREEZE_PE) to set_option()
method to proactively freeze PE, which will be issued before resetting
pass-throughed PE to drop MMIO access during reset because it's
always contributing to recursive EEH error.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

0d5ee520

powerpc/eeh: Drop unused argument in eeh_check_failure() · 3e938052

由 Gavin Shan 提交于 9月 30, 2014

eeh_check_failure() is used to check frozen state of the PE which
owns the indicated I/O address. The argument "val" of the function
isn't used. The patch drops it and return the frozen state of the
PE as expected.

Cc: Vishal Mansur <vmansur@linux.vnet.ibm.com>
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3e938052

25 9月, 2014 1 次提交

powerpc/eeh: Fix kernel crash when passing through VF · 2a58222f

由 Wei Yang 提交于 9月 17, 2014

When doing vfio passthrough a VF, the kernel will crash with following
message:

[  442.656459] Unable to handle kernel paging request for data at address 0x00000060
[  442.656593] Faulting instruction address: 0xc000000000038b88
[  442.656706] Oops: Kernel access of bad area, sig: 11 [#1]
[  442.656798] SMP NR_CPUS=1024 NUMA PowerNV
[  442.656890] Modules linked in: vfio_pci mlx4_core nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack bnep bluetooth rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw tg3 nfsd be2net nfs_acl ses lockd ptp enclosure pps_core kvm_hv kvm_pr shpchp binfmt_misc kvm sunrpc uinput lpfc scsi_transport_fc ipr scsi_tgt [last unloaded: mlx4_core]
[  442.658152] CPU: 40 PID: 14948 Comm: qemu-system-ppc Not tainted 3.10.42yw-pkvm+ #37
[  442.658219] task: c000000f7e2a9a00 ti: c000000f6dc3c000 task.ti: c000000f6dc3c000
[  442.658287] NIP: c000000000038b88 LR: c0000000004435a8 CTR: c000000000455bc0
[  442.658352] REGS: c000000f6dc3f580 TRAP: 0300   Not tainted  (3.10.42yw-pkvm+)
[  442.658419] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR: 28004882  XER: 20000000
[  442.658577] CFAR: c00000000000908c DAR: 0000000000000060 DSISR: 40000000 SOFTE: 1
GPR00: c0000000004435a8 c000000f6dc3f800 c0000000012b1c10 c00000000da24000
GPR04: 0000000000000003 0000000000001004 00000000000015b3 000000000000ffff
GPR08: c00000000127f5d8 0000000000000000 000000000000ffff 0000000000000000
GPR12: c000000000068078 c00000000fdd6800 000001003c320c80 000001003c3607f0
GPR16: 0000000000000001 00000000105480c8 000000001055aaa8 000001003c31ab18
GPR20: 000001003c10fb40 000001003c360ae8 000000001063bcf0 000000001063bdb0
GPR24: 000001003c15ed70 0000000010548f40 c000001fe5514c88 c000001fe5514cb0
GPR28: c00000000da24000 0000000000000000 c00000000da24000 0000000000000003
[  442.659471] NIP [c000000000038b88] .pcibios_set_pcie_reset_state+0x28/0x130
[  442.659530] LR [c0000000004435a8] .pci_set_pcie_reset_state+0x28/0x40
[  442.659585] Call Trace:
[  442.659610] [c000000f6dc3f800] [00000000000719e0] 0x719e0 (unreliable)
[  442.659677] [c000000f6dc3f880] [c0000000004435a8] .pci_set_pcie_reset_state+0x28/0x40
[  442.659757] [c000000f6dc3f900] [c000000000455bf8] .reset_fundamental+0x38/0x80
[  442.659835] [c000000f6dc3f980] [c0000000004562a8] .pci_dev_specific_reset+0xa8/0xf0
[  442.659913] [c000000f6dc3fa00] [c0000000004448c4] .__pci_dev_reset+0x44/0x430
[  442.659980] [c000000f6dc3fab0] [c000000000444d5c] .pci_reset_function+0x7c/0xc0
[  442.660059] [c000000f6dc3fb30] [d00000001c141ab8] .vfio_pci_open+0xe8/0x2b0 [vfio_pci]
[  442.660139] [c000000f6dc3fbd0] [c000000000586c30] .vfio_group_fops_unl_ioctl+0x3a0/0x630
[  442.660219] [c000000f6dc3fc90] [c000000000255fbc] .do_vfs_ioctl+0x4ec/0x7c0
[  442.660286] [c000000f6dc3fd80] [c000000000256364] .SyS_ioctl+0xd4/0xf0
[  442.660354] [c000000f6dc3fe30] [c000000000009e54] syscall_exit+0x0/0x98
[  442.660420] Instruction dump:
[  442.660454] 4bfffce9 4bfffee4 7c0802a6 fbc1fff0 fbe1fff8 f8010010 f821ff81 7c7e1b78
[  442.660566] 7c9f2378 60000000 60000000 e93e02c8 <e8690060> 2fa30000 41de00c4 2b9f0002
[  442.660679] ---[ end trace a64ac9546bcf0328 ]---
[  442.660724]

The reason is current VF is not EEH enabled.

This patch introduces a macro to convert eeh_dev to eeh_pe. By doing so, it
will prevent converting with NULL pointer.
Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
CC: Michael Ellerman <mpe@ellerman.id.au>

V3 -> V4:
   1. move the macro definition from include/linux/pci.h to
      arch/powerpc/include/asm/eeh.h

V2 -> V3:
   1. rebased on 3.17-rc4
   2. introduce a macro
   3. use this macro in several other places

V1 -> V2:
   1. code style and patch subject adjustment
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

2a58222f

05 8月, 2014 5 次提交

powerpc/eeh: Aux PE data for error log · bb593c00

由 Gavin Shan 提交于 7月 17, 2014

The patch allows PE (struct eeh_pe) instance to have auxillary data,
whose size is configurable on basis of platform. For PowerNV, the
auxillary data will be used to cache PHB diag-data for that PE
(frozen PE or fenced PHB). In turn, we can retrieve the diag-data
at any later points.

It's useful for the case of VFIO PCI devices where the error log
should be cached, and then be retrieved by the guest at later point.
Also, it can avoid PHB diag-data overwritting if another frozen PE
reported and the previous diag-data isn't fetched by guest.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

bb593c00

powerpc/eeh: Selectively enable IO for error log · dc561fb9

由 Gavin Shan 提交于 7月 17, 2014

According to the experiment I did, PCI config access is blocked
on P7IOC frozen PE by hardware, but PHB3 doesn't do that. That
means we always get 0xFF's while dumping PCI config space of the
frozen PE on P7IOC. We don't have the problem on PHB3. So we have
to enable I/O prioir to collecting error log. Otherwise, meaningless
0xFF's are always returned.

The patch fixes it by EEH flag (EEH_ENABLE_IO_FOR_LOG), which is
selectively set to indicate the case for: P7IOC on PowerNV platform,
pSeries platform.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

dc561fb9

powerpc/eeh: Refactor EEH flag accessors · 05b1721d

由 Gavin Shan 提交于 7月 17, 2014

There are multiple global EEH flags. Almost each flag has its own
accessor, which doesn't make sense. The patch refactors EEH flag
accessors so that they look unified:

  eeh_add_flag():   Add EEH flag
  eeh_clear_flag(): Clear EEH flag
  eeh_has_flag():   Check if one specific flag has been set
  eeh_enabled():    Check if EEH functionality has been enabled
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

05b1721d

powerpc/eeh: EEH support for VFIO PCI device · 212d16cd

由 Gavin Shan 提交于 6月 10, 2014

The patch exports functions to be used by new VFIO ioctl command,
which will be introduced in subsequent patch, to support EEH
functinality for VFIO PCI devices.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

212d16cd

powerpc/eeh: Avoid event on passed PE · 05ec424e

由 Gavin Shan 提交于 6月 10, 2014

We must not handle EEH error on devices which are passed to somebody
else. Instead, we expect that the frozen device owner detects an EEH
error and recovers from it.

This avoids EEH error handling on passed through devices so the device
owner gets a chance to handle them.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

05ec424e

11 6月, 2014 1 次提交

powerpc/eeh: Dump PE location code · 357b2f3d

由 Gavin Shan 提交于 6月 11, 2014

As Ben suggested, it's meaningful to dump PE's location code
for site engineers when hitting EEH errors. The patch introduces
function eeh_pe_loc_get() to retireve the location code from
dev-tree so that we can output it when hitting EEH errors.

If primary PE bus is root bus, the PHB's dev-node would be tried
prior to root port's dev-node. Otherwise, the upstream bridge's
dev-node of the primary PE bus will be check for the location code
directly.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

357b2f3d

20 5月, 2014 1 次提交

Revert "powerpc/powernv: Fundamental reset on PLX ports" · 965b5608

由 Benjamin Herrenschmidt 提交于 5月 20, 2014

This reverts commit b2b5efcf.

This code was way too board specific, there are quirks as to how
the PERST line is wired on different boards, we'll have to revisit
this using/creating appropriate firmware interfaces.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

965b5608

28 4月, 2014 7 次提交

powerpc/powernv: Fundamental reset on PLX ports · b2b5efcf

由 Gavin Shan 提交于 4月 24, 2014

The patch intends to support fundamental reset on PLX downstream
ports. If the PCI device matches any one of the internal table,
which includes PLX vendor ID, bridge device ID, register offset
for fundamental reset and bit, fundamental reset will be done
accordingly. Otherwise, it will fail back to hot reset.

Additional flag (EEH_DEV_FRESET) is introduced to record the last
reset type on the PCI bridge.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

b2b5efcf

powerpc/eeh: Make the delay for PE reset unified · 26833a50

由 Gavin Shan 提交于 4月 24, 2014

Basically, we have 3 types of resets to fulfil PE reset: fundamental,
hot and PHB reset. For the later 2 cases, we need PCI bus reset hold
and settlement delay as specified by PCI spec. PowerNV and pSeries
platforms are running on top of different firmware and some of the
delays have been covered by underly firmware (PowerNV).

The patch makes the delays unified to be done in backend, instead of
EEH core.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

26833a50

powerpc/eeh: No hotplug on permanently removed dev · d2b0f6f7

由 Gavin Shan 提交于 4月 24, 2014

The issue was detected in a bit complicated test case where
we have multiple hierarchical PEs shown as following figure:

                +-----------------+
                | PE#3     p2p#0  |
                |          p2p#1  |
                +-----------------+
                        |
                +-----------------+
                | PE#4     pdev#0 |
                |          pdev#1 |
                +-----------------+

PE#4 (have 2 PCI devices) is the child of PE#3, which has 2 p2p
bridges. We accidentally had less-known scenario: PE#4 was removed
permanently from the system because of permanent failure (e.g.
exceeding the max allowd failure times in last hour), then we detects
EEH errors on PE#3 and tried to recover it. However, eeh_dev instances
for pdev#0/1 were not detached from PE#4, which was still connected to
PE#3. All of that was because of the fact that we rely on count-based
pcibios_release_device(), which isn't reliable enough. When doing
recovery for PE#3, we still apply hotplug on PE#4 and pdev#0/1, which
are not valid any more. Eventually, we run into kernel crash.

The patch fixes above issue from two aspects. For unplug, we simply
skip those permanently removed PE, whose state is (EEH_PE_STATE_ISOLATED
&& !EEH_PE_STATE_RECOVERING) and its frozen count should be greater
than EEH_MAX_ALLOWED_FREEZES. For plug, we marked all permanently
removed EEH devices with EEH_DEV_REMOVED and return 0xFF's on read
its PCI config so that PCI core will omit them.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

d2b0f6f7

powerpc/eeh: Cleanup EEH subsystem variables · 8a5ad356

由 Gavin Shan 提交于 4月 24, 2014

There're 2 EEH subsystem variables: eeh_subsystem_enabled and
eeh_probe_mode. We needn't maintain 2 variables and we can just
have one variable and introduce different flags. The patch also
introduces additional flag EEH_FORCE_DISABLE, which will be used
to disable EEH subsystem via boot parameter ("eeh=off") in future.
Besides, the patch also introduces flag EEH_ENABLED, which is
changed to disable or enable EEH functionality on the fly through
debugfs entry in future.

With the patch applied, the creteria to check the enabled EEH
functionality is changed to:

!EEH_FORCE_DISABLED && EEH_ENABLED : Enabled
                       Other cases : Disabled
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8a5ad356

powerpc/eeh: Use cached capability for log dump · 2a18dfc6

由 Gavin Shan 提交于 4月 24, 2014

When calling into eeh_gather_pci_data() on pSeries platform, we
possiblly don't have pci_dev instance yet, but eeh_dev is always
ready. So we use cached capability from eeh_dev instead of pci_dev
for log dump there. In order to keep things unified, we also cache
PCI capability positions to eeh_dev for PowerNV as well.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

2a18dfc6

powerpc/eeh: Block PCI-CFG access during PE reset · d0914f50

由 Gavin Shan 提交于 4月 24, 2014

We've observed multiple PE reset failures because of PCI-CFG
access during that period. Potentially, some device drivers
can't support EEH very well and they can't put the device to
motionless state before PE reset. So those device drivers might
produce PCI-CFG accesses during PE reset. Also, we could have
PCI-CFG access from user space (e.g. "lspci"). Since access to
frozen PE should return 0xFF's, we can block PCI-CFG access
during the period of PE reset so that we won't get recrusive EEH
errors.

The patch adds flag EEH_PE_RESET, which is kept during PE reset.
The PowerNV/pSeries PCI-CFG accessors reuse the flag to block
PCI-CFG accordingly.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

d0914f50

powerpc/eeh: Remove EEH_PE_PHB_DEAD · 9e049375

由 Gavin Shan 提交于 4月 24, 2014

The PE state (for eeh_pe instance) EEH_PE_PHB_DEAD is duplicate to
EEH_PE_ISOLATED. Originally, those PHBs (PHB PE) with EEH_PE_PHB_DEAD
would be removed from the system. However, it's safe to replace
that with EEH_PE_ISOLATED.

The patch also clear EEH_PE_RECOVERING after fenced PHB has been handled,
either failure or success. It makes the PHB PE state consistent with:

	PHB functions normally		  NONE
	PHB has been removed		  EEH_PE_ISOLATED
	PHB fenced, recovery in progress  EEH_PE_ISOLATED | RECOVERING
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

9e049375

17 2月, 2014 1 次提交

powerpc/eeh: Cleanup on eeh_subsystem_enabled · 2ec5a0ad

由 Gavin Shan 提交于 2月 12, 2014

The patch cleans up variable eeh_subsystem_enabled so that we needn't
refer the variable directly from external. Instead, we will use
function eeh_enabled() and eeh_set_enable() to operate the variable.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

2ec5a0ad

15 1月, 2014 3 次提交

powerpc/eeh: Handle multiple EEH errors · 7e4e7867

由 Gavin Shan 提交于 1月 15, 2014

For one PCI error relevant OPAL event, we possibly have multiple
EEH errors for that. For example, multiple frozen PEs detected on
different PHBs. Unfortunately, we didn't cover the case. The patch
enumarates the return value from eeh_ops::next_error() and change
eeh_handle_special_event() and eeh_ops::next_error() to handle all
existing EEH errors.

As Ben pointed out, we needn't list_for_each_entry_safe() since we
are not deleting any PHB from the hose_list and the EEH serialized
lock should be held while purging EEH events. The patch covers those
suggestions as well.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

7e4e7867

powerpc/eeh: Hotplug improvement · f26c7a03

由 Gavin Shan 提交于 1月 12, 2014

When EEH error comes to one specific PCI device before its driver
is loaded, we will apply hotplug to recover the error. During the
plug time, the PCI device will be probed and its driver is loaded.
Then we wrongly calls to the error handlers if the driver supports
EEH explicitly.

The patch intends to fix by introducing flag EEH_DEV_NO_HANDLER and
set it before we remove the PCI device. In turn, we can avoid wrongly
calls the error handlers of the PCI device after its driver loaded.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f26c7a03

powerpc/eeh: Add restore_config operation · 1d350544

由 Gavin Shan 提交于 1月 03, 2014

After reset on the specific PE or PHB, we never configure AER
correctly on PowerNV platform. We needn't care it on pSeries
platform. The patch introduces additional EEH operation eeh_ops::
restore_config() so that we have chance to configure AER correctly
for PowerNV platform.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

1d350544

24 7月, 2013 6 次提交

powerpc/eeh: Introdce flag to protect sysfs · ab55d218

由 Gavin Shan 提交于 7月 24, 2013

The patch introduces flag EEH_DEV_SYSFS to keep track that the sysfs
entries for the corresponding EEH device (then PCI device) has been
added or removed, in order to avoid race condition.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

ab55d218

powerpc/eeh: Don't use pci_dev during BAR restore · 4b83bd45

由 Gavin Shan 提交于 7月 24, 2013

While restoring BARs for one specific PCI device, the pci_dev
instance should have been released. So it's not reliable to use
the pci_dev instance on restoring BARs. However, we still need
some information (e.g. PCIe capability position, header type) from
the pci_dev instance. So we have to store those information to
EEH device in advance.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

4b83bd45

powerpc/eeh: Use partial hotplug for EEH unaware drivers · f5c57710

由 Gavin Shan 提交于 7月 24, 2013

When EEH error happens to one specific PE, some devices with drivers
supporting EEH won't except hotplug on the device. However, there
might have other deivces without driver, or with driver without EEH
support. For the case, we need do partial hotplug in order to make
sure that the PE becomes absolutely quite during reset. Otherise,
the PE reset might fail and leads to failure of error recovery.

The current code doesn't handle that 'mixed' case properly, it either
uses the error callbacks to the drivers, or tries hotplug, but doesn't
handle a PE (EEH domain) composed of a combination of the two.

The patch intends to support so-called "partial" hotplug for EEH:
Before we do reset, we stop and remove those PCI devices without
EEH sensitive driver. The corresponding EEH devices are not detached
from its PE, but with special flag. After the reset is done, those
EEH devices with the special flag will be scanned one by one.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f5c57710

powerpc/eeh: Use safe list traversal when walking EEH devices · 9feed42e

由 Gavin Shan 提交于 7月 24, 2013

Currently, we're trasversing the EEH devices list using list_for_each_entry().
That's not safe enough because the EEH devices might be removed from
its parent PE while doing iteration. The patch replaces that with
list_for_each_entry_safe().
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

9feed42e

powerpc/eeh: Keep PE during hotplug · 807a827d

由 Gavin Shan 提交于 7月 24, 2013

When we do normal hotplug, the PE (shadow EEH structure) shouldn't be
kept around.

However, we need to keep it if the hotplug an artifial one caused by
EEH errors recovery.

Since we remove EEH device through the PCI hook pcibios_release_device(),
the flag "purge_pe" passed to various functions is meaningless. So the patch
removes the meaningless flag and introduce new flag "EEH_PE_KEEP"
to save the PE while doing hotplug during EEH error recovery.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

807a827d

powerpc/eeh: Export functions for hotplug · f2856491

由 Gavin Shan 提交于 7月 24, 2013

Make some functions public in order to support hotplug on either specific
PCI bus or PCI device in future.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f2856491

01 7月, 2013 1 次提交

powerpc/eeh: Avoid build warnings · eeb6361f

由 Gavin Shan 提交于 6月 27, 2013

The patch is for avoiding following build warnings:

   The function .pnv_pci_ioda_fixup() references
   the function __init .eeh_init().
   This is often because .pnv_pci_ioda_fixup lacks a __init

   The function .pnv_pci_ioda_fixup() references
   the function __init .eeh_addr_cache_build().
   This is often because .pnv_pci_ioda_fixup lacks a __init
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

eeb6361f

25 6月, 2013 1 次提交

powerpc/eeh: Remove eeh_mutex · ef6a2857

由 Gavin Shan 提交于 6月 25, 2013

Originally, eeh_mutex was introduced to protect the PE hierarchy
tree and the attached EEH devices because EEH core was possiblly
running with multiple threads to access the PE hierarchy tree.
However, we now have only one kthread in EEH core. So we needn't
the eeh_mutex and just remove it.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

ef6a2857

20 6月, 2013 9 次提交

powerpc/eeh: EEH core to handle special event · 8a6b1bc7

由 Gavin Shan 提交于 6月 20, 2013

On PowerNV platform, the EEH event caused by interrupt won't have
binding PE. The patch enables EEH core to handle the special event.
To avoid the current logic we have, The eeh_handle_event() is renamed
to eeh_handle_normal_event(), and the eeh_handle_special_event() is
introduced. The function eeh_handle_event() dispatches to above two
functions according to the input parameter. Besides, new backend
"next_error" added to eeh_ops and it's expected to have following
return values:

        4 - Dead IOC           3 - Dead PHB
        2 - Fenced PHB         1 - Frozen PE
        0 - No error found
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8a6b1bc7

powerpc/eeh: Export confirm_error_lock · 4907581d

由 Gavin Shan 提交于 6月 20, 2013

An EEH event is created and queued to the event queue for each
ingress EEH error. When there're mutiple EEH errors, we need serialize
the process to keep consistent PE state (flags). The spinlock
"confirm_error_lock" was introduced for the purpose. We'll inject
EEH event upon error reporting interrupts on PowerNV platform. So
we export the spinlock for that to use for consistent PE state.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

4907581d

powerpc/eeh: Trace time on first error for PE · 5a71978e

由 Gavin Shan 提交于 6月 20, 2013

We're not expecting that one specific PE got frozen for over 5
times in last hour. Otherwise, the PE will be removed from the
system upon newly coming EEH errors. The patch introduces time
stamp to trace the first error on specific PE in last hour and
function to update that accordingly. Besides, the time stamp
is recovered during PE hotplug path as we did for frozen count.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

5a71978e

powerpc/eeh: EEH post initialization operation · 21fd21f5

由 Gavin Shan 提交于 6月 20, 2013

The patch adds new EEH operation post_init. It's used to notify
the platform that EEH core has completed the EEH probe. By that,
PowerNV platform starts to use the services supplied by EEH
functionality.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

21fd21f5

powerpc/eeh: Make eeh_init() public · 51fb5f56

由 Gavin Shan 提交于 6月 20, 2013

For EEH on PowerNV platform, we will do EEH probe based on the
real PCI devices. The PCI devices are available after PCI probe.
So we have to call eeh_init() explicitly on PowerNV platform
after PCI probe. The patch also does EEH probe for PowerNV platform
in eeh_init().
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

51fb5f56

powerpc/eeh: Trace PCI bus from PE · 8cdb2833

由 Gavin Shan 提交于 6月 20, 2013

There're several types of PEs can be supported for now: PHB, Bus
and Device dependent PE. For PCI bus dependent PE, tracing the
corresponding PCI bus from PE (struct eeh_pe) would make the code
more efficient. The patch also enables the retrieval of PCI bus based
on the PCI bus dependent PE.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8cdb2833

powerpc/eeh: Make eeh_pe_get() public · 01566808

由 Gavin Shan 提交于 6月 20, 2013

While processing EEH event interrupt from P7IOC, we need function
to retrieve the PE according to the indicated EEH device. The patch
makes function eeh_pe_get() public so that other source files can call
it for that purpose. Also, the patch fixes referring to wrong BDF
(Bus/Device/Function) address while searching PE in function
__eeh_pe_get().
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

01566808

powerpc/eeh: Make eeh_phb_pe_get() public · 9ff67433

由 Gavin Shan 提交于 6月 20, 2013

One of the possible cases indicated by P7IOC interrupt is fenced
PHB. For that case, we need fetch the PE corresponding to the PHB
and disable the PHB and all subordinate PCI buses/devices, recover
from the fenced state and eventually enable the whole PHB. We need
one function to fetch the PHB PE outside eeh_pe.c and the patch is
going to make eeh_phb_pe_get() public for that purpose.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

9ff67433

powerpc/eeh: Enhance converting EEH dev · 2d5c1216

由 Gavin Shan 提交于 6月 05, 2013

Under some special circumstances, the EEH device doesn't have the
associated device tree node or PCI device. The patch enhances those
functions converting EEH device to device tree node or PCI device
accordingly to avoid unnecessary system crash.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

2d5c1216

10 1月, 2013 1 次提交

powerpc/eeh: Fix crash when adding a device in a slot with DDW · 6a040ce7

由 Thadeu Lima de Souza Cascardo 提交于 12月 28, 2012

The DDW code uses a eeh_dev struct from the pci_dev. However, this is
not set until eeh_add_device_late is called.

Since pci_bus_add_devices is called before eeh_add_device_late, the PCI
devices are added to the bus, making drivers' probe hooks to be called.
These will call set_dma_mask, which will call the DDW code, which will
require the eeh_dev struct from pci_dev. This would result in a crash,
due to a NULL dereference.

Calling eeh_add_device_late after pci_bus_add_devices would make the
system BUG, because device files shouldn't be added to devices there
were not added to the system. So, a new function is needed to add such
files only after pci_bus_add_devices have been called.

Cc: stable@vger.kernel.org
Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Acked-by: NGavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

6a040ce7

04 1月, 2013 1 次提交

POWERPC: drivers: remove __dev* attributes. · cad5cef6

由 Greg Kroah-Hartman 提交于 12月 21, 2012

CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
markings need to be removed.

This change removes the use of __devinit, __devexit_p, __devinitdata,
__devinitconst, and __devexit from these drivers.

Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

cad5cef6

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功