提交 · db819d60f6720080150a365080ff656cf239f88f · openanolis / cloud-kernel

12 5月, 2016 1 次提交

EDAC, mce_amd: Detect SMCA using X86_FEATURE_SMCA · a348ed83

由 Yazen Ghannam 提交于 5月 11, 2016

Use X86_FEATURE_SMCA when detecting if SMCA is available instead of
directly using CPUID 0x80000007_EBX.
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1462971509-3856-7-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

a348ed83

08 3月, 2016 1 次提交

x86/mce/AMD, EDAC: Enable error decoding of Scalable MCA errors · be0aec23

由 Aravind Gopalakrishnan 提交于 3月 07, 2016

For Scalable MCA enabled processors, errors are listed per IP block. And
since it is not required for an IP to map to a particular bank, we need
to use HWID and McaType values from the MCx_IPID register to figure out
which IP a given bank represents.

We also have a new bit (TCC) in the MCx_STATUS register to indicate Task
context is corrupt.

Add logic here to decode errors from all known IP blocks for Fam17h
Model 00-0fh and to print TCC errors.

[ Minor fixups. ]
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1457021458-2522-3-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

be0aec23

13 8月, 2015 2 次提交

x86/mce: Kill drain_mcelog_buffer() · eef4dfa0

由 Borislav Petkov 提交于 8月 12, 2015

This used to flush out MCEs logged during early boot and which
were in the MCA registers from a previous system run. No need
for that now, since we've moved to a genpool.
Suggested-by: NTony Luck <tony.luck@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1439396985-12812-7-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

eef4dfa0

x86/mce: Remove the MCE ring for Action Optional errors · fd4cf79f

由 Chen, Gong 提交于 8月 12, 2015

Use unified genpool to save Action Optional error events and put
Action Optional error handling in the same notification chain as
MCE error decoding.
Signed-off-by: NChen, Gong <gong.chen@linux.intel.com>
[ Fold in subsequent patch from Boris for early boot logging. ]
Signed-off-by: NTony Luck <tony.luck@intel.com>
[ Correct a lot. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1439396985-12812-5-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

fd4cf79f

14 7月, 2015 1 次提交

EDAC, mce_amd: Don't emit 'CE' for Deferred error · 99e1dfb7

由 Aravind Gopalakrishnan 提交于 7月 13, 2015

Currently, when decoding an MCE, we display 'CE' for a Deferred error, like
this:

[Hardware Error]: CPU:0 (15:2:0) MC4_STATUS[Over|CE|MiscV|-|AddrV|Deferred|-|UECC]: 0xdc04b00095080813

When the 'UC' bit in the MCx_STATUS register is clear, the error status
is either a Corrected error or Deferred error as determined by the
'Deferred' bit. So do not print 'CE' on a deferred error.

Refer to AMD Error Scope Hierarchy table in a newer BKDG (example:
49125_15h_Models_30h-3Fh_BKDG.pdf, section "RAS Features").
Signed-off-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1436788382-6463-1-git-send-email-aravind.gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>

99e1dfb7

25 11月, 2014 1 次提交

EDAC, MCE, AMD: Correct formatting of decoded text · 50872ccd

由 Borislav Petkov 提交于 11月 22, 2014

Write out MCx_ADDR into the more humanly readable "MCx Error Address"
and remove double colon in the output.

Cc: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>

50872ccd

05 11月, 2014 1 次提交

EDAC, MCE, AMD: Add decoding table for MC6 xec · bc4febe9

由 Aravind Gopalakrishnan 提交于 11月 04, 2014

Extended error code meanings are tabulated for other banks. Extend that
tradition for MC6 too.
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Link: http://lkml.kernel.org/r/1415122868-10969-1-git-send-email-aravind.gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>

bc4febe9

14 7月, 2014 1 次提交

EDAC, MCE, AMD: Add MCE decoding for F15h M60h · eba4bfb3

由 Aravind Gopalakrishnan 提交于 7月 14, 2014

Add decoding logic for new Fam15h model 60h.

Tested using mce_amd_inj module and works fine.
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Link: http://lkml.kernel.org/r/1405098795-4678-1-git-send-email-Aravind.Gopalakrishnan@amd.com
[ Boris: simplify a bit. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

eba4bfb3

09 5月, 2014 1 次提交

EDAC, MCE, AMD: Remove leftover unused mask · c5c0903b

由 Borislav Petkov 提交于 5月 08, 2014

295d8cda ("EDAC, MCE, AMD: Drop local coreid reporting") removed the
code snippet which used that mask but forgot to drop the mask itself. Do
that now.
Signed-off-by: NBorislav Petkov <bp@suse.de>

c5c0903b

24 2月, 2014 1 次提交

MCE, AMD: Fix decoding module loading on unsupported hw · fd0f5fff

由 Borislav Petkov 提交于 2月 17, 2014

We want to still be able to issue some error information on systems for
which there is no decoding support (think older distro kernels here,
for example). Therefore, we allow module registration but skip the
per-family bank-specific decoders and issue the general information
only, i.e.:

[   46.822828] [Hardware Error]: Error Status: Uncorrected, software containable error.
[   46.822846] [Hardware Error]: CPU:0 (15:30:0) MC0_STATUS[-|UE|-|-|-|-|-]: 0xa000000000010f0f
[   46.822858] [Hardware Error]: cache level: L3/GEN, mem/io: GEN, mem-tx: GEN, part-proc: GEN (timed out)

with the hope that it still contains helpful useful bits.
Suggested-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Tested-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Link: http://lkml.kernel.org/r/1392659391-2411-1-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>

fd0f5fff

08 6月, 2013 1 次提交

EDAC, MCE, AMD: Add an MCE signature for new Fam15h models · aad19e51

由 Aravind Gopalakrishnan 提交于 6月 05, 2013

Add a new error signature for Family 15h, models 30h-3fh. Patch has been
tested on Fam15h using mce_amd_inj facility and has been verified to
work correctly.
Signed-off-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
 [ cleanup commit message and error string ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

aad19e51

23 1月, 2013 3 次提交

EDAC, MCE, AMD: Remove unneeded exports · 0f08669e

由 Borislav Petkov 提交于 12月 23, 2012

Initially, those strings describing different parts of an MCE message
were shared with amd64_edac and were therefore exported to modules.
However, all except pp_msgs are used only in one place right now so hide
them and make them static.

No functionality change.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NBorislav Petkov <bp@alien8.de>

0f08669e

EDAC, MCE, AMD: Add MCE decoding support for Family 16h · 980eec8b

由 Jacob Shin 提交于 12月 18, 2012

Add MCE decoding logic for AMD Family 16h processors.

Boris:

- drop unneeded uu_msgs export
- exit early in cat_mc1_mce and save us an indentation level
Signed-off-by: NJacob Shin <jacob.shin@amd.com>
Signed-off-by: NBorislav Petkov <bp@alien8.de>

980eec8b

EDAC, MCE, AMD: Make MC2 decoding per-family · 4a73d3de

由 Jacob Shin 提交于 12月 18, 2012

Currently only AMD Family 15h processors have special handling for MC2
errors. Since upcoming Family 16h will also need unique handling, let's
make MC2 handling part of amd_decoder_ops.
Signed-off-by: NJacob Shin <jacob.shin@amd.com>
Signed-off-by: NBorislav Petkov <bp@alien8.de>

4a73d3de

28 11月, 2012 4 次提交

MCE, AMD: Dump error status · d5c6770d

由 Borislav Petkov 提交于 9月 14, 2012

Dump error status after decoding the error which describes the error
disposition.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

d5c6770d

MCE, AMD: Report decoded error type first · d824c771

由 Borislav Petkov 提交于 9月 14, 2012

Instead of starting with the error details, report the decoded, readable
error type first.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

d824c771

MCE, AMD: Dump CPU f/m/s triple with the error · f89f8388

由 Borislav Petkov 提交于 9月 13, 2012

It is very useful to have the family/model/stepping with the reported
error so dump it. This saves us asking the bug reporter about it.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

f89f8388

MCE, AMD: Remove functional unit references · f05c41a9

由 Borislav Petkov 提交于 9月 11, 2012

Having the functional unit names in each bank decode is only misleading
as this code supports multiple families and there's no guarantee the
mapping between FUs and MCE banks will stay the same.

And also, knowing the functional unit name doesn't help much since you
end up looking at the respective BKDG anyway.

So drop all FU references and use the MC bank numbers instead.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

f05c41a9

04 4月, 2012 1 次提交

MCE, AMD: Drop too granulary family model checks · ec3e82d6

由 Borislav Petkov 提交于 4月 04, 2012

MCA details seldom change inbetween the models of a family so don't
be too conservative and enable decoding on everything starting from
K8 onwards. Minor adjustments can come in later but most importantly,
we have some decoding infrastructure in place for upcoming models by
default.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

ec3e82d6

19 3月, 2012 6 次提交

MCE, AMD: Constify error tables · ebe2aea8

由 Borislav Petkov 提交于 11月 29, 2011

... so that checkpatch can chill out.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Reviewed-by: NAndreas Herrmann <andreas.herrmann3@amd.com>

ebe2aea8

MCE, AMD: Correct bank 5 error signatures · ae615b4b

由 Borislav Petkov 提交于 11月 25, 2011

... and remove superfluous ErrorCodeExt check.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Reviewed-by: NAndreas Herrmann <andreas.herrmann3@amd.com>

ae615b4b

MCE, AMD: Rework NB MCE signatures · 68782673

由 Borislav Petkov 提交于 11月 24, 2011

Correct their formulation, replace per-family functions with a single,
unified lookup table.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Reviewed-by: NAndreas Herrmann <andreas.herrmann3@amd.com>

68782673

MCE, AMD: Correct VB data error description · b64a99c1

由 Borislav Petkov 提交于 11月 23, 2011

Sync with latest BKDG error types.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Reviewed-by: NAndreas Herrmann <andreas.herrmann3@amd.com>

b64a99c1

MCE, AMD: Correct ucode patch buffer description · 6c1173a6

由 Borislav Petkov 提交于 11月 21, 2011

This MC1 error signature is called differently now, fix it.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Reviewed-by: NAndreas Herrmann <andreas.herrmann3@amd.com>

6c1173a6

MCE, AMD: Correct some MC0 error types · 344f0a06

由 Borislav Petkov 提交于 11月 15, 2011

Use "System Read Data Error" as a more general name for MC0 bus errors
on F15h and update some error definitions.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Reviewed-by: NAndreas Herrmann <andreas.herrmann3@amd.com>

344f0a06

14 12月, 2011 1 次提交

x86, mce: Add wrappers for registering on the decode chain · 3653ada5

由 Borislav Petkov 提交于 12月 04, 2011

No functionality change, this is done so that in a follow-on patch all
queued-up MCEs can be decoded after registering on the chain.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

3653ada5

06 10月, 2011 4 次提交

EDAC, MCE, AMD: Simplify NB MCE decoder interface · b0b07a2b

由 Borislav Petkov 提交于 8月 24, 2011

Drop third nbcfg argument which is old remains and not required anymore.

No functionality change.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

b0b07a2b

EDAC, MCE, AMD: Drop local coreid reporting · 295d8cda

由 Borislav Petkov 提交于 8月 24, 2011

MCE decoding code is reporting the core which encountered the error
unconditionally now so drop this piece. Besides, it reported the
coreid in the local processor package which is not that valuable as a
datapoint.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

295d8cda

EDAC, MCE, AMD: Print valid addr when reporting an error · 086be786

由 Borislav Petkov 提交于 9月 30, 2011

The MCi_STATUS bank has a AddrV bit which, when set, denotes that the
corresponding MCi_ADDR MSR contains a valid address belonging to the
MCE currently being reported. Dump it since it is definitely relevant
information.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

086be786

EDAC, MCE, AMD: Print CPU number when reporting the error · bff7b812

由 Borislav Petkov 提交于 8月 04, 2011

Currently, correctable ECCs go through mcelog and do not print the scary
MCE banner. In that case, however, reporting the core where the CECC
happened is important information so dump it along with the decoded
string albeit at risk of having a minor redundancy.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

bff7b812

17 3月, 2011 2 次提交

amd64_edac: Enable driver on F15h · df71a053

由 Borislav Petkov 提交于 1月 19, 2011

Add the PCI device ids required for driver registration. Remove
pvt->ctl_name and use the family descriptor directly, instead. Then,
bump driver version and fixup its format. Finally, enable DRAM ECC
decoding on F15h.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

df71a053

amd64_edac: Cleanup NBSH cruft · bcd781f4

由 Borislav Petkov 提交于 1月 07, 2011

Remove reporting of errors with UC bit set - this is done by the MCE
decoding code anyway and this driver deals with DRAM ECC errors only. UC
(NB uncorrectable error) doesn't necessarily mean it is a DRAM error.
Remove unused macros while at it.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

bcd781f4

07 1月, 2011 8 次提交

EDAC, MCE: Fix NB error formatting · 6d5db466

由 Borislav Petkov 提交于 11月 25, 2010

Minor formatting fixup since the information which core was associated
with the MCE is not always valid.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

6d5db466

EDAC, MCE: Use BIT_64() to eliminate warnings on 32-bit · 50adbbd8

由 Randy Dunlap 提交于 11月 13, 2010

Building for X86_32 produces shift count warnings, so use BIT_64() to
eliminate the warnings.

drivers/edac/mce_amd.c:778: warning: left shift count >= width of type
drivers/edac/mce_amd.c:778: warning: left shift count >= width of type
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: bluesmoke-devel@lists.sourceforge.net
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

50adbbd8

EDAC, MCE: Enable MCE decoding on F15h · bad11e03

由 Borislav Petkov 提交于 9月 22, 2010

Now that everything is inplace, enable MCE decoding on F15h. Make
initcall routine a bit more readable.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

bad11e03

EDAC, MCE: Shorten error report formatting · fa7ae8cc

由 Borislav Petkov 提交于 9月 22, 2010

Shorten up MCi_STATUS flags and add BD's new deferred and poison types.
Also, simplify formatting.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

fa7ae8cc

EDAC, MCE: Overhaul error fields extraction macros · 62452882

由 Borislav Petkov 提交于 9月 22, 2010

Make macro names shorter thus making code shorter and more clear.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

62452882

B
EDAC, MCE: Add F15h FP MCE decoder · b8f85c47
由 Borislav Petkov 提交于 9月 22, 2010
```
Add decoder for FP MCEs.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
```
b8f85c47

EDAC, MCE: Add F15 EX MCE decoder · 8259a7e5

由 Borislav Petkov 提交于 9月 22, 2010

Integrate the single FIROB signature into an expanded table along with
the new BD MCE types.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

8259a7e5

EDAC, MCE: Add an F15h NB MCE decoder · 05cd667d

由 Borislav Petkov 提交于 9月 22, 2010

by (almost) reusing the F10h one since the signatures are the same.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

05cd667d

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功