- 12 5月, 2016 1 次提交
-
-
由 Yazen Ghannam 提交于
Use X86_FEATURE_SMCA when detecting if SMCA is available instead of directly using CPUID 0x80000007_EBX. Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1462971509-3856-7-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 08 3月, 2016 1 次提交
-
-
由 Aravind Gopalakrishnan 提交于
For Scalable MCA enabled processors, errors are listed per IP block. And since it is not required for an IP to map to a particular bank, we need to use HWID and McaType values from the MCx_IPID register to figure out which IP a given bank represents. We also have a new bit (TCC) in the MCx_STATUS register to indicate Task context is corrupt. Add logic here to decode errors from all known IP blocks for Fam17h Model 00-0fh and to print TCC errors. [ Minor fixups. ] Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1457021458-2522-3-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 13 8月, 2015 2 次提交
-
-
由 Borislav Petkov 提交于
This used to flush out MCEs logged during early boot and which were in the MCA registers from a previous system run. No need for that now, since we've moved to a genpool. Suggested-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1439396985-12812-7-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Chen, Gong 提交于
Use unified genpool to save Action Optional error events and put Action Optional error handling in the same notification chain as MCE error decoding. Signed-off-by: NChen, Gong <gong.chen@linux.intel.com> [ Fold in subsequent patch from Boris for early boot logging. ] Signed-off-by: NTony Luck <tony.luck@intel.com> [ Correct a lot. ] Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1439396985-12812-5-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 14 7月, 2015 1 次提交
-
-
由 Aravind Gopalakrishnan 提交于
Currently, when decoding an MCE, we display 'CE' for a Deferred error, like this: [Hardware Error]: CPU:0 (15:2:0) MC4_STATUS[Over|CE|MiscV|-|AddrV|Deferred|-|UECC]: 0xdc04b00095080813 When the 'UC' bit in the MCx_STATUS register is clear, the error status is either a Corrected error or Deferred error as determined by the 'Deferred' bit. So do not print 'CE' on a deferred error. Refer to AMD Error Scope Hierarchy table in a newer BKDG (example: 49125_15h_Models_30h-3Fh_BKDG.pdf, section "RAS Features"). Signed-off-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1436788382-6463-1-git-send-email-aravind.gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 25 11月, 2014 1 次提交
-
-
由 Borislav Petkov 提交于
Write out MCx_ADDR into the more humanly readable "MCx Error Address" and remove double colon in the output. Cc: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 05 11月, 2014 1 次提交
-
-
由 Aravind Gopalakrishnan 提交于
Extended error code meanings are tabulated for other banks. Extend that tradition for MC6 too. Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1415122868-10969-1-git-send-email-aravind.gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 14 7月, 2014 1 次提交
-
-
由 Aravind Gopalakrishnan 提交于
Add decoding logic for new Fam15h model 60h. Tested using mce_amd_inj module and works fine. Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1405098795-4678-1-git-send-email-Aravind.Gopalakrishnan@amd.com [ Boris: simplify a bit. ] Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 09 5月, 2014 1 次提交
-
-
由 Borislav Petkov 提交于
295d8cda ("EDAC, MCE, AMD: Drop local coreid reporting") removed the code snippet which used that mask but forgot to drop the mask itself. Do that now. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 24 2月, 2014 1 次提交
-
-
由 Borislav Petkov 提交于
We want to still be able to issue some error information on systems for which there is no decoding support (think older distro kernels here, for example). Therefore, we allow module registration but skip the per-family bank-specific decoders and issue the general information only, i.e.: [ 46.822828] [Hardware Error]: Error Status: Uncorrected, software containable error. [ 46.822846] [Hardware Error]: CPU:0 (15:30:0) MC0_STATUS[-|UE|-|-|-|-|-]: 0xa000000000010f0f [ 46.822858] [Hardware Error]: cache level: L3/GEN, mem/io: GEN, mem-tx: GEN, part-proc: GEN (timed out) with the hope that it still contains helpful useful bits. Suggested-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Tested-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1392659391-2411-1-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 08 6月, 2013 1 次提交
-
-
由 Aravind Gopalakrishnan 提交于
Add a new error signature for Family 15h, models 30h-3fh. Patch has been tested on Fam15h using mce_amd_inj facility and has been verified to work correctly. Signed-off-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> [ cleanup commit message and error string ] Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 23 1月, 2013 3 次提交
-
-
由 Borislav Petkov 提交于
Initially, those strings describing different parts of an MCE message were shared with amd64_edac and were therefore exported to modules. However, all except pp_msgs are used only in one place right now so hide them and make them static. No functionality change. Reported-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
由 Jacob Shin 提交于
Add MCE decoding logic for AMD Family 16h processors. Boris: - drop unneeded uu_msgs export - exit early in cat_mc1_mce and save us an indentation level Signed-off-by: NJacob Shin <jacob.shin@amd.com> Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
由 Jacob Shin 提交于
Currently only AMD Family 15h processors have special handling for MC2 errors. Since upcoming Family 16h will also need unique handling, let's make MC2 handling part of amd_decoder_ops. Signed-off-by: NJacob Shin <jacob.shin@amd.com> Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
- 28 11月, 2012 4 次提交
-
-
由 Borislav Petkov 提交于
Dump error status after decoding the error which describes the error disposition. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Instead of starting with the error details, report the decoded, readable error type first. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
It is very useful to have the family/model/stepping with the reported error so dump it. This saves us asking the bug reporter about it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Having the functional unit names in each bank decode is only misleading as this code supports multiple families and there's no guarantee the mapping between FUs and MCE banks will stay the same. And also, knowing the functional unit name doesn't help much since you end up looking at the respective BKDG anyway. So drop all FU references and use the MC bank numbers instead. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 04 4月, 2012 1 次提交
-
-
由 Borislav Petkov 提交于
MCA details seldom change inbetween the models of a family so don't be too conservative and enable decoding on everything starting from K8 onwards. Minor adjustments can come in later but most importantly, we have some decoding infrastructure in place for upcoming models by default. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 19 3月, 2012 6 次提交
-
-
由 Borislav Petkov 提交于
... so that checkpatch can chill out. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Reviewed-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
-
由 Borislav Petkov 提交于
... and remove superfluous ErrorCodeExt check. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Reviewed-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
-
由 Borislav Petkov 提交于
Correct their formulation, replace per-family functions with a single, unified lookup table. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Reviewed-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
-
由 Borislav Petkov 提交于
Sync with latest BKDG error types. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Reviewed-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
-
由 Borislav Petkov 提交于
This MC1 error signature is called differently now, fix it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Reviewed-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
-
由 Borislav Petkov 提交于
Use "System Read Data Error" as a more general name for MC0 bus errors on F15h and update some error definitions. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Reviewed-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
-
- 14 12月, 2011 1 次提交
-
-
由 Borislav Petkov 提交于
No functionality change, this is done so that in a follow-on patch all queued-up MCEs can be decoded after registering on the chain. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 06 10月, 2011 4 次提交
-
-
由 Borislav Petkov 提交于
Drop third nbcfg argument which is old remains and not required anymore. No functionality change. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
MCE decoding code is reporting the core which encountered the error unconditionally now so drop this piece. Besides, it reported the coreid in the local processor package which is not that valuable as a datapoint. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
The MCi_STATUS bank has a AddrV bit which, when set, denotes that the corresponding MCi_ADDR MSR contains a valid address belonging to the MCE currently being reported. Dump it since it is definitely relevant information. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Currently, correctable ECCs go through mcelog and do not print the scary MCE banner. In that case, however, reporting the core where the CECC happened is important information so dump it along with the decoded string albeit at risk of having a minor redundancy. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 17 3月, 2011 2 次提交
-
-
由 Borislav Petkov 提交于
Add the PCI device ids required for driver registration. Remove pvt->ctl_name and use the family descriptor directly, instead. Then, bump driver version and fixup its format. Finally, enable DRAM ECC decoding on F15h. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Remove reporting of errors with UC bit set - this is done by the MCE decoding code anyway and this driver deals with DRAM ECC errors only. UC (NB uncorrectable error) doesn't necessarily mean it is a DRAM error. Remove unused macros while at it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 07 1月, 2011 8 次提交
-
-
由 Borislav Petkov 提交于
Minor formatting fixup since the information which core was associated with the MCE is not always valid. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Randy Dunlap 提交于
Building for X86_32 produces shift count warnings, so use BIT_64() to eliminate the warnings. drivers/edac/mce_amd.c:778: warning: left shift count >= width of type drivers/edac/mce_amd.c:778: warning: left shift count >= width of type Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: bluesmoke-devel@lists.sourceforge.net Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Now that everything is inplace, enable MCE decoding on F15h. Make initcall routine a bit more readable. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Shorten up MCi_STATUS flags and add BD's new deferred and poison types. Also, simplify formatting. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Make macro names shorter thus making code shorter and more clear. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Add decoder for FP MCEs. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Integrate the single FIROB signature into an expanded table along with the new BD MCE types. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
by (almost) reusing the F10h one since the signatures are the same. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-