- 17 1月, 2020 2 次提交
-
-
由 Yazen Ghannam 提交于
commit 3ad7e748c12cc771df6020a552def3e1727e8a17 upstream. The existing CS, PSP, and SMU SMCA bank types will see new versions (as indicated by their McaTypes) in future SMCA systems. Add the new (HWID, MCATYPE) tuples for these new versions. Reuse the same names as the older versions, since they are logically the same to the user. SMCA systems won't mix and match IP blocks with different McaType versions in the same system, so there isn't a need to distinguish them. The MCA_IPID register is saved when logging an MCA error, and that can be used to triage the error. Also, add the new error descriptions to edac_mce_amd. Some error types (positions in the list) are overloaded compared to the previous McaTypes. Therefore, just create new lists of the error descriptions to keep things simple even if some of the error descriptions are the same between versions. Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Pu Wen <puwen@hygon.cn> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Shirish S <Shirish.S@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190201225534.8177-3-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Yazen Ghannam 提交于
commit cbfa447edd6a3825fdb8a4ffae74ff7208f2d2c0 upstream. Add the (HWID, MCATYPE) tuples and names for the new MP5, NBIO, and PCIE SMCA bank types. Also, add their respective error descriptions to the MCE decoding module edac_mce_amd. Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Pu Wen <puwen@hygon.cn> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Shirish S <Shirish.S@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190201225534.8177-2-Yazen.Ghannam@amd.comSigned-off-by: NWANG Siyuan <Siyuan.Wang@amd.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
- 22 2月, 2018 1 次提交
-
-
由 Yazen Ghannam 提交于
Currently, bank 4 is reserved on Fam17h, so we chose not to initialize bank 4 in the smca_banks array. This means that when we check if a bank is initialized, like during boot or resume, we will see that bank 4 is not initialized and try to initialize it. This will cause a call trace, when resuming from suspend, due to rdmsr_*on_cpu() calls in the init path. The rdmsr_*on_cpu() calls issue an IPI but we're running with interrupts disabled. This triggers: WARNING: CPU: 0 PID: 11523 at kernel/smp.c:291 smp_call_function_single+0xdc/0xe0 ... Reserved banks will be read-as-zero, so their MCA_IPID register will be zero. So, like the smca_banks array, the threshold_banks array will not have an entry for a reserved bank since all its MCA_MISC* registers will be zero. Enumerate a "Reserved" bank type that matches on a HWID_MCATYPE of 0,0. Use the "Reserved" type when checking if a bank is reserved. It's possible that other bank numbers may be reserved on future systems. Don't try to find the block address on reserved banks. Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # 4.14.x Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20180221101900.10326-7-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 21 8月, 2017 3 次提交
-
-
由 Borislav Petkov 提交于
... and use the macro for that. No functionality change. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
由 Borislav Petkov 提交于
struct mce.cpuid contains CPUID(1).EAX which contains family, model and stepping and thus has enough information for our purposes. Thus get rid of some external dependencies which are not really needed. No functionality change. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
由 Borislav Petkov 提交于
Singular fits better because it decodes a single error. No functionality change. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 17 7月, 2017 1 次提交
-
-
由 Yazen Ghannam 提交于
Using the homegrown amd_get_nb_id() to find a node ID on AMD was fine while the L3 to node mapping was 1:1. And Zen topology broke this. So let's start slowly moving away from it and use the topology interfaces instead. Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1490041614-90057-2-git-send-email-Yazen.Ghannam@amd.com [ Massage commit message. ] Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 13 6月, 2017 1 次提交
-
-
由 Yazen Ghannam 提交于
Fix typo in "poison consumption" error description. Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1497286703-62853-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 16 2月, 2017 1 次提交
-
-
由 Yazen Ghannam 提交于
Currently, the IPID and Syndrome are printed on the same line as the Address. There are cases when we can have a valid Syndrome but not a valid Address. For example, the MCA_SYND register can be used to hold more detailed error info that the hardware folks can use. It's not just DRAM ECC syndromes. There are some error types that aren't related to memory that may have valid syndromes, like some errors related to links in the Data Fabric, etc. In these cases, the IPID and Syndrome are not printed at the same log level as the rest of the stanza, so users won't see them on the console. Console: [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2 Dmesg: [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b , Syndrome: 0x000000010b404000, IPID: 0x0001002e00000002 [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2 Print the IPID first and on a new line. The IPID should always be printed on SMCA systems. The Syndrome will then be printed with the IPID and at the same log level when valid: [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b [Hardware Error]: IPID: 0x0001002e00000002, Syndrome: 0x000000010b404000 [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2 Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1487192182-2474-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 28 1月, 2017 1 次提交
-
-
由 Yazen Ghannam 提交于
Users may not be familiar with the concept of deferred errors. There is no action for users to take on this type of error, so give more context in the error message to make this more clear. Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485297149-13733-2-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 24 1月, 2017 3 次提交
-
-
由 Borislav Petkov 提交于
Assign all notifiers on the MCE decode chain a priority so that they get called in the correct order. Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-10-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Borislav Petkov 提交于
Dump the TSC value of the time when the MCE got logged. Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-8-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Borislav Petkov 提交于
It is not used outside of the driver anymore. Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-7-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 29 11月, 2016 1 次提交
-
-
由 Yazen Ghannam 提交于
MCA_STATUS[43] has been defined as "Poison" or "Reserved" for every bank since Fam15h except for Fam15h, bank 4 in which case it's defined as part of the McaStatSubCache bitfield. Filter out that case. Reported-by: NDean Liberty <Dean.Liberty@amd.com> Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479478222-19896-1-git-send-email-Yazen.Ghannam@amd.com [ Split an almost unparseable ternary conditional, add a comment. ] Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 21 11月, 2016 1 次提交
-
-
由 Yazen Ghannam 提交于
nb_bus_decoder() is only used for DRAM ECC errors so rename it so that the name is more generic and descriptive. Also, call it for DRAM ECC errors on SMCA systems. [ Boris: rename it to real function name with a verb in it. ] Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-4-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 09 11月, 2016 3 次提交
-
-
由 Borislav Petkov 提交于
Add accessor functions and hide the smca_names array. Also, add a sanity-check to bank HWID assignment in get_smca_bank_info(). Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20161104152317.5r276t35df53qk76@pd.tnicSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Borislav Petkov 提交于
Make it differ more from struct smca_bank_name for better readability. Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NYazen Ghannam <yazen.ghannam@amd.com> Link: http://lkml.kernel.org/r/20161103125556.15482-3-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Borislav Petkov 提交于
Call it simply smca_hwid and call local variables "hwid". More readable. Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NYazen Ghannam <yazen.ghannam@amd.com> Link: http://lkml.kernel.org/r/20161103125556.15482-2-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 13 9月, 2016 6 次提交
-
-
由 Yazen Ghannam 提交于
Bank 4 is reserved on family 0x17 and shouldn't generate any MCE records. However, broken hardware and software is not something unheard of so warn about bank 4 errors. They shouldn't be coming from bank 4 naturally but users can still use mce_amd_inj to simulate errors from it for testing purposed. Also, avoid special handling in the injector mce_amd_inj like it is being done on the older families. [ bp: Rewrite commit message and merge into one patch. Use boot_cpu_data. ] Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NAravind Gopalakrishnan <aravindksg.lkml@gmail.com> Link: http://lkml.kernel.org/r/1473384591-5323-1-git-send-email-Yazen.Ghannam@amd.com Link: http://lkml.kernel.org/r/1473384591-5323-2-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Yazen Ghannam 提交于
The MCA_SYND and MCA_IPID registers contain valuable information and should be included in MCE output. The MCA_SYND register contains syndrome and other error information, and the MCA_IPID register will uniquely identify the MCA bank's type without having to rely on system software. Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1472680624-34221-2-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Yazen Ghannam 提交于
Scalable MCA defines a number of IP types. An MCA bank on an SMCA system is defined as one of these IP types. A bank's type is uniquely identified by the combination of the HWID and MCATYPE values read from its MCA_IPID register. Add the required tables in order to be able to lookup error descriptions based on a bank's type and the error's extended error code. [ bp: Align comments, simplify a bit. ] Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1472741832-1690-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Yazen Ghannam 提交于
The error descriptions defined for Fam17h can be reused for other SMCA systems, so their names should reflect this. Change f17h prefix to smca for error descriptions. Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1472673994-12235-4-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Yazen Ghannam 提交于
Add missing SMCA error descriptions to the error descriptions arrays. Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1472673994-12235-3-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Yazen Ghannam 提交于
Print SyndV bit status and print the raw value of the MCA_SYND register. Further decoding of the syndrome from struct mce.synd can be done in other places where appropriate, e.g. DRAM ECC. Boris: make the error stanza more compact by putting the error address and syndrome on the same line: [Hardware Error]: Corrected error, no action required. [Hardware Error]: CPU:2 (17:0:0) MC4_STATUS[-|CE|-|PCC|AddrV|-|-|SyndV|CECC]: 0x96204100001e0117 [Hardware Error]: Error Addr: 0x000000007f4c52e3, Syndrome: 0x0000000000000000 [Hardware Error]: Invalid IP block specified. [Hardware Error]: cache level: L3/GEN, tx: DATA, mem-tx: RD Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1467633035-32080-2-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 12 5月, 2016 1 次提交
-
-
由 Yazen Ghannam 提交于
Use X86_FEATURE_SMCA when detecting if SMCA is available instead of directly using CPUID 0x80000007_EBX. Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1462971509-3856-7-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 08 3月, 2016 1 次提交
-
-
由 Aravind Gopalakrishnan 提交于
For Scalable MCA enabled processors, errors are listed per IP block. And since it is not required for an IP to map to a particular bank, we need to use HWID and McaType values from the MCx_IPID register to figure out which IP a given bank represents. We also have a new bit (TCC) in the MCx_STATUS register to indicate Task context is corrupt. Add logic here to decode errors from all known IP blocks for Fam17h Model 00-0fh and to print TCC errors. [ Minor fixups. ] Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1457021458-2522-3-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 13 8月, 2015 2 次提交
-
-
由 Borislav Petkov 提交于
This used to flush out MCEs logged during early boot and which were in the MCA registers from a previous system run. No need for that now, since we've moved to a genpool. Suggested-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1439396985-12812-7-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Chen, Gong 提交于
Use unified genpool to save Action Optional error events and put Action Optional error handling in the same notification chain as MCE error decoding. Signed-off-by: NChen, Gong <gong.chen@linux.intel.com> [ Fold in subsequent patch from Boris for early boot logging. ] Signed-off-by: NTony Luck <tony.luck@intel.com> [ Correct a lot. ] Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1439396985-12812-5-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 14 7月, 2015 1 次提交
-
-
由 Aravind Gopalakrishnan 提交于
Currently, when decoding an MCE, we display 'CE' for a Deferred error, like this: [Hardware Error]: CPU:0 (15:2:0) MC4_STATUS[Over|CE|MiscV|-|AddrV|Deferred|-|UECC]: 0xdc04b00095080813 When the 'UC' bit in the MCx_STATUS register is clear, the error status is either a Corrected error or Deferred error as determined by the 'Deferred' bit. So do not print 'CE' on a deferred error. Refer to AMD Error Scope Hierarchy table in a newer BKDG (example: 49125_15h_Models_30h-3Fh_BKDG.pdf, section "RAS Features"). Signed-off-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1436788382-6463-1-git-send-email-aravind.gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 25 11月, 2014 1 次提交
-
-
由 Borislav Petkov 提交于
Write out MCx_ADDR into the more humanly readable "MCx Error Address" and remove double colon in the output. Cc: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 05 11月, 2014 1 次提交
-
-
由 Aravind Gopalakrishnan 提交于
Extended error code meanings are tabulated for other banks. Extend that tradition for MC6 too. Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1415122868-10969-1-git-send-email-aravind.gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 14 7月, 2014 1 次提交
-
-
由 Aravind Gopalakrishnan 提交于
Add decoding logic for new Fam15h model 60h. Tested using mce_amd_inj module and works fine. Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1405098795-4678-1-git-send-email-Aravind.Gopalakrishnan@amd.com [ Boris: simplify a bit. ] Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 09 5月, 2014 1 次提交
-
-
由 Borislav Petkov 提交于
295d8cda ("EDAC, MCE, AMD: Drop local coreid reporting") removed the code snippet which used that mask but forgot to drop the mask itself. Do that now. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 24 2月, 2014 1 次提交
-
-
由 Borislav Petkov 提交于
We want to still be able to issue some error information on systems for which there is no decoding support (think older distro kernels here, for example). Therefore, we allow module registration but skip the per-family bank-specific decoders and issue the general information only, i.e.: [ 46.822828] [Hardware Error]: Error Status: Uncorrected, software containable error. [ 46.822846] [Hardware Error]: CPU:0 (15:30:0) MC0_STATUS[-|UE|-|-|-|-|-]: 0xa000000000010f0f [ 46.822858] [Hardware Error]: cache level: L3/GEN, mem/io: GEN, mem-tx: GEN, part-proc: GEN (timed out) with the hope that it still contains helpful useful bits. Suggested-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Tested-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1392659391-2411-1-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 08 6月, 2013 1 次提交
-
-
由 Aravind Gopalakrishnan 提交于
Add a new error signature for Family 15h, models 30h-3fh. Patch has been tested on Fam15h using mce_amd_inj facility and has been verified to work correctly. Signed-off-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> [ cleanup commit message and error string ] Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 23 1月, 2013 3 次提交
-
-
由 Borislav Petkov 提交于
Initially, those strings describing different parts of an MCE message were shared with amd64_edac and were therefore exported to modules. However, all except pp_msgs are used only in one place right now so hide them and make them static. No functionality change. Reported-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
由 Jacob Shin 提交于
Add MCE decoding logic for AMD Family 16h processors. Boris: - drop unneeded uu_msgs export - exit early in cat_mc1_mce and save us an indentation level Signed-off-by: NJacob Shin <jacob.shin@amd.com> Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
由 Jacob Shin 提交于
Currently only AMD Family 15h processors have special handling for MC2 errors. Since upcoming Family 16h will also need unique handling, let's make MC2 handling part of amd_decoder_ops. Signed-off-by: NJacob Shin <jacob.shin@amd.com> Signed-off-by: NBorislav Petkov <bp@alien8.de>
-
- 28 11月, 2012 2 次提交
-
-
由 Borislav Petkov 提交于
Dump error status after decoding the error which describes the error disposition. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Instead of starting with the error details, report the decoded, readable error type first. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-