提交 · a9a1c0ee04aa771e5523ae33e458c702261ab547 · openanolis / cloud-kernel

The MCA_ADDR registers on Scalable MCA systems contain the ErrorAddr
in bits [55:0] and the least significant bit of the address in bits
[61:56]. We should extract the valid ErrorAddr bits from the MCA_ADDR
register rather than saving the raw value to struct mce.
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1473275643-1721-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

4f29b73b

x86/mce/AMD: Save MCA_IPID in MCE struct on SMCA systems · 5828c46f

由 Yazen Ghannam 提交于 9月 12, 2016

The MCA_IPID register uniquely identifies a bank's type and instance
on Scalable MCA systems. We should save the value of this register
in struct mce along with the other relevant error information. This
ensures that we can decode errors without relying on system software to
correlate the bank to the type.
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472680624-34221-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

5828c46f

x86/mce/AMD: Ensure the deferred error interrupt is of type APIC on SMCA systems · 66ef269d

由 Yazen Ghannam 提交于 9月 12, 2016

The Deferred Error Interrupt Type is set per bank on Scalable MCA
systems. This is done in a bitfield in the MCA_CONFIG register of each
bank. We should set its type to APIC-based interrupt and not assume BIOS
has set it for us.
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472737486-1720-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

66ef269d

x86/mce/AMD: Update sysfs bank names for SMCA systems · 87a6d409

由 Yazen Ghannam 提交于 9月 12, 2016

Define a bank's sysfs filename based on its IP type and InstanceId.

Credits go to Aravind  for:
 * The general idea and proto- get_name().
 * Defining smca_umc_block_names[] and buf_mcatype[].
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Link: http://lkml.kernel.org/r/1473193490-3291-2-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

87a6d409

x86/mce/AMD, EDAC/mce_amd: Define and use tables for known SMCA IP types · 5896820e

由 Yazen Ghannam 提交于 9月 12, 2016

Scalable MCA defines a number of IP types. An MCA bank on an SMCA
system is defined as one of these IP types. A bank's type is uniquely
identified by the combination of the HWID and MCATYPE values read from
its MCA_IPID register.

Add the required tables in order to be able to lookup error descriptions
based on a bank's type and the error's extended error code.

[ bp: Align comments, simplify a bit. ]
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472741832-1690-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

5896820e

x86/mce/AMD: Read MSRs on the CPU allocating the threshold blocks · cfee4f6f

由 Yazen Ghannam 提交于 9月 12, 2016

Scalable MCA systems allow non-core MCA banks to only be accessible by
certain CPUs. The MSRs for these banks are Read-as-Zero on other CPUs.

During allocate_threshold_blocks(), get_block_address() can be scheduled
on CPUs other than the one allocating the block. This causes the MSRs to
be read on the wrong CPU and results in incorrect behavior.

Add a @cpu parameter to get_block_address() and pass this in to ensure
that the MSRs are only read on the CPU that is allocating the block.
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472673994-12235-2-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

cfee4f6f

x86/mce: Add support for new MCA_SYND register · db819d60

由 Yazen Ghannam 提交于 9月 12, 2016

Syndrome information is no longer contained in MCA_STATUS for SMCA
systems but in a new register - MCA_SYND.

Add a synd field to struct mce to hold MCA_SYND register value. Add it
to the end of struct mce to maintain compatibility with old versions of
mcelog. Also, add it to the respective tracepoint.
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1467633035-32080-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

db819d60

x86/mce/AMD: Use msr_ops.misc() in allocate_threshold_blocks() · 74ab0e7a

由 Yazen Ghannam 提交于 9月 12, 2016

Change MSR_IA32_MCx_MISC() macro to msr_ops.misc() because SMCA machines
define a different set of MSRs and msr_ops will give you the correct
MISC register.
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1468269447-8808-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

74ab0e7a

08 7月, 2016 1 次提交

x86/mce/AMD: Increase size of the bank_map type · 955d1427

由 Aravind Gopalakrishnan 提交于 7月 08, 2016

Change bank_map type from 'char' to 'int' since we now have more than eight
banks in a system.
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1467968983-4874-2-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

955d1427

12 5月, 2016 3 次提交

x86/mce/AMD: Save an indentation level in prepare_threshold_block() · e128b4f4

由 Borislav Petkov 提交于 5月 11, 2016

Do the !SMCA work first and then save us an indentation level for the
SMCA code.

No functionality change.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1462971509-3856-4-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

e128b4f4

x86/mce/AMD: Disable LogDeferredInMcaStat for SMCA systems · 32544f06

由 Yazen Ghannam 提交于 5月 11, 2016

Disable Deferred Error logging in MCA_{STATUS,ADDR} additionally for
SMCA systems as this information will retrieved from MCA_DE{STAT,ADDR}
on those systems.
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
[ Simplify, drop SMCA_MCAX_EN_OFF define too. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1462971509-3856-3-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

32544f06

x86/mce/AMD: Log Deferred Errors using SMCA MCA_DE{STAT,ADDR} registers · 34102009

由 Yazen Ghannam 提交于 5月 11, 2016

Scalable MCA provides new registers for all banks for logging deferred
errors: MCA_DESTAT and MCA_DEADDR. Deferred errors are always logged to
these registers.

Update the AMD deferred error handler to use these registers, if
available.
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
[ Sanity-check __log_error() args, massage a bit. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1462971509-3856-2-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

34102009

03 5月, 2016 1 次提交

x86/mce: Detect and use SMCA-specific msr_ops · d9d73fcc

由 Yazen Ghannam 提交于 4月 30, 2016

Replace all calls to MCx_IA32_{CTL,ADDR,MISC,STATUS} with the
appropriate msr_ops.

Use SMCA-specific msr_ops when on an SMCA-enabled processor.

Carved out from a patch by Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>.
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1462019637-16474-6-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

d9d73fcc

08 3月, 2016 3 次提交

x86/mce/AMD: Document some functionality · ea2ca36b

由 Aravind Gopalakrishnan 提交于 3月 07, 2016

In an attempt to aid in understanding of what the threshold_block
structure holds, provide comments to describe the members here. Also,
trim comments around threshold_restart_bank() and update copyright info.

No functional change is introduced.
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
[ Shorten comments. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1457021458-2522-6-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ea2ca36b

x86/mce/AMD: Fix logic to obtain block address · 8dd1e17a

由 Aravind Gopalakrishnan 提交于 3月 07, 2016

In upcoming processors, the BLKPTR field is no longer used to indicate
the MSR number of the additional register. Insted, it simply indicates
the prescence of additional MSRs.

Fix the logic here to gather MSR address from MSR_AMD64_SMCA_MCx_MISC()
for newer processors and fall back to existing logic for older
processors.

[ Drop nextaddr_out label; style cleanups. ]
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1457021458-2522-4-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

8dd1e17a

x86/mce/AMD, EDAC: Enable error decoding of Scalable MCA errors · be0aec23

由 Aravind Gopalakrishnan 提交于 3月 07, 2016

For Scalable MCA enabled processors, errors are listed per IP block. And
since it is not required for an IP to map to a particular bank, we need
to use HWID and McaType values from the MCx_IPID register to figure out
which IP a given bank represents.

We also have a new bit (TCC) in the MCx_STATUS register to indicate Task
context is corrupt.

Add logic here to decode errors from all known IP blocks for Fam17h
Model 00-0fh and to print TCC errors.

[ Minor fixups. ]
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1457021458-2522-3-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

be0aec23

01 2月, 2016 5 次提交

x86/mce/AMD: Set MCAX Enable bit · e6c8f187

由 Aravind Gopalakrishnan 提交于 1月 25, 2016

It is required for the OS to acknowledge that it is using the
MCAX register set and its associated fields by setting the
'McaXEnable' bit in each bank's MCi_CONFIG register. If it is
not set, then all UC errors will cause a system panic.
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1453750913-4781-9-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

e6c8f187

x86/mce/AMD: Carve out threshold block preparation · 429893b1

由 Borislav Petkov 提交于 1月 25, 2016

mce_amd_feature_init() was getting pretty fat, carve out the
threshold_block setup into a separate function in order to
simplify flow and make it more understandable.

No functionality change.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1453750913-4781-8-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

429893b1

x86/mce/AMD: Fix LVT offset configuration for thresholding · f57a1f3c

由 Aravind Gopalakrishnan 提交于 1月 25, 2016

For processor families with the Scalable MCA feature, the LVT
offset for threshold interrupts is configured only in MSR
0xC0000410 and not in each per bank MISC register as was done in
earlier families.

Obtain the LVT offset from the correct MSR for those families.
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1453750913-4781-7-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

f57a1f3c

x86/mce/AMD: Reduce number of blocks scanned per bank · 60f116fc

由 Aravind Gopalakrishnan 提交于 1月 25, 2016

From Fam17h onwards, the number of extended MCx_MISC register blocks is
reduced to 4. It is an architectural change from what we had on
earlier processors.

Although theoritically the total number of extended MCx_MISC
registers was 8 in earlier processor families, in practice we
only had to use the extra registers for MC4. And only 2 of those
were used. So this change does not affect older processors.
Tested on Fam10h and Fam15h systems.
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1453750913-4781-6-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

60f116fc

x86/mce/AMD: Do not perform shared bank check for future processors · 284b965c

由 Aravind Gopalakrishnan 提交于 1月 25, 2016

Fam17h and above should not require a check to see if a bank is
shared or not. For shared banks, there will always be only one
core that has visibility over the MSRs and only that particular
core will be allowed to write to the MSRs.

Fix the code to return early if we have Scalable MCA support. No
change in functionality for earlier processors.
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
[ Massaged the changelog text, fixed kbuild test robot build warning. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1453750913-4781-5-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

284b965c

07 5月, 2015 5 次提交

x86/mce/amd: Zap changelog · 3490c0e4

由 Borislav Petkov 提交于 5月 07, 2015

It is useless and git history has it all detailed anyway. Update
copyright while at it.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>

3490c0e4

x86/mce/amd: Rename setup_APIC_mce · 868c00bb

由 Aravind Gopalakrishnan 提交于 5月 06, 2015

'setup_APIC_mce' doesn't give us an indication of why we are
going to program LVT. Make that explicit by renaming it to
setup_APIC_mce_threshold so we know.

No functional change is introduced.
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1430913538-1415-7-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>

868c00bb

x86/mce/amd: Introduce deferred error interrupt handler · 24fd78a8

由 Aravind Gopalakrishnan 提交于 5月 06, 2015

Deferred errors indicate error conditions that were not corrected, but
require no action from S/W (or action is optional).These errors provide
info about a latent UC MCE that can occur when a poisoned data is
consumed by the processor.

Processors that report these errors can be configured to generate APIC
interrupts to notify OS about the error.

Provide an interrupt handler in this patch so that OS can catch these
errors as and when they happen. Currently, we simply log the errors and
exit the handler as S/W action is not mandated.
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1430913538-1415-5-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>

24fd78a8

x86/mce/amd: Collect valid address before logging an error · 6e6e746e

由 Aravind Gopalakrishnan 提交于 5月 06, 2015

amd_decode_mce() needs value in m->addr so it can report the error
address correctly. This should be setup in __log_error() before we call
mce_log(). We do this because the error address is an important bit of
information which should be conveyed to userspace.

The correct output then reports proper address, like this:

  [Hardware Error]: Corrected error, no action required.
  [Hardware Error]: CPU:0 (15:60:0) MC0_STATUS [-|CE|-|-|AddrV|-|-|CECC]: 0x840041000028017b
  [Hardware Error]: MC0 Error Address: 0x00001f808f0ff040
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1430913538-1415-3-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>

6e6e746e

x86/mce/amd: Factor out logging mechanism · afdf344e

由 Aravind Gopalakrishnan 提交于 5月 06, 2015

Refactor the code here to setup struct mce and call mce_log() to log
the error. We're going to reuse this in a later patch as part of the
deferred error interrupt enablement.

No functional change is introduced.
Suggested-by: NBorislav Petkov <bp@alien8.de>
Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1430913538-1415-2-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>

afdf344e

19 2月, 2015 2 次提交

x86/MCE/AMD: Enable thresholding interrupts by default if supported · d79f931f

由 Aravind Gopalakrishnan 提交于 2月 02, 2015

We setup APIC vectors for threshold errors if interrupt_capable.
However, we don't set interrupt_enable by default. Rework
threshold_restart_bank() so that when we set up lvt_offset, we also set
IntType to APIC and also enable thresholding interrupts for banks which
support it by default.

User is still allowed to disable interrupts through sysfs.

While at it, check if status is valid before we proceed to log error
using mce_log. This is because, in multi-node platforms, only the NBC
(Node Base Core, i.e. the first core in the node) has valid status info
in its MCA registers. So, the decoding of status values on the non-NBC
leads to noise on kernel logs like so:

  EDAC DEBUG: amd64_inject_write_store: section=0x80000000 word_bits=0x10020001
  [Hardware Error]: Corrected error, no action required.
  [Hardware Error]: CPU:25 (15:2:0) MC4_STATUS[-|CE|-|-|-
  [Hardware Error]: Corrected error, no action required.
  [Hardware Error]: CPU:26 (15:2:0) MC4_STATUS[-|CE|-|-|-
  <...>
  WARNING: CPU: 25 PID: 0 at drivers/edac/amd64_edac.c:2147 decode_bus_error+0x1ba/0x2a0()
  WARNING: CPU: 26 PID: 0 at drivers/edac/amd64_edac.c:2147 decode_bus_error+0x1ba/0x2a0()
  Something is rotten in the state of Denmark.
Suggested-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Link: http://lkml.kernel.org/r/1422896561-7695-1-git-send-email-aravind.gopalakrishnan@amd.com
[ Massage commit message. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

d79f931f

x86/MCE/AMD: Drop bogus const modifier from AMD's bank4_names() · 2cd4c303

由 Jan Beulich 提交于 1月 23, 2015

The compiler validly warns about it being ignored.
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/54C21511020000780005890E@mail.emea.novell.comSigned-off-by: NBorislav Petkov <bp@suse.de>

2cd4c303

01 11月, 2014 1 次提交

x86, MCE, AMD: Assign interrupt handler only when bank supports it · 8dcf32ea

由 Chen Yucong 提交于 11月 01, 2014

There are some AMD CPU models which have thresholding banks but which
cannot generate a thresholding interrupt. This is denoted by the bit
MCi_MISC[IntP]. Make sure to check that bit before assigning the
thresholding interrupt handler.
Signed-off-by: NChen Yucong <slaoub@gmail.com>
[ Boris: save an indentation level and rewrite commit message. ]
Link: http://lkml.kernel.org/r/1412662128.28440.18.camel@debianSigned-off-by: NBorislav Petkov <bp@suse.de>

8dcf32ea

22 10月, 2014 4 次提交

x86, MCE, AMD: Drop software-defined bank in error thresholding · a3a529d1

由 Borislav Petkov 提交于 10月 21, 2014

Aravind had the good question about why we're assigning a
software-defined bank when reporting error thresholding errors instead
of simply using the bank which reports the last error causing the
overflow.

Digging through git history, it pointed to

95268664 ("[PATCH] x86_64: mce_amd support for family 0x10 processors")

which added that functionality. The problem with this, however, is that
tools don't know about software-defined banks and get puzzled. So drop
that K8_MCE_THRESHOLD_BASE and simply use the hw bank reporting the
thresholding interrupt.

Save us a couple of MSR reads while at it.
Reported-by: NAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Link: https://lkml.kernel.org/r/5435B206.60402@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>

a3a529d1

x86, MCE, AMD: Move invariant code out from loop body · 69b95758

由 Chen Yucong 提交于 10月 02, 2014

Assigning to mce_threshold_vector is loop-invariant code in
mce_amd_feature_init(). So do it only once, out of loop body.
Signed-off-by: NChen Yucong <slaoub@gmail.com>
Link: http://lkml.kernel.org/r/1412263212.8085.6.camel@debian
[ Boris: commit message corrections. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

69b95758

x86, MCE, AMD: Correct thresholding error logging · 44612a3a

由 Chen Yucong 提交于 10月 02, 2014

mce_setup() does not gather the content of IA32_MCG_STATUS, so it
should be read explicitly. Moreover, we need to clear IA32_MCx_STATUS
to avoid that mce_log() logs the processed threshold event again
at next time.

But we do the logging ourselves and machine_check_poll() is completely
useless there. So kill it.
Signed-off-by: NChen Yucong <slaoub@gmail.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>

44612a3a

x86, MCE, AMD: Use macros to compute bank MSRs · 4b737d78

由 Chen Yucong 提交于 9月 23, 2014

Avoid open coded calculations for bank MSRs by hiding the index
of higher bank MSRs in well-defined macros.

No semantic changes.
Signed-off-by: NChen Yucong <slaoub@gmail.com>
Link: http://lkml.kernel.org/r/1411438561-24319-1-git-send-email-slaoub@gmail.comSigned-off-by: NBorislav Petkov <bp@suse.de>

4b737d78

27 8月, 2014 1 次提交

x86: Replace __get_cpu_var uses · 89cbc767

由 Christoph Lameter 提交于 8月 17, 2014

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :

#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))

__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.

This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

Transformations done to __get_cpu_var()

1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);

2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);

3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);

4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));

5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);

6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

89cbc767

09 8月, 2014 1 次提交

arch/x86: replace strict_strto calls · 164109e3

由 Daniel Walter 提交于 8月 08, 2014

Replace obsolete strict_strto calls with appropriate kstrto calls
Signed-off-by: NDaniel Walter <dwalter@google.com>
Acked-by: NBorislav Petkov <bp@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

164109e3

15 7月, 2013 1 次提交

x86: delete __cpuinit usage from all x86 files · 148f9bb8

由 Paul Gortmaker 提交于 6月 18, 2013

The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications.  For example, the fix in
commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out.  Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit  -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings.  In any case, they are temporary and harmless.

This removes all the arch/x86 uses of the __cpuinit macros from
all C files.  x86 only had the one __CPUINIT used in assembly files,
and it wasn't paired off with a .previous or a __FINIT, so we can
delete it directly w/o any corresponding additional change there.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Acked-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

148f9bb8

22 3月, 2013 1 次提交

x86, MCE, AMD: Use MCG_CAP MSR to find out number of banks on AMD · bafcdd3b

由 Boris Ostrovsky 提交于 3月 14, 2013

Currently number of error reporting register banks is hardcoded to
6 on AMD processors. This may break in virtualized scenarios when
a hypervisor prefers to report fewer banks than what the physical
HW provides.

Since number of supported banks is reported in MSR_IA32_MCG_CAP[7:0]
that's what we should use.
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1363295441-1859-3-git-send-email-boris.ostrovsky@oracle.com
[ reverse NULL ptr test logic ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

bafcdd3b

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功