提交 · bc8e80d56c1ecb35e65df392d7601d1427d14efe · openanolis / cloud-kernel

14 6月, 2017 3 次提交

x86/mce: Merge mce_amd_inj into mce-inject · bc8e80d5

由 Borislav Petkov 提交于 6月 13, 2017

Reuse mce_amd_inj's debugfs interface so that mce-inject can
benefit from it too. The old functionality is still preserved under
CONFIG_X86_MCELOG_LEGACY.
Tested-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Acked-by: NYazen Ghannam <yazen.ghannam@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/20170613162835.30750-4-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

bc8e80d5

x86/mce/AMD: Use saved threshold block info in interrupt handler · 17ef4af0

由 Yazen Ghannam 提交于 6月 13, 2017

In the amd_threshold_interrupt() handler, we loop through every possible
block in each bank and rediscover the block's address and if it's valid,
e.g. valid, counter present and not locked.

However, we already have the address saved in the threshold blocks list
for each CPU and bank. The list only contains blocks that have passed
all the valid checks.

Besides the redundancy, there's also a smp_call_function* in
get_block_address() which causes a warning when servicing the interrupt:

 WARNING: CPU: 0 PID: 0 at kernel/smp.c:281 smp_call_function_single+0xdd/0xf0
 ...
 Call Trace:
  <IRQ>
  rdmsr_safe_on_cpu()
  get_block_address.isra.2()
  amd_threshold_interrupt()
  smp_threshold_interrupt()
  threshold_interrupt()

because we do get called in an interrupt handler *with* interrupts
disabled, which can result in a deadlock.

Drop the redundant valid checks and move the overflow check, logging and
block reset into a separate function.

Check the first block then iterate over the rest. This procedure is
needed since the first block is used as the head of the list.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170613162835.30750-3-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

17ef4af0

x86/mce/AMD: Use msr_stat when clearing MCA_STATUS · a24b8c34

由 Yazen Ghannam 提交于 6月 13, 2017

The value of MCA_STATUS is used as the MSR when clearing MCA_STATUS.

This may cause the following warning:

 unchecked MSR access error: WRMSR to 0x11b (tried to write 0x0000000000000000)
 Call Trace:
  <IRQ>
  smp_threshold_interrupt()
  threshold_interrupt()

Use msr_stat instead which has the MSR address.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Fixes: 37d43acf ("x86/mce/AMD: Redo error logging from APIC LVT interrupt handlers")
Link: http://lkml.kernel.org/r/20170613162835.30750-2-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

a24b8c34

22 5月, 2017 4 次提交

x86/mce/AMD: Carve out SMCA bank configuration · 84bcc1d5

由 Yazen Ghannam 提交于 5月 19, 2017

Scalable MCA systems have a new MCA_CONFIG register that we use to
configure each bank. We currently use this when we set up thresholding.
However, this is logically separate.

Group all SMCA-related initialization into a single function.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1493147772-2721-2-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

84bcc1d5

x86/mce/AMD: Redo error logging from APIC LVT interrupt handlers · 37d43acf

由 Yazen Ghannam 提交于 5月 19, 2017

We have support for the new SMCA MCA_DE{STAT,ADDR} registers in Linux.
So we've used these registers in place of MCA_{STATUS,ADDR} on SMCA
systems.

However, the guidance for current SMCA implementations of is to continue
using MCA_{STATUS,ADDR} and to use MCA_DE{STAT,ADDR} only if a Deferred
error was not found in the former registers. If we logged a Deferred
error in MCA_STATUS then we should also clear MCA_DESTAT. This also
means we shouldn't clear MCA_CONFIG[LogDeferredInMcaStat].

Rework __log_error() to only log an error and add helpers for the
different error types being logged from the corresponding interrupt
handlers.

Boris: carve out common functionality into a _log_error_bank(). Cleanup
comments, check MCi_STATUS bits before reading MSRs. Streamline flow.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1493147772-2721-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

37d43acf

x86/mce: Convert threshold_bank.cpus from atomic_t to refcount_t · 473e90b2

由 Elena Reshetova 提交于 5月 19, 2017

The refcount_t type and corresponding API should be used instead
of atomic_t when the variable is used as a reference counter. This
allows to avoid accidental refcounter overflows that might lead to
use-after-free situations.
Suggested-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1492695536-5947-1-git-send-email-elena.reshetova@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

473e90b2

x86/MCE: Export memory_error() · 2d1f4061

由 Borislav Petkov 提交于 5月 19, 2017

Export the function which checks whether an MCE is a memory error to
other users so that we can reuse the logic. Drop the boot_cpu_data use,
while at it, as mce.cpuvendor already has the CPU vendor in there.

Integrate a piece from a patch from Vishal Verma
<vishal.l.verma@intel.com> to export it for modules (nfit).

The main reason we're exporting it is that the nfit handler
nfit_handle_mce() needs to detect a memory error properly before doing
its recovery actions.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170519093915.15413-2-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

2d1f4061

19 4月, 2017 2 次提交

x86/mce: Check MCi_STATUS[MISCV] for usable addr on Intel only · c6a9583f

由 Borislav Petkov 提交于 4月 18, 2017

mce_usable_address() does a bunch of basic sanity checks to verify
whether the address reported with the error is usable for further
processing. However, we do check MCi_STATUS[MISCV] and that is not
needed on AMD as that bit says that there's additional information about
the logged error in the MCi_MISCj banks.

But we don't need that to know whether the address is usable - we only
need to know whether the physical address is valid - i.e., ADDRV.

On Intel the MISCV bit is needed to perform additional checks to determine
whether the reported address is a physical one, etc.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170418183924.6agjkebilwqj26or@pd.tnicSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

c6a9583f

x86/mce: Make the MCE notifier a blocking one · 0dc9c639

由 Vishal Verma 提交于 4月 18, 2017

The NFIT MCE handler callback (for handling media errors on NVDIMMs)
takes a mutex to add the location of a memory error to a list. But since
the notifier call chain for machine checks (x86_mce_decoder_chain) is
atomic, we get a lockdep splat like:

  BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
  in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0
  [..]
  Call Trace:
   dump_stack
   ___might_sleep
   __might_sleep
   mutex_lock_nested
   ? __lock_acquire
   nfit_handle_mce
   notifier_call_chain
   atomic_notifier_call_chain
   ? atomic_notifier_call_chain
   mce_gen_pool_process

Convert the notifier to a blocking one which gets to run only in process
context.

Boris: remove the notifier call in atomic context in print_mce(). For
now, let's print the MCE on the atomic path so that we can make sure
they go out and get logged at least.

Fixes: 6839a6d9 ("nfit: do an ARS scrub on hitting a latent media error")
Reported-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
Acked-by: NTony Luck <tony.luck@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170411224457.24777-1-vishal.l.verma@intel.comSigned-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

0dc9c639

18 4月, 2017 1 次提交

x86/mce: Update notifier priority check · 415601b1

由 Borislav Petkov 提交于 4月 18, 2017

Update the check which enforces the registration of MCE decoder notifier
callbacks with valid priority only, to include mcelog's priority.
Reported-by: Nkernel test robot <xiaolong.ye@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: lkp@01.org
Link: http://lkml.kernel.org/r/20170418073820.i6kl5tggcntwlisa@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>

415601b1

14 4月, 2017 1 次提交

x86/mce: Enable PPIN for Knights Landing/Mill · 9ea74f7c

由 Piotr Luc 提交于 4月 13, 2017

Intel Xeon Phi processors (KNL and KNM) support PPIN as well, so add their
CPUIDs to the whitelist of supported processors.
Signed-off-by: NPiotr Luc <piotr.luc@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170408172004.8463-1-piotr.luc@intel.com
Link: http://lkml.kernel.org/r/20170413201056.10525-1-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

9ea74f7c

31 3月, 2017 1 次提交

x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs · 29f72ce3

由 Yazen Ghannam 提交于 3月 30, 2017

MCA bank 3 is reserved on systems pre-Fam17h, so it didn't have a name.
However, MCA bank 3 is defined on Fam17h systems and can be accessed
using legacy MSRs. Without a name we get a stack trace on Fam17h systems
when trying to register sysfs files for bank 3 on kernels that don't
recognize Scalable MCA.

Call MCA bank 3 "decode_unit" since this is what it represents on
Fam17h. This will allow kernels without SMCA support to see this bank on
Fam17h+ and prevent the stack trace. This will not affect older systems
since this bank is reserved on them, i.e. it'll be ignored.

Tested on AMD Fam15h and Fam17h systems.

  WARNING: CPU: 26 PID: 1 at lib/kobject.c:210 kobject_add_internal
  kobject: (ffff88085bb256c0): attempted to be registered with empty name!
  ...
  Call Trace:
   kobject_add_internal
   kobject_add
   kobject_create_and_add
   threshold_create_device
   threshold_init_device
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1490102285-3659-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

29f72ce3

28 3月, 2017 6 次提交

x86/mce: Do not register notifiers with invalid prio · 32b40a82

由 Borislav Petkov 提交于 3月 27, 2017

This is just a defensive precaution: do not register notifiers with a
priority which would disrupt the error handling in the notifiers with
prio higher than MCE_PRIO_EDAC.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170327093304.10683-7-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

32b40a82

x86/mce: Factor out and deprecate the /dev/mcelog driver · 5de97c9f

由 Tony Luck 提交于 3月 27, 2017

Move all code relating to /dev/mcelog to a separate source file.
/dev/mcelog driver can now operate from the machine check notifier with
lowest prio.
Signed-off-by: NTony Luck <tony.luck@intel.com>
[ Move the mce_helper and trigger functionality behind CONFIG_X86_MCELOG_LEGACY. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170327093304.10683-6-bp@alien8.de
[ Renamed CONFIG_X86_MCELOG to CONFIG_X86_MCELOG_LEGACY. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

5de97c9f

RAS: Add a Corrected Errors Collector · 011d8261

由 Borislav Petkov 提交于 3月 27, 2017

Introduce a simple data structure for collecting correctable errors
along with accessors. More detailed description in the code itself.

The error decoding is done with the decoding chain now and
mce_first_notifier() gets to see the error first and the CEC decides
whether to log it and then the rest of the chain doesn't hear about it -
basically the main reason for the CE collector - or to continue running
the notifiers.

When the CEC hits the action threshold, it will try to soft-offine the
page containing the ECC and then the whole decoding chain gets to see
the error.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170327093304.10683-5-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

011d8261

x86/mce: Rename mce_log to mce_log_buffer · e64edfcc

由 Borislav Petkov 提交于 3月 27, 2017

It is confusing when staring at "struct mce_log mcelog" and then there's
also a function called mce_log(). So call the buffer what it is.

No functionality change.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170327093304.10683-4-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

e64edfcc

x86/mce: Rename mce_log()'s argument · fe3ed20f

由 Borislav Petkov 提交于 3月 27, 2017

We call it everywhere "struct mce *m". Adjust that here too to avoid
confusion.

No functionality change.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170327093304.10683-3-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

fe3ed20f

x86/mce: Don't print MCEs when mcelog is active · cc66afea

由 Andi Kleen 提交于 3月 27, 2017

Since:

  cd9c57ca ("x86/MCE: Dump MCE to dmesg if no consumers")

all MCEs are printed even when mcelog is running. Fix the regression to
not print to dmesg when mcelog is running as it is a consumer too.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
[ Massage commit message. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: stable@vger.kernel.org # 4.10..
Fixes: cd9c57ca ("x86/MCE: Dump MCE to dmesg if no consumers")
Link: http://lkml.kernel.org/r/20170327093304.10683-2-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

cc66afea

18 3月, 2017 1 次提交

x86/mce: Init some CPU features early · 5204bf17

由 Yazen Ghannam 提交于 3月 15, 2017

When the MCA banks in __mcheck_cpu_init_generic() are polled for leftover
errors logged during boot or from the previous boot, its required to have
CPU features detected sufficiently so that the reading out and handling of
those early errors is done correctly.

If those features are not available, the decoding may miss some information
and get incomplete errors logged. For example, on SMCA systems the MCA_IPID
and MCA_SYND registers are not logged and MCA_ADDR is not masked
appropriately.

To cure that, do a subset of the basic feature detection early while the
rest happens in its usual place in __mcheck_cpu_init_vendor().
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1489599055-20756-1-git-send-email-Yazen.Ghannam@amd.com
[ Massage commit message and simplify. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

5204bf17

14 3月, 2017 1 次提交

x86/mce: Handle broadcasted MCE gracefully with kexec · 5bc32950

由 Xunlei Pang 提交于 3月 13, 2017

When we are about to kexec a crash kernel and right then and there a
broadcasted MCE fires while we're still in the first kernel and while
the other CPUs remain in a holding pattern, the #MC handler of the
first kernel will timeout and then panic due to never completing MCE
synchronization.

Handle this in a similar way as to when the CPUs are offlined when that
broadcasted MCE happens.

[ Boris: rewrote commit message and comments. ]
Suggested-by: NBorislav Petkov <bp@alien8.de>
Signed-off-by: NXunlei Pang <xlpang@redhat.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Acked-by: NTony Luck <tony.luck@intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: kexec@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1487857012-9059-1-git-send-email-xlpang@redhat.com
Link: http://lkml.kernel.org/r/20170313095019.19351-1-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

5bc32950

01 2月, 2017 1 次提交

x86/mce: Make timer handling more robust · 0becc0ae

由 Thomas Gleixner 提交于 1月 31, 2017

Erik reported that on a preproduction hardware a CMCI storm triggers the
BUG_ON in add_timer_on(). The reason is that the per CPU MCE timer is
started by the CMCI logic before the MCE CPU hotplug callback starts the
timer with add_timer_on(). So the timer is already queued which triggers
the BUG.

Using add_timer_on() is pretty pointless in this code because the timer is
strictlty per CPU, initialized as pinned and all operations which arm the
timer happen on the CPU to which the timer belongs.

Simplify the whole machinery by using mod_timer() instead of add_timer_on()
which avoids the problem because mod_timer() can handle already queued
timers. Use __start_timer() everywhere so the earliest armed expiry time is
preserved.
Reported-by: NErik Veijola <erik.veijola@intel.com>
Tested-by: NBorislav Petkov <bp@alien8.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NBorislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701310936080.3457@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

0becc0ae

24 1月, 2017 6 次提交

x86/ras, EDAC, acpi: Assign MCE notifier handlers a priority · 9026cc82

由 Borislav Petkov 提交于 1月 23, 2017

Assign all notifiers on the MCE decode chain a priority so that they get
called in the correct order.
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170123183514.13356-10-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

9026cc82

x86/ras: Get rid of mce_process_work() · cff4c039

由 Borislav Petkov 提交于 1月 23, 2017

Make mce_gen_pool_process() the workqueue function directly and save us
an indirection.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170123183514.13356-9-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

cff4c039

x86/ras: Flip the TSC-adding logic · 669c00f0

由 Borislav Petkov 提交于 1月 23, 2017

Add the TSC value to the MCE record only when the MCE being logged is
precise, i.e., it is logged as an exception or an MCE-related interrupt.

So it doesn't look particularly easy to do without touching/changing a
bunch of places. That's why I'm trying tricks first.

For example, the mce-apei.c case I'm addressing by setting ->tsc only
for errors of panic severity. The idea there is, that, panic errors will
have raised an #MC and not polled.

And then instead of propagating a flag to mce_setup(), it seems
easier/less code to set ->tsc depending on the call sites, i.e.,
are we polling or are we preparing an MCE record in an exception
handler/thresholding interrupt.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170123183514.13356-5-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

669c00f0

x86/ras/amd: Make sysfs names of banks more user-friendly · 0b737a9c

由 Yazen Ghannam 提交于 1月 23, 2017

Currently, we append the MCA_IPID[InstanceId] to the bank name to create
the sysfs filename. The InstanceId field uniquely identifies a bank
instance but it doesn't look very nice for most banks.

Replace the InstanceId with a simpler, ascending (0, 1, ..) value.
Only use this in the sysfs name when there is more than 1 instance.
Otherwise, just use the bank's name as the sysfs name.
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1484322741-41884-3-git-send-email-Yazen.Ghannam@amd.com
Link: http://lkml.kernel.org/r/20170123183514.13356-4-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

0b737a9c

x86/ras/therm_throt: Do not log a fake MCE for thermal events · 9b052ea4

由 Borislav Petkov 提交于 1月 23, 2017

We log a fake bank 128 MCE to note that we're handling a CPU thermal
event. However, this confuses people into thinking that their hardware
generates MCEs. Hijacking MCA for logging thermal events is a gross
misuse anyway and it shouldn't have been done in the first place. And
besides we have other means for dealing with thermal events which are
much more suitable.

So let's kill the MCE logging part.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Acked-by: NAshok Raj <ashok.raj@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170105213846.GA12024@gmail.com
Link: http://lkml.kernel.org/r/20170123183514.13356-3-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

9b052ea4

x86/ras/inject: Make it depend on X86_LOCAL_APIC=y · d4b2ac63

由 Borislav Petkov 提交于 1月 23, 2017

... and get rid of the annoying:

  arch/x86/kernel/cpu/mcheck/mce-inject.c:97:13: warning: ‘mce_irq_ipi’ defined but not used [-Wunused-function]

when doing randconfig builds.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170123183514.13356-2-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

d4b2ac63

05 1月, 2017 1 次提交

x86/irq, trace: Add __irq_entry annotation to x86's platform IRQ handlers · c4158ff5

由 Daniel Bristot de Oliveira 提交于 1月 04, 2017

This patch adds the __irq_entry annotation to the default x86
platform IRQ handlers. ftrace's function_graph tracer uses the
__irq_entry annotation to notify the entry and return of IRQ
handlers.

For example, before the patch:
  354549.667252 |   3)  d..1              |  default_idle_call() {
  354549.667252 |   3)  d..1              |    arch_cpu_idle() {
  354549.667253 |   3)  d..1              |      default_idle() {
  354549.696886 |   3)  d..1              |        smp_trace_reschedule_interrupt() {
  354549.696886 |   3)  d..1              |          irq_enter() {
  354549.696886 |   3)  d..1              |            rcu_irq_enter() {

After the patch:
  366416.254476 |   3)  d..1              |    arch_cpu_idle() {
  366416.254476 |   3)  d..1              |      default_idle() {
  366416.261566 |   3)  d..1  ==========> |
  366416.261566 |   3)  d..1              |        smp_trace_reschedule_interrupt() {
  366416.261566 |   3)  d..1              |          irq_enter() {
  366416.261566 |   3)  d..1              |            rcu_irq_enter() {

KASAN also uses this annotation. The smp_apic_timer_interrupt()
was already annotated.
Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com>
Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Aaron Lu <aaron.lu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Claudio Fontana <claudio.fontana@huawei.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nicolai Stange <nicstange@gmail.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/059fdf437c2f0c09b13c18c8fe4e69999d3ffe69.1483528431.git.bristot@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

c4158ff5

27 12月, 2016 1 次提交

x86/mce/AMD: Make the init code more robust · 0dad3a30

由 Thomas Gleixner 提交于 12月 26, 2016

If mce_device_init() fails then the mce device pointer is NULL and the
AMD mce code happily dereferences it.

Add a sanity check.
Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Reported-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0dad3a30

25 12月, 2016 1 次提交

Replace <asm/uaccess.h> with <linux/uaccess.h> globally · 7c0f6ba6

由 Linus Torvalds 提交于 12月 24, 2016

This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.
Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7c0f6ba6

10 12月, 2016 1 次提交

x86: Remove empty idle.h header · 34bc3560

由 Thomas Gleixner 提交于 12月 09, 2016

One include less is always a good thing(tm). Good riddance.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20161209182912.2726-6-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

34bc3560

23 11月, 2016 3 次提交

x86/mce: Include the PPIN in MCE records when available · 3f5a7896

由 Tony Luck 提交于 11月 18, 2016

Intel Xeons from Ivy Bridge onwards support a processor identification
number set in the factory. To the user this is a handy unique number to
identify a particular CPU. Intel can decode this to the fab/production
run to track errors. On systems that have it, include it in the machine
check record. I'm told that this would be helpful for users that run
large data centers with multi-socket servers to keep track of which CPUs
are seeing errors.

Boris:
* Add some clarifying comments and spacing.
* Mask out [63:2] in the disabled-but-not-locked case
* Call the MSR variable "val" for more readability.
Signed-off-by: NTony Luck <tony.luck@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20161123114855.njguoaygp3qnbkia@pd.tnicSigned-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

3f5a7896

x86/mce/therm_throt: Move hotplug callbacks to online · 33d97302

由 Thomas Gleixner 提交于 11月 21, 2016

No point to have the sysfs files around before the cpu is online and no
point to have them around until the cpu is dead. Get rid of the explicit
state.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>

33d97302

x86/mce/therm_throt: Convert to hotplug state machine · d6526e73

由 Sebastian Andrzej Siewior 提交于 11月 17, 2016

Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs.
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: rt@linuxtronix.de
Cc: Borislav Petkov <bp@alien8.de>
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/20161117183541.8588-2-bigeasy@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

d6526e73

22 11月, 2016 1 次提交

x86/mce/AMD: Add system physical address translation for AMD Fam17h · f5382de9

由 Yazen Ghannam 提交于 11月 17, 2016

The Unified Memory Controllers (UMCs) on Fam17h log a normalized address
in their MCA_ADDR registers. We need to convert that normalized address
to a system physical address in order to support a few facilities:

1) To offline poisoned pages in DRAM proactively in the deferred error
   handler.

2) To print sysaddr and page info for DRAM ECC errors in EDAC.

[ Boris: fixes/cleanups ontop:

  * hi_addr_offset = 0 - no need for that branch. Stick it all under the
    HiAddrOffsetEn case. It confines hi_addr_offset's declaration too.

  * Move variables to the innermost scope they're used at so that we save
    on stack and not blow it up immediately on function entry.

  * Do not modify *sys_addr prematurely - we want to not exit early and
    have modified *sys_addr some, which callers get to see. We either
    convert to a sys_addr or we don't do anything. And we signal that with
    the retval of the function.

  * Rename label out -> out_err - because it is the error path.

  * No need to pr_err of the conversion failed case: imagine a
    sparsely-populated machine with UMCs which don't have DIMMs. Callers
    should look at the retval instead and issue a printk only when really
    necessary. No need for useless info in dmesg.

  * s/temp_reg/tmp/ and other variable names shortening => shorter code.

  * Use BIT() everywhere.

  * Make error messages more informative.

  *  Small build fix for the !CONFIG_X86_MCE_AMD case.

  * ... and more minor cleanups.
]
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20161122111133.mjzpvzhf7o7yl2oa@pd.tnic
[ Typo fixes. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>

f5382de9

21 11月, 2016 1 次提交

x86/MCE/AMD: Fix thinko about thresholding_en · 254fe9c7

由 Borislav Petkov 提交于 11月 19, 2016

So adding thresholding_en et al was a good thing for removing the
per-CPU thresholding callback, i.e., threshold_cpu_callback.

But, in order for it to work and especially that test in
mce_threshold_create_device() so that all thresholding banks get
properly created and not the whole thing to fail with a NULL ptr
dereference at mce_cpu_pre_down() when we offline the CPUs, we need to
set the thresholding_en flag *before* we start creating the devices.

Yap, it failed because thresholding_en wasn't set at the time
we were creating the banks so we didn't create any and then at
mce_cpu_pre_down() -> mce_threshold_remove_device() time, we would blow
up.

And the fix is actually easy: we have thresholding on the system when we
have managed to set the thresholding vector to amd_threshold_interrupt()
earlier in mce_amd_feature_init() while we were picking apart the
thresholding banks and what is set and what not.

So let's do that.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Fixes: 4d7b02d5 ("x86/mcheck: Split threshold_cpu_callback into two callbacks")
Link: http://lkml.kernel.org/r/20161119103402.5227-1-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

254fe9c7

16 11月, 2016 4 次提交

x86/mce/AMD: Reset Threshold Limit after logging error · 18807ddb

由 Yazen Ghannam 提交于 11月 15, 2016

The error count field in MCA_MISC does not get reset by hardware when the
threshold has been reached. Software is expected to reset it. Currently,
the threshold limit only gets reset during init or when a user writes to
sysfs.

If the user is not monitoring threshold interrupts and resetting
the limit then the user will only see 1 interrupt when the limit is first
hit. So if, for example, the limit is set to 10 then only 1 interrupt will
be recorded after 10 errors even if 100 errors have occurred. The user may
then assume that only 10 errors have occurred.
Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1479244433-69267-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

18807ddb

x86/mcheck: Move CPU_DEAD to hotplug state machine · 0e285d36

由 Sebastian Andrzej Siewior 提交于 11月 10, 2016

This moves the last piece of the old hotplug notifier code in MCE to the
new hotplug state machine.
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: NBorislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: rt@linutronix.de
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/20161110174447.11848-8-bigeasy@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

0e285d36

x86/mcheck: Move CPU_ONLINE and CPU_DOWN_PREPARE to hotplug state machine · 8c0eeac8

由 Sebastian Andrzej Siewior 提交于 11月 10, 2016

The CPU_ONLINE and CPU_DOWN_PREPARE look fully symmetrical and could be move
to the hotplug state machine.
On a failure during registration we have the tear down callback invoked
(mce_cpu_pre_down()) so there should be no timer around and so no need to need
keep notifier installed (this was the reason according to the comment why the
notifier was registered despite of errors).
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: NBorislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: rt@linutronix.de
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/20161110174447.11848-7-bigeasy@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

8c0eeac8

x86/mcheck: Reorganize the hotplug callbacks · 39f152ff

由 Sebastian Andrzej Siewior 提交于 11月 10, 2016

Initially I wanted to remove mcheck_cpu_init() from identify_cpu() and let it
become an independent early hotplug callback. The main problem here was that
the init on the boot CPU may happen too late
(device_initcall_sync(mcheck_init_device)) and nobody wanted to risk receiving
and MCE event at boot time leading to a shutdown (if the MCE feature is not yet
enabled).

Here is attempt two: the timming stays as-is but the ordering of the functions
is changed:
- mcheck_cpu_init() (which is run from identify_cpu()) will setup the timer
  struct but won't fire the timer. This is moved to CPU_ONLINE since its
  cleanup part is in CPU_DOWN_PREPARE. So if it is okay to stop the timer early
  in the shutdown phase, it should be okay to start it late in the bring up phase.

- CPU_DOWN_PREPARE disables the MCE feature flags for !INTEL CPUs in
  mce_disable_cpu(). If a failure occures it would be re-enabled on all vendor
  CPUs (including Intel where it was not disabled during shutdown). To keep this
  working I am moving it to CPU_ONLINE. smp_call_function_single() is dropped
  beause the notifier runs nowdays on the target CPU.

- CPU_ONLINE is invoking mce_device_create() + mce_threshold_create_device()
  but its cleanup part is in CPU_DEAD (mce_threshold_remove_device() and
  mce_device_remove()). In order to keep this symmetrical I am moving the clean
  up from CPU_DEAD to CPU_DOWN_PREPARE.
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: NBorislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: rt@linutronix.de
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/20161110174447.11848-6-bigeasy@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

39f152ff

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功