1. 07 5月, 2015 1 次提交
    • A
      x86/mce: Add support for deferred errors on AMD · 7559e13f
      Aravind Gopalakrishnan 提交于
      Deferred errors indicate error conditions that were not corrected, but
      those errors have not been consumed yet. They require no action from
      S/W (or action is optional). These errors provide info about a latent
      uncorrectable MCE that can occur when a poisoned data is consumed by the
      processor.
      
      Newer AMD processors can generate deferred errors and can be configured
      to generate APIC interrupts on such events.
      
      SUCCOR stands for S/W UnCorrectable error COntainment and Recovery.
      It indicates support for data poisoning in HW and deferred error
      interrupts.
      
      Add new bitfield to mce_vendor_flags for this. We use this to verify
      presence of deferred error interrupts before we enable them in mce_amd.c
      
      While at it, clarify comments in mce_vendor_flags to provide an
      indication of usages of the bitfields.
      Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: x86-ml <x86@kernel.org>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1430913538-1415-4-git-send-email-Aravind.Gopalakrishnan@amd.com
      [ beef up commit message, do CPUID(8000_0007) only once. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      7559e13f
  2. 24 3月, 2015 2 次提交
  3. 23 3月, 2015 2 次提交
  4. 19 2月, 2015 2 次提交
  5. 10 2月, 2015 1 次提交
    • T
      x86/mce: Fix regression. All error records should report via /dev/mcelog · a2413d8b
      Tony Luck 提交于
      I'm getting complaints from validation teams that have updated their
      Linux kernels from ancient versions to current. They don't see the
      error logs they expect. I tell the to unload any EDAC drivers[1], and
      things start working again.  The problem is that we short-circuit
      the logging process if any function on the decoder chain claims to
      have dealt with the problem:
      
      	ret = atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, m);
      	if (ret == NOTIFY_STOP)
      		return;
      
      The logic we used when we added this code was that we did not want
      to confuse users with double reports of the same error.
      
      But it turns out users are not confused - they are upset that they
      don't see a log where their tools used to find a log.
      
      I could also get into a long description of how the consumer of this
      log does more than just decode model specific details of the error.
      It keeps counts, tracks thresholds, takes actions and runs scripts
      that can alert administrators to problems.
      
      [1] We've recently compounded the problem because the acpi_extlog
      driver also registers for this notifier and also returns NOTIFY_STOP.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      a2413d8b
  6. 04 2月, 2015 1 次提交
  7. 07 1月, 2015 1 次提交
  8. 03 1月, 2015 1 次提交
    • A
      x86, traps: Track entry into and exit from IST context · 95927475
      Andy Lutomirski 提交于
      We currently pretend that IST context is like standard exception
      context, but this is incorrect.  IST entries from userspace are like
      standard exceptions except that they use per-cpu stacks, so they are
      atomic.  IST entries from kernel space are like NMIs from RCU's
      perspective -- they are not quiescent states even if they
      interrupted the kernel during a quiescent state.
      
      Add and use ist_enter and ist_exit to track IST context.  Even
      though x86_32 has no IST stacks, we track these interrupts the same
      way.
      
      This fixes two issues:
      
       - Scheduling from an IST interrupt handler will now warn.  It would
         previously appear to work as long as we got lucky and nothing
         overwrote the stack frame.  (I don't know of any bugs in this
         that would trigger the warning, but it's good to be on the safe
         side.)
      
       - RCU handling in IST context was dangerous.  As far as I know,
         only machine checks were likely to trigger this, but it's good to
         be on the safe side.
      
      Note that the machine check handlers appears to have been missing
      any context tracking at all before this patch.
      
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      95927475
  9. 23 12月, 2014 2 次提交
  10. 08 12月, 2014 1 次提交
  11. 20 11月, 2014 2 次提交
    • C
      x86, mce: Support memory error recovery for both UCNA and Deferred error in machine_check_poll · fa92c586
      Chen Yucong 提交于
      Uncorrected no action required (UCNA) - is a uncorrected recoverable
      machine check error that is not signaled via a machine check exception
      and, instead, is reported to system software as a corrected machine
      check error. UCNA errors indicate that some data in the system is
      corrupted, but the data has not been consumed and the processor state
      is valid and you may continue execution on this processor. UCNA errors
      require no action from system software to continue execution. Note that
      UCNA errors are supported by the processor only when IA32_MCG_CAP[24]
      (MCG_SER_P) is set.
                                                     -- Intel SDM Volume 3B
      
      Deferred errors are errors that cannot be corrected by hardware, but
      do not cause an immediate interruption in program flow, loss of data
      integrity, or corruption of processor state. These errors indicate
      that data has been corrupted but not consumed. Hardware writes information
      to the status and address registers in the corresponding bank that
      identifies the source of the error if deferred errors are enabled for
      logging. Deferred errors are not reported via machine check exceptions;
      they can be seen by polling the MCi_STATUS registers.
                                                      -- AMD64 APM Volume 2
      
      Above two items, both UCNA and Deferred errors belong to detected
      errors, but they can't be corrected by hardware, and this is very
      similar to Software Recoverable Action Optional (SRAO) errors.
      Therefore, we can take some actions that have been used for handling
      SRAO errors to handle UCNA and Deferred errors.
      Acked-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NChen Yucong <slaoub@gmail.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      fa92c586
    • C
      x86, mce, severity: Extend the the mce_severity mechanism to handle UCNA/DEFERRED error · e3480271
      Chen Yucong 提交于
      Until now, the mce_severity mechanism can only identify the severity
      of UCNA error as MCE_KEEP_SEVERITY. Meanwhile, it is not able to filter
      out DEFERRED error for AMD platform.
      
      This patch extends the mce_severity mechanism for handling
      UCNA/DEFERRED error. In order to do this, the patch introduces a new
      severity level - MCE_UCNA/DEFERRED_SEVERITY.
      
      In addition, mce_severity is specific to machine check exception,
      and it will check MCIP/EIPV/RIPV bits. In order to use mce_severity
      mechanism in non-exception context, the patch also introduces a new
      argument (is_excp) for mce_severity. `is_excp' is used to explicitly
      specify the calling context of mce_severity.
      Reviewed-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Signed-off-by: NChen Yucong <slaoub@gmail.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      e3480271
  12. 27 8月, 2014 1 次提交
    • C
      x86: Replace __get_cpu_var uses · 89cbc767
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86@kernel.org
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      89cbc767
  13. 09 8月, 2014 1 次提交
  14. 22 7月, 2014 1 次提交
  15. 24 6月, 2014 1 次提交
  16. 23 6月, 2014 1 次提交
  17. 05 6月, 2014 1 次提交
  18. 31 5月, 2014 2 次提交
  19. 17 4月, 2014 1 次提交
  20. 29 3月, 2014 1 次提交
    • C
      x86, CMCI: Add proper detection of end of CMCI storms · 27f6c573
      Chen, Gong 提交于
      When CMCI storm persists for a long time(at least beyond predefined
      threshold. It's 30 seconds for now), we can watch CMCI storm is
      detected immediately after it subsides.
      
      ...
      Dec 10 22:04:29 kernel: CMCI storm detected: switching to poll mode
      Dec 10 22:04:59 kernel: CMCI storm subsided: switching to interrupt mode
      Dec 10 22:04:59 kernel: CMCI storm detected: switching to poll mode
      Dec 10 22:05:29 kernel: CMCI storm subsided: switching to interrupt mode
      ...
      
      The problem is that our logic that determines that the storm has
      ended is incorrect. We announce the end, re-enable interrupts and
      realize that the storm is still going on, so we switch back to
      polling mode. Rinse, repeat.
      
      When a storm happens we disable signaling of errors via CMCI and begin
      polling machine check banks instead. If we find any logged errors,
      then we need to set a per-cpu flag so that our per-cpu tests that
      check whether the storm is ongoing will see that errors are still
      being logged independently of whether mce_notify_irq() says that the
      error has been fully processed.
      
      cmci_clear() is not the right tool to disable a bank. It disables the
      interrupt for the bank as desired, but it also clears the bit for
      this bank in "mce_banks_owned" so we will skip the bank when polling
      (so we fail to see that the storm continues because we stop looking).
      New cmci_storm_disable_banks() just disables the interrupt while
      allowing polling to continue.
      Reported-by: NWilliam Dauchy <wdauchy@gmail.com>
      Signed-off-by: NChen, Gong <gong.chen@linux.intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      27f6c573
  21. 20 3月, 2014 1 次提交
    • S
      x86, mce: Fix CPU hotplug callback registration · 82a8f131
      Srivatsa S. Bhat 提交于
      Subsystems that want to register CPU hotplug callbacks, as well as perform
      initialization for the CPUs that are already online, often do it as shown
      below:
      
      	get_online_cpus();
      
      	for_each_online_cpu(cpu)
      		init_cpu(cpu);
      
      	register_cpu_notifier(&foobar_cpu_notifier);
      
      	put_online_cpus();
      
      This is wrong, since it is prone to ABBA deadlocks involving the
      cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
      with CPU hotplug operations).
      
      Instead, the correct and race-free way of performing the callback
      registration is:
      
      	cpu_notifier_register_begin();
      
      	for_each_online_cpu(cpu)
      		init_cpu(cpu);
      
      	/* Note the use of the double underscored version of the API */
      	__register_cpu_notifier(&foobar_cpu_notifier);
      
      	cpu_notifier_register_done();
      
      Fix the mce code in x86 by using this latter form of callback registration.
      
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      82a8f131
  22. 12 1月, 2014 1 次提交
    • B
      x86, mce: Fix mce_start_timer semantics · 4f75d841
      Borislav Petkov 提交于
      So mce_start_timer() has a 'cpu' argument which is supposed to mean to
      start a timer on that cpu. However, the code currently starts a timer on
      the *current* cpu the function runs on and causes the sanity-check in
      mce_timer_fn to fire:
      
      	WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/mcheck/mce.c:1286 mce_timer_fn
      
      because it is running on the wrong cpu.
      
      This was triggered by Prarit Bhargava <prarit@redhat.com> by offlining
      all the cpus in succession.
      
      Then, we were fiddling with the CMCI storm settings when starting the
      timer whereas there's no need for that - if there's storm happening
      on this newly restarted cpu, we're going to be in normal CMCI mode
      initially and then when the CMCI interrupt starts firing, we're going to
      go to the polling mode with the timer real soon.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NPrarit Bhargava <prarit@redhat.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Reviewed-by: NChen, Gong <gong.chen@linux.intel.com>
      Link: http://lkml.kernel.org/r/1387722156-5511-1-git-send-email-prarit@redhat.com
      4f75d841
  23. 30 11月, 2013 1 次提交
  24. 15 7月, 2013 1 次提交
    • P
      x86: delete __cpuinit usage from all x86 files · 148f9bb8
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      Note that some harmless section mismatch warnings may result, since
      notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
      are flagged as __cpuinit  -- so if we remove the __cpuinit from
      arch specific callers, we will also get section mismatch warnings.
      As an intermediate step, we intend to turn the linux/init.h cpuinit
      content into no-ops as early as possible, since that will get rid
      of these warnings.  In any case, they are temporary and harmless.
      
      This removes all the arch/x86 uses of the __cpuinit macros from
      all C files.  x86 only had the one __CPUINIT used in assembly files,
      and it wasn't paired off with a .previous or a __FINIT, so we can
      delete it directly w/o any corresponding additional change there.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      148f9bb8
  25. 09 7月, 2013 1 次提交
    • N
      mce: acpi/apei: Honour Firmware First for MCA banks listed in APEI HEST CMC · c3d1fb56
      Naveen N. Rao 提交于
      The Corrected Machine Check structure (CMC) in HEST has a flag which can be
      set by the firmware to indicate to the OS that it prefers to process the
      corrected error events first. In this scenario, the OS is expected to not
      monitor for corrected errors (through CMCI/polling). Instead, the firmware
      notifies the OS on corrected error events through GHES.
      
      Linux already has support for GHES. This patch adds support for parsing CMC
      structure and to disable CMCI/polling if the firmware first flag is set.
      
      Further, the list of machine check bank structures at the end of CMC is used
      to determine which MCA banks function in FF mode, so that we continue to
      monitor error events on the other banks.
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      c3d1fb56
  26. 26 6月, 2013 1 次提交
  27. 03 4月, 2013 1 次提交
    • S
      x86/mce: Rework cmci_rediscover() to play well with CPU hotplug · 7a0c819d
      Srivatsa S. Bhat 提交于
      Dave Jones reports that offlining a CPU leads to this trace:
      
      numa_remove_cpu cpu 1 node 0: mask now 0,2-3
      smpboot: CPU 1 is now offline
      BUG: using smp_processor_id() in preemptible [00000000] code:
      cpu-offline.sh/10591
      caller is cmci_rediscover+0x6a/0xe0
      Pid: 10591, comm: cpu-offline.sh Not tainted 3.9.0-rc3+ #2
      Call Trace:
       [<ffffffff81333bbd>] debug_smp_processor_id+0xdd/0x100
       [<ffffffff8101edba>] cmci_rediscover+0x6a/0xe0
       [<ffffffff815f5b9f>] mce_cpu_callback+0x19d/0x1ae
       [<ffffffff8160ea66>] notifier_call_chain+0x66/0x150
       [<ffffffff8107ad7e>] __raw_notifier_call_chain+0xe/0x10
       [<ffffffff8104c2e3>] cpu_notify+0x23/0x50
       [<ffffffff8104c31e>] cpu_notify_nofail+0xe/0x20
       [<ffffffff815ef082>] _cpu_down+0x302/0x350
       [<ffffffff815ef106>] cpu_down+0x36/0x50
       [<ffffffff815f1c9d>] store_online+0x8d/0xd0
       [<ffffffff813edc48>] dev_attr_store+0x18/0x30
       [<ffffffff81226eeb>] sysfs_write_file+0xdb/0x150
       [<ffffffff811adfb2>] vfs_write+0xa2/0x170
       [<ffffffff811ae16c>] sys_write+0x4c/0xa0
       [<ffffffff81613019>] system_call_fastpath+0x16/0x1b
      
      However, a look at cmci_rediscover shows that it can be simplified quite
      a bit, apart from solving the above issue. It invokes functions that
      take spin locks with interrupts disabled, and hence it can run in atomic
      context. Also, it is run in the CPU_POST_DEAD phase, so the dying CPU
      is already dead and out of the cpu_online_mask. So take these points into
      account and simplify the code, and thereby also fix the above issue.
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      7a0c819d
  28. 21 1月, 2013 1 次提交
  29. 29 12月, 2012 1 次提交
    • T
      x86/mce: don't use [delayed_]work_pending() · 4d899be5
      Tejun Heo 提交于
      There's no need to test whether a (delayed) work item in pending
      before queueing, flushing or cancelling it.  Most uses are unnecessary
      and quite a few of them are buggy.
      
      Remove unnecessary pending tests from x86/mce.  Only compile tested.
      
      v2: Local var work removed from mce_schedule_work() as suggested by
          Borislav.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NBorislav Petkov <bp@alien8.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-edac@vger.kernel.org
      4d899be5
  30. 26 10月, 2012 4 次提交
  31. 19 10月, 2012 1 次提交