1. 24 11月, 2015 2 次提交
    • B
      x86/RAS: Remove mce.usable_addr · c0ec382e
      Borislav Petkov 提交于
      It is useless and we can use the function instead. Besides,
      mcelog(8) hasn't managed to make use of it yet. So kill it.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1448350880-5573-3-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c0ec382e
    • T
      x86/mce: Do not enter deferred errors into the generic pool twice · 8b38937b
      Tony Luck 提交于
      We used to have a special ring buffer for deferred errors that
      was used to mark problem pages. We replaced that with a generic
      pool. Then later converted mce_log() to also use the same pool.
      As a result, we end up adding all deferred errors to the pool
      twice.
      
      Rearrange this code. Make sure to set the m.severity and
      m.usable_addr fields for deferred errors. Then if flags and
      mca_cfg.dont_log_ce mean we call mce_log() we are done, because
      that will add this entry to the generic pool.
      
      If we skipped mce_log(), then we still want to take action for
      the deferred error, so add to the pool.
      
      Change the name of the boolean "error_logged" to "error_seen",
      we should set it whether of not we logged an error because the
      return value from machine_check_poll() is used to decide whether
      storms have subsided or not.
      Reported-by: NGong Chen <gong.chen@linux.intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1448350880-5573-2-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8b38937b
  2. 01 11月, 2015 2 次提交
  3. 28 9月, 2015 1 次提交
    • A
      x86/mce: Don't clear shared banks on Intel when offlining CPUs · 6e06780a
      Ashok Raj 提交于
      It is not safe to clear global MCi_CTL banks during CPU offline
      or suspend/resume operations. These MSRs are either
      thread-scoped (meaning private to a thread), or core-scoped
      (private to threads in that core only), or with a socket scope:
      visible and controllable from all threads in the socket.
      
      When we offline a single CPU, clearing those MCi_CTL bits will
      stop signaling for all the shared, i.e., socket-wide resources,
      such as LLC, iMC, etc.
      
      In addition, it might be possible to compromise the integrity of
      an Intel Secure Guard eXtentions (SGX) system if the attacker
      has control of the host system and is able to inject errors
      which would be otherwise ignored when MCi_CTL bits are cleared.
      
      Hence on SGX enabled systems, if MCi_CTL is cleared, SGX gets
      disabled.
      Tested-by: NSerge Ayoun <serge.ayoun@intel.com>
      Signed-off-by: NAshok Raj <ashok.raj@intel.com>
      [ Cleanup text. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NTony Luck <tony.luck@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1441391390-16985-1-git-send-email-ashok.raj@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6e06780a
  4. 13 8月, 2015 8 次提交
  5. 23 7月, 2015 1 次提交
  6. 07 7月, 2015 1 次提交
    • A
      x86/entry: Remove exception_enter() from most trap handlers · 8c84014f
      Andy Lutomirski 提交于
      On 64-bit kernels, we don't need it any more: we handle context
      tracking directly on entry from user mode and exit to user mode.
      
      On 32-bit kernels, we don't support context tracking at all, so
      these callbacks had no effect.
      
      Note: this doesn't change do_page_fault().  Before we do that,
      we need to make sure that there is no code that can page fault
      from kernel mode with CONTEXT_USER.  The 32-bit fast system call
      stack argument code is the only offender I'm aware of right now.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Denys Vlasenko <vda.linux@googlemail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: paulmck@linux.vnet.ibm.com
      Link: http://lkml.kernel.org/r/ae22f4dfebd799c916574089964592be218151f9.1435952415.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8c84014f
  7. 06 7月, 2015 2 次提交
  8. 07 6月, 2015 2 次提交
  9. 28 5月, 2015 2 次提交
  10. 27 5月, 2015 1 次提交
  11. 18 5月, 2015 1 次提交
    • B
      x86/mce: Fix MCE severity messages · 17fea54b
      Borislav Petkov 提交于
      Derek noticed that a critical MCE gets reported with the wrong
      error type description:
      
        [Hardware Error]: CPU 34: Machine Check Exception: 5 Bank 9: f200003f000100b0
        [Hardware Error]: RIP !INEXACT! 10:<ffffffff812e14c1> {intel_idle+0xb1/0x170}
        [Hardware Error]: TSC 49587b8e321cb
        [Hardware Error]: PROCESSOR 0:306e4 TIME 1431561296 SOCKET 1 APIC 29
        [Hardware Error]: Some CPUs didn't answer in synchronization
        [Hardware Error]: Machine check: Invalid
      				   ^^^^^^^
      
      The last line with 'Invalid' should have printed the high level
      MCE error type description we get from mce_severity, i.e.
      something like:
      
        [Hardware Error]: Machine check: Action required: data load error in a user process
      
      this happens due to the fact that mce_no_way_out() iterates over
      all MCA banks and possibly overwrites the @msg argument which is
      used in the panic printing later.
      
      Change behavior to take the message of only and the (last)
      critical MCE it detects.
      Reported-by: NDerek <denc716@gmail.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: <stable@vger.kernel.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Link: http://lkml.kernel.org/r/1431936437-25286-3-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      17fea54b
  12. 07 5月, 2015 1 次提交
    • A
      x86/mce: Add support for deferred errors on AMD · 7559e13f
      Aravind Gopalakrishnan 提交于
      Deferred errors indicate error conditions that were not corrected, but
      those errors have not been consumed yet. They require no action from
      S/W (or action is optional). These errors provide info about a latent
      uncorrectable MCE that can occur when a poisoned data is consumed by the
      processor.
      
      Newer AMD processors can generate deferred errors and can be configured
      to generate APIC interrupts on such events.
      
      SUCCOR stands for S/W UnCorrectable error COntainment and Recovery.
      It indicates support for data poisoning in HW and deferred error
      interrupts.
      
      Add new bitfield to mce_vendor_flags for this. We use this to verify
      presence of deferred error interrupts before we enable them in mce_amd.c
      
      While at it, clarify comments in mce_vendor_flags to provide an
      indication of usages of the bitfields.
      Signed-off-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: x86-ml <x86@kernel.org>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1430913538-1415-4-git-send-email-Aravind.Gopalakrishnan@amd.com
      [ beef up commit message, do CPUID(8000_0007) only once. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      7559e13f
  13. 24 3月, 2015 2 次提交
  14. 23 3月, 2015 2 次提交
  15. 19 2月, 2015 2 次提交
  16. 10 2月, 2015 1 次提交
    • T
      x86/mce: Fix regression. All error records should report via /dev/mcelog · a2413d8b
      Tony Luck 提交于
      I'm getting complaints from validation teams that have updated their
      Linux kernels from ancient versions to current. They don't see the
      error logs they expect. I tell the to unload any EDAC drivers[1], and
      things start working again.  The problem is that we short-circuit
      the logging process if any function on the decoder chain claims to
      have dealt with the problem:
      
      	ret = atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, m);
      	if (ret == NOTIFY_STOP)
      		return;
      
      The logic we used when we added this code was that we did not want
      to confuse users with double reports of the same error.
      
      But it turns out users are not confused - they are upset that they
      don't see a log where their tools used to find a log.
      
      I could also get into a long description of how the consumer of this
      log does more than just decode model specific details of the error.
      It keeps counts, tracks thresholds, takes actions and runs scripts
      that can alert administrators to problems.
      
      [1] We've recently compounded the problem because the acpi_extlog
      driver also registers for this notifier and also returns NOTIFY_STOP.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      a2413d8b
  17. 04 2月, 2015 1 次提交
  18. 07 1月, 2015 1 次提交
  19. 03 1月, 2015 1 次提交
    • A
      x86, traps: Track entry into and exit from IST context · 95927475
      Andy Lutomirski 提交于
      We currently pretend that IST context is like standard exception
      context, but this is incorrect.  IST entries from userspace are like
      standard exceptions except that they use per-cpu stacks, so they are
      atomic.  IST entries from kernel space are like NMIs from RCU's
      perspective -- they are not quiescent states even if they
      interrupted the kernel during a quiescent state.
      
      Add and use ist_enter and ist_exit to track IST context.  Even
      though x86_32 has no IST stacks, we track these interrupts the same
      way.
      
      This fixes two issues:
      
       - Scheduling from an IST interrupt handler will now warn.  It would
         previously appear to work as long as we got lucky and nothing
         overwrote the stack frame.  (I don't know of any bugs in this
         that would trigger the warning, but it's good to be on the safe
         side.)
      
       - RCU handling in IST context was dangerous.  As far as I know,
         only machine checks were likely to trigger this, but it's good to
         be on the safe side.
      
      Note that the machine check handlers appears to have been missing
      any context tracking at all before this patch.
      
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      95927475
  20. 23 12月, 2014 2 次提交
  21. 08 12月, 2014 1 次提交
  22. 20 11月, 2014 2 次提交
    • C
      x86, mce: Support memory error recovery for both UCNA and Deferred error in machine_check_poll · fa92c586
      Chen Yucong 提交于
      Uncorrected no action required (UCNA) - is a uncorrected recoverable
      machine check error that is not signaled via a machine check exception
      and, instead, is reported to system software as a corrected machine
      check error. UCNA errors indicate that some data in the system is
      corrupted, but the data has not been consumed and the processor state
      is valid and you may continue execution on this processor. UCNA errors
      require no action from system software to continue execution. Note that
      UCNA errors are supported by the processor only when IA32_MCG_CAP[24]
      (MCG_SER_P) is set.
                                                     -- Intel SDM Volume 3B
      
      Deferred errors are errors that cannot be corrected by hardware, but
      do not cause an immediate interruption in program flow, loss of data
      integrity, or corruption of processor state. These errors indicate
      that data has been corrupted but not consumed. Hardware writes information
      to the status and address registers in the corresponding bank that
      identifies the source of the error if deferred errors are enabled for
      logging. Deferred errors are not reported via machine check exceptions;
      they can be seen by polling the MCi_STATUS registers.
                                                      -- AMD64 APM Volume 2
      
      Above two items, both UCNA and Deferred errors belong to detected
      errors, but they can't be corrected by hardware, and this is very
      similar to Software Recoverable Action Optional (SRAO) errors.
      Therefore, we can take some actions that have been used for handling
      SRAO errors to handle UCNA and Deferred errors.
      Acked-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NChen Yucong <slaoub@gmail.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      fa92c586
    • C
      x86, mce, severity: Extend the the mce_severity mechanism to handle UCNA/DEFERRED error · e3480271
      Chen Yucong 提交于
      Until now, the mce_severity mechanism can only identify the severity
      of UCNA error as MCE_KEEP_SEVERITY. Meanwhile, it is not able to filter
      out DEFERRED error for AMD platform.
      
      This patch extends the mce_severity mechanism for handling
      UCNA/DEFERRED error. In order to do this, the patch introduces a new
      severity level - MCE_UCNA/DEFERRED_SEVERITY.
      
      In addition, mce_severity is specific to machine check exception,
      and it will check MCIP/EIPV/RIPV bits. In order to use mce_severity
      mechanism in non-exception context, the patch also introduces a new
      argument (is_excp) for mce_severity. `is_excp' is used to explicitly
      specify the calling context of mce_severity.
      Reviewed-by: NAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Signed-off-by: NChen Yucong <slaoub@gmail.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      e3480271
  23. 27 8月, 2014 1 次提交
    • C
      x86: Replace __get_cpu_var uses · 89cbc767
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86@kernel.org
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      89cbc767