1. 13 9月, 2016 1 次提交
  2. 12 5月, 2016 1 次提交
  3. 03 5月, 2016 1 次提交
    • Y
      x86/mce: Define vendor-specific MSR accessors · a9750a31
      Yazen Ghannam 提交于
      Scalable MCA processors have a whole new range of MSR addresses to
      obtain bank related info such as CTL, MISC, ADDR, STATUS. Therefore, we
      need a way to abstract the MSR addresses per vendor.
      
      Carved out from a patch by Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>.
      Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1462019637-16474-5-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a9750a31
  4. 08 3月, 2016 4 次提交
  5. 18 2月, 2016 1 次提交
  6. 01 11月, 2015 1 次提交
  7. 13 8月, 2015 4 次提交
  8. 07 6月, 2015 2 次提交
  9. 07 5月, 2015 2 次提交
  10. 24 3月, 2015 2 次提交
  11. 19 2月, 2015 1 次提交
    • B
      x86/MCE/intel: Cleanup CMCI storm logic · 3f2f0680
      Borislav Petkov 提交于
      Initially, this started with the yet another report about a race
      condition in the CMCI storm adaptive period length thing. Yes, we have
      to admit, it is fragile and error prone. So let's simplify it.
      
      The simpler logic is: now, after we enter storm mode, we go straight to
      polling with CMCI_STORM_INTERVAL, i.e. once a second. We remain in storm
      mode as long as we see errors being logged while polling.
      
      Theoretically, if we see an uninterrupted error stream, we will remain
      in storm mode indefinitely and keep polling the MSRs.
      
      However, when the storm is actually a burst of errors, once we have
      logged them all, we back out of it after ~5 mins of polling and no more
      errors logged.
      
      If we encounter an error during those 5 minutes, we reset the polling
      interval to 5 mins.
      
      Making machine_check_poll() return a bool and denoting whether it has
      seen an error or not lets us simplify a bunch of code and move the storm
      handling private to mce_intel.c.
      
      Some minor cleanups while at it.
      Reported-by: NCalvin Owens <calvinowens@fb.com>
      Tested-by: NTony Luck <tony.luck@intel.com>
      Link: http://lkml.kernel.org/r/1417746575-23299-1-git-send-email-calvinowens@fb.comSigned-off-by: NBorislav Petkov <bp@suse.de>
      3f2f0680
  12. 07 1月, 2015 1 次提交
  13. 20 11月, 2014 1 次提交
  14. 22 10月, 2014 1 次提交
  15. 05 6月, 2014 1 次提交
  16. 07 1月, 2014 1 次提交
  17. 24 10月, 2013 1 次提交
  18. 06 8月, 2013 1 次提交
  19. 09 7月, 2013 1 次提交
    • N
      mce: acpi/apei: Honour Firmware First for MCA banks listed in APEI HEST CMC · c3d1fb56
      Naveen N. Rao 提交于
      The Corrected Machine Check structure (CMC) in HEST has a flag which can be
      set by the firmware to indicate to the OS that it prefers to process the
      corrected error events first. In this scenario, the OS is expected to not
      monitor for corrected errors (through CMCI/polling). Instead, the firmware
      notifies the OS on corrected error events through GHES.
      
      Linux already has support for GHES. This patch adds support for parsing CMC
      structure and to disable CMCI/polling if the firmware first flag is set.
      
      Further, the list of machine check bank structures at the end of CMC is used
      to determine which MCA banks function in FF mode, so that we continue to
      monitor error events on the other banks.
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      c3d1fb56
  20. 13 6月, 2013 1 次提交
  21. 05 6月, 2013 1 次提交
  22. 03 4月, 2013 1 次提交
    • S
      x86/mce: Rework cmci_rediscover() to play well with CPU hotplug · 7a0c819d
      Srivatsa S. Bhat 提交于
      Dave Jones reports that offlining a CPU leads to this trace:
      
      numa_remove_cpu cpu 1 node 0: mask now 0,2-3
      smpboot: CPU 1 is now offline
      BUG: using smp_processor_id() in preemptible [00000000] code:
      cpu-offline.sh/10591
      caller is cmci_rediscover+0x6a/0xe0
      Pid: 10591, comm: cpu-offline.sh Not tainted 3.9.0-rc3+ #2
      Call Trace:
       [<ffffffff81333bbd>] debug_smp_processor_id+0xdd/0x100
       [<ffffffff8101edba>] cmci_rediscover+0x6a/0xe0
       [<ffffffff815f5b9f>] mce_cpu_callback+0x19d/0x1ae
       [<ffffffff8160ea66>] notifier_call_chain+0x66/0x150
       [<ffffffff8107ad7e>] __raw_notifier_call_chain+0xe/0x10
       [<ffffffff8104c2e3>] cpu_notify+0x23/0x50
       [<ffffffff8104c31e>] cpu_notify_nofail+0xe/0x20
       [<ffffffff815ef082>] _cpu_down+0x302/0x350
       [<ffffffff815ef106>] cpu_down+0x36/0x50
       [<ffffffff815f1c9d>] store_online+0x8d/0xd0
       [<ffffffff813edc48>] dev_attr_store+0x18/0x30
       [<ffffffff81226eeb>] sysfs_write_file+0xdb/0x150
       [<ffffffff811adfb2>] vfs_write+0xa2/0x170
       [<ffffffff811ae16c>] sys_write+0x4c/0xa0
       [<ffffffff81613019>] system_call_fastpath+0x16/0x1b
      
      However, a look at cmci_rediscover shows that it can be simplified quite
      a bit, apart from solving the above issue. It invokes functions that
      take spin locks with interrupts disabled, and hence it can run in atomic
      context. Also, it is run in the CPU_POST_DEAD phase, so the dying CPU
      is already dead and out of the cpu_online_mask. So take these points into
      account and simplify the code, and thereby also fix the above issue.
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      7a0c819d
  23. 09 1月, 2013 1 次提交
    • B
      x86, MCE: Retract most UAPI exports · f51bde6f
      Borislav Petkov 提交于
      Retract back most macro definitions which went into the
      user-visible mce.h header. Even though those bits are mostly
      hardware-defined/-architectural, their naming is not. If we export them
      to userspace, any kernel unification/renaming/cleanup cannot be done
      anymore since those are effectively cast in stone. Besides, if userspace
      wants those definitions, they can write their own defines and go crazy.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      f51bde6f
  24. 15 12月, 2012 1 次提交
  25. 26 10月, 2012 4 次提交
  26. 28 9月, 2012 1 次提交
  27. 18 9月, 2012 1 次提交
  28. 26 7月, 2012 1 次提交