1. 11 6月, 2010 1 次提交
  2. 20 5月, 2010 1 次提交
    • H
      ACPI, APEI, Generic Hardware Error Source memory error support · d334a491
      Huang Ying 提交于
      Generic Hardware Error Source provides a way to report platform
      hardware errors (such as that from chipset). It works in so called
      "Firmware First" mode, that is, hardware errors are reported to
      firmware firstly, then reported to Linux by firmware. This way, some
      non-standard hardware error registers or non-standard hardware link
      can be checked by firmware to produce more valuable hardware error
      information for Linux.
      
      Now, only SCI notification type and memory errors are supported. More
      notification type and hardware error type will be added later. These
      memory errors are reported to user space through /dev/mcelog via
      faking a corrected Machine Check, so that the error memory page can be
      offlined by /sbin/mcelog if the error count for one page is beyond the
      threshold.
      
      On some machines, Machine Check can not report physical address for
      some corrected memory errors, but GHES can do that. So this simplified
      GHES is implemented firstly.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      d334a491
  3. 13 1月, 2010 1 次提交
  4. 10 11月, 2009 1 次提交
    • Y
      x86: Under BIOS control, restore AP's APIC_LVTTHMR to the BSP value · a2202aa2
      Yong Wang 提交于
      On platforms where the BIOS handles the thermal monitor interrupt,
      APIC_LVTTHMR on each logical CPU is programmed to generate a SMI
      and OS must not touch it.
      
      Unfortunately AP bringup sequence using INIT-SIPI-SIPI clears all
      the LVT entries except the mask bit. Essentially this results in
      all LVT entries including the thermal monitoring interrupt set
      to masked (clearing the bios programmed value for APIC_LVTTHMR).
      
      And this leads to kernel take over the thermal monitoring
      interrupt on AP's but not on BSP (leaving the bios programmed
      value only on BSP).
      
      As a result of this, we have seen system hangs when the thermal
      monitoring interrupt is generated.
      
      Fix this by reading the initial value of thermal LVT entry on
      BSP and if bios has taken over the control, then program the
      same value on all AP's and leave the thermal monitoring
      interrupt control on all the logical cpu's to the bios.
      Signed-off-by: NYong Wang <yong.y.wang@intel.com>
      Reviewed-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Borislav Petkov <borislav.petkov@amd.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      LKML-Reference: <20091110013824.GA24940@ywang-moblin2.bj.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: stable@kernel.org
      a2202aa2
  5. 16 10月, 2009 1 次提交
    • B
      x86, mce: Fix up MCE naming nomenclature · 5e09954a
      Borislav Petkov 提交于
      Prefix global/setup routines with "mcheck_" thus differentiating
      from the internal facilities prefixed with "mce_". Also, prefix
      the per cpu calls with mcheck_cpu and rename them to reflect the
      MCE setup hierarchy of calls better.
      
      There should be no functionality change resulting from this
      patch.
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      LKML-Reference: <1255689093-26921-1-git-send-email-borislav.petkov@amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5e09954a
  6. 12 10月, 2009 1 次提交
  7. 02 10月, 2009 1 次提交
    • I
      x86: EDAC: MCE: Fix MCE decoding callback logic · f436f8bb
      Ingo Molnar 提交于
      Make decoding of MCEs happen only on AMD hardware by registering a
      non-default callback only on CPU families which support it.
      
      While looking at the interaction of decode_mce() with the other MCE
      code i also noticed a few other things and made the following
      cleanups/fixes:
      
       - Fixed the mce_decode() weak alias - a weak alias is really not
         good here, it should be a proper callback. A weak alias will be
         overriden if a piece of code is built into the kernel - not
         good, obviously.
      
       - The patch initializes the callback on AMD family 10h and 11h.
      
       - Added the more correct fallback printk of:
      
      	No support for human readable MCE decoding on this CPU type.
      	Transcribe the message and run it through 'mcelog --ascii' to decode.
      
         On CPUs that dont have a decoder.
      
       - Made the surrounding code more readable.
      
      Note that the callback allows us to have a default fallback -
      without having to check the CPU versions during the printout
      itself. When an EDAC module registers itself, it can install the
      decode-print function.
      
      (there's no unregister needed as this is core code.)
      
      version -v2 by Borislav Petkov:
      
       - add K8 to the set of supported CPUs
      
       - always build in edac_mce_amd since we use an early_initcall now
      
       - fix checkpatch warnings
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      LKML-Reference: <20091001141432.GA11410@aftab>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f436f8bb
  8. 11 8月, 2009 2 次提交
  9. 10 7月, 2009 2 次提交
  10. 21 6月, 2009 1 次提交
  11. 17 6月, 2009 5 次提交
  12. 11 6月, 2009 1 次提交
    • H
      x86, mce: Add boot options for corrected errors · 62fdac59
      Hidetoshi Seto 提交于
      This patch introduces three boot options (no_cmci, dont_log_ce
      and ignore_ce) to control handling for corrected errors.
      
      The "mce=no_cmci" boot option disables the CMCI feature.
      
      Since CMCI is a new feature so having boot controls to disable
      it will be a help if the hardware is misbehaving.
      
      The "mce=dont_log_ce" boot option disables logging for corrected
      errors. All reported corrected errors will be cleared silently.
      This option will be useful if you never care about corrected
      errors.
      
      The "mce=ignore_ce" boot option disables features for corrected
      errors, i.e. polling timer and cmci.  All corrected events are
      not cleared and kept in bank MSRs.
      
      Usually this disablement is not recommended, however it will be
      a help if there are some conflict with the BIOS or hardware
      monitoring applications etc., that clears corrected events in
      banks instead of OS.
      
      [ And trivial cleanup (space -> tab) for doc is included. ]
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      LKML-Reference: <4A30ACDF.5030408@jp.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      62fdac59
  13. 04 6月, 2009 8 次提交
    • A
      x86, mce: support action-optional machine checks · 9b1beaf2
      Andi Kleen 提交于
      Newer Intel CPUs support a new class of machine checks called recoverable
      action optional.
      
      Action Optional means that the CPU detected some form of corruption in
      the background and tells the OS about using a machine check
      exception. The OS can then take appropiate action, like killing the
      process with the corrupted data or logging the event properly to disk.
      
      This is done by the new generic high level memory failure handler added
      in a earlier patch. The high level handler takes the address with the
      failed memory and does the appropiate action, like killing the process.
      
      In this version of the patch the high level handler is stubbed out
      with a weak function to not create a direct dependency on the hwpoison
      branch.
      
      The high level handler cannot be directly called from the machine check
      exception though, because it has to run in a defined process context to
      be able to sleep when taking VM locks (it is not expected to sleep for a
      long time, just do so in some exceptional cases like lock contention)
      
      Thus the MCE handler has to queue a work item for process context,
      trigger process context and then call the high level handler from there.
      
      This patch adds two path to process context: through a per thread kernel
      exit notify_user() callback or through a high priority work item.
      The first runs when the process exits back to user space, the other when
      it goes to sleep and there is no higher priority process.
      
      The machine check handler will schedule both, and whoever runs first
      will grab the event. This is done because quick reaction to this
      event is critical to avoid a potential more fatal machine check
      when the corruption is consumed.
      
      There is a simple lock less ring buffer to queue the corrupted
      addresses between the exception handler and the process context handler.
      Then in process context it just calls the high level VM code with
      the corrupted PFNs.
      
      The code adds the required code to extract the failed address from
      the CPU's machine check registers. It doesn't try to handle all
      possible cases -- the specification has 6 different ways to specify
      memory address -- but only the linear address.
      
      Most of the required checking has been already done earlier in the
      mce_severity rule checking engine.  Following the Intel
      recommendations Action Optional errors are only enabled for known
      situations (encoded in MCACODs). The errors are ignored otherwise,
      because they are action optional.
      
      v2: Improve comment, disable preemption while processing ring buffer
          (reported by Ying Huang)
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      9b1beaf2
    • A
      x86, mce: rename mce_notify_user to mce_notify_irq · 9ff36ee9
      Andi Kleen 提交于
      Rename the mce_notify_user function to mce_notify_irq. The next
      patch will split the wakeup handling of interrupt context
      and of process context and it's better to give it a clearer
      name for this.
      
      Contains a fix from Ying Huang
      
      [ Impact: cleanup ]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      9ff36ee9
    • A
      x86, mce: implement new status bits · ed7290d0
      Andi Kleen 提交于
      The x86 architecture recently added some new machine check status bits:
      S(ignalled) and AR (Action-Required). Signalled allows to check
      if a specific event caused an exception or was just logged through CMCI.
      AR allows the kernel to decide if an event needs immediate action
      or can be delayed or ignored.
      
      Implement support for these new status bits. mce_severity() uses
      the new bits to grade the machine check correctly and decide what
      to do. The exception handler uses AR to decide to kill or not.
      The S bit is used to separate events between the poll/CMCI handler
      and the exception handler.
      
      Classical UC always leads to panic. That was true before anyways
      because the existing CPUs always passed a PCC with it.
      
      Also corrects the rules whether to kill in user or kernel context
      and how to handle missing RIPV.
      
      The machine check handler largely uses the mce-severity grading
      engine now instead of making its own decisions. This means the logic
      is centralized in one place.  This is useful because it has to be
      evaluated multiple times.
      
      v2: Some rule fixes; Add AO events
      Fix RIPV, RIPV|EIPV order (Ying Huang)
      Fix UCNA with AR=1 message (Ying Huang)
      Add comment about panicing in m_c_p.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      ed7290d0
    • A
      x86, mce: extend struct mce user interface with more information. · 8ee08347
      Andi Kleen 提交于
      Experience has shown that struct mce which is used to pass an machine
      check to the user space daemon currently a few limitations.  Also some
      data which is useful to print at panic level is also missing.
      
      This patch addresses most of them. The same information is also
      printed out together with mce panic.
      
      struct mce can be painlessly extended in a compatible way, the mcelog
      user space code just ignores additional fields with a warning.
      
      - It doesn't provide a wall time timestamp. There have been a few
        complaints about that. Fix that by adding a 64bit time_t
      
      - It doesn't provide the exact CPU identification. This makes
        it awkward for mcelog to decode the event correctly, especially
        when there are variations in the supported MCE codes on different
        CPU models or when mcelog is running on a different host after a panic.
        Previously the administrator had to specify the correct CPU
        when mcelog ran on a different host, but with the more variation
        in machine checks now it's better to auto detect that.
        It's also useful for more detailed analysis of CPU events.
        Pass CPUID 1.EAX and the cpu vendor (as encoded in processor.h) instead.
      
      - Socket ID and initial APIC ID are useful to report because they
        allow to identify the failing CPU in some (not all) cases.
        This is also especially useful for the panic situation.
        This addresses one of the complaints from Thomas Gleixner earlier.
      
      - The MCG capabilities MSR needs to be reported for some advanced
        error processing in mcelog
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      8ee08347
    • A
      x86, mce: support more than 256 CPUs in struct mce · d620c67f
      Andi Kleen 提交于
      The old struct mce had a limitation to 256 CPUs. But x86 Linux supports
      more than that now with x2apic. Add a new field extcpu to report the
      extended number.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      d620c67f
    • A
      x86, mce: store record length into memory struct mce anchor · f6fb0ac0
      Andi Kleen 提交于
      This makes it easier for tools who want to extract the mcelog out of
      crash images or memory dumps to adapt to changing struct mce size.
      The length field replaces padding, so it's fully compatible.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      f6fb0ac0
    • A
      x86, mce: add MCE poll count to /proc/interrupts · ca84f696
      Andi Kleen 提交于
      Keep a count of the machine check polls (or CMCI events) in
      /proc/interrupts.
      
      Andi needs this for debugging, but it's also useful in general
      to see what's going in by the kernel.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      ca84f696
    • A
      x86, mce: add machine check exception count in /proc/interrupts · 01ca79f1
      Andi Kleen 提交于
      Useful for debugging, but it's also good general policy
      to have a counter for all special interrupts there. This makes it easier
      to diagnose where a CPU is spending its time.
      
      [ Impact: feature, debugging tool ]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      01ca79f1
  14. 29 5月, 2009 6 次提交
  15. 23 4月, 2009 1 次提交
  16. 25 2月, 2009 5 次提交
    • A
      x86, mce, cmci: add CMCI support · 88ccbedd
      Andi Kleen 提交于
      Impact: Major new feature
      
      Intel CMCI (Corrected Machine Check Interrupt) is a new
      feature on Nehalem CPUs. It allows the CPU to trigger
      interrupts on corrected events, which allows faster
      reaction to them instead of with the traditional
      polling timer.
      
      Also use CMCI to discover shared banks. Machine check banks
      can be shared by CPU threads or even cores. Using the CMCI enable
      bit it is possible to detect the fact that another CPU already
      saw a specific bank. Use this to assign shared banks only
      to one CPU to avoid reporting duplicated events.
      
      On CPU hot unplug bank sharing is re discovered. This is done
      using a thread that cycles through all the CPUs.
      
      To avoid races between the poller and CMCI we only poll
      for banks that are not CMCI capable and only check CMCI
      owned banks on a interrupt.
      
      The shared banks ownership information is currently only used for
      CMCI interrupts, not polled banks.
      
      The sharing discovery code follows the algorithm recommended in the
      IA32 SDM Vol3a 14.5.2.1
      
      The CMCI interrupt handler just calls the machine check poller to
      pick up the machine check event that caused the interrupt.
      
      I decided not to implement a separate threshold event like
      the AMD version has, because the threshold is always one currently
      and adding another event didn't seem to add any value.
      
      Some code inspired by Yunhong Jiang's Xen implementation,
      which was in term inspired by a earlier CMCI implementation
      by me.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      88ccbedd
    • A
      x86, mce, cmci: define MSR names and fields for new CMCI registers · 03195c6b
      Andi Kleen 提交于
      Impact: New register definitions only
      
      CMCI means support for raising an interrupt on a corrected machine
      check event instead of having to poll for it. It's a new feature in
      Intel Nehalem CPUs available on some machine check banks.
      
      For details see the IA32 SDM Vol3a 14.5
      
      Define the registers for it as a preparation for further patches.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      03195c6b
    • A
      x86, mce, cmci: use polled banks bitmap in machine check poller · ee031c31
      Andi Kleen 提交于
      Define a per cpu bitmap that contains the banks polled by the machine
      check poller. This is needed for the CMCI code in the next patches
      to be able to disable polling on specific banks.
      
      The bank by default contains all banks, so there is no behaviour
      change. Only future code will remove some banks from the polling
      set.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      ee031c31
    • A
      x86, mce, cmci: factor out threshold interrupt handler · b2762686
      Andi Kleen 提交于
      Impact: cleanup; preparation for feature
      
      The mce_amd_64 code has an own private MC threshold vector with an own
      interrupt handler. Since Intel needs a similar handler
      it makes sense to share the vector because both can not
      be active at the same time.
      
      I factored the common APIC handler code into a separate file which can
      be used by both the Intel or AMD MC code.
      
      This is needed for the next patch which adds an Intel specific
      CMCI handler.
      
      This patch should be a nop for AMD, it just moves some code
      around.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      b2762686
    • A
      x86, mce, cmci: export MAX_NR_BANKS · 41fdff32
      Andi Kleen 提交于
      Impact: Cleanup (code movement)
      
      Move MAX_NR_BANKS into mce.h because it's needed there
      for followup patches.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      41fdff32
  17. 20 2月, 2009 2 次提交
    • A
      x86, mce: separate correct machine check poller and fatal exception handler · b79109c3
      Andi Kleen 提交于
      Impact: cleanup, performance enhancement
      
      The machine check poller is diverging more and more from the fatal
      exception handler. Instead of adding more special cases separate the code
      paths completely. The corrected poll path is actually quite simple,
      and this doesn't result in much code duplication.
      
      This makes both handlers much easier to read and results in
      cleaner code flow.  The exception handler now only needs to care
      about uncorrected errors, which also simplifies the handling of multiple
      errors. The corrected poller also now always runs in standard interrupt
      context and does not need to do anything special to handle NMI context.
      
      Minor behaviour changes:
      - MCG status is now not cleared on polling.
      - Only the banks which had corrected errors get cleared on polling
      - The exception handler only clears banks with errors now
      
      v2: Forward port to new patch order. Add "uc" argument.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      b79109c3
    • A
      x86, mce: factor out duplicated struct mce setup into one function · b5f2fa4e
      Andi Kleen 提交于
      Impact: cleanup
      
      This merely factors out duplicated code to set up
      the initial struct mce state into a single function.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      b5f2fa4e