1. 29 3月, 2014 1 次提交
    • C
      x86, CMCI: Add proper detection of end of CMCI storms · 27f6c573
      Chen, Gong 提交于
      When CMCI storm persists for a long time(at least beyond predefined
      threshold. It's 30 seconds for now), we can watch CMCI storm is
      detected immediately after it subsides.
      
      ...
      Dec 10 22:04:29 kernel: CMCI storm detected: switching to poll mode
      Dec 10 22:04:59 kernel: CMCI storm subsided: switching to interrupt mode
      Dec 10 22:04:59 kernel: CMCI storm detected: switching to poll mode
      Dec 10 22:05:29 kernel: CMCI storm subsided: switching to interrupt mode
      ...
      
      The problem is that our logic that determines that the storm has
      ended is incorrect. We announce the end, re-enable interrupts and
      realize that the storm is still going on, so we switch back to
      polling mode. Rinse, repeat.
      
      When a storm happens we disable signaling of errors via CMCI and begin
      polling machine check banks instead. If we find any logged errors,
      then we need to set a per-cpu flag so that our per-cpu tests that
      check whether the storm is ongoing will see that errors are still
      being logged independently of whether mce_notify_irq() says that the
      error has been fully processed.
      
      cmci_clear() is not the right tool to disable a bank. It disables the
      interrupt for the bank as desired, but it also clears the bit for
      this bank in "mce_banks_owned" so we will skip the bank when polling
      (so we fail to see that the storm continues because we stop looking).
      New cmci_storm_disable_banks() just disables the interrupt while
      allowing polling to continue.
      Reported-by: NWilliam Dauchy <wdauchy@gmail.com>
      Signed-off-by: NChen, Gong <gong.chen@linux.intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      27f6c573
  2. 07 1月, 2014 1 次提交
  3. 09 7月, 2013 1 次提交
    • N
      mce: acpi/apei: Honour Firmware First for MCA banks listed in APEI HEST CMC · c3d1fb56
      Naveen N. Rao 提交于
      The Corrected Machine Check structure (CMC) in HEST has a flag which can be
      set by the firmware to indicate to the OS that it prefers to process the
      corrected error events first. In this scenario, the OS is expected to not
      monitor for corrected errors (through CMCI/polling). Instead, the firmware
      notifies the OS on corrected error events through GHES.
      
      Linux already has support for GHES. This patch adds support for parsing CMC
      structure and to disable CMCI/polling if the firmware first flag is set.
      
      Further, the list of machine check bank structures at the end of CMC is used
      to determine which MCA banks function in FF mode, so that we continue to
      monitor error events on the other banks.
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      c3d1fb56
  4. 26 6月, 2013 1 次提交
  5. 03 4月, 2013 1 次提交
    • S
      x86/mce: Rework cmci_rediscover() to play well with CPU hotplug · 7a0c819d
      Srivatsa S. Bhat 提交于
      Dave Jones reports that offlining a CPU leads to this trace:
      
      numa_remove_cpu cpu 1 node 0: mask now 0,2-3
      smpboot: CPU 1 is now offline
      BUG: using smp_processor_id() in preemptible [00000000] code:
      cpu-offline.sh/10591
      caller is cmci_rediscover+0x6a/0xe0
      Pid: 10591, comm: cpu-offline.sh Not tainted 3.9.0-rc3+ #2
      Call Trace:
       [<ffffffff81333bbd>] debug_smp_processor_id+0xdd/0x100
       [<ffffffff8101edba>] cmci_rediscover+0x6a/0xe0
       [<ffffffff815f5b9f>] mce_cpu_callback+0x19d/0x1ae
       [<ffffffff8160ea66>] notifier_call_chain+0x66/0x150
       [<ffffffff8107ad7e>] __raw_notifier_call_chain+0xe/0x10
       [<ffffffff8104c2e3>] cpu_notify+0x23/0x50
       [<ffffffff8104c31e>] cpu_notify_nofail+0xe/0x20
       [<ffffffff815ef082>] _cpu_down+0x302/0x350
       [<ffffffff815ef106>] cpu_down+0x36/0x50
       [<ffffffff815f1c9d>] store_online+0x8d/0xd0
       [<ffffffff813edc48>] dev_attr_store+0x18/0x30
       [<ffffffff81226eeb>] sysfs_write_file+0xdb/0x150
       [<ffffffff811adfb2>] vfs_write+0xa2/0x170
       [<ffffffff811ae16c>] sys_write+0x4c/0xa0
       [<ffffffff81613019>] system_call_fastpath+0x16/0x1b
      
      However, a look at cmci_rediscover shows that it can be simplified quite
      a bit, apart from solving the above issue. It invokes functions that
      take spin locks with interrupts disabled, and hence it can run in atomic
      context. Also, it is run in the CPU_POST_DEAD phase, so the dying CPU
      is already dead and out of the cpu_online_mask. So take these points into
      account and simplify the code, and thereby also fix the above issue.
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      7a0c819d
  6. 31 10月, 2012 1 次提交
    • T
      x86/mce: Do not change worker's running cpu in cmci_rediscover(). · 85b97637
      Tang Chen 提交于
      cmci_rediscover() used set_cpus_allowed_ptr() to change the current process's
      running cpu, and migrate itself to the dest cpu. But worker processes are not
      allowed to be migrated. If current is a worker, the worker will be migrated to
      another cpu, but the corresponding  worker_pool is still on the original cpu.
      
      In this case, the following BUG_ON in try_to_wake_up_local() will be triggered:
      BUG_ON(rq != this_rq());
      
      This will cause the kernel panic. The call trace is like the following:
      
      [ 6155.451107] ------------[ cut here ]------------
      [ 6155.452019] kernel BUG at kernel/sched/core.c:1654!
      ......
      [ 6155.452019] RIP: 0010:[<ffffffff810add15>]  [<ffffffff810add15>] try_to_wake_up_local+0x115/0x130
      ......
      [ 6155.452019] Call Trace:
      [ 6155.452019]  [<ffffffff8166fc14>] __schedule+0x764/0x880
      [ 6155.452019]  [<ffffffff81670059>] schedule+0x29/0x70
      [ 6155.452019]  [<ffffffff8166de65>] schedule_timeout+0x235/0x2d0
      [ 6155.452019]  [<ffffffff810db57d>] ? mark_held_locks+0x8d/0x140
      [ 6155.452019]  [<ffffffff810dd463>] ? __lock_release+0x133/0x1a0
      [ 6155.452019]  [<ffffffff81671c50>] ? _raw_spin_unlock_irq+0x30/0x50
      [ 6155.452019]  [<ffffffff810db8f5>] ? trace_hardirqs_on_caller+0x105/0x190
      [ 6155.452019]  [<ffffffff8166fefb>] wait_for_common+0x12b/0x180
      [ 6155.452019]  [<ffffffff810b0b30>] ? try_to_wake_up+0x2f0/0x2f0
      [ 6155.452019]  [<ffffffff8167002d>] wait_for_completion+0x1d/0x20
      [ 6155.452019]  [<ffffffff8110008a>] stop_one_cpu+0x8a/0xc0
      [ 6155.452019]  [<ffffffff810abd40>] ? __migrate_task+0x1a0/0x1a0
      [ 6155.452019]  [<ffffffff810a6ab8>] ? complete+0x28/0x60
      [ 6155.452019]  [<ffffffff810b0fd8>] set_cpus_allowed_ptr+0x128/0x130
      [ 6155.452019]  [<ffffffff81036785>] cmci_rediscover+0xf5/0x140
      [ 6155.452019]  [<ffffffff816643c0>] mce_cpu_callback+0x18d/0x19d
      [ 6155.452019]  [<ffffffff81676187>] notifier_call_chain+0x67/0x150
      [ 6155.452019]  [<ffffffff810a03de>] __raw_notifier_call_chain+0xe/0x10
      [ 6155.452019]  [<ffffffff81070470>] __cpu_notify+0x20/0x40
      [ 6155.452019]  [<ffffffff810704a5>] cpu_notify_nofail+0x15/0x30
      [ 6155.452019]  [<ffffffff81655182>] _cpu_down+0x262/0x2e0
      [ 6155.452019]  [<ffffffff81655236>] cpu_down+0x36/0x50
      [ 6155.452019]  [<ffffffff813d3eaa>] acpi_processor_remove+0x50/0x11e
      [ 6155.452019]  [<ffffffff813a6978>] acpi_device_remove+0x90/0xb2
      [ 6155.452019]  [<ffffffff8143cbec>] __device_release_driver+0x7c/0xf0
      [ 6155.452019]  [<ffffffff8143cd6f>] device_release_driver+0x2f/0x50
      [ 6155.452019]  [<ffffffff813a7870>] acpi_bus_remove+0x32/0x6d
      [ 6155.452019]  [<ffffffff813a7932>] acpi_bus_trim+0x87/0xee
      [ 6155.452019]  [<ffffffff813a7a21>] acpi_bus_hot_remove_device+0x88/0x16b
      [ 6155.452019]  [<ffffffff813a33ee>] acpi_os_execute_deferred+0x27/0x34
      [ 6155.452019]  [<ffffffff81090589>] process_one_work+0x219/0x680
      [ 6155.452019]  [<ffffffff81090528>] ? process_one_work+0x1b8/0x680
      [ 6155.452019]  [<ffffffff813a33c7>] ? acpi_os_wait_events_complete+0x23/0x23
      [ 6155.452019]  [<ffffffff810923be>] worker_thread+0x12e/0x320
      [ 6155.452019]  [<ffffffff81092290>] ? manage_workers+0x110/0x110
      [ 6155.452019]  [<ffffffff81098396>] kthread+0xc6/0xd0
      [ 6155.452019]  [<ffffffff8167c4c4>] kernel_thread_helper+0x4/0x10
      [ 6155.452019]  [<ffffffff81671f30>] ? retint_restore_args+0x13/0x13
      [ 6155.452019]  [<ffffffff810982d0>] ? __init_kthread_worker+0x70/0x70
      [ 6155.452019]  [<ffffffff8167c4c0>] ? gs_change+0x13/0x13
      
      This patch removes the set_cpus_allowed_ptr() call, and put the cmci rediscover
      jobs onto all the other cpus using system_wq. This could bring some delay for
      the jobs.
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      85b97637
  7. 26 10月, 2012 2 次提交
  8. 28 9月, 2012 1 次提交
  9. 10 8月, 2012 2 次提交
    • C
      x86/mce: Add CMCI poll mode · 55babd8f
      Chen Gong 提交于
      On Intel systems corrected machine check interrupts (CMCI) may be sent to
      multiple logical processors; possibly to all processors on the affected
      socket (SDM Volume 3B "15.5.1 CMCI Local APIC Interface").  This means
      that a persistent error (such as a stuck bit in ECC memory) may cause
      a storm of interrupts that greatly hinders or prevents forward progress
      (probably on many processors).
      
      To solve this we keep track of the rate at which each processor sees
      CMCI. If we exceed a threshold, we disable CMCI delivery and switch to
      polling the machine check banks. If the storm subsides (none of the
      affected processors see any more errors for a complete poll interval) we
      re-enable CMCI.
      
      [Tony: Added console messages when storm begins/ends and increased storm
      threshold from 5 to 15 so we have a few more logged entries before we
      disable interrupts and start dropping reports]
      Signed-off-by: NChen Gong <gong.chen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NChen Gong <gong.chen@linux.intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      55babd8f
    • T
      x86/mce: Make cmci_discover() quiet · 4670a300
      Tony Luck 提交于
      cmci_discover() works out which machine check banks support CMCI, and
      which of those are shared by multiple logical processors. It uses this
      information to ensure that exactly one cpu is designated the owner of
      each bank so that when interrupts are broadcast to multiple cpus, only one
      of them will look in a shared bank to log the error and clear the bank.
      
      At boot time cmci_discover() performs this task silently. But during
      certain cpu hotplug operations it prints out a set of summary lines
      like this:
      
      CPU 35 MCA banks CMCI:0 CMCI:1 CMCI:3 CMCI:5 CMCI:6 CMCI:7 CMCI:8 CMCI:9 CMCI:10 CMCI:11
      CPU 1 MCA banks CMCI:0 CMCI:1 CMCI:3
      CPU 39 MCA banks CMCI:0 CMCI:1 CMCI:3
      CPU 38 MCA banks CMCI:0 CMCI:1 CMCI:3
      CPU 32 MCA banks CMCI:0 CMCI:1 CMCI:3
      CPU 37 MCA banks CMCI:0 CMCI:1 CMCI:3
      CPU 36 MCA banks CMCI:0 CMCI:1 CMCI:3
      CPU 34 MCA banks CMCI:0 CMCI:1 CMCI:3
      
      The value of these messages seems very low. A user might painstakingly
      cross-check against the data sheet for a processor to ensure that all
      CMCI supported banks are correctly reported, but this seems improbable.
      If users really wanted to do this, we should print the information at
      boot time too.
      
      Remove the messages.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      4670a300
  10. 13 9月, 2011 1 次提交
  11. 30 12月, 2010 1 次提交
  12. 11 6月, 2010 2 次提交
  13. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  14. 11 3月, 2010 1 次提交
  15. 12 10月, 2009 1 次提交
  16. 10 7月, 2009 1 次提交
  17. 17 6月, 2009 8 次提交
  18. 11 6月, 2009 1 次提交
    • H
      x86, mce: Add boot options for corrected errors · 62fdac59
      Hidetoshi Seto 提交于
      This patch introduces three boot options (no_cmci, dont_log_ce
      and ignore_ce) to control handling for corrected errors.
      
      The "mce=no_cmci" boot option disables the CMCI feature.
      
      Since CMCI is a new feature so having boot controls to disable
      it will be a help if the hardware is misbehaving.
      
      The "mce=dont_log_ce" boot option disables logging for corrected
      errors. All reported corrected errors will be cleared silently.
      This option will be useful if you never care about corrected
      errors.
      
      The "mce=ignore_ce" boot option disables features for corrected
      errors, i.e. polling timer and cmci.  All corrected events are
      not cleared and kept in bank MSRs.
      
      Usually this disablement is not recommended, however it will be
      a help if there are some conflict with the BIOS or hardware
      monitoring applications etc., that clears corrected events in
      banks instead of OS.
      
      [ And trivial cleanup (space -> tab) for doc is included. ]
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      LKML-Reference: <4A30ACDF.5030408@jp.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      62fdac59
  19. 04 6月, 2009 1 次提交
  20. 29 5月, 2009 4 次提交
  21. 09 5月, 2009 1 次提交
  22. 08 5月, 2009 1 次提交
    • H
      x86: MCE: make cmci_discover_lock irq-safe · e5299926
      Hidetoshi Seto 提交于
      Lockdep reports the warning below when Li tries to offline one cpu:
      
      [  110.835487] =================================
      [  110.835616] [ INFO: inconsistent lock state ]
      [  110.835688] 2.6.30-rc4-00336-g8c9ed899 #52
      [  110.835757] ---------------------------------
      [  110.835828] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
      [  110.835908] swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
      [  110.835982]  (cmci_discover_lock){?.+...}, at: [<ffffffff80236dc0>] cmci_clear+0x30/0x9b
      
      cmci_clear() can be called via smp_call_function_single().
      
      It is better to disable interrupt while holding cmci_discover_lock,
      to turn it into an irq-safe lock - we can deadlock otherwise.
      
      [ Impact: fix possible deadlock in the MCE code ]
      Reported-by: NShaohua Li <shaohua.li@intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4A03ED38.8000700@jp.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Reported-by: Shaohua Li<shaohua.li@intel.com>
      e5299926
  23. 16 3月, 2009 1 次提交
  24. 13 3月, 2009 1 次提交
  25. 25 2月, 2009 2 次提交
    • H
      x86, mce, cmci: remove incorrect __cpuinit/__cpuexit annotations · df20e2eb
      H. Peter Anvin 提交于
      Impact: Bug fix on UP
      
      The MCE code is reinitialized from resume, so we can't use
      __cpuinit/__cpuexit for most of the code.  Remove those annotations
      for anything downstream of mce_init().
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      df20e2eb
    • A
      x86, mce, cmci: add CMCI support · 88ccbedd
      Andi Kleen 提交于
      Impact: Major new feature
      
      Intel CMCI (Corrected Machine Check Interrupt) is a new
      feature on Nehalem CPUs. It allows the CPU to trigger
      interrupts on corrected events, which allows faster
      reaction to them instead of with the traditional
      polling timer.
      
      Also use CMCI to discover shared banks. Machine check banks
      can be shared by CPU threads or even cores. Using the CMCI enable
      bit it is possible to detect the fact that another CPU already
      saw a specific bank. Use this to assign shared banks only
      to one CPU to avoid reporting duplicated events.
      
      On CPU hot unplug bank sharing is re discovered. This is done
      using a thread that cycles through all the CPUs.
      
      To avoid races between the poller and CMCI we only poll
      for banks that are not CMCI capable and only check CMCI
      owned banks on a interrupt.
      
      The shared banks ownership information is currently only used for
      CMCI interrupts, not polled banks.
      
      The sharing discovery code follows the algorithm recommended in the
      IA32 SDM Vol3a 14.5.2.1
      
      The CMCI interrupt handler just calls the machine check poller to
      pick up the machine check event that caused the interrupt.
      
      I decided not to implement a separate threshold event like
      the AMD version has, because the threshold is always one currently
      and adding another event didn't seem to add any value.
      
      Some code inspired by Yunhong Jiang's Xen implementation,
      which was in term inspired by a earlier CMCI implementation
      by me.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      88ccbedd
  26. 21 2月, 2009 1 次提交
    • H
      x86, mce: remove incorrect __cpuinit for mce_cpu_features() · cc3ca220
      H. Peter Anvin 提交于
      Impact: Bug fix on UP
      
      Checkin 6ec68bff:
          x86, mce: reinitialize per cpu features on resume
      
      introduced a call to mce_cpu_features() in the resume path, in order
      for the MCE machinery to get properly reinitialized after a resume.
      However, this function (and its successors) was flagged __cpuinit,
      which becomes __init on UP configurations (on SMP suspend/resume
      requires CPU hotplug and so this would not be seen.)
      
      Remove the offending __cpuinit annotations for mce_cpu_features() and
      its successor functions.
      
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      cc3ca220