1. 18 2月, 2009 6 次提交
    • A
      x86, mce: switch machine check polling to per CPU timer · 52d168e2
      Andi Kleen 提交于
      Impact: Higher priority bug fix
      
      The machine check poller runs a single timer and then broadcasted an
      IPI to all CPUs to check them. This leads to unnecessary
      synchronization between CPUs. The original CPU running the timer has
      to wait potentially a long time for all other CPUs answering. This is
      also real time unfriendly and in general inefficient.
      
      This was especially a problem on systems with a lot of events where
      the poller run with a higher frequency after processing some events.
      There could be more and more CPU time wasted with this, to
      the point of significantly slowing down machines.
      
      The machine check polling is actually fully independent per CPU, so
      there's no reason to not just do this all with per CPU timers.  This
      patch implements that.
      
      Also switch the poller also to use standard timers instead of work
      queues. It was using work queues to be able to execute a user program
      on a event, but mce_notify_user() handles this case now with a
      separate callback. So instead always run the poll code in in a
      standard per CPU timer, which means that in the common case of not
      having to execute a trigger there will be less overhead.
      
      This allows to clean up the initialization significantly, because
      standard timers are already up when machine checks get init'ed.  No
      multiple initialization functions.
      
      Thanks to Thomas Gleixner for some help.
      
      Cc: thockin@google.com
      v2: Use del_timer_sync() on cpu shutdown and don't try to handle
      migrated timers.
      v3: Add WARN_ON for timer running on unexpected CPU
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      52d168e2
    • A
      x86, mce: always use separate work queue to run trigger · 9bd98405
      Andi Kleen 提交于
      Impact: Needed for bug fix in next patch
      
      This relaxes the requirement that mce_notify_user has to run in process
      context. Useful for future changes, but also leads to cleaner
      behaviour now. Now instead mce_notify_user can be called directly
      from interrupt (but not NMI) context.
      
      The work queue only uses a single global work struct, which can be done safely
      because it is always free to reuse before the trigger function is executed.
      This way no events can be lost.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      9bd98405
    • A
      x86, mce: don't disable machine checks during code patching · 123aa76e
      Andi Kleen 提交于
      Impact: low priority bug fix
      
      This removes part of a a patch I added myself some time ago. After some
      consideration the patch was a bad idea. In particular it stopped machine check
      exceptions during code patching.
      
      To quote the comment:
      
              * MCEs only happen when something got corrupted and in this
              * case we must do something about the corruption.
              * Ignoring it is worse than a unlikely patching race.
              * Also machine checks tend to be broadcast and if one CPU
              * goes into machine check the others follow quickly, so we don't
              * expect a machine check to cause undue problems during to code
              * patching.
      
      So undo the machine check related parts of
      8f4e956b NMIs are still disabled.
      
      This only removes code, the only additions are a new comment.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      123aa76e
    • A
      x86, mce: disable machine checks on suspend · 973a2dd1
      Andi Kleen 提交于
      Impact: Bug fix
      
      During suspend it is not reliable to process machine check
      exceptions, because CPUs disappear but can still get machine check
      broadcasts.  Also the system is slightly more likely to
      machine check them, but the handler is typically not a position
      to handle them in a meaningfull way.
      
      So disable them during suspend and enable them during resume.
      
      Also make sure they are always disabled on hot-unplugged CPUs.
      
      This new code assumes that suspend always hotunplugs all
      non BP CPUs.
      
      v2: Remove the WARN_ONs Thomas objected to.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      973a2dd1
    • A
      x86, mce: use force_sig_info to kill process in machine check · 380851bc
      Andi Kleen 提交于
      Impact: bug fix (with tolerant == 3)
      
      do_exit cannot be called directly from the exception handler because
      it can sleep and the exception handler runs on the exception stack.
      Use force_sig() instead.
      
      Based on a earlier patch by Ying Huang who debugged the problem.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      380851bc
    • A
      x86, mce: reinitialize per cpu features on resume · 6ec68bff
      Andi Kleen 提交于
      Impact: Bug fix
      
      This fixes a long standing bug in the machine check code. On resume the
      boot CPU wouldn't get its vendor specific state like thermal handling
      reinitialized. This means the boot cpu wouldn't ever get any thermal
      events reported again.
      
      Call the respective initialization functions on resume
      
      v2: Remove ancient init because they don't have a resume device anyways.
          Pointed out by Thomas Gleixner.
      v3: Now fix the Subject too to reflect v2 change
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6ec68bff
  2. 17 12月, 2008 1 次提交
  3. 07 9月, 2008 1 次提交
  4. 23 8月, 2008 1 次提交
    • R
      x86 MCE: Fix CPU hotplug problem with multiple multicore AMD CPUs · 8735728e
      Rafael J. Wysocki 提交于
      During CPU hot-remove the sysfs directory created by
      threshold_create_bank(), defined in
      arch/x86/kernel/cpu/mcheck/mce_amd_64.c, has to be removed before
      its parent directory, created by mce_create_device(), defined in
      arch/x86/kernel/cpu/mcheck/mce_64.c .  Moreover, when the CPU in
      question is hotplugged again, obviously the latter has to be created
      before the former.  At present, the right ordering is not enforced,
      because all of these operations are carried out by CPU hotplug
      notifiers which are not appropriately ordered with respect to each
      other.  This leads to serious problems on systems with two or more
      multicore AMD CPUs, among other things during suspend and hibernation.
      
      Fix the problem by placing threshold bank CPU hotplug callbacks in
      mce_cpu_callback(), so that they are invoked at the right places,
      if defined.  Additionally, use kobject_del() to remove the sysfs
      directory associated with the kobject created by
      kobject_create_and_add() in threshold_create_bank(), to prevent the
      kernel from crashing during CPU hotplug operations on systems with
      two or more multicore AMD CPUs.
      
      This patch fixes bug #11337.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NAndi Kleen <andi@firstfloor.org>
      Tested-by: NMark Langsdorf <mark.langsdorf@amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8735728e
  5. 22 7月, 2008 2 次提交
  6. 20 7月, 2008 1 次提交
  7. 03 7月, 2008 2 次提交
  8. 26 6月, 2008 1 次提交
  9. 18 6月, 2008 1 次提交
    • D
      x86: correctly report NR_BANKS in mce_64.c · b4b3bd96
      Daniel Rahn 提交于
      attached is a no-brainer that makes kernel correctly report
      NR_BANKS for MCE. We are right now limited to NR_BANKS==6, but the
      error message will use the available number of banks instead of the
      defined maximum.
      
      For a Nehalem based system it will print:
      
      "MCE: warning: using only 9 banks"
      
      while the correct message would be
      
      "MCE: warning: using only 6 banks"
      Signed-off-by: NPavel Machek <pavel@suse.cz>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b4b3bd96
  10. 13 5月, 2008 1 次提交
    • V
      x86: remove 6 bank limitation in 64 bit MCE reporting code · 8edc5cc5
      Venki Pallipadi 提交于
      Eliminate the 6 bank restriction in 64 bit mce reporting code. This
      restriction is artificial (due to static creation of sysfs files) and 32
      bit code does not have any such restriction.
      
      This change helps in reporting the details of machine checks on a
      machine check exception with errors in bank 6 and above on CPUs that
      support those banks. Without the patch, machine check errors in those
      banks are not reported.
      
      We still have 128 (MCE_EXTENDED_BANK) bank restriction instead of max
      256 supported in hardware. That is not changed in the patch below as it
      will have some user level mcelog utility dependency, with bank 128 being
      used for thermal reporting currently.
      
      The patch below does not create sysfs control (bankNctl) for banks
      higher than 6 as well. That needs some pre-cleanup in /sysfs mce layout,
      removal of per cpu /sysfs entries for bankctl as they are really global
      system level control today. That change will follow. This basic change
      is critical to report the detailed errors on banks higher than 6.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8edc5cc5
  11. 26 4月, 2008 1 次提交
    • J
      x86-64: extend MCE CPU quirk handling · 911f6a7b
      Jan Beulich 提交于
      At least on my Barcelona, I see MCE log entries after cold boot caused
      by BIOS not properly clearing the respective registers. Therefore, this
      patch extends the workaround to families 0x10 and 0x11 (the latter just
      for completeness, I have nothing to verify this against).
      At the same time, provide a way to make these entries visible via the
      'mce=bootlog' command line option even on these machines.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      911f6a7b
  12. 30 1月, 2008 7 次提交
  13. 25 1月, 2008 1 次提交
  14. 17 11月, 2007 1 次提交
    • A
      x86: fix cpu-hotplug regression · 90367556
      Andreas Herrmann 提交于
      Commit d435d862
      ("cpu hotplug: mce: fix cpu hotplug error handling")
      changed the error handling in mce_cpu_callback.
      
      In cases where not all CPUs are brought up during
      boot (e.g. using maxcpus and additional_cpus parameters)
      mce_cpu_callback now returns NOTFIY_BAD because
      for such CPUs cpu_data is not completely filled when
      the notifier is called. Thus mce_create_device fails right
      at its beginning:
      
              if (!mce_available(&cpu_data[cpu]))
                      return -EIO;
      
      As a quick fix I suggest to check boot_cpu_data for MCE.
      
      To reproduce this regression:
      
      (1) boot with maxcpus=2 addtional_cpus=2 on a 4 CPU x86-64 system
      (2) # echo 1 >/sys/devices/system/cpu/cpu2/online
        -bash: echo: write error: Invalid argument
      
      dmesg shows:
      
      _cpu_up: attempt to bring up CPU 2 failed
      Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      90367556
  15. 15 11月, 2007 1 次提交
    • A
      x86: don't call mce_create_device on CPU_UP_PREPARE · bae19fe0
      Andreas Herrmann 提交于
      Fix regression introduced with d435d862
      ("cpu hotplug: mce: fix cpu hotplug error handling").
      
      A CPU which was not brought up during boot (using maxcpus and
      additional_cpus parameters) couldn't be onlined anymore.  For such a CPU it
      seemed that MCE was not supported during CPU_UP_PREPARE-time which caused
      mce_cpu_callback to return NOTIFY_BAD to notifier_call_chain.  To fix this
      we:
      
       - call mce_create_device for CPU_ONLINE event (instead of CPU_UP_PREPARE),
       - avoid mce_remove_device() for the CPU that is not correctly initialized
         by mce_create_device() failure,
       - make mce_cpu_callback always return NOTIFY_OK for CPU_ONLINE event.
         Because CPU_ONLINE callback return value is always ignored.
      
      [akinobu.mita@gmail.com: avoid mce_remove_device() for not initialized device]
      [akinobu.mita@gmail.com: make mce_cpu_callback always return NOTIFY_OK]
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bae19fe0
  16. 24 10月, 2007 2 次提交
  17. 20 10月, 2007 2 次提交
  18. 19 10月, 2007 1 次提交
    • A
      cpu hotplug: mce: fix cpu hotplug error handling · d435d862
      Akinobu Mita 提交于
      - Clear kobject in percpu device_mce before calling sysdev_register() with
      
        Because mce_create_device() may fail and it leaves kobject filled with
        junk. It will be the problem when mce_create_device() will be called
        next time.
      
      - Fix error handling in mce_create_device()
      
        Error handling should not do sysdev_remove_file() with not yet added
        attributes.
      
      - Don't register hotcpu notifier when mce_create_device() returns error
      
      - Do mce_create_device() in CPU_UP_PREPARE instead of CPU_ONLINE
      
      Cc: Andi Kleen <andi@firstfloor.org>
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Jan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d435d862
  19. 18 10月, 2007 1 次提交
  20. 17 10月, 2007 1 次提交
  21. 11 10月, 2007 2 次提交
  22. 23 7月, 2007 1 次提交
  23. 22 7月, 2007 2 次提交
    • V
      x86: round_jiffies() for i386 and x86-64 non-critical/corrected MCE polling · 22293e58
      Venki Pallipadi 提交于
      This helps to reduce the frequency at which the CPU must be taken out of a
      lower-power state.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Acked-by: NTim Hockin <thockin@hockin.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      22293e58
    • T
      x86_64: mcelog tolerant level cleanup · bd78432c
      Tim Hockin 提交于
      Background:
       The MCE handler has several paths that it can take, depending on various
       conditions of the MCE status and the value of the 'tolerant' knob.  The
       exact semantics are not well defined and the code is a bit twisty.
      
      Description:
       This patch makes the MCE handler's behavior more clear by documenting the
       behavior for various 'tolerant' levels.  It also fixes or enhances
       several small things in the handler.  Specifically:
           * If RIPV is set it is not safe to restart, so set the 'no way out'
             flag rather than the 'kill it' flag.
           * Don't panic() on correctable MCEs.
           * If the _OVER bit is set *and* the _UC bit is set (meaning possibly
             dropped uncorrected errors), set the 'no way out' flag.
           * Use EIPV for testing whether an app can be killed (SIGBUS) rather
             than RIPV.  According to docs, EIPV indicates that the error is
             related to the IP, while RIPV simply means the IP is valid to
             restart from.
           * Don't clear the MCi_STATUS registers until after the panic() path.
             This leaves the status bits set after the panic() so clever BIOSes
             can find them (and dumb BIOSes can do nothing).
      
       This patch also calls nonseekable_open() in mce_open (as suggested by akpm).
      
      Result:
       Tolerant levels behave almost identically to how they always have, but
       not it's well defined.  There's a slightly higher chance of panic()ing
       when multiple errors happen (a good thing, IMHO).  If you take an MBE and
       panic(), the error status bits are not cleared.
      
      Alternatives:
       None.
      
      Testing:
       I used software to inject correctable and uncorrectable errors.  With
       tolerant = 3, the system usually survives.  With tolerant = 2, the system
       usually panic()s (PCC) but not always.  With tolerant = 1, the system
       always panic()s.  When the system panic()s, the BIOS is able to detect
       that the cause of death was an MC4.  I was not able to reproduce the
       case of a non-PCC error in userspace, with EIPV, with (tolerant < 3).
       That will be rare at best.
      Signed-off-by: NTim Hockin <thockin@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bd78432c