- 15 6月, 2009 3 次提交
-
-
由 Andrew Morton 提交于
This symbol doesn't need file-global scope. Cc: "Zhang, Rui" <rui.zhang@intel.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: Langsdorf, Mark <mark.langsdorf@amd.com> Cc: Leo Milano <lmilano@gmx.net> Cc: Thomas Renninger <trenn@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NDave Jones <davej@redhat.com>
-
由 Luis Henriques 提交于
Mess cleanup in powernow_k8_acpi_pst_values() function. Signed-off-by: NLuis Henriques <henrix@sapo.pt> Signed-off-by: NDave Jones <davej@redhat.com>
-
由 Thomas Renninger 提交于
This doesn't fix anything, but it's expected that a transition latency of 0 could cause trouble in the future. Signed-off-by: NThomas Renninger <trenn@suse.de> Cc: Langsdorf, Mark <mark.langsdorf@amd.com> Signed-off-by: NDave Jones <davej@redhat.com>
-
- 12 6月, 2009 1 次提交
-
-
由 Yong Wang 提交于
The fixed-function performance counters do not work on current Atom processors. Use the general-purpose ones instead. Signed-off-by: NYong Wang <yong.y.wang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <20090612080855.GA2286@ywang-moblin2.bj.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 11 6月, 2009 8 次提交
-
-
由 Peter Zijlstra 提交于
The top (fastest) and last level (biggest) caches are the most interesting ones, performance wise. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> [ Fixed the Nehalem LL table to LLC Reference/Miss events ] Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Pure renames only, to PERF_COUNT_HW_* and PERF_COUNT_SW_*. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Hidetoshi Seto 提交于
This patch introduces three boot options (no_cmci, dont_log_ce and ignore_ce) to control handling for corrected errors. The "mce=no_cmci" boot option disables the CMCI feature. Since CMCI is a new feature so having boot controls to disable it will be a help if the hardware is misbehaving. The "mce=dont_log_ce" boot option disables logging for corrected errors. All reported corrected errors will be cleared silently. This option will be useful if you never care about corrected errors. The "mce=ignore_ce" boot option disables features for corrected errors, i.e. polling timer and cmci. All corrected events are not cleared and kept in bank MSRs. Usually this disablement is not recommended, however it will be a help if there are some conflict with the BIOS or hardware monitoring applications etc., that clears corrected events in banks instead of OS. [ And trivial cleanup (space -> tab) for doc is included. ] Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> LKML-Reference: <4A30ACDF.5030408@jp.fujitsu.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Hidetoshi Seto 提交于
This patch: - Adds print_mce_head() instead of first flag - Makes the header to be printed always - Stops double printing of corrected errors [ This portion originates from Huang Ying's patch ] Originally-From: Huang Ying <ying.huang@intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> LKML-Reference: <4A30AC83.5010708@jp.fujitsu.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
We currently log hw.sample_period for PERF_SAMPLE_PERIOD, however this is incorrect. When we adjust the period, it will only take effect the next cycle but report it for the current cycle. So when we adjust the period for every cycle, we're always wrong. Solve this by keeping track of the last_period. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
For easy extension of the sample data, put it in a structure. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Harald Welte 提交于
The e_powersaver driver for VIA's C7 CPU's needs to be marked as DANGEROUS as it configures the CPU to power states that are out of specification. According to Centaur, all systems with C7 and Nano CPU's support the ACPI p-state method. Thus, the acpi-cpufreq driver should be used instead. Signed-off-by: NHarald Welte <HaraldWelte@viatech.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Harald Welte 提交于
The VIA/Centaur C7, C7-M and Nano CPU's all support ACPI based cpu p-states using a MSR interface. The Linux driver just never made use of it, since in addition to the check for the EST flag it also checked if the vendor is Intel. Signed-off-by: NHarald Welte <HaraldWelte@viatech.com> [ Removed the vendor checks entirely - Linus ] Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 6月, 2009 3 次提交
-
-
由 Peter Zijlstra 提交于
Also employ the overflow handler to adjust the frequency, this results in a stable frequency in about 40~50 samples, instead of that many ticks. This also means we can start sampling at a sample period of 1 without running head-first into the throttle. It relies on sched_clock() to accurately measure the time difference between the overflow NMIs. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yong Wang 提交于
Fix the model number of Intel Core2 processors according to the documentation: Intel Processor Identification with the CPUID Instruction: http://www.intel.com/support/processors/sb/cs-009861.htmSigned-off-by: NYong Wang <yong.y.wang@intel.com> Also-Reported-by: NArnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <20090610090612.GA26580@ywang-moblin2.bj.intel.com> [ Added two more model numbers suggested by Arnd Bergmann ] Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andi Kleen 提交于
VT-x needs an explicit MC vector intercept to handle machine checks in the hyper visor. It also has a special option to catch machine checks that happen during VT entry. Do these interceptions and forward them to the Linux machine check handler. Make it always look like user space is interrupted because the machine check handler treats kernel/user space differently. Thanks to Jiang Yunhong for help and testing. Cc: stable@kernel.org Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 09 6月, 2009 5 次提交
-
-
由 Yong Wang 提交于
Correct some event and UMASK values according to Intel SDM, in the Nehalem and Atom tables. Signed-off-by: NYong Wang <yong.y.wang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <20090609131553.GA12489@ywang-moblin2.bj.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
Booting a 32-bit kernel on Magny-Cours results in the following panic: ... Using APIC driver default ... Overriding APIC driver with bigsmp ... Getting VERSION: 80050010 Getting VERSION: 80050010 Getting ID: 10000000 Getting ID: ef000000 Getting LVT0: 700 Getting LVT1: 10000 Kernel panic - not syncing: Boot APIC ID in local APIC unexpected (16 vs 0) Pid: 1, comm: swapper Not tainted 2.6.30-rcX #2 Call Trace: [<c05194da>] ? panic+0x38/0xd3 [<c0743102>] ? native_smp_prepare_cpus+0x259/0x31f [<c073b19d>] ? kernel_init+0x3e/0x141 [<c073b15f>] ? kernel_init+0x0/0x141 [<c020325f>] ? kernel_thread_helper+0x7/0x10 The reason is that default_get_apic_id handled extension of local APIC ID field just in case of XAPIC. Thus for this AMD CPU, default_get_apic_id() returns 0 and bigsmp_get_apic_id() returns 16 which leads to the respective kernel panic. This patch introduces a Linux specific feature flag to indicate support for extended APIC id (8 bits instead of 4 bits width) and sets the flag on AMD CPUs if applicable. Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: <stable@kernel.org> LKML-Reference: <20090608135509.GA12431@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yinghai Lu 提交于
These are defined as static cpumask_var_t so if MAXSMP is not used, they are cleared already. Avoid surprises when MAXSMP is enabled. Signed-off-by: NYinghai Lu <yinghai.lu@kernel.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Fill in amd_hw_cache_event_id[] with the AMD CPU specific events, for family 0x0f, 0x10 and 0x11. There's apparently no distinction between load and store events, so we only fill in the load events. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 08 6月, 2009 3 次提交
-
-
由 Ingo Molnar 提交于
Standardize and tidy up all the messages we print during perfcounter initialization. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Fill in core2_hw_cache_event_id[] with the Atom model specific events. The events can be used in all the tools via the -e (--event) parameter, for example "-e l1-misses" or -"-e l2-accesses" or "-e l2-write-misses". ( Note: these are straight from the Intel manuals - not tested yet.) Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Fill in core2_hw_cache_event_id[] with the Core2 model specific events. The events can be used in all the tools via the -e (--event) parameter, for example "-e l1-misses" or -"-e l2-accesses" or "-e l2-write-misses". Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 07 6月, 2009 1 次提交
-
-
由 Jaswinder Singh Rajput 提交于
Remove model information, encoding/decoding and reduce bookkeeping. This, besides removing a lot of code and cleaning up the code, also enables these features on many more CPUs that were enumerated before. Reported-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> LKML-Reference: <1244224637.8212.6.camel@ht.satnam> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 06 6月, 2009 3 次提交
-
-
由 Ingo Molnar 提交于
Extend generic event enumeration with the PERF_TYPE_HW_CACHE method. This is a 3-dimensional space: { L1-D, L1-I, L2, ITLB, DTLB, BPU } x { load, store, prefetch } x { accesses, misses } User-space passes in the 3 coordinates and the kernel provides a counter. (if the hardware supports that type and if the combination makes sense.) Combinations that make no sense produce a -EINVAL. Combinations that are not supported by the hardware produce -ENOTSUP. Extend the tools to deal with this, and rewrite the event symbol parsing code with various popular aliases for the units and access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are both valid aliases. ( x86 is supported for now, with the Nehalem event table filled in, and with Core2 and Atom having placeholder tables. ) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Counter type is a frequently used value and we do a lot of bit juggling by encoding and decoding it from attr->config. Clean this up by creating a separate attr->type field. Also clean up the various similarly complex user-space bits all around counter attribute management. The net improvement is significant, and it will be easier to add a new major type (which is what triggered this cleanup). (This changes the ABI, all tools are adapted.) (PowerPC build-tested.) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Dave Jones 提交于
The powernow-k8 driver checks to see that the Performance Control/Status Registers are declared as FFH (functional fixed hardware) by the BIOS. However, this check got broken in the commit: 0e64a0c9 [CPUFREQ] checkpatch cleanups for powernow-k8 Fix based on an original patch from Naga Chumbalkar. Signed-off-by: NNaga Chumbalkar <nagananda.chumbalkar@hp.com> Cc: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: NDave Jones <davej@redhat.com>
-
- 04 6月, 2009 13 次提交
-
-
由 Andi Kleen 提交于
Newer Intel CPUs support a new class of machine checks called recoverable action optional. Action Optional means that the CPU detected some form of corruption in the background and tells the OS about using a machine check exception. The OS can then take appropiate action, like killing the process with the corrupted data or logging the event properly to disk. This is done by the new generic high level memory failure handler added in a earlier patch. The high level handler takes the address with the failed memory and does the appropiate action, like killing the process. In this version of the patch the high level handler is stubbed out with a weak function to not create a direct dependency on the hwpoison branch. The high level handler cannot be directly called from the machine check exception though, because it has to run in a defined process context to be able to sleep when taking VM locks (it is not expected to sleep for a long time, just do so in some exceptional cases like lock contention) Thus the MCE handler has to queue a work item for process context, trigger process context and then call the high level handler from there. This patch adds two path to process context: through a per thread kernel exit notify_user() callback or through a high priority work item. The first runs when the process exits back to user space, the other when it goes to sleep and there is no higher priority process. The machine check handler will schedule both, and whoever runs first will grab the event. This is done because quick reaction to this event is critical to avoid a potential more fatal machine check when the corruption is consumed. There is a simple lock less ring buffer to queue the corrupted addresses between the exception handler and the process context handler. Then in process context it just calls the high level VM code with the corrupted PFNs. The code adds the required code to extract the failed address from the CPU's machine check registers. It doesn't try to handle all possible cases -- the specification has 6 different ways to specify memory address -- but only the linear address. Most of the required checking has been already done earlier in the mce_severity rule checking engine. Following the Intel recommendations Action Optional errors are only enabled for known situations (encoded in MCACODs). The errors are ignored otherwise, because they are action optional. v2: Improve comment, disable preemption while processing ring buffer (reported by Ying Huang) Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
Rename the mce_notify_user function to mce_notify_irq. The next patch will split the wakeup handling of interrupt context and of process context and it's better to give it a clearer name for this. Contains a fix from Ying Huang [ Impact: cleanup ] Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Huang Ying 提交于
The MCE severity judgement code is data-driven, so code coverage tools such as gcov can not be used for measuring coverage. Instead a dedicated coverage mechanism is implemented. The kernel keeps track of rules executed and reports them in debugfs. This is useful for increasing coverage of the mce-test testsuite. Right now it's unconditionally enabled because it's very little code. Signed-off-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
The x86 architecture recently added some new machine check status bits: S(ignalled) and AR (Action-Required). Signalled allows to check if a specific event caused an exception or was just logged through CMCI. AR allows the kernel to decide if an event needs immediate action or can be delayed or ignored. Implement support for these new status bits. mce_severity() uses the new bits to grade the machine check correctly and decide what to do. The exception handler uses AR to decide to kill or not. The S bit is used to separate events between the poll/CMCI handler and the exception handler. Classical UC always leads to panic. That was true before anyways because the existing CPUs always passed a PCC with it. Also corrects the rules whether to kill in user or kernel context and how to handle missing RIPV. The machine check handler largely uses the mce-severity grading engine now instead of making its own decisions. This means the logic is centralized in one place. This is useful because it has to be evaluated multiple times. v2: Some rule fixes; Add AO events Fix RIPV, RIPV|EIPV order (Ying Huang) Fix UCNA with AR=1 message (Ying Huang) Add comment about panicing in m_c_p. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
When multiple MCEs are printed print the "HARDWARE ERROR" header and "This is not a software error" footer only once. This makes the output much more compact with many CPUs. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
Fatal machine checks can be logged to disk after boot, but only if the system did a warm reboot. That's unfortunately difficult with the default panic behaviour, which waits forever and the admin has to press the power button because modern systems usually miss a reset button. This clears the machine checks in the registers and make it impossible to log them. This patch changes the default for machine check panic to always reboot after 30s. Then the mce can be successfully logged after reboot. I believe this will improve machine check experience for any system running the X server. This is dependent on successfull boot logging of MCEs. This currently only works on Intel systems, on AMD there are quite a lot of systems around which leave junk in the machine check registers after boot, so it's disabled here. These systems will continue to default to endless waiting panic. v2: Only force panic timeout when it's shorter (H.Seto) v3: Only force timeout when there is no timeout (based on comment H.Seto) [ Fix changelog - HS ] Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Huang Ying 提交于
Assume IP on the stack is valid when either EIPV or RIPV are set. This influences whether the machine check exception handler decides to return or panic. This fixes a test case in the mce-test suite and is more compliant to the specification. This currently only makes a difference in a artificial testing scenario with the mce-test test suite. Also in addition do not force the EIPV to be valid with the exact register MSRs, and keep in trust the CS value on stack even if MSR is available. [AK: combination of patches from Huang Ying and Hidetoshi Seto, with new description by me] [add some description, no code changed - HS] Signed-off-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
... instead of "Machine check". This is for consistency with the Monarch panic message. Based on a report from Ying Huang. v2: But add a descriptive postfix so that the test suite can distingush. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
On Intel platforms machine check exceptions are always broadcast to all CPUs. This patch makes the machine check handler synchronize all these machine checks, elect a Monarch to handle the event and collect the worst event from all CPUs and then process it first. This has some advantages: - When there is a truly data corrupting error the system panics as quickly as possible. This improves containment of corrupted data and makes sure the corrupted data never hits stable storage. - The panics are synchronized and do not reenter the panic code on multiple CPUs (which currently does not handle this well). - All the errors are reported. Currently it often happens that another CPU happens to do the panic first, but reports useless information (empty machine check) because the real error happened on another CPU which came in later. This is a big advantage on Nehalem where the 8 threads per CPU lead to often the wrong CPU winning the race and dumping useless information on a machine check. The problem also occurs in a less severe form on older CPUs. - The system can detect when no CPUs detected a machine check and shut down the system. This can happen when one CPU is so badly hung that that it cannot process a machine check anymore or when some external agent wants to stop the system by asserting the machine check pin. This follows Intel hardware recommendations. - This matches the recommended error model by the CPU designers. - The events can be output in true severity order - When a panic happens on another CPU it makes sure to be actually be able to process the stop IPI by enabling interrupts. The code is extremly careful to handle timeouts while waiting for other CPUs. It can't rely on the normal timing mechanisms (jiffies, ktime_get) because of its asynchronous/lockless nature, so it uses own timeouts using ndelay() and a "SPINUNIT" The timeout is configurable. By default it waits for upto one second for the other CPUs. This can be also disabled. From some informal testing AMD systems do not see to broadcast machine checks, so right now it's always disabled by default on non Intel CPUs or also on very old Intel systems. Includes fixes from Ying Huang Fixed a "ecception" in a comment (H.Seto) Moved global_nwo reset later based on suggestion from H.Seto v2: Avoid duplicate messages [ Impact: feature, fixes long standing problems. ] Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
In some circumstances multiple CPUs can enter mce_panic() in parallel. This gives quite confused output because they will all dump the same machine check buffer. The other problem is that they would all panic in parallel, but not process each other's shutdown IPIs because interrupts are disabled. Detect this situation early on in mce_panic(). On the first CPU entering will do the panic, the others will just wait to be killed. For paranoia reasons in case the other CPU dies during the MCE I added a 5 seconds timeout. If it expires each CPU will panic on its own again. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
Machine checks support waking up the mcelog daemon quickly. The original wake up code for this was pretty ugly, relying on a idle notifier and a special process flag. The reason it did it this way is that the machine check handler is not subject to normal interrupt locking rules so it's not safe to call wake_up(). Instead it set a process flag and then either did the wakeup in the syscall return or in the idle notifier. This patch adds a new "bootstraping" method as replacement. The idea is that the handler checks if it's in a state where it is unsafe to call wake_up(). If it's safe it calls it directly. When it's not safe -- that is it interrupted in a critical section with interrupts disables -- it uses a new "self IPI" to trigger an IPI to its own CPU. This can be done safely because IPI triggers are atomic with some care. The IPI is raised once the interrupts are reenabled and can then safely call wake_up(). When APICs are disabled the event is just queued and will be picked up eventually by the next polling timer. I think that's a reasonable compromise, since it should only happen quite rarely. Contains fixes from Ying Huang. [ solve conflict on irqinit, make it work on 32bit (entry_arch.h) - HS ] Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
The exception handler should behave differently if the exception is fatal versus one that can be returned from. In the first case it should never clear any registers because these need to be preserved for logging after the next boot. Otherwise it should clear them on each CPU step by step so that other CPUs sharing the same bank don't see duplicate events. Otherwise we risk reporting events multiple times on any CPUs which have shared machine check banks, which is a common problem on Intel Nehalem which has both SMT (two CPU threads sharing banks) and shared machine check banks in the uncore. Determine early in a special pass if any event requires a panic. This uses the mce_severity() function added earlier. This is needed for the next patch. Also fixes a problem together with an earlier patch that corrected events weren't logged on a fatal MCE. [ Impact: Feature ] Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
The machine check grading (as in deciding what should be done for a given register value) has to be done multiple times soon and it's also getting more complicated. So it makes sense to consolidate it into a single function. To get smaller and more straight forward and possibly more extensible code I opted towards a new table driven method. The various rules are put into a table when is then executed by a very simple interpreter. The grading engine is in a new file mce-severity.c. I also added a private include file mce-internal.h, because mce.h is already a bit too cluttered. This is dead code right now, but will be used in followon patches. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-