- 17 6月, 2009 8 次提交
-
-
由 Hidetoshi Seto 提交于
The mce_disabled on 32bit is a tristate variable [1,0,-1], while 64bit version is boolean [0,1]. This patch makes mce_disabled always boolean, and use mce_p5_enabled to indicate the third state instead. Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Hidetoshi Seto 提交于
There are 2 headers: arch/x86/include/asm/mce.h arch/x86/kernel/cpu/mcheck/mce.h and in the latter small header: #include <asm/mce.h> This patch move all contents in the latter header into the former, and fix all files using the latter to include the former instead. Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Hidetoshi Seto 提交于
Add sysfs interface for admins who want to tweak these options without rebooting the system. Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Hidetoshi Seto 提交于
"trigger" is not straight forward name for valiable that holds name of user mode helper program which triggered by machine check events. This patch renames this valiable and kins to more recognizable names. trigger => mce_helper trigger_argv => mce_helper_argv notify_user => mce_need_notify No functional changes. Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Hidetoshi Seto 提交于
Add __read_mostly to data written during setup. Suggested-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Hidetoshi Seto 提交于
Simplify interface of mce_start(): - no_way_out = mce_start(no_way_out, &order); + order = mce_start(&no_way_out); Now Monarch and Subjects share same exit(return) in usual path. Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Hidetoshi Seto 提交于
In mce_cpu_restart, mce_init_timer is called unconditionally. If !mce_available (e.g. mce is disabled), there are no useful work for timer. Stop running it. Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Huang Ying 提交于
If one CPU has no_way_out == 1, all other CPUs should have no_way_out == 1. But despite global_nwo is read after mce_callin, global_nwo is updated after mce_callin too. So it is possible that some CPU read global_nwo before some other CPU update global_nwo, so that no_way_out == 1 for some CPU, while no_way_out == 0 for some other CPU. This patch fixes this race condition via moving mce_callin updating after global_nwo updating, with a smp_wmb in between. A smp_rmb is added between their reading too. Signed-off-by: NHuang Ying <ying.huang@intel.com> Acked-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
-
- 12 6月, 2009 1 次提交
-
-
由 Yong Wang 提交于
The fixed-function performance counters do not work on current Atom processors. Use the general-purpose ones instead. Signed-off-by: NYong Wang <yong.y.wang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <20090612080855.GA2286@ywang-moblin2.bj.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 11 6月, 2009 8 次提交
-
-
由 Peter Zijlstra 提交于
The top (fastest) and last level (biggest) caches are the most interesting ones, performance wise. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> [ Fixed the Nehalem LL table to LLC Reference/Miss events ] Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Pure renames only, to PERF_COUNT_HW_* and PERF_COUNT_SW_*. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Hidetoshi Seto 提交于
This patch introduces three boot options (no_cmci, dont_log_ce and ignore_ce) to control handling for corrected errors. The "mce=no_cmci" boot option disables the CMCI feature. Since CMCI is a new feature so having boot controls to disable it will be a help if the hardware is misbehaving. The "mce=dont_log_ce" boot option disables logging for corrected errors. All reported corrected errors will be cleared silently. This option will be useful if you never care about corrected errors. The "mce=ignore_ce" boot option disables features for corrected errors, i.e. polling timer and cmci. All corrected events are not cleared and kept in bank MSRs. Usually this disablement is not recommended, however it will be a help if there are some conflict with the BIOS or hardware monitoring applications etc., that clears corrected events in banks instead of OS. [ And trivial cleanup (space -> tab) for doc is included. ] Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> LKML-Reference: <4A30ACDF.5030408@jp.fujitsu.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Hidetoshi Seto 提交于
This patch: - Adds print_mce_head() instead of first flag - Makes the header to be printed always - Stops double printing of corrected errors [ This portion originates from Huang Ying's patch ] Originally-From: Huang Ying <ying.huang@intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> LKML-Reference: <4A30AC83.5010708@jp.fujitsu.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
We currently log hw.sample_period for PERF_SAMPLE_PERIOD, however this is incorrect. When we adjust the period, it will only take effect the next cycle but report it for the current cycle. So when we adjust the period for every cycle, we're always wrong. Solve this by keeping track of the last_period. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
For easy extension of the sample data, put it in a structure. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Harald Welte 提交于
The e_powersaver driver for VIA's C7 CPU's needs to be marked as DANGEROUS as it configures the CPU to power states that are out of specification. According to Centaur, all systems with C7 and Nano CPU's support the ACPI p-state method. Thus, the acpi-cpufreq driver should be used instead. Signed-off-by: NHarald Welte <HaraldWelte@viatech.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Harald Welte 提交于
The VIA/Centaur C7, C7-M and Nano CPU's all support ACPI based cpu p-states using a MSR interface. The Linux driver just never made use of it, since in addition to the check for the EST flag it also checked if the vendor is Intel. Signed-off-by: NHarald Welte <HaraldWelte@viatech.com> [ Removed the vendor checks entirely - Linus ] Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 6月, 2009 3 次提交
-
-
由 Peter Zijlstra 提交于
Also employ the overflow handler to adjust the frequency, this results in a stable frequency in about 40~50 samples, instead of that many ticks. This also means we can start sampling at a sample period of 1 without running head-first into the throttle. It relies on sched_clock() to accurately measure the time difference between the overflow NMIs. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yong Wang 提交于
Fix the model number of Intel Core2 processors according to the documentation: Intel Processor Identification with the CPUID Instruction: http://www.intel.com/support/processors/sb/cs-009861.htmSigned-off-by: NYong Wang <yong.y.wang@intel.com> Also-Reported-by: NArnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <20090610090612.GA26580@ywang-moblin2.bj.intel.com> [ Added two more model numbers suggested by Arnd Bergmann ] Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andi Kleen 提交于
VT-x needs an explicit MC vector intercept to handle machine checks in the hyper visor. It also has a special option to catch machine checks that happen during VT entry. Do these interceptions and forward them to the Linux machine check handler. Make it always look like user space is interrupted because the machine check handler treats kernel/user space differently. Thanks to Jiang Yunhong for help and testing. Cc: stable@kernel.org Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
- 09 6月, 2009 5 次提交
-
-
由 Yong Wang 提交于
Correct some event and UMASK values according to Intel SDM, in the Nehalem and Atom tables. Signed-off-by: NYong Wang <yong.y.wang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <20090609131553.GA12489@ywang-moblin2.bj.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
Booting a 32-bit kernel on Magny-Cours results in the following panic: ... Using APIC driver default ... Overriding APIC driver with bigsmp ... Getting VERSION: 80050010 Getting VERSION: 80050010 Getting ID: 10000000 Getting ID: ef000000 Getting LVT0: 700 Getting LVT1: 10000 Kernel panic - not syncing: Boot APIC ID in local APIC unexpected (16 vs 0) Pid: 1, comm: swapper Not tainted 2.6.30-rcX #2 Call Trace: [<c05194da>] ? panic+0x38/0xd3 [<c0743102>] ? native_smp_prepare_cpus+0x259/0x31f [<c073b19d>] ? kernel_init+0x3e/0x141 [<c073b15f>] ? kernel_init+0x0/0x141 [<c020325f>] ? kernel_thread_helper+0x7/0x10 The reason is that default_get_apic_id handled extension of local APIC ID field just in case of XAPIC. Thus for this AMD CPU, default_get_apic_id() returns 0 and bigsmp_get_apic_id() returns 16 which leads to the respective kernel panic. This patch introduces a Linux specific feature flag to indicate support for extended APIC id (8 bits instead of 4 bits width) and sets the flag on AMD CPUs if applicable. Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: <stable@kernel.org> LKML-Reference: <20090608135509.GA12431@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yinghai Lu 提交于
These are defined as static cpumask_var_t so if MAXSMP is not used, they are cleared already. Avoid surprises when MAXSMP is enabled. Signed-off-by: NYinghai Lu <yinghai.lu@kernel.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Fill in amd_hw_cache_event_id[] with the AMD CPU specific events, for family 0x0f, 0x10 and 0x11. There's apparently no distinction between load and store events, so we only fill in the load events. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 08 6月, 2009 3 次提交
-
-
由 Ingo Molnar 提交于
Standardize and tidy up all the messages we print during perfcounter initialization. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Fill in core2_hw_cache_event_id[] with the Atom model specific events. The events can be used in all the tools via the -e (--event) parameter, for example "-e l1-misses" or -"-e l2-accesses" or "-e l2-write-misses". ( Note: these are straight from the Intel manuals - not tested yet.) Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Fill in core2_hw_cache_event_id[] with the Core2 model specific events. The events can be used in all the tools via the -e (--event) parameter, for example "-e l1-misses" or -"-e l2-accesses" or "-e l2-write-misses". Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 07 6月, 2009 1 次提交
-
-
由 Jaswinder Singh Rajput 提交于
Remove model information, encoding/decoding and reduce bookkeeping. This, besides removing a lot of code and cleaning up the code, also enables these features on many more CPUs that were enumerated before. Reported-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> LKML-Reference: <1244224637.8212.6.camel@ht.satnam> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 06 6月, 2009 3 次提交
-
-
由 Ingo Molnar 提交于
Extend generic event enumeration with the PERF_TYPE_HW_CACHE method. This is a 3-dimensional space: { L1-D, L1-I, L2, ITLB, DTLB, BPU } x { load, store, prefetch } x { accesses, misses } User-space passes in the 3 coordinates and the kernel provides a counter. (if the hardware supports that type and if the combination makes sense.) Combinations that make no sense produce a -EINVAL. Combinations that are not supported by the hardware produce -ENOTSUP. Extend the tools to deal with this, and rewrite the event symbol parsing code with various popular aliases for the units and access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are both valid aliases. ( x86 is supported for now, with the Nehalem event table filled in, and with Core2 and Atom having placeholder tables. ) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Counter type is a frequently used value and we do a lot of bit juggling by encoding and decoding it from attr->config. Clean this up by creating a separate attr->type field. Also clean up the various similarly complex user-space bits all around counter attribute management. The net improvement is significant, and it will be easier to add a new major type (which is what triggered this cleanup). (This changes the ABI, all tools are adapted.) (PowerPC build-tested.) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Dave Jones 提交于
The powernow-k8 driver checks to see that the Performance Control/Status Registers are declared as FFH (functional fixed hardware) by the BIOS. However, this check got broken in the commit: 0e64a0c9 [CPUFREQ] checkpatch cleanups for powernow-k8 Fix based on an original patch from Naga Chumbalkar. Signed-off-by: NNaga Chumbalkar <nagananda.chumbalkar@hp.com> Cc: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: NDave Jones <davej@redhat.com>
-
- 04 6月, 2009 8 次提交
-
-
由 Andi Kleen 提交于
Newer Intel CPUs support a new class of machine checks called recoverable action optional. Action Optional means that the CPU detected some form of corruption in the background and tells the OS about using a machine check exception. The OS can then take appropiate action, like killing the process with the corrupted data or logging the event properly to disk. This is done by the new generic high level memory failure handler added in a earlier patch. The high level handler takes the address with the failed memory and does the appropiate action, like killing the process. In this version of the patch the high level handler is stubbed out with a weak function to not create a direct dependency on the hwpoison branch. The high level handler cannot be directly called from the machine check exception though, because it has to run in a defined process context to be able to sleep when taking VM locks (it is not expected to sleep for a long time, just do so in some exceptional cases like lock contention) Thus the MCE handler has to queue a work item for process context, trigger process context and then call the high level handler from there. This patch adds two path to process context: through a per thread kernel exit notify_user() callback or through a high priority work item. The first runs when the process exits back to user space, the other when it goes to sleep and there is no higher priority process. The machine check handler will schedule both, and whoever runs first will grab the event. This is done because quick reaction to this event is critical to avoid a potential more fatal machine check when the corruption is consumed. There is a simple lock less ring buffer to queue the corrupted addresses between the exception handler and the process context handler. Then in process context it just calls the high level VM code with the corrupted PFNs. The code adds the required code to extract the failed address from the CPU's machine check registers. It doesn't try to handle all possible cases -- the specification has 6 different ways to specify memory address -- but only the linear address. Most of the required checking has been already done earlier in the mce_severity rule checking engine. Following the Intel recommendations Action Optional errors are only enabled for known situations (encoded in MCACODs). The errors are ignored otherwise, because they are action optional. v2: Improve comment, disable preemption while processing ring buffer (reported by Ying Huang) Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
Rename the mce_notify_user function to mce_notify_irq. The next patch will split the wakeup handling of interrupt context and of process context and it's better to give it a clearer name for this. Contains a fix from Ying Huang [ Impact: cleanup ] Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Huang Ying 提交于
The MCE severity judgement code is data-driven, so code coverage tools such as gcov can not be used for measuring coverage. Instead a dedicated coverage mechanism is implemented. The kernel keeps track of rules executed and reports them in debugfs. This is useful for increasing coverage of the mce-test testsuite. Right now it's unconditionally enabled because it's very little code. Signed-off-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
The x86 architecture recently added some new machine check status bits: S(ignalled) and AR (Action-Required). Signalled allows to check if a specific event caused an exception or was just logged through CMCI. AR allows the kernel to decide if an event needs immediate action or can be delayed or ignored. Implement support for these new status bits. mce_severity() uses the new bits to grade the machine check correctly and decide what to do. The exception handler uses AR to decide to kill or not. The S bit is used to separate events between the poll/CMCI handler and the exception handler. Classical UC always leads to panic. That was true before anyways because the existing CPUs always passed a PCC with it. Also corrects the rules whether to kill in user or kernel context and how to handle missing RIPV. The machine check handler largely uses the mce-severity grading engine now instead of making its own decisions. This means the logic is centralized in one place. This is useful because it has to be evaluated multiple times. v2: Some rule fixes; Add AO events Fix RIPV, RIPV|EIPV order (Ying Huang) Fix UCNA with AR=1 message (Ying Huang) Add comment about panicing in m_c_p. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
When multiple MCEs are printed print the "HARDWARE ERROR" header and "This is not a software error" footer only once. This makes the output much more compact with many CPUs. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
Fatal machine checks can be logged to disk after boot, but only if the system did a warm reboot. That's unfortunately difficult with the default panic behaviour, which waits forever and the admin has to press the power button because modern systems usually miss a reset button. This clears the machine checks in the registers and make it impossible to log them. This patch changes the default for machine check panic to always reboot after 30s. Then the mce can be successfully logged after reboot. I believe this will improve machine check experience for any system running the X server. This is dependent on successfull boot logging of MCEs. This currently only works on Intel systems, on AMD there are quite a lot of systems around which leave junk in the machine check registers after boot, so it's disabled here. These systems will continue to default to endless waiting panic. v2: Only force panic timeout when it's shorter (H.Seto) v3: Only force timeout when there is no timeout (based on comment H.Seto) [ Fix changelog - HS ] Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Huang Ying 提交于
Assume IP on the stack is valid when either EIPV or RIPV are set. This influences whether the machine check exception handler decides to return or panic. This fixes a test case in the mce-test suite and is more compliant to the specification. This currently only makes a difference in a artificial testing scenario with the mce-test test suite. Also in addition do not force the EIPV to be valid with the exact register MSRs, and keep in trust the CS value on stack even if MSR is available. [AK: combination of patches from Huang Ying and Hidetoshi Seto, with new description by me] [add some description, no code changed - HS] Signed-off-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Andi Kleen 提交于
... instead of "Machine check". This is for consistency with the Monarch panic message. Based on a report from Ying Huang. v2: But add a descriptive postfix so that the test suite can distingush. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-