- 22 3月, 2013 1 次提交
-
-
由 Boris Ostrovsky 提交于
Use helper function instead of an array to report whether register bank is shared. Currently only bank 4 (northbridge) is shared. Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Link: http://lkml.kernel.org/r/1363295441-1859-2-git-send-email-boris.ostrovsky@oracle.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 21 1月, 2013 1 次提交
-
-
由 Rusty Russell 提交于
Fix up all callers as they were before, with make one change: an unsigned module taints the kernel, but doesn't turn off lockdep. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 29 12月, 2012 1 次提交
-
-
由 Tejun Heo 提交于
There's no need to test whether a (delayed) work item in pending before queueing, flushing or cancelling it. Most uses are unnecessary and quite a few of them are buggy. Remove unnecessary pending tests from x86/mce. Only compile tested. v2: Local var work removed from mce_schedule_work() as suggested by Borislav. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NBorislav Petkov <bp@alien8.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac@vger.kernel.org
-
- 31 10月, 2012 1 次提交
-
-
由 Tang Chen 提交于
cmci_rediscover() used set_cpus_allowed_ptr() to change the current process's running cpu, and migrate itself to the dest cpu. But worker processes are not allowed to be migrated. If current is a worker, the worker will be migrated to another cpu, but the corresponding worker_pool is still on the original cpu. In this case, the following BUG_ON in try_to_wake_up_local() will be triggered: BUG_ON(rq != this_rq()); This will cause the kernel panic. The call trace is like the following: [ 6155.451107] ------------[ cut here ]------------ [ 6155.452019] kernel BUG at kernel/sched/core.c:1654! ...... [ 6155.452019] RIP: 0010:[<ffffffff810add15>] [<ffffffff810add15>] try_to_wake_up_local+0x115/0x130 ...... [ 6155.452019] Call Trace: [ 6155.452019] [<ffffffff8166fc14>] __schedule+0x764/0x880 [ 6155.452019] [<ffffffff81670059>] schedule+0x29/0x70 [ 6155.452019] [<ffffffff8166de65>] schedule_timeout+0x235/0x2d0 [ 6155.452019] [<ffffffff810db57d>] ? mark_held_locks+0x8d/0x140 [ 6155.452019] [<ffffffff810dd463>] ? __lock_release+0x133/0x1a0 [ 6155.452019] [<ffffffff81671c50>] ? _raw_spin_unlock_irq+0x30/0x50 [ 6155.452019] [<ffffffff810db8f5>] ? trace_hardirqs_on_caller+0x105/0x190 [ 6155.452019] [<ffffffff8166fefb>] wait_for_common+0x12b/0x180 [ 6155.452019] [<ffffffff810b0b30>] ? try_to_wake_up+0x2f0/0x2f0 [ 6155.452019] [<ffffffff8167002d>] wait_for_completion+0x1d/0x20 [ 6155.452019] [<ffffffff8110008a>] stop_one_cpu+0x8a/0xc0 [ 6155.452019] [<ffffffff810abd40>] ? __migrate_task+0x1a0/0x1a0 [ 6155.452019] [<ffffffff810a6ab8>] ? complete+0x28/0x60 [ 6155.452019] [<ffffffff810b0fd8>] set_cpus_allowed_ptr+0x128/0x130 [ 6155.452019] [<ffffffff81036785>] cmci_rediscover+0xf5/0x140 [ 6155.452019] [<ffffffff816643c0>] mce_cpu_callback+0x18d/0x19d [ 6155.452019] [<ffffffff81676187>] notifier_call_chain+0x67/0x150 [ 6155.452019] [<ffffffff810a03de>] __raw_notifier_call_chain+0xe/0x10 [ 6155.452019] [<ffffffff81070470>] __cpu_notify+0x20/0x40 [ 6155.452019] [<ffffffff810704a5>] cpu_notify_nofail+0x15/0x30 [ 6155.452019] [<ffffffff81655182>] _cpu_down+0x262/0x2e0 [ 6155.452019] [<ffffffff81655236>] cpu_down+0x36/0x50 [ 6155.452019] [<ffffffff813d3eaa>] acpi_processor_remove+0x50/0x11e [ 6155.452019] [<ffffffff813a6978>] acpi_device_remove+0x90/0xb2 [ 6155.452019] [<ffffffff8143cbec>] __device_release_driver+0x7c/0xf0 [ 6155.452019] [<ffffffff8143cd6f>] device_release_driver+0x2f/0x50 [ 6155.452019] [<ffffffff813a7870>] acpi_bus_remove+0x32/0x6d [ 6155.452019] [<ffffffff813a7932>] acpi_bus_trim+0x87/0xee [ 6155.452019] [<ffffffff813a7a21>] acpi_bus_hot_remove_device+0x88/0x16b [ 6155.452019] [<ffffffff813a33ee>] acpi_os_execute_deferred+0x27/0x34 [ 6155.452019] [<ffffffff81090589>] process_one_work+0x219/0x680 [ 6155.452019] [<ffffffff81090528>] ? process_one_work+0x1b8/0x680 [ 6155.452019] [<ffffffff813a33c7>] ? acpi_os_wait_events_complete+0x23/0x23 [ 6155.452019] [<ffffffff810923be>] worker_thread+0x12e/0x320 [ 6155.452019] [<ffffffff81092290>] ? manage_workers+0x110/0x110 [ 6155.452019] [<ffffffff81098396>] kthread+0xc6/0xd0 [ 6155.452019] [<ffffffff8167c4c4>] kernel_thread_helper+0x4/0x10 [ 6155.452019] [<ffffffff81671f30>] ? retint_restore_args+0x13/0x13 [ 6155.452019] [<ffffffff810982d0>] ? __init_kthread_worker+0x70/0x70 [ 6155.452019] [<ffffffff8167c4c0>] ? gs_change+0x13/0x13 This patch removes the set_cpus_allowed_ptr() call, and put the cmci rediscover jobs onto all the other cpus using system_wq. This could bring some delay for the jobs. Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com> Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 30 10月, 2012 1 次提交
-
-
由 Borislav Petkov 提交于
Move to private email and put in maintained status. Signed-off-by: NBorislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1351532410-4887-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 26 10月, 2012 4 次提交
-
-
由 Borislav Petkov 提交于
mce_ser, mce_bios_cmci_threshold and mce_disabled are the last three bools which need conversion. Move them to the mca_config struct and adjust usage sites accordingly. Signed-off-by: NBorislav Petkov <bp@alien8.de> Acked-by: NTony Luck <tony.luck@intel.com>
-
由 Borislav Petkov 提交于
Move them into the mca_config struct and adjust code touching them accordingly. Signed-off-by: NBorislav Petkov <bp@alien8.de> Acked-by: NTony Luck <tony.luck@intel.com>
-
由 Borislav Petkov 提交于
Move above configuration variables into struct mca_config and adjust usage places accordingly. Signed-off-by: NBorislav Petkov <bp@alien8.de> Acked-by: NTony Luck <tony.luck@intel.com>
-
由 Borislav Petkov 提交于
Move those MCA configuration variables into struct mca_config and adjust the places they're used accordingly. Signed-off-by: NBorislav Petkov <bp@alien8.de> Acked-by: NTony Luck <tony.luck@intel.com>
-
- 19 10月, 2012 1 次提交
-
-
由 Borislav Petkov 提交于
450cc201 ("x86/mce: Provide boot argument to honour bios-set CMCI threshold") added the bios_cmci_threshold sysfs attribute which was supposed to communicate to userspace tools that BIOS CMCI threshold has been honoured. However, this info is not of any importance to userspace - it should rather get the actual error count it has been thresholded already from MCi_STATUS[38:52]. So drop this before it becomes a used interface (good thing we caught this early in 3.7-rc1, right after the merge window closed). Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: NTony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/20121017105940.GA14590@x1.osrc.amd.comSigned-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 18 10月, 2012 1 次提交
-
-
由 Daniel J Blueman 提交于
When booting on a federated multi-server system (NumaScale), the processor Northbridge lookup returns NULL; add guards to prevent this causing an oops. On those systems, the northbridge is accessed through MMIO and the "normal" northbridge enumeration in amd_nb.c doesn't work since we're generating the northbridge ID from the initial APIC ID and the last is not unique on those systems. Long story short, we end up without northbridge descriptors. Signed-off-by: NDaniel J Blueman <daniel@numascale-asia.com> Cc: stable@vger.kernel.org # 3.6 Link: http://lkml.kernel.org/r/1349073725-14093-1-git-send-email-daniel@numascale-asia.com [ Boris: beef up commit message ] Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 28 9月, 2012 1 次提交
-
-
由 Naveen N. Rao 提交于
The ACPI spec doesn't provide for a way for the bios to pass down recommended thresholds to the OS on a _per-bank_ basis. This patch adds a new boot option, which if passed, tells Linux to use CMCI thresholds set by the bios. As fail-safe, we initialize threshold to 1 if some banks have not been initialized by the bios and warn the user. Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 10 8月, 2012 2 次提交
-
-
由 Chen Gong 提交于
On Intel systems corrected machine check interrupts (CMCI) may be sent to multiple logical processors; possibly to all processors on the affected socket (SDM Volume 3B "15.5.1 CMCI Local APIC Interface"). This means that a persistent error (such as a stuck bit in ECC memory) may cause a storm of interrupts that greatly hinders or prevents forward progress (probably on many processors). To solve this we keep track of the rate at which each processor sees CMCI. If we exceed a threshold, we disable CMCI delivery and switch to polling the machine check banks. If the storm subsides (none of the affected processors see any more errors for a complete poll interval) we re-enable CMCI. [Tony: Added console messages when storm begins/ends and increased storm threshold from 5 to 15 so we have a few more logged entries before we disable interrupts and start dropping reports] Signed-off-by: NChen Gong <gong.chen@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NChen Gong <gong.chen@linux.intel.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
由 Tony Luck 提交于
cmci_discover() works out which machine check banks support CMCI, and which of those are shared by multiple logical processors. It uses this information to ensure that exactly one cpu is designated the owner of each bank so that when interrupts are broadcast to multiple cpus, only one of them will look in a shared bank to log the error and clear the bank. At boot time cmci_discover() performs this task silently. But during certain cpu hotplug operations it prints out a set of summary lines like this: CPU 35 MCA banks CMCI:0 CMCI:1 CMCI:3 CMCI:5 CMCI:6 CMCI:7 CMCI:8 CMCI:9 CMCI:10 CMCI:11 CPU 1 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 39 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 38 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 32 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 37 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 36 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 34 MCA banks CMCI:0 CMCI:1 CMCI:3 The value of these messages seems very low. A user might painstakingly cross-check against the data sheet for a processor to ensure that all CMCI supported banks are correctly reported, but this seems improbable. If users really wanted to do this, we should print the information at boot time too. Remove the messages. Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 04 8月, 2012 4 次提交
-
-
由 Thomas Gleixner 提交于
No point in having double cases if we can simply mask the FROZEN bit out. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NChen Gong <gong.chen@linux.intel.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
由 Thomas Gleixner 提交于
Split timer init function into the init and the start part, so the start part can replace the open coded version in CPU_DOWN_FAILED. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NChen Gong <gong.chen@linux.intel.com> Acked-by: NBorislav Petkov <borislav.petkov@amd.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
由 Thomas Gleixner 提交于
raise_mce() fiddles with global state, but lacks any kind of serialization. Add a mutex around the raise_mce() call, so concurrent writers do not stomp on each other toes. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NChen Gong <gong.chen@linux.intel.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
由 Thomas Gleixner 提交于
raise_mce() has a code path which does not disable preemption when the raise_local() is called. The per cpu variable access in raise_local() depends on preemption being disabled to be functional. So that code path was either never tested or never tested with CONFIG_DEBUG_PREEMPT enabled. Add the missing preempt_disable/enable() pair around the call. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NChen Gong <gong.chen@linux.intel.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 26 7月, 2012 2 次提交
-
-
由 Tony Luck 提交于
Sandy Bridge processors follow the SDM (Vol 3B, Table 15-20) and set both the RIPV and EIPV bits in the MCG_STATUS register to zero for machine checks during instruction fetch. This is more than a little counter-intuitive and means that Linux cannot recover from these errors. Rather than insert special case code at several places in mce.c and mce-severity.c, we pretend the EIPV bit was set for just this case early in processing the machine check. Acked-by: NBorislav Petkov <bp@amd64.org> Signed-off-by: NTony Luck <tony.luck@intel.com> Cc: Chen Gong <gong.chen@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Link: http://lkml.kernel.org/r/180a06f3f357cf9f78259ae443a082b14a29535b.1343078495.git.tony.luck@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Tony Luck 提交于
We will need some of these values in mce.c. Move them to the appropriate header file so they are available. Acked-by: NBorislav Petkov <bp@amd64.org> Signed-off-by: NTony Luck <tony.luck@intel.com> Cc: Chen Gong <gong.chen@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Link: http://lkml.kernel.org/r/0ccfb1af5fe35e537b7cd8e4d448bf7d851dbfb9.1343078495.git.tony.luck@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 20 7月, 2012 2 次提交
-
-
由 Liu, Jinsong 提交于
there are 3 funcs which need to be _initcalled in a logic sequence: 1. xen_late_init_mcelog 2. mcheck_init_device 3. threshold_init_device xen_late_init_mcelog must register xen_mce_chrdev_device before native mce_chrdev_device registration if running under xen platform; mcheck_init_device should be inited before threshold_init_device to initialize mce_device, otherwise a a NULL ptr dereference will cause panic. so we use following _initcalls 1. device_initcall(xen_late_init_mcelog); 2. device_initcall_sync(mcheck_init_device); 3. late_initcall(threshold_init_device); when running under xen, the initcall order is 1,2,3; on baremetal, we skip 1 and we do only 2 and 3. Acked-and-tested-by: NBorislav Petkov <bp@amd64.org> Suggested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NLiu, Jinsong <jinsong.liu@intel.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Liu, Jinsong 提交于
When MCA error occurs, it would be handled by Xen hypervisor first, and then the error information would be sent to initial domain for logging. This patch gets error information from Xen hypervisor and convert Xen format error into Linux format mcelog. This logic is basically self-contained, not touching other kernel components. By using tools like mcelog tool users could read specific error information, like what they did under native Linux. To test follow directions outlined in Documentation/acpi/apei/einj.txt Acked-and-tested-by: NBorislav Petkov <borislav.petkov@amd.com> Signed-off-by: NKe, Liping <liping.ke@intel.com> Signed-off-by: NJiang, Yunhong <yunhong.jiang@intel.com> Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NLiu, Jinsong <jinsong.liu@intel.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 12 7月, 2012 1 次提交
-
-
由 Tony Luck 提交于
In commit dad1743e ("x86/mce: Only restart instruction after machine check recovery if it is safe") we fixed mce_notify_process() to force a signal to the current process if it was not restartable (RIPV bit not set in MCG_STATUS). But doing it here means that the process doesn't get told the virtual address of the fault via siginfo_t->si_addr. This would prevent application level recovery from the fault. Make a new MF_MUST_KILL flag bit for memory_failure() et al. to use so that we will provide the right information with the signal. Signed-off-by: NTony Luck <tony.luck@intel.com> Acked-by: NBorislav Petkov <borislav.petkov@amd.com> Cc: stable@kernel.org # 3.4+
-
- 07 6月, 2012 8 次提交
-
-
由 Borislav Petkov 提交于
Jacob is doing something else now so add myself as the loser who provides support. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Having the banks numbered is ok but having real names which mean something to the user makes a lot more sense: /sys/devices/system/machinecheck/machinecheck0/ |-- bank0 |-- bank1 |-- bank2 |-- bank3 |-- bank4 |-- bank5 |-- bank6 |-- check_interval |-- cmci_disabled |-- combined_unit | |-- combined_unit | |-- error_count | |-- threshold_limit |-- dont_log_ce |-- execution_unit | |-- execution_unit | |-- error_count | |-- threshold_limit |-- ignore_ce |-- insn_fetch | |-- insn_fetch | |-- error_count | |-- threshold_limit |-- load_store | |-- load_store | |-- error_count | |-- threshold_limit |-- monarch_timeout |-- northbridge | |-- dram | | |-- error_count | | |-- interrupt_enable | | |-- threshold_limit | |-- ht_links | | |-- error_count | | |-- interrupt_enable | | |-- threshold_limit | |-- l3_cache | |-- error_count | |-- interrupt_enable | |-- threshold_limit ... Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Until now, writing to error count caused the code to reset the thresholding bank to the current thresholding limit and start counting errors from the beginning. This is misleading and unclear, and can be accomplished by writing the old thresholding limit into ->threshold_limit. Make error_count read-only with the functionality to show the current error count. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
We have rdmsr_on_cpu() now so remove locally defined solution in favor of the generic one. No functionality change. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
If one sets the threshold limit, say to 25: $ echo 25 > machinecheck0/threshold_bank4/misc0/threshold_limit and then reads it back again, it gives $ cat machinecheck0/threshold_bank4/misc0/threshold_limit 19 which is actually 0x19 but we don't know that. Make all output decimal. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
Well, instead of having a real bank 4 on the BSP of each node and symlinks on the remaining cores, we push it up into the amd_northbridge descriptor which now contains a pointer to the northbridge bank 4 because the bank is one per northbridge and, as such, belongs in the NB descriptor anyway. Each time we hotplug CPUs, we use the northbridge pointer to copy the shared bank into the per-CPU array of threshold_banks pointers, or destroy it when the last CPU on the node goes offline, or create it when the first comes online. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
It is unneeded now so drop it. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
由 Borislav Petkov 提交于
The code used to create a symlink on all non-BSP cores of a node when the MCi_MISCj bank is present once per node. (This is generally the case with bank 4 on AMD). However, these sysfs links cause a bunch of problems with cpu off-/onlining testing and are, as such, a bit overengineered. IOW, there's nothing wrong with having normal sysfs files for the shared banks since the corresponding MSRs are replicated across each core anyway. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
-
- 06 6月, 2012 4 次提交
-
-
由 Thomas Gleixner 提交于
commit 82f7af09 ("x86/mce: Cleanup timer mess) dropped the initialization of the per cpu timer interval. Duh :( Restore the previous behaviour. Reported-by: NChen Gong <gong.chen@linux.intel.com> Cc: bp@amd64.org Cc: tony.luck@intel.com Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Joe Perches 提交于
Use a more current logging style: - Bare printks should have a KERN_<LEVEL> for consistency's sake - Add pr_fmt where appropriate - Neaten some macro definitions - Convert some Ok output to OK - Use "%s: ", __func__ in pr_fmt for summit - Convert some printks to pr_<level> Message output is not identical in all cases. Signed-off-by: NJoe Perches <joe@perches.com> Cc: levinsasha928@gmail.com Link: http://lkml.kernel.org/r/1337655007.24226.10.camel@joe2Laptop [ merged two similar patches, tidied up the changelog ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Chen Gong 提交于
In commit 82f7af09 ("x86/mce: Cleanup timer mess), Thomas just forgot the "/ 2" there while cleaning up. Signed-off-by: NChen Gong <gong.chen@linux.intel.com> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: bp@amd64.org Cc: tony.luck@intel.com Link: http://lkml.kernel.org/r/1338863702-9245-1-git-send-email-gong.chen@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Chen Gong 提交于
In commit 82f7af09 (x86/mce: Cleanup timer mess), Thomas just forgot the "/ 2" there while cleaning up. Signed-off-by: NChen Gong <gong.chen@linux.intel.com> Acked-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 31 5月, 2012 1 次提交
-
-
由 Thomas Gleixner 提交于
Use unsigned long for dealing with jiffies not int. Rename the callback to something sensible. Use __this_cpu_read/write for accessing per cpu data. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NBorislav Petkov <borislav.petkov@amd.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 24 5月, 2012 3 次提交
-
-
由 Tony Luck 提交于
Instruction recovery cases are very similar to the data recovery one we already have. Just trade out for a new MCACOD value. Signed-off-by: NTony Luck <tony.luck@intel.com>
-
由 Tony Luck 提交于
Linus pointed out that there was no value is checking whether m->ip was zero - because zero is a legimate value. If we have a reliable (or faked in the VM86 case) "m->cs" we can use it to tell whether we were in user mode or kernelwhen the machine check hit. Reported-by: NLinus Torvalds <torvalds@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
由 Andi Kleen 提交于
When running on 32bit the mce handler could misinterpret vm86 mode as ring 0. This can affect whether it does recovery or not; it was possible to panic when recovery was actually possible. Fix this by always forcing vm86 to look like ring 3. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 23 5月, 2012 1 次提交
-
-
由 Borislav Petkov 提交于
Got bitten again by the BIT() macro: arch/x86/kernel/cpu/mcheck/mce.c: In function '__mcheck_cpu_apply_quirks': arch/x86/kernel/cpu/mcheck/mce.c:1453:6: warning: left shift count >= width of type arch/x86/kernel/cpu/mcheck/mce.c:1454:7: warning: left shift count >= width of type Fix it already. Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com> Cc: Frank Arnold <frank.arnold@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1337684026-19740-2-git-send-email-bp@amd64.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-