- 15 12月, 2016 3 次提交
-
-
由 Boris Ostrovsky 提交于
acpi_map_pxm_to_node() unconditially maps nodes even when NUMA is turned off. So acpi_get_node() might return a node > 0, which is fatal when NUMA is disabled as the rest of the kernel assumes that only node 0 exists. Expose numa_off to the acpi code and return NUMA_NO_NODE when it's set. Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: linux-ia64@vger.kernel.org Cc: catalin.marinas@arm.com Cc: rjw@rjwysocki.net Cc: will.deacon@arm.com Cc: linux-acpi@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: lenb@kernel.org Link: http://lkml.kernel.org/r/1481602709-18260-1-git-send-email-boris.ostrovsky@oracle.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Boris Ostrovsky 提交于
Use NUMA_NO_NODE instead of -1. Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Cc: len.brown@intel.com Cc: rjw@rjwysocki.net Cc: linux-acpi@vger.kernel.org Cc: pavel@ucw.cz Link: http://lkml.kernel.org/r/1481570993-13941-1-git-send-email-boris.ostrovsky@oracle.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
prefill_possible_map() reinitializes the cpu_possible_map by setting the possible cpu bits and clearing all other bits up to NR_CPUS. This is technically always correct because cpu_possible_map is statically allocated and sized NR_CPUS. With CPUMASK_OFFSTACK and DEBUG_PER_CPU_MAPS enabled the bounds check of cpu masks happens on nr_cpu_ids. nr_cpu_ids is initialized to NR_CPUS and only limited after the set/clear bit loops have been executed. But if the system was booted with "nr_cpus=N" on the command line, where N is < NR_CPUS then nr_cpu_ids is limited in the parameter parsing function before prefill_possible_map() is invoked. As a consequence the cpumask bounds check triggers when clearing the bits past nr_cpu_ids. Add a helper which allows to reset cpu_possible_map w/o the bounds check and then set only the possible bits which are well inside bounds. Reported-by: NDmitry Safonov <dsafonov@virtuozzo.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: 0x7f454c46@gmail.com Cc: Jan Beulich <JBeulich@novell.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1612131836050.3415@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 14 12月, 2016 2 次提交
-
-
由 Josh Poimboeuf 提交于
start_cpu() pushes a text address on the stack so that stack traces from idle tasks will show start_cpu() at the end. But it currently shows the wrong function offset. It's more correct to show the address immediately after the 'lretq' instruction. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/2cadd9f16c77da7ee7957bfc5e1c67928c23ca48.1481685203.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
start_cpu() pushes a text address on the stack so that stack traces from idle tasks will show start_cpu() at the end. But it uses a call instruction to do that, which is rather obtuse. Use a straightforward push instead. Suggested-by: NBorislav Petkov <bp@alien8.de> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/4d8a1952759721d42d1e62ba9e4a7e3ac5df8574.1481685203.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 13 12月, 2016 1 次提交
-
-
由 Thomas Gleixner 提交于
The logical package management has several issues: - The APIC ids provided by ACPI are not required to be the same as the initial APIC id which can be retrieved by CPUID. The APIC ids provided by ACPI are those which are written by the BIOS into the APIC. The initial id is set by hardware and can not be changed. The hardware provided ids contain the real hardware package information. Especially AMD sets the effective APIC id different from the hardware id as they need to reserve space for the IOAPIC ids starting at id 0. As a consequence those machines trigger the currently active firmware bug printouts in dmesg, These are obviously wrong. - Virtual machines have their own interesting of enumerating APICs and packages which are not reliably covered by the current implementation. The sizing of the mapping array has been tweaked to be generously large to handle systems which provide a wrong core count when HT is disabled so the whole magic which checks for space in the physical hotplug case is not needed anymore. Simplify the whole machinery and do the mapping when the CPU starts and the CPUID derived physical package information is available. This solves the observed problems on AMD machines and works for the virtualization issues as well. Remove the extra call from XEN cpu bringup code as it is not longer required. Fixes: d49597fd ("x86/cpu: Deal with broken firmware (VMWare/XEN)") Reported-and-tested-by: NBorislav Petkov <bp@suse.de> Tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: M. Vefa Bicakci <m.v.b@runbox.com> Cc: xen-devel <xen-devel@lists.xen.org> Cc: Charles (Chas) Williams <ciwillia@brocade.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Alok Kataria <akataria@vmware.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1612121102260.3429@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 12 12月, 2016 22 次提交
-
-
由 Stefan Kristiansson 提交于
During page fault handling we check the last instruction to understand if the fault was for a read or for a write. By default we fall back to read. New instructions were added to the openrisc 1.1 spec for an atomic load/store pair (l.lwa/l.swa). This patch adds the opcode for l.swa (0x33) allowing it to be treated as a write operation. Signed-off-by: NStefan Kristiansson <stefan.kristiansson@saunalahti.fi> [shorne@gmail.com: expanded a bit on the comment] Signed-off-by: NStafford Horne <shorne@gmail.com>
-
由 Stafford Horne 提交于
The openrisc.net domain expired and was taken over by squatters. These updates point documentation to the new domain, mailing lists and git repos. Also, Jonas is not the main maintainer anylonger, he reviews changes but does not maintain a repo or sent pull requests. Updating this to add Stafford and Stefan who are the active maintainers. Acked-by: NOlof Kindgren <olof.kindgren@gmail.com> Signed-off-by: NStafford Horne <shorne@gmail.com>
-
由 Stafford Horne 提交于
Clearing out one todo item. Use the memblock boot time memory which is the current standard. Tested-by: NGuenter Roeck <linux@roeck-us.net> Acked-by: NJonas <jonas@southpole.se> Signed-off-by: NStafford Horne <shorne@gmail.com>
-
由 Rob Herring 提交于
The of_platform_populate call in the openrisc arch code is now redundant as the DT core provides a default call. Openrisc has a NULL match table which means only top level nodes with compatible strings will have devices creates. The default version will also descend nodes in the match table such as "simple-bus" which should be fine as openrisc doesn't have any of these (though it is preferred that memory-mapped peripherals be grouped under a bus node(s)). Signed-off-by: NRob Herring <robh@kernel.org> Cc: Jonas Bonn <jonas@southpole.se> Tested-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NStafford Horne <shorne@gmail.com>
-
由 Stafford Horne 提交于
The build system now expects that NR_CPUS is defined. Follow 4cbbbb43 ("microblaze: Fix missing NR_CPUS in menuconfig") Signed-off-by: NStafford Horne <shorne@gmail.com>
-
由 Guenter Roeck 提交于
The output file format for or1k has changed from "elf32-or32" to "elf32-or1k". Select the correct output format automatically to be able to compile the kernel with both toolchain variants. Signed-off-by: NGuenter Roeck <linux@roeck-us.net> Tested-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NStafford Horne <shorne@gmail.com>
-
由 Christian Svensson 提交于
Historically OpenRISC GCC has reserved r10 which we now use to hold the thread pointer for thread-local storage (TLS). Signed-off-by: NChristian Svensson <blue@cmd.nu> Signed-off-by: NStefan Kristiansson <stefan.kristiansson@saunalahti.fi> Tested-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NStafford Horne <shorne@gmail.com>
-
由 Jonas Bonn 提交于
Fix signal handling for when signals are handled as the result of timers or exceptions, previous code assumed syscalls. This was noticeable with X crashing where it uses SIGALRM. This patch restores all regs before returning to userspace via _resume_userspace instead of via syscall return path. The rt_sigreturn syscall is more like a context switch than a function call; it entails a return from one context (the signal handler) to another (the process in question). For a context switch like this there are effectively no call-saved regs that remain constant across the transition. Reported-by: NSebastian Macke <sebastian@macke.de> Signed-off-by: NJonas Bonn <jonas@southpole.se> Tested-by: NGuenter Roeck <linux@roeck-us.net> [shorne@gmail.com: Updated comment better reflect change and issue] Signed-off-by: NStafford Horne <shorne@gmail.com>
-
由 Stefan Kristiansson 提交于
On OpenRISC, with its 8k pages, PAGE_SHIFT is defined to be 13. That makes the expression (1UL << (PAGE_SHIFT-2)) evaluate to 2048. The correct value for PTRS_PER_PGD should be 256. Correcting the PTRS_PER_PGD define unveiled a bug in map_ram(), where PTRS_PER_PGD was used when the intent was to iterate over a set of page table entries. This patch corrects that issue as well. Signed-off-by: NStefan Kristiansson <stefan.kristiansson@saunalahti.fi> Acked-by: NJonas Bonn <jonas@southpole.se> Tested-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NStafford Horne <shorne@gmail.com>
-
This patch wires up the new pkey_mprotect, pkey_alloc and pkey_free syscalls on AVR32.
-
由 Markus Elfring 提交于
Strings which did not contain data format specifications should be put into a sequence. Thus use the corresponding function "seq_puts". This issue was detected by using the Coccinelle software. Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
-
由 Markus Elfring 提交于
A single character (line break) should be put into a sequence. Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
-
由 Markus Elfring 提交于
Some data were printed into a sequence by nine separate function calls. Print the same data by a single function call instead. Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
-
由 Markus Elfring 提交于
A single character (line break) should be put into two sequences. Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
-
由 Gonglei \(Arei\) 提交于
>> arch/sparc/include/asm/topology_64.h:44:44: error: implicit declaration of function 'cpu_data' [-Werror=implicit-function-declaration] #define topology_physical_package_id(cpu) (cpu_data(cpu).proc_id) ^ Let's include cpudata.h in topology_64.h. Cc: Sam Ravnborg <sam@ravnborg.org> Cc: David S. Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Suggested-by: NSam Ravnborg <sam@ravnborg.org> Signed-off-by: NGonglei <arei.gonglei@huawei.com> Acked-by: NSam Ravnborg <sam@ravnborg.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill A. Shutemov 提交于
It really has to be pgdp, not pgd. It just happend to work since all callers have 'pgd' as an argument. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dan Carpenter 提交于
There are some error paths where we should restore IRQs but we don't. Fixes: bb620c3d ("sparc: Make sparc64 use scalable lib/iommu-common.c functions") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dan Carpenter 提交于
The original code causes a static checker warning because it has a continue inside a do { } while (0); loop. In that context, a continue and a break are equivalent. The intent was to go back to the start of the loop so the continue was a bug. I've added a retry label at the start and changed the continue to a goto retry. Then I removed the do { } while (0) loop and pulled the code in one indent level. Fixes: 2791c1a4 ("SPARC/LEON: added support for selecting Timer Core and Timer within core") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dan Carpenter 提交于
My static checker complains that if "lvl" is ULONG_MAX (this is 64 bit) then some of the strings will overflow. I don't know if that's possible but it seems simple enough to make the buffers slightly larger. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dan Carpenter 提交于
We shouldn't dereference "iommu" until after we have checked that it is non-NULL. Fixes: f08978b0 ("sparc64: Enable sun4v dma ops to use IOMMU v2 APIs") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Geliang Tang 提交于
Use builtin_platform_driver() helper to simplify the code. Signed-off-by: NGeliang Tang <geliangtang@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Allen Pais 提交于
Signed-off-by: NEric Saint Etienne <eric.saint.etienne@oracle.com> Signed-off-by: NAllen Pais <allen.pais@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 12月, 2016 5 次提交
-
-
由 Peter Zijlstra 提交于
Commit: 3cded417 ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()") introduced a paravirt op with bool return type [*] It turns out that the PVOP_CALL*() macros miscompile when rettype is bool. Code that looked like: 83 ef 01 sub $0x1,%edi ff 15 32 a0 d8 00 callq *0xd8a032(%rip) # ffffffff81e28120 <pv_lock_ops+0x20> 84 c0 test %al,%al ended up looking like so after PVOP_CALL1() was applied: 83 ef 01 sub $0x1,%edi 48 63 ff movslq %edi,%rdi ff 14 25 20 81 e2 81 callq *0xffffffff81e28120 48 85 c0 test %rax,%rax Note how it tests the whole of %rax, even though a typical bool return function only sets %al, like: 0f 95 c0 setne %al c3 retq This is because ____PVOP_CALL() does: __ret = (rettype)__eax; and while regular integer type casts truncate the result, a cast to bool tests for any !0 value. Fix this by explicitly truncating to sizeof(rettype) before casting. [*] The actual bug should've been exposed in commit: 446f3dc8 ("locking/core, x86/paravirt: Implement vcpu_is_preempted(cpu) for KVM and Xen guests") but that didn't properly implement the paravirt call. Reported-by: Nkernel test robot <xiaolong.ye@intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alok Kataria <akataria@vmware.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 3cded417 ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()") Link: http://lkml.kernel.org/r/20161208154349.346057680@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
While chasing a regression I noticed we potentially patch the wrong code in native_patch(). If we do not select the native code sequence, we must use the default patcher, not fall-through the switch case. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alok Kataria <akataria@vmware.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel test robot <xiaolong.ye@intel.com> Fixes: 3cded417 ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()") Link: http://lkml.kernel.org/r/20161208154349.270616999@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andi Kleen 提交于
An earlier patch allowed enabling PT and LBR at the same time on Goldmont. However it also allowed enabling BTS and LBR at the same time, which is still not supported. Fix this by bypassing the check only for PT. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: alexander.shishkin@intel.com Cc: kan.liang@intel.com Cc: <stable@vger.kernel.org> Fixes: ccbebba4 ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports it") Link: http://lkml.kernel.org/r/20161209001417.4713-1-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Hauke Mehrtens 提交于
The hardware documentation says bit 11:10 are used for the GPE frequency selection. Fix the mask in the define to match these bits. Signed-off-by: NHauke Mehrtens <hauke@hauke-m.de> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Reviewed-by: NThomas Langer <thomas.langer@intel.com> Cc: linux-mips@linux-mips.org Cc: john@phrozen.org Patchwork: https://patchwork.linux-mips.org/patch/14648/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Luuk Paulussen 提交于
The sync_cmos_clock function in kernel/time/ntp.c first tries to update the internal clock of the cpu by calling the "update_persistent_clock64" architecture specific function. If this returns -ENODEV, it then tries to update an external RTC using "rtc_set_ntp_time". On the mips architecture, the weak implementation of the underlying function would return 0 if it wasn't overridden. This meant that the sync_cmos_clock function would never try to update an external RTC (if both CONFIG_GENERIC_CMOS_UPDATE and CONFIG_RTC_SYSTOHC are configured) Returning -ENODEV instead, means that an external RTC will be tried. Signed-off-by: NLuuk Paulussen <luuk.paulussen@alliedtelesis.co.nz> Reviewed-by: NRichard Laing <richard.laing@alliedtelesis.co.nz> Reviewed-by: NScott Parlane <scott.parlane@alliedtelesis.co.nz> Reviewed-by: NChris Packham <chris.packham@alliedtelesis.co.nz> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/14649/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
- 10 12月, 2016 7 次提交
-
-
由 Thomas Gleixner 提交于
ldt->size can never be negative. The helper functions take 'unsigned int' arguments which are assigned from ldt->size. The related user space user_desc struct member entry_number is unsigned as well. But ldt->size itself and a few local variables which are related to ldt->size are type 'int' which makes no sense whatsoever and results in typecasts which make the eyes bleed. Clean it up and convert everything which is related to ldt->size to unsigned it. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dan Carpenter <dan.carpenter@oracle.com>
-
由 Dan Carpenter 提交于
My static checker complains that we put an upper bound on the "size" argument but not a lower bound. The checker is not smart enough to know the possible ranges of "old_mm->context.ldt->size" from init_new_context_ldt() so it thinks maybe it could be negative. Let's make it unsigned to silence the warning and future proof the code a bit. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NAndy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: kernel-janitors@vger.kernel.org Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20161208105602.GA11382@elgon.mountainSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
One include less is always a good thing(tm). Good riddance. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-6-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Borislav Petkov 提交于
Reorganize the E400 detection now that we have everything in place: switch the CPUs to broadcast mode after the LAPIC has been initialized and remove the facilities that were used previously on the idle path. Unfortunately static_cpu_has_bug() cannpt be used in the E400 idle routine because alternatives have been applied when the actual detection happens, so the static switching does not take effect and the test will stay false. Use boot_cpu_has_bug() instead which is definitely an improvement over the RDMSR and the cpumask handling. Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-5-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
AMD CPUs affected by the E400 erratum suffer from the issue that the local APIC timer stops when the CPU goes into C1E. Unfortunately there is no way to detect the affected CPUs on early boot. It's only possible to determine the range of possibly affected CPUs from the family/model range. The actual decision whether to enter C1E and thus cause the bug is done by the firmware and we need to detect that case late, after ACPI has been initialized. The current solution is to check in the idle routine whether the CPU is affected by reading the MSR_K8_INT_PENDING_MSG MSR and checking for the K8_INTP_C1E_ACTIVE_MASK bits. If one of the bits is set then the CPU is affected and the system is switched into forced broadcast mode. This is ineffective and on non-affected CPUs every entry to idle does the extra RDMSR. After doing some research it turns out that the bits are visible on the boot CPU right after the ACPI subsystem is initialized in the early boot process. So instead of polling for the bits in the idle loop, add a detection function after acpi_subsystem_init() and check for the MSR bits. If set, then the X86_BUG_AMD_APIC_C1E is set on the boot CPU and the TSC is marked unstable when X86_FEATURE_NONSTOP_TSC is not set as it will stop in C1E state as well. The switch to broadcast mode cannot be done at this point because the boot CPU still uses HPET as a clockevent device and the local APIC timer is not yet calibrated and installed. The switch to broadcast mode on the affected CPUs needs to be done when the local APIC timer is actually set up. This allows to cleanup the amd_e400_idle() function in the next step. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-4-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
The workaround for the AMD Erratum E400 (Local APIC timer stops in C1E state) is a two step process: - Selection of the E400 aware idle routine - Detection whether the platform is affected The idle routine selection happens for possibly affected CPUs depending on family/model/stepping information. These range of CPUs is not necessarily affected as the decision whether to enable the C1E feature is made by the firmware. Unfortunately there is no way to query this at early boot. The current implementation polls a MSR in the E400 aware idle routine to detect whether the CPU is affected. This is inefficient on non affected CPUs because every idle entry has to do the MSR read. There is a better way to detect this before going idle for the first time which requires to seperate the bug flags: X86_BUG_AMD_E400 - Selects the E400 aware idle routine and enables the detection X86_BUG_AMD_APIC_C1E - Set when the platform is affected by E400 Replace the current X86_BUG_AMD_APIC_C1E usage by the new X86_BUG_AMD_E400 bug bit to select the idle routine which currently does an unconditional detection poll. X86_BUG_AMD_APIC_C1E is going to be used in later patches to remove the MSR polling and simplify the handling of this misfeature. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-3-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Borislav Petkov 提交于
Will be used in a later patch to set bug bits for bugs which need late detection. Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-2-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-