- 11 12月, 2016 1 次提交
-
-
由 Peter Zijlstra 提交于
Commit: 3cded417 ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()") introduced a paravirt op with bool return type [*] It turns out that the PVOP_CALL*() macros miscompile when rettype is bool. Code that looked like: 83 ef 01 sub $0x1,%edi ff 15 32 a0 d8 00 callq *0xd8a032(%rip) # ffffffff81e28120 <pv_lock_ops+0x20> 84 c0 test %al,%al ended up looking like so after PVOP_CALL1() was applied: 83 ef 01 sub $0x1,%edi 48 63 ff movslq %edi,%rdi ff 14 25 20 81 e2 81 callq *0xffffffff81e28120 48 85 c0 test %rax,%rax Note how it tests the whole of %rax, even though a typical bool return function only sets %al, like: 0f 95 c0 setne %al c3 retq This is because ____PVOP_CALL() does: __ret = (rettype)__eax; and while regular integer type casts truncate the result, a cast to bool tests for any !0 value. Fix this by explicitly truncating to sizeof(rettype) before casting. [*] The actual bug should've been exposed in commit: 446f3dc8 ("locking/core, x86/paravirt: Implement vcpu_is_preempted(cpu) for KVM and Xen guests") but that didn't properly implement the paravirt call. Reported-by: Nkernel test robot <xiaolong.ye@intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alok Kataria <akataria@vmware.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 3cded417 ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()") Link: http://lkml.kernel.org/r/20161208154349.346057680@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 10 12月, 2016 5 次提交
-
-
由 Thomas Gleixner 提交于
ldt->size can never be negative. The helper functions take 'unsigned int' arguments which are assigned from ldt->size. The related user space user_desc struct member entry_number is unsigned as well. But ldt->size itself and a few local variables which are related to ldt->size are type 'int' which makes no sense whatsoever and results in typecasts which make the eyes bleed. Clean it up and convert everything which is related to ldt->size to unsigned it. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dan Carpenter <dan.carpenter@oracle.com>
-
由 Thomas Gleixner 提交于
One include less is always a good thing(tm). Good riddance. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-6-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Borislav Petkov 提交于
Reorganize the E400 detection now that we have everything in place: switch the CPUs to broadcast mode after the LAPIC has been initialized and remove the facilities that were used previously on the idle path. Unfortunately static_cpu_has_bug() cannpt be used in the E400 idle routine because alternatives have been applied when the actual detection happens, so the static switching does not take effect and the test will stay false. Use boot_cpu_has_bug() instead which is definitely an improvement over the RDMSR and the cpumask handling. Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-5-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
The workaround for the AMD Erratum E400 (Local APIC timer stops in C1E state) is a two step process: - Selection of the E400 aware idle routine - Detection whether the platform is affected The idle routine selection happens for possibly affected CPUs depending on family/model/stepping information. These range of CPUs is not necessarily affected as the decision whether to enable the C1E feature is made by the firmware. Unfortunately there is no way to query this at early boot. The current implementation polls a MSR in the E400 aware idle routine to detect whether the CPU is affected. This is inefficient on non affected CPUs because every idle entry has to do the MSR read. There is a better way to detect this before going idle for the first time which requires to seperate the bug flags: X86_BUG_AMD_E400 - Selects the E400 aware idle routine and enables the detection X86_BUG_AMD_APIC_C1E - Set when the platform is affected by E400 Replace the current X86_BUG_AMD_APIC_C1E usage by the new X86_BUG_AMD_E400 bug bit to select the idle routine which currently does an unconditional detection poll. X86_BUG_AMD_APIC_C1E is going to be used in later patches to remove the MSR polling and simplify the handling of this misfeature. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-3-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Borislav Petkov 提交于
Will be used in a later patch to set bug bits for bugs which need late detection. Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-2-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 06 12月, 2016 1 次提交
-
-
由 Peter Zijlstra 提交于
I recently encountered wreckage because access_ok() was used where it should not be, add an explicit WARN when access_ok() is used wrongly. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 30 11月, 2016 1 次提交
-
-
由 Tim Chen 提交于
Rename CONFIG_SCHED_ITMT for Intel Turbo Boost Max Technology 3.0 to CONFIG_SCHED_MC_PRIO. This makes the configuration extensible in future to other architectures that wish to similarly establish CPU core priorities support in the scheduler. The description in Kconfig is updated to reflect this change with added details for better clarity. The configuration is explicitly default-y, to enable the feature on CPUs that have this feature. It has no effect on non-TBM3 CPUs. Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@suse.de Cc: jolsa@redhat.com Cc: linux-acpi@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/2b2ee29d93e3f162922d72d0165a1405864fbb23.1480444902.git.tim.c.chen@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 28 11月, 2016 1 次提交
-
-
由 Paul Bolle 提交于
In x86's include/asm/Kbuild three entries are appended to the genhdr-y make variable: genhdr-y += unistd_32.h genhdr-y += unistd_64.h genhdr-y += unistd_x32.h The same entries are also appended to that variable in include/uapi/asm/Kbuild. So commit: 10b63956 ("UAPI: Plumb the UAPI Kbuilds into the user header installation and checking") ... removed these three entries from include/asm/Kbuild. But, apparently, some merge conflict resolution re-added them. The net effect is, in short, that the genhdr-y make variable contains these file names twice and, as a consequence, that the corresponding headers get installed twice. And so the build prints: INSTALL usr/include/asm/ (65 files) ... while in reality only 62 files are installed in that directory. Nothing breaks because of all that, but it's a good idea to finally remove these unneeded entries nevertheless. Signed-off-by: NPaul Bolle <pebolle@tiscali.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1480077707-2837-1-git-send-email-pebolle@tiscali.nlSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 11月, 2016 3 次提交
-
-
由 Tim Chen 提交于
Intel Turbo Boost Max Technology 3.0 (ITMT) feature allows some cores to be boosted to higher turbo frequency than others. Add /proc/sys/kernel/sched_itmt_enabled so operator can enable/disable scheduling of tasks that favor cores with higher turbo boost frequency potential. By default, system that is ITMT capable and single socket has this feature turned on. It is more likely to be lightly loaded and operates in Turbo range. When there is a change in the ITMT scheduling operation desired, a rebuild of the sched domain is initiated so the scheduler can set up sched domains with appropriate flag to enable/disable ITMT scheduling operations. Co-developed-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Co-developed-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com> Cc: linux-pm@vger.kernel.org Cc: peterz@infradead.org Cc: jolsa@redhat.com Cc: rjw@rjwysocki.net Cc: linux-acpi@vger.kernel.org Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: bp@suse.de Link: http://lkml.kernel.org/r/07cc62426a28bad57b01ab16bb903a9c84fa5421.1479844244.git.tim.c.chen@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Tim Chen 提交于
On platforms supporting Intel Turbo Boost Max Technology 3.0, the maximum turbo frequencies of some cores in a CPU package may be higher than for the other cores in the same package. In that case, better performance (and possibly lower energy consumption as well) can be achieved by making the scheduler prefer to run tasks on the CPUs with higher max turbo frequencies. To that end, set up a core priority metric to abstract the core preferences based on the maximum turbo frequency. In that metric, the cores with higher maximum turbo frequencies are higher-priority than the other cores in the same package and that causes the scheduler to favor them when making load-balancing decisions using the asymmertic packing approach. At the same time, the priority of SMT threads with a higher CPU number is reduced so as to avoid scheduling tasks on all of the threads that belong to a favored core before all of the other cores have been given a task to run. The priority metric will be initialized by the P-state driver with the help of the sched_set_itmt_core_prio() function. The P-state driver will also determine whether or not ITMT is supported by the platform and will call sched_set_itmt_support() to indicate that. Co-developed-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Co-developed-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com> Cc: linux-pm@vger.kernel.org Cc: peterz@infradead.org Cc: jolsa@redhat.com Cc: rjw@rjwysocki.net Cc: linux-acpi@vger.kernel.org Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: bp@suse.de Link: http://lkml.kernel.org/r/cd401ccdff88f88c8349314febdc25d51f7c48f7.1479844244.git.tim.c.chen@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Tim Chen 提交于
The scheduler calls arch_update_cpu_topology() to check whether the scheduler domains have to be rebuilt. So far x86 has no requirement for this, but the upcoming ITMT support makes this necessary. Request the rebuild when the x86 internal update flag is set. Suggested-by: NMorten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com> Cc: linux-pm@vger.kernel.org Cc: peterz@infradead.org Cc: jolsa@redhat.com Cc: rjw@rjwysocki.net Cc: linux-acpi@vger.kernel.org Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: bp@suse.de Link: http://lkml.kernel.org/r/bfbf5591276ec60b2af2da798adc1060df1e2a5f.1479844244.git.tim.c.chen@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 24 11月, 2016 1 次提交
-
-
由 Dmitry Safonov 提交于
Commit: 90954e7b ("x86/coredump: Use pr_reg size, rather that TIF_IA32 flag") changed the coredumping code to construct the elf coredump file according to register set size - and that's good: if binary crashes with 32-bit code selector, generate 32-bit ELF core, otherwise - 64-bit core. That was made for restoring 32-bit applications on x86_64: we want 32-bit application after restore to generate 32-bit ELF dump on crash. All was quite good and recently I started reworking 32-bit applications dumping part of CRIU: now it has two parasites (32 and 64) for seizing compat/native tasks, after rework it'll have one parasite, working in 64-bit mode, to which 32-bit prologue long-jumps during infection. And while it has worked for my work machine, in VM with !CONFIG_X86_X32_ABI during reworking I faced that segfault in 32-bit binary, that has long-jumped to 64-bit mode results in dereference of garbage: 32-victim[19266]: segfault at f775ef65 ip 00000000f775ef65 sp 00000000f776aa50 error 14 BUG: unable to handle kernel paging request at ffffffffffffffff IP: [<ffffffff81332ce0>] strlen+0x0/0x20 [...] Call Trace: [] elf_core_dump+0x11a9/0x1480 [] do_coredump+0xa6b/0xe60 [] get_signal+0x1a8/0x5c0 [] do_signal+0x23/0x660 [] exit_to_usermode_loop+0x34/0x65 [] prepare_exit_to_usermode+0x2f/0x40 [] retint_user+0x8/0x10 That's because we have 64-bit registers set (with according total size) and we're writing it to elf_thread_core_info which has smaller size on !CONFIG_X86_X32_ABI. That lead to overwriting ELF notes part. Tested on 32-, 64-bit ELF crashes and on 32-bit binaries that have jumped with 64-bit code selector - all is readable with gdb. Signed-off-by: NDmitry Safonov <dsafonov@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Fixes: 90954e7b ("x86/coredump: Use pr_reg size, rather that TIF_IA32 flag") Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 23 11月, 2016 1 次提交
-
-
由 Tony Luck 提交于
Intel Xeons from Ivy Bridge onwards support a processor identification number set in the factory. To the user this is a handy unique number to identify a particular CPU. Intel can decode this to the fab/production run to track errors. On systems that have it, include it in the machine check record. I'm told that this would be helpful for users that run large data centers with multi-socket servers to keep track of which CPUs are seeing errors. Boris: * Add some clarifying comments and spacing. * Mask out [63:2] in the disabled-but-not-locked case * Call the MSR variable "val" for more readability. Signed-off-by: NTony Luck <tony.luck@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/20161123114855.njguoaygp3qnbkia@pd.tnicSigned-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 22 11月, 2016 3 次提交
-
-
由 Peter Zijlstra 提交于
Avoid the pointless function call to pv_lock_ops.vcpu_is_preempted() when a paravirt spinlock enabled kernel is ran on native hardware. Do this by patching out the CALL instruction with "XOR %RAX,%RAX" which has the same effect (0 return value). Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: David.Laight@ACULAB.COM Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: benh@kernel.crashing.org Cc: boqun.feng@gmail.com Cc: borntraeger@de.ibm.com Cc: bsingharora@gmail.com Cc: dave@stgolabs.net Cc: jgross@suse.com Cc: kernellwp@gmail.com Cc: konrad.wilk@oracle.com Cc: mpe@ellerman.id.au Cc: paulmck@linux.vnet.ibm.com Cc: paulus@samba.org Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Cc: will.deacon@arm.com Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Pan Xinhui 提交于
Optimize spinlock and mutex busy-loops by providing a vcpu_is_preempted(cpu) function on KVM and Xen platforms. Extend the pv_lock_ops interface accordingly and implement the callbacks on KVM and Xen. Signed-off-by: NPan Xinhui <xinhui.pan@linux.vnet.ibm.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> [ Translated to English. ] Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Cc: David.Laight@ACULAB.COM Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: benh@kernel.crashing.org Cc: boqun.feng@gmail.com Cc: borntraeger@de.ibm.com Cc: bsingharora@gmail.com Cc: dave@stgolabs.net Cc: jgross@suse.com Cc: kernellwp@gmail.com Cc: konrad.wilk@oracle.com Cc: linuxppc-dev@lists.ozlabs.org Cc: mpe@ellerman.id.au Cc: paulmck@linux.vnet.ibm.com Cc: paulus@samba.org Cc: rkrcmar@redhat.com Cc: virtualization@lists.linux-foundation.org Cc: will.deacon@arm.com Cc: xen-devel-request@lists.xenproject.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1478077718-37424-7-git-send-email-xinhui.pan@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Yazen Ghannam 提交于
The Unified Memory Controllers (UMCs) on Fam17h log a normalized address in their MCA_ADDR registers. We need to convert that normalized address to a system physical address in order to support a few facilities: 1) To offline poisoned pages in DRAM proactively in the deferred error handler. 2) To print sysaddr and page info for DRAM ECC errors in EDAC. [ Boris: fixes/cleanups ontop: * hi_addr_offset = 0 - no need for that branch. Stick it all under the HiAddrOffsetEn case. It confines hi_addr_offset's declaration too. * Move variables to the innermost scope they're used at so that we save on stack and not blow it up immediately on function entry. * Do not modify *sys_addr prematurely - we want to not exit early and have modified *sys_addr some, which callers get to see. We either convert to a sys_addr or we don't do anything. And we signal that with the retval of the function. * Rename label out -> out_err - because it is the error path. * No need to pr_err of the conversion failed case: imagine a sparsely-populated machine with UMCs which don't have DIMMs. Callers should look at the retval instead and issue a printk only when really necessary. No need for useless info in dmesg. * s/temp_reg/tmp/ and other variable names shortening => shorter code. * Use BIT() everywhere. * Make error messages more informative. * Small build fix for the !CONFIG_X86_MCE_AMD case. * ... and more minor cleanups. ] Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20161122111133.mjzpvzhf7o7yl2oa@pd.tnic [ Typo fixes. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 21 11月, 2016 1 次提交
-
-
由 Josh Poimboeuf 提交于
NMI stack dumps are bracketed by the following tags: <NMI> ... <EOE> The ending tag is kind of confusing if you don't already know what "EOE" means (end of exception). The same ending tag is also used to mark the end of all other exceptions' stacks. For example: <#DF> ... <EOE> And similarly, "EOI" is used as the ending tag for interrupts: <IRQ> ... <EOI> Change the tags to be more comprehensible by making them symmetrical and more XML-esque: <NMI> ... </NMI> <#DF> ... </#DF> <IRQ> ... </IRQ> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/180196e3754572540b595bc56b947d43658979a7.1479491159.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 18 11月, 2016 3 次提交
-
-
由 Len Brown 提交于
Upon removal of the is_idle flag, these routines became NOPs. Signed-off-by: NLen Brown <len.brown@intel.com> Acked-by: NIngo Molnar <mingo@kernel.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/822f2c22cc5890f7b8ea0eeec60277eb44505b4e.1479449716.git.len.brown@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Len Brown 提交于
Upon removal of the "is_idle" flag, x86_test_and_clear_bit_percpu() is no longer used. Signed-off-by: NLen Brown <len.brown@intel.com> Acked-by: NIngo Molnar <mingo@kernel.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/b334ae6819507e3dfc0a4b33ed974714d067eb4a.1479449716.git.len.brown@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Len Brown 提交于
Upon removal of the i7300_idle driver, the idle_notifer is unused. Signed-off-by: NLen Brown <len.brown@intel.com> Acked-by: NIngo Molnar <mingo@kernel.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/f15385a82ec4bf51f4f06777193d83f03b28cfdd.1479449716.git.len.brown@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 17 11月, 2016 5 次提交
-
-
由 Andy Lutomirski 提交于
RDPID is a new instruction that reads MSR_TSC_AUX quickly. This should be considerably faster than reading the GDT. Add a cpufeature for it and use it from __vdso_getcpu() when available. Tested-by: NMegha Dey <megha.dey@intel.com> Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/4f6c3a22012d10f1c65b9ca15800e01b42c7d39d.1479320367.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Christian Borntraeger 提交于
No need to duplicate the same define everywhere. Since the only user is stop-machine and the only provider is s390, we can use a default implementation of cpu_relax_yield() in sched.h. Suggested-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NRussell King <rmk+kernel@armlinux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: kvm@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-s390 <linux-s390@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: sparclinux@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1479298985-191589-1-git-send-email-borntraeger@de.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Gayatri Kammela 提交于
Add a few new AVX512 instruction groups/features for enumeration in /proc/cpuinfo: AVX512IFMA and AVX512VBMI. Clear the flags in fpu_xstate_clear_all_cpu_caps(). CPUID.(EAX=7,ECX=0):EBX[bit 21] AVX512IFMA CPUID.(EAX=7,ECX=0):ECX[bit 1] AVX512VBMI Detailed information of cpuid bits for the features can be found at https://bugzilla.kernel.org/show_bug.cgi?id=187891Signed-off-by: NGayatri Kammela <gayatri.kammela@intel.com> Reviewed-by: NBorislav Petkov <bp@alien8.de> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: mingo@elte.hu Link: http://lkml.kernel.org/r/1479327060-18668-1-git-send-email-gayatri.kammela@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Yazen Ghannam 提交于
Some devices on Fam17h can only be accessed through the System Management Network (SMN). The SMN is accessed by a pair of index/data registers in PCI config space. Add a pair of functions to read from and write to the SMN. The Data Fabric on Fam17h allows multiple devices to use the same register space. The registers of a specific device are accessed indirectly using the device's DF InstanceId. Currently, we only need to read from these devices, so only define a read function for now. Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1478812257-5424-5-git-send-email-Yazen.Ghannam@amd.com [ Boris: make __amd_smn_rw() even more compact. ] Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Yazen Ghannam 提交于
Hide amd_northbridges in amd_nb.c so that external callers will have to use the exported accessor functions. Also, fix some checkpatch.pl warnings. Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1478812257-5424-2-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 16 11月, 2016 6 次提交
-
-
由 He Chen 提交于
Sparse populated CPUID leafs are collected in a software provided leaf to avoid bloat of the x86_capability array, but there is no way to rebuild the real leafs (e.g. for KVM CPUID enumeration) other than rereading the CPUID leaf from the CPU. While this is possible it is problematic as it does not take software disabled features into account. If a feature is disabled on the host it should not be exposed to a guest either. Add get_scattered_cpuid_leaf() which rebuilds the leaf from the scattered cpuid table information and the active CPU features. [ tglx: Rewrote changelog ] Signed-off-by: NHe Chen <he.chen@linux.intel.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Cc: Luwei Kang <luwei.kang@intel.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Piotr Luc <Piotr.Luc@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/r/1478856336-9388-3-git-send-email-he.chen@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 He Chen 提交于
cpuid_regs is defined multiple times as structure and enum. Rename the enum and move all of it to processor.h so we don't end up with more instances. Rename the misnomed register enumeration from CR_* to the obvious CPUID_*. [ tglx: Rewrote changelog ] Signed-off-by: NHe Chen <he.chen@linux.intel.com> Reviewed-by: NBorislav Petkov <bp@alien8.de> Cc: Luwei Kang <luwei.kang@intel.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Piotr Luc <Piotr.Luc@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/r/1478856336-9388-2-git-send-email-he.chen@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Martin Schwidefsky 提交于
The per-cpu preempt count of x86 contains two values, the actual preempt count and the inverted PREEMPT_NEED_RESCHED bit. If a corrupted preempt count is detected the preempt_count_set() function is used to reset the preempt count. In case the inverted PREEMPT_NEED_RESCHED bit is zero at the time of the reset, the preemption indication is lost. Use raw_cpu_cmpxchg_4() to reset only the count part and leave the PREEMPT_NEED_RESCHED bit as it is. This improves the kernel's behavior when it runs into preempt count leaks and tries to fix them up. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1478523660-733-1-git-send-email-schwidefsky@de.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Borislav Petkov 提交于
Make the MSR argument an unsigned int, both low and high u32, put "notrace" last in the function signature. Reflow function signatures for better readability and cleanup white space. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Christian Borntraeger 提交于
As there are no users left, we can remove cpu_relax_lowlatency() implementations from every architecture. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Cc: <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/1477386195-32736-6-git-send-email-borntraeger@de.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Christian Borntraeger 提交于
For spinning loops people do often use barrier() or cpu_relax(). For most architectures cpu_relax and barrier are the same, but on some architectures cpu_relax can add some latency. For example on power,sparc64 and arc, cpu_relax can shift the CPU towards other hardware threads in an SMT environment. On s390 cpu_relax does even more, it uses an hypercall to the hypervisor to give up the timeslice. In contrast to the SMT yielding this can result in larger latencies. In some places this latency is unwanted, so another variant "cpu_relax_lowlatency" was introduced. Before this is used in more and more places, lets revert the logic and provide a cpu_relax_yield that can be called in places where yielding is more important than latency. By default this is the same as cpu_relax on all architectures. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1477386195-32736-2-git-send-email-borntraeger@de.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 13 11月, 2016 1 次提交
-
-
由 Lukas Wunner 提交于
We already have a macro to invoke boot services which on x86 adapts automatically to the bitness of the EFI firmware: efi_call_early(). The macro allows sharing of functions across arches and bitness variants as long as those functions only call boot services. However in practice functions in the EFI stub contain a mix of boot services calls and protocol calls. Add an efi_call_proto() macro for bitness-agnostic protocol calls to allow sharing more code across arches as well as deduplicating 32 bit and 64 bit code paths. On x86, implement it using a new efi_table_attr() macro for bitness- agnostic table lookups. Refactor efi_call_early() to make use of the same macro. (The resulting object code remains identical.) Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk> Cc: Andreas Noever <andreas.noever@gmail.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Jones <pjones@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20161112213237.8804-8-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 11 11月, 2016 1 次提交
-
-
由 Yazen Ghannam 提交于
The calculation of the hwid_mcatype value in get_smca_bank_info() became incorrect after applying the following commit: 1ce9cd7f ("x86/RAS: Simplify SMCA HWID descriptor struct") This causes the function to not match a bank to its type. Disassembly of hwid_mcatype calculation after change: db: 8b 45 e0 mov -0x20(%rbp),%eax de: 41 89 c4 mov %eax,%r12d e1: 25 00 00 ff 0f and $0xfff0000,%eax e6: 41 c1 ec 10 shr $0x10,%r12d ea: 41 09 c4 or %eax,%r12d Disassembly of hwid_mcatype calculation in original code: 286: 8b 45 d0 mov -0x30(%rbp),%eax 289: 41 89 c5 mov %eax,%r13d 28c: c1 e8 10 shr $0x10,%eax 28f: 41 81 e5 ff 0f 00 00 and $0xfff,%r13d 296: 41 c1 e5 10 shl $0x10,%r13d 29a: 41 09 c5 or %eax,%r13d Grouping the arguments to the HWID_MCATYPE() macro fixes the issue. ( Boris suggested adding parentheses in the macro. ) Signed-off-by: NYazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 10 11月, 2016 2 次提交
-
-
由 Wanpeng Li 提交于
The following RCU lockdep warning led to adding irq_enter()/irq_exit() into smp_reschedule_interrupt(): RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0 RCU used illegally from extended quiescent state! no locks held by swapper/1/0. do_trace_write_msr native_write_msr native_apic_msr_eoi_write smp_reschedule_interrupt reschedule_interrupt As Peterz pointed out: | So now we're making a very frequent interrupt slower because of debug | code. | | The thing is, many many smp_reschedule_interrupt() invocations don't | actually execute anything much at all and are only sent to tickle the | return to user path (which does the actual preemption). | | Having to do the whole irq_enter/irq_exit dance just for this unlikely | debug case totally blows. Use the wrmsr_notrace() variant in native_apic_msr_write_eoi, annotate the kvm variant with notrace and add a native_apic_eoi callback to the apic structure so KVM guests are covered as well. This allows to revert the irq_enter/irq_exit dance in smp_reschedule_interrupt(). Suggested-by: NPeter Zijlstra <peterz@infradead.org> Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Cc: Mike Galbraith <efault@gmx.de> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1478488420-5982-3-git-send-email-wanpeng.li@hotmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Wanpeng Li 提交于
Required to remove the extra irq_enter()/irq_exit() in smp_reschedule_interrupt(). Suggested-by: NPeter Zijlstra <peterz@infradead.org> Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NBorislav Petkov <bp@alien8.de> Cc: kvm@vger.kernel.org Cc: Mike Galbraith <efault@gmx.de> Link: http://lkml.kernel.org/r/1478488420-5982-2-git-send-email-wanpeng.li@hotmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 09 11月, 2016 4 次提交
-
-
由 Borislav Petkov 提交于
Add accessor functions and hide the smca_names array. Also, add a sanity-check to bank HWID assignment in get_smca_bank_info(). Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20161104152317.5r276t35df53qk76@pd.tnicSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Borislav Petkov 提交于
Make it differ more from struct smca_bank_name for better readability. Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NYazen Ghannam <yazen.ghannam@amd.com> Link: http://lkml.kernel.org/r/20161103125556.15482-3-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Borislav Petkov 提交于
Call it simply smca_hwid and call local variables "hwid". More readable. Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NYazen Ghannam <yazen.ghannam@amd.com> Link: http://lkml.kernel.org/r/20161103125556.15482-2-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Borislav Petkov 提交于
Call the struct simply smca_bank, it's instance ID can be simply ->id. Makes the code much more readable. Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NYazen Ghannam <yazen.ghannam@amd.com> Link: http://lkml.kernel.org/r/20161103125556.15482-1-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-