- 01 11月, 2014 1 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
The current method of handling multiple function callbacks is to register a list function callback that calls all the other callbacks based on their hash tables and compare it to the function that the callback was called on. But this is very inefficient. For example, if you are tracing all functions in the kernel and then add a kprobe to a function such that the kprobe uses ftrace, the mcount trampoline will switch from calling the function trace callback to calling the list callback that will iterate over all registered ftrace_ops (in this case, the function tracer and the kprobes callback). That means for every function being traced it checks the hash of the ftrace_ops for function tracing and kprobes, even though the kprobes is only set at a single function. The kprobes ftrace_ops is checked for every function being traced! Instead of calling the list function for functions that are only being traced by a single callback, we can call a dynamically allocated trampoline that calls the callback directly. The function graph tracer already uses a direct call trampoline when it is being traced by itself but it is not dynamically allocated. It's trampoline is static in the kernel core. The infrastructure that called the function graph trampoline can also be used to call a dynamically allocated one. For now, only ftrace_ops that are not dynamically allocated can have a trampoline. That is, users such as function tracer or stack tracer. kprobes and perf allocate their ftrace_ops, and until there's a safe way to free the trampoline, it can not be used. The dynamically allocated ftrace_ops may, although, use the trampoline if the kernel is not compiled with CONFIG_PREEMPT. But that will come later. Tested-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: NJiri Kosina <jkosina@suse.cz> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 19 10月, 2014 1 次提交
-
-
由 Andy Lutomirski 提交于
CR4 isn't constant; at least the TSD and PCE bits can vary. TBH, treating CR0 and CR3 as constant scares me a bit, too, but it looks like it's correct. This adds a branch and a read from cr4 to each vm entry. Because it is extremely likely that consecutive entries into the same vcpu will have the same host cr4 value, this fixes up the vmcs instead of restoring cr4 after the fact. A subsequent patch will add a kernel-wide cr4 shadow, reducing the overhead in the common case to just two memory reads and a branch. Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Cc: Petr Matousek <pmatouse@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 10月, 2014 1 次提交
-
-
由 Anton Blanchard 提交于
Commit e7dbfe34 ("kprobes/x86: Move ftrace-based kprobe code into kprobes-ftrace.c") switched from using ARCH_SUPPORTS_KPROBES_ON_FTRACE to CONFIG_KPROBES_ON_FTRACE but missed removing the define. Signed-off-by: NAnton Blanchard <anton@samba.org> Cc: masami.hiramatsu.pt@hitachi.com Cc: ananth@in.ibm.com Cc: a.p.zijlstra@chello.nl Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 15 10月, 2014 1 次提交
-
-
由 Alexei Starovoitov 提交于
1. JIT compiler using multi-pass approach to converge to final image size, since x86 instructions are variable length. It starts with large gaps between instructions (so some jumps may use imm32 instead of imm8) and iterates until total program size is the same as in previous pass. This algorithm works only if program size is strictly decreasing. Programs that use LD_ABS insn need additional code in prologue, but it was not emitted during 1st pass, so there was a chance that 2nd pass would adjust imm32->imm8 jump offsets to the same number of bytes as increase in prologue, which may cause algorithm to erroneously decide that size converged. Fix it by always emitting largest prologue in the first pass which is detected by oldproglen==0 check. Also change error check condition 'proglen != oldproglen' to fail gracefully. 2. while staring at the code realized that 64-byte buffer may not be enough when 1st insn is large, so increase it to 128 to avoid buffer overflow (theoretical maximum size of prologue+div is 109) and add runtime check. Fixes: 62258278 ("net: filter: x86: internal BPF JIT") Reported-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Tested-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 10月, 2014 7 次提交
-
-
由 Ulrich Obergfell 提交于
Use watchdog_enable_hardlockup_detector() to set hard lockup detection's default value to false. It's risky to run this detection in a guest, as false positives are easy to trigger, especially if the host is overcommitted. Signed-off-by: NUlrich Obergfell <uobergfe@redhat.com> Signed-off-by: NAndrew Jones <drjones@redhat.com> Signed-off-by: NDon Zickus <dzickus@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Xishi Qiu 提交于
If all the nodes are marked hotpluggable, alloc node data will fail. Because __next_mem_range_rev() will skip the hotpluggable memory regions. numa_clear_kernel_node_hotplug() is called after alloc node data. numa_init() ... ret = init_func(); // this will mark hotpluggable flag from SRAT ... memblock_set_bottom_up(false); ... ret = numa_register_memblks(&numa_meminfo); // this will alloc node data(pglist_data) ... numa_clear_kernel_node_hotplug(); // in case all the nodes are hotpluggable ... numa_register_memblks() setup_node_data() memblock_find_in_range_node() __memblock_find_range_top_down() for_each_mem_range_rev() __next_mem_range_rev() This patch moves numa_clear_kernel_node_hotplug() into numa_register_memblks(), clear kernel node hotpluggable flag before alloc node data, then alloc node data won't fail even all the nodes are hotpluggable. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NXishi Qiu <qiuxishi@huawei.com> Cc: Dave Jones <davej@redhat.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrew Morton 提交于
x86_64 allnoconfig: arch/x86/kernel/cpu/common.c:968: warning: 'syscall32_cpu_init' defined but not used Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Travis 提交于
Use the optimized ioresource lookup, "region_is_ram", for the ioremap function. If the region is not found, it falls back to the "page_is_ram" function. If it is found and it is RAM, then the usual warning message is issued, and the ioremap operation is aborted. Otherwise, the ioremap operation continues. Signed-off-by: NMike Travis <travis@sgi.com> Acked-by: NAlex Thorlton <athorlton@sgi.com> Reviewed-by: NCliff Wickman <cpw@sgi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Mark Salter <msalter@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vivek Goyal 提交于
David Howells brought to my attention the mails generated by kbuild test bot and following sparse warnings were present. This patch fixes these warnings. arch/x86/kernel/kexec-bzimage64.c:270:5: warning: symbol 'bzImage64_probe' was not declared. Should it be static? arch/x86/kernel/kexec-bzimage64.c:328:6: warning: symbol 'bzImage64_load' was not declared. Should it be static? arch/x86/kernel/kexec-bzimage64.c:517:5: warning: symbol 'bzImage64_cleanup' was not declared. Should it be static? arch/x86/kernel/kexec-bzimage64.c:531:5: warning: symbol 'bzImage64_verify_sig' was not declared. Should it be static? arch/x86/kernel/kexec-bzimage64.c:546:23: warning: symbol 'kexec_bzImage64_ops' was not declared. Should it be static? Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Reported-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Baoquan He 提交于
Add a check if crashk_res_low exists just like GART region does. If crashk_res_low doesn't exist, calling exclude_mem_range is unnecessary. Meanwhile, since crashk_res_low has been initialized at definition, it's safe just use "if (crashk_low_res.end)" to check if it's exist. And this can make it consistent with other places of check. Signed-off-by: NBaoquan He <bhe@redhat.com> Acked-by: NVivek Goyal <vgoyal@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Baoquan He 提交于
Make the Makefile of kexec purgatory be consistent with others in linux src tree, and make it look generic and simple. Signed-off-by: NBaoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 10月, 2014 1 次提交
-
-
由 Honggang Li 提交于
arch/x86/um/checksum_32.S had been copy & paste from x86. When build x86 uml, csum_partial_copy_generic_i386 mess up the exception table. In fact, exception table dose not work in uml kernel. And csum_partial_copy_generic_i386 never been called. So, delete it. Signed-off-by: NHonggang Li <enjoymindful@gmail.com> Signed-off-by: NRichard Weinberger <richard@nod.at>
-
- 10 10月, 2014 2 次提交
-
-
由 Geert Uytterhoeven 提交于
The different architectures used their own (and different) declarations: extern __visible const void __nosave_begin, __nosave_end; extern const void __nosave_begin, __nosave_end; extern long __nosave_begin, __nosave_end; Consolidate them using the first variant in <asm/sections.h>. Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
ARCH_USES_NUMA_PROT_NONE was defined for architectures that implemented _PAGE_NUMA using _PROT_NONE. This saved using an additional PTE bit and relied on the fact that PROT_NONE vmas were skipped by the NUMA hinting fault scanner. This was found to be conceptually confusing with a lot of implicit assumptions and it was asked that an alternative be found. Commit c46a7c81 "x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels" redefined _PAGE_NUMA on x86 to be one of the swap PTE bits and shrunk the maximum possible swap size but it did not go far enough. There are no architectures that reuse _PROT_NONE as _PROT_NUMA but the relics still exist. This patch removes ARCH_USES_NUMA_PROT_NONE and removes some unnecessary duplication in powerpc vs the generic implementation by defining the types the core NUMA helpers expected to exist from x86 with their ppc64 equivalent. This necessitated that a PTE bit mask be created that identified the bits that distinguish present from NUMA pte entries but it is expected this will only differ between arches based on _PAGE_PROTNONE. The naming for the generic helpers was taken from x86 originally but ppc64 has types that are equivalent for the purposes of the helper so they are mapped instead of duplicating code. Signed-off-by: NMel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 10月, 2014 2 次提交
-
-
由 Shuah Khan 提交于
The following generated files are missing from gitignore and show up in git status after x86_64 build. Add them to gitignore. arch/x86/purgatory/kexec-purgatory.c arch/x86/purgatory/purgatory.ro Signed-off-by: NShuah Khan <shuahkh@osg.samsung.com> Link: http://lkml.kernel.org/r/1412016116-7213-1-git-send-email-shuahkh@osg.samsung.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Al Viro 提交于
... rather than doing that in the guts of ->load_binary(). [updated to fix the bug spotted by Shentino - for SIGSEGV we really need something stronger than send_sig_info(); again, better do that in one place] Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 08 10月, 2014 7 次提交
-
-
由 Jan Beulich 提交于
Signed-off-by: NJan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/542291CA0200007800038085@mail.emea.novell.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Andi Kleen 提交于
A variable cannot be both __read_mostly and const. This is a meaningless combination. Just make it only const. This fixes the LTO build with numachip enabled. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1411533139-25708-1-git-send-email-andi@firstfloor.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Ben Hutchings 提交于
It is currently possible to execve() an x32 executable on an x86_64 kernel that has only ia32 compat enabled. However all its syscalls will fail, even _exit(). This usually causes it to segfault. Change the ELF compat architecture check so that x32 executables are rejected if we don't support the x32 ABI. Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Link: http://lkml.kernel.org/r/1410120305.6822.9.camel@decadent.org.uk Cc: stable@vger.kernel.org Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Bryan O'Donoghue 提交于
Intel processors which don't report cache information via cpuid(2) or cpuid(4) need quirk code in the legacy_cache_size callback to report this data. For Intel that callback is is intel_size_cache(). This patch enables calling of cpu_detect_cache_sizes() inside of init_intel() and hence the calling of the legacy_cache callback in intel_size_cache(). Adding this call will ensure that PIII Tualatin currently in intel_size_cache() and Quark SoC X1000 being added to intel_size_cache() in this patch will report their respective cache sizes. This model of calling cpu_detect_cache_sizes() is consistent with AMD/Via/Cirix/Transmeta and Centaur. Also added is a string to idenitfy the Quark as Quark SoC X1000 giving better and more descriptive output via /proc/cpuinfo Adding cpu_detect_cache_sizes to init_intel() will enable calling of intel_size_cache() on Intel processors which currently no code can reach. Therefore this patch will also re-enable reporting of PIII Tualatin cache size information as well as add Quark SoC X1000 support. Comment text and cache flow logic suggested by Thomas Gleixner Signed-off-by: NBryan O'Donoghue <pure.logic@nexus-software.ie> Cc: davej@redhat.com Cc: hmh@hmh.eng.br Link: http://lkml.kernel.org/r/1412641189-12415-3-git-send-email-pure.logic@nexus-software.ieSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Bryan O'Donoghue 提交于
Quark SoC X1000 advertises Page Global Enable for it's Translation Lookaside Buffer via cpuid. The silicon does not in fact support PGE and hence will not flush the TLB when CR4.PGE is rewritten. The Quark documentation makes clear the necessity to instead rewrite CR3 in order to flush any TLB entries, irrespective of the state of CR4.PGE or an individual PTE.PGE See Intel Quark Core DevMan_001.pdf section 6.4.11 In setup.c setup_arch() the code will load_cr3() and then do a __flush_tlb_all(). On Quark the entire TLB will be flushed at the load_cr3(). The __flush_tlb_all() have no effect and can be safely ignored. Later on in the boot process we switch off the flag for cpu_has_pge() which means that subsequent calls to __flush_tlb_all() will call __flush_tlb() not __flush_tlb_global() flushing the TLB in the correct way via load_cr3() not CR4.PGE rewrite This patch documents the behaviour of flushing the TLB for Quark in setup_arch() Comment text suggested by Thomas Gleixner Signed-off-by: NBryan O'Donoghue <pure.logic@nexus-software.ie> Cc: davej@redhat.com Cc: hmh@hmh.eng.br Link: http://lkml.kernel.org/r/1412641189-12415-2-git-send-email-pure.logic@nexus-software.ieSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jan Beulich 提交于
- don't include unneeded headers - drop redundant entry point label - complete unwind annotations - use .L prefix on local labels to not clutter the symbol table Signed-off-by: NJan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/5422917E0200007800038081@mail.emea.novell.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jan Beulich 提交于
- don't include unneeded headers - don't open-code PER_CPU_VAR() - drop redundant entry point label - complete unwind annotations - use .L prefix on local label to not clutter the symbol table Signed-off-by: NJan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/542290BC020000780003807D@mail.emea.novell.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 07 10月, 2014 1 次提交
-
-
由 Andy Lutomirski 提交于
The NT flag doesn't do anything in long mode other than causing IRET to #GP. Oddly, CPL3 code can still set NT using popf. Entry via hardware or software interrupt clears NT automatically, so the only relevant entries are fast syscalls. If user code causes kernel code to run with NT set, then there's at least some (small) chance that it could cause trouble. For example, user code could cause a call to EFI code with NT set, and who knows what would happen? Apparently some games on Wine sometimes do this (!), and, if an IRET return happens, they will segfault. That segfault cannot be handled, because signal delivery fails, too. This patch programs the CPU to clear NT on entry via SYSCALL (both 32-bit and 64-bit, by my reading of the AMD APM), and it clears NT in software on entry via SYSENTER. To save a few cycles, this borrows a trick from Jan Beulich in Xen: it checks whether NT is set before trying to clear it. As a result, it seems to have very little effect on SYSENTER performance on my machine. There's another minor bug fix in here: it looks like the CFI annotations were wrong if CONFIG_AUDITSYSCALL=n. Testers beware: on Xen, SYSENTER with NT set turns into a GPF. I haven't touched anything on 32-bit kernels. The syscall mask change comes from a variant of this patch by Anish Bhatt. Note to stable maintainers: there is no known security issue here. A misguided program can set NT and cause the kernel to try and fail to deliver SIGSEGV, crashing the program. This patch fixes Far Cry on Wine: https://bugs.winehq.org/show_bug.cgi?id=33275 Cc: <stable@vger.kernel.org> Reported-by: NAnish Bhatt <anish@chelsio.com> Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/395749a5d39a29bd3e4b35899cf3a3c1340e5595.1412189265.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 06 10月, 2014 1 次提交
-
-
由 Mukesh Rathor 提交于
This fixes two bugs in PVH guests: - Not setting EFER.NX means the NX bit in page table entries is ignored on Intel processors and causes reserved bit page faults on AMD processors. - After the Xen commit 7645640d6ff1 ("x86/PVH: don't set EFER_SCE for pvh guest") PVH guests are required to set EFER.SCE to enable the SYSCALL instruction. Secondary VCPUs are started with pagetables with the NX bit set so EFER.NX must be set before using any stack or data segment. xen_pvh_cpu_early_init() is the new secondary VCPU entry point that sets EFER before jumping to cpu_bringup_and_idle(). Signed-off-by: NMukesh Rathor <mukesh.rathor@oracle.com> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
- 03 10月, 2014 6 次提交
-
-
由 Juergen Gross 提交于
Size restrictions native kernels wouldn't have resulted from the initrd getting mapped into the initial mapping. The kernel doesn't really need the initrd to be mapped, so use infrastructure available in Xen to avoid the mapping and hence the restriction. Signed-off-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
由 Pranith Kumar 提交于
Use the much more reader friendly ACCESS_ONCE() instead of the cast to volatile. This is purely a stylistic change. Signed-off-by: NPranith Kumar <bobby.prani@gmail.com> Acked-by: NJesper Nilsson <jesper.nilsson@axis.com> Acked-by: NHans-Christian Egtvedt <egtvedt@samfundet.no> Acked-by: NMax Filippov <jcmvbkbc@gmail.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/1411482607-20948-1-git-send-email-bobby.prani@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Wei Huang 提交于
PMU checking can fail due to various reasons. On native machine, this is mostly caused by faulty hardware and it is reasonable to use KERN_ERR in reporting. However, when kernel is running on virtualized environment, this checking can fail if virtual PMU is not supported (e.g. KVM on AMD host). It is annoying to see an error message on splash screen, even though we know such failure is benign on virtualized environment. This patch checks if the kernel is running in a virtualized environment. If so, it will use KERN_INFO in reporting, which reduces the syslog priority of them. This patch was tested successfully on KVM. Signed-off-by: NWei Huang <wei@redhat.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Link: http://lkml.kernel.org/r/1411617314-24659-1-git-send-email-wei@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andi Kleen 提交于
I was looking for the trinity oops cause in the uncore driver. (so far didn't found it) However I found this tiny race: when a box is set up two threads on the same CPU, they may be setting up the box in parallel (e.g. with kernel preemption). This could lead to the reference count being increasing too much. Always recheck there is no existing cpu reference inside the lock. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: eranian@google.com Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Link: http://lkml.kernel.org/r/1411424826-15629-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dave Hansen 提交于
Commit: cebf15eb ("x86, sched: Add new topology for multi-NUMA-node CPUs") some code to try to detect the situation where we have a NUMA node inside of the "DIE" sched domain. It detected this by looking for cpus which match_die() but do not match NUMA nodes via topology_same_node(). I wrote it up as: if (match_die(c, o) == !topology_same_node(c, o)) which actually seemed to work some of the time, albiet accidentally. It should have been doing an &&, not an ==. This code essentially chopped off the "DIE" domain on one of Andrew Morton's systems. He reported that this patch fixed his issue. Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Reported-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Dave Hansen <dave@sr71.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Rientjes <rientjes@google.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Lan Tianyu <tianyu.lan@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Toshi Kani <toshi.kani@hp.com> Link: http://lkml.kernel.org/r/20140930214546.FD481CFF@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Paolo Bonzini 提交于
This fixes the following OOPS: loaded kvm module (v3.17-rc1-168-gcec26bc3) BUG: unable to handle kernel paging request at fffffffffffffffe IP: [<ffffffff81168449>] put_page+0x9/0x30 PGD 1e15067 PUD 1e17067 PMD 0 Oops: 0000 [#1] PREEMPT SMP [<ffffffffa063271d>] ? kvm_vcpu_reload_apic_access_page+0x5d/0x70 [kvm] [<ffffffffa013b6db>] vmx_vcpu_reset+0x21b/0x470 [kvm_intel] [<ffffffffa0658816>] ? kvm_pmu_reset+0x76/0xb0 [kvm] [<ffffffffa064032a>] kvm_vcpu_reset+0x15a/0x1b0 [kvm] [<ffffffffa06403ac>] kvm_arch_vcpu_setup+0x2c/0x50 [kvm] [<ffffffffa062e540>] kvm_vm_ioctl+0x200/0x780 [kvm] [<ffffffff81212170>] do_vfs_ioctl+0x2d0/0x4b0 [<ffffffff8108bd99>] ? __mmdrop+0x69/0xb0 [<ffffffff812123d1>] SyS_ioctl+0x81/0xa0 [<ffffffff8112a6f6>] ? __audit_syscall_exit+0x1f6/0x2a0 [<ffffffff817229e9>] system_call_fastpath+0x16/0x1b Code: c6 78 ce a3 81 4c 89 e7 e8 d9 80 ff ff 0f 0b 4c 89 e7 e8 8f f6 ff ff e9 fa fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 <48> f7 07 00 c0 00 00 55 48 89 e5 75 1e 8b 47 1c 85 c0 74 27 f0 RIP [<ffffffff81193045>] put_page+0x5/0x50 when not using the in-kernel irqchip ("-machine kernel_irqchip=off" with QEMU). The fix is to make the same check in kvm_vcpu_reload_apic_access_page that we already have in vmx.c's vm_need_virtualize_apic_accesses(). Reported-by: NJan Kiszka <jan.kiszka@siemens.com> Tested-by: NJan Kiszka <jan.kiszka@siemens.com> Fixes: 4256f43fSigned-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 02 10月, 2014 4 次提交
-
-
由 Mathias Krause 提交于
This reverts commit 7da4b29d. Now, that the issue is fixed, we can re-enable the code. Signed-off-by: NMathias Krause <minipli@googlemail.com> Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Mathias Krause 提交于
The defines for xkey3, xkey6 and xkey9 are not used in the code. They're probably left overs from merging the three source files for 128, 192 and 256 bit AES. They can safely be removed. Signed-off-by: NMathias Krause <minipli@googlemail.com> Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Mathias Krause 提交于
The "by8" CTR AVX implementation fails to propperly handle counter overflows. That was the reason it got disabled in commit 7da4b29d ("crypto: aesni - disable "by8" AVX CTR optimization"). Fix the overflow handling by incrementing the counter block as a double quad word, i.e. a 128 bit, and testing for overflows afterwards. We need to use VPTEST to do so as VPADD* does not set the flags itself and silently drops the carry bit. As this change adds branches to the hot path, minor performance regressions might be a side effect. But, OTOH, we now have a conforming implementation -- the preferable goal. A tcrypt test on a SandyBridge system (i7-2620M) showed almost identical numbers for the old and this version with differences within the noise range. A dm-crypt test with the fixed version gave even slightly better results for this version. So the performance impact might not be as big as expected. Tested-by: NRomain Francoise <romain@orebokech.com> Signed-off-by: NMathias Krause <minipli@googlemail.com> Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Kees Cook 提交于
Building 32-bit threw a warning on kASLR enabled builds: arch/x86/boot/compressed/aslr.c: In function ‘mem_avoid_overlap’: arch/x86/boot/compressed/aslr.c:198:17: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] avoid.start = (u64)ptr; ^ This fixes the warning; unsigned long should have been used here. Signed-off-by: NKees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/20141001183632.GA11431@www.outflux.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 27 9月, 2014 1 次提交
-
-
由 Alexei Starovoitov 提交于
done as separate commit to ease conflict resolution Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 9月, 2014 1 次提交
-
-
由 Matt Fleming 提交于
If we're executing the 32-bit efi_char16_printk() code path (i.e. running on top of 32-bit firmware) we know that efi_early->text_output will be a 32-bit value, even though ->text_output has type u64. Unfortunately, we currently pass ->text_output directly to efi_early->call() so for CONFIG_X86_32 the compiler will push a 64-bit value onto the stack, causing the other parameters to be misaligned. The way we handle this in the rest of the EFI boot stub is to pass pointers as arguments to efi_early->call(), which automatically do the right thing (pointers are 32-bit on CONFIG_X86_32, and we simply ignore the upper 32-bits of the argument register if running in 64-bit mode with 32-bit firmware). This fixes a corruption bug when printing strings from the 32-bit EFI boot stub. Link: https://bugzilla.kernel.org/show_bug.cgi?id=84241Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
- 24 9月, 2014 3 次提交
-
-
由 Ben Hutchings 提交于
per_cpu_load_addr is only used for 64-bit relocations, but is declared in both configurations of relocs.c - with different types. This has undefined behaviour in general. GNU ld is documented to use the larger size in this case, but other tools may differ and some warn about this. References: https://bugs.debian.org/748577Reported-by: NMichael Tautschnig <mt@debian.org> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Cc: 748577@bugs.debian.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1411561812.3659.23.camel@decadent.org.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
Trivial. We have "lib-y += thunk_$(BITS).o" at the start, no need to add thunk_64.o if !CONFIG_X86_32. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NAndy Lutomirski <luto@amacapital.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140921184232.GB23727@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
___preempt_schedule() does SAVE_ALL/RESTORE_ALL but this is suboptimal, we do not need to save/restore the callee-saved register. And we already have arch/x86/lib/thunk_*.S which implements the similar asm wrappers, so it makes sense to redefine ___preempt_schedule() as "THUNK ..." and remove preempt.S altogether. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Reviewed-by: NAndy Lutomirski <luto@amacapital.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140921184153.GA23727@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-