- 18 12月, 2014 1 次提交
-
-
由 Linus Torvalds 提交于
My commit 26178ec1 ("x86: mm: consolidate VM_FAULT_RETRY handling") had a really stupid typo: the FAULT_FLAG_USER bit is in the 'flags' variable, not the 'fault' variable. Duh, The one silver lining in this is that Dave finding this at least confirms that trinity actually triggers this special path easily, in a way normal use does not. Reported-by: NDave Jones <davej@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 12月, 2014 2 次提交
-
-
由 Linus Torvalds 提交于
The VM_FAULT_RETRY handling was confusing and incorrect for the case of returning to kernel mode. We need to handle the exception table fixup if we return to kernel mode due to a fatal signal - it will basically look to the kernel user mode access like the access failed due to the VM going away from udner it. Which is correct - the process is dying - and avoids the whole "repeat endless kernel page faults" case. Handling the VM_FAULT_RETRY early and in just one place also simplifies the mmap_sem handling, since once we've taken care of VM_FAULT_RETRY we know that we can just drop the lock. The remaining accounting and possible error handling is thread-local and does not need the mmap_sem. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
This replaces four copies in various stages of mm_fault_error() handling with just a single one. It will also allow for more natural placement of the unlocking after some further cleanup. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 12月, 2014 5 次提交
-
-
由 Andy Lutomirski 提交于
Users have no business installing custom code segments into the GDT, and segments that are not present but are otherwise valid are a historical source of interesting attacks. For completeness, block attempts to set the L bit. (Prior to this patch, the L bit would have been silently dropped.) This is an ABI break. I've checked glibc, musl, and Wine, and none of them look like they'll have any trouble. Note to stable maintainers: this is a hardening patch that fixes no known bugs. Given the possibility of ABI issues, this probably shouldn't be backported quickly. Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Acked-by: NH. Peter Anvin <hpa@zytor.com> Cc: stable@vger.kernel.org # optional Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: security@kernel.org <security@kernel.org> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
Installing a 16-bit RW data segment into the GDT defeats espfix. AFAICT this will not affect glibc, Wine, or dosemu at all. Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Acked-by: NH. Peter Anvin <hpa@zytor.com> Cc: stable@vger.kernel.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: security@kernel.org <security@kernel.org> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Riku Voipio 提交于
Following the suggestions from Andrew Morton and Stephen Rothwell, Dont expand the ARCH list in kernel/gcov/Kconfig. Instead, define a ARCH_HAS_GCOV_PROFILE_ALL bool which architectures can enable. set ARCH_HAS_GCOV_PROFILE_ALL on Architectures where it was previously allowed + ARM64 which I tested. Signed-off-by: NRiku Voipio <riku.voipio@linaro.org> Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Drysdale 提交于
Hook up x86-64, i386 and x32 ABIs. Signed-off-by: NDavid Drysdale <drysdale@google.com> Cc: Meredydd Luff <meredydd@senatehouse.org> Cc: Shuah Khan <shuah.kh@samsung.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rich Felker <dalias@aerifal.cx> Cc: Christoph Hellwig <hch@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joonsoo Kim 提交于
Now, we have prepared to avoid using debug-pagealloc in boottime. So introduce new kernel-parameter to disable debug-pagealloc in boottime, and makes related functions to be disabled in this case. Only non-intuitive part is change of guard page functions. Because guard page is effective only if debug-pagealloc is enabled, turning off according to debug-pagealloc is reasonable thing to do. Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Dave Hansen <dave@sr71.net> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Jungsoo Son <jungsoo.son@lge.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 12月, 2014 3 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
The parameters for prepare_ftrace_return() used by the function graph tracer were swapped to simplify the code on x86_64. But i386 function graph trampoline also calls this function, and it did not have its parameters swapped. Link: http://lkml.kernel.org/r/20141210231732.GA24163@wfg-t540p.sh.intel.comReported-by: NFengguang Wu <fengguang.wu@intel.com> Tested-by: NFengguang Wu <fengguang.wu@intel.com> Fixes: 6a06bdbf "ftrace/fgraph/x86: Have prepare_ftrace_return() take ip as first parameter" Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Alexander Duyck 提交于
There are a number of situations where the mandatory barriers rmb() and wmb() are used to order memory/memory operations in the device drivers and those barriers are much heavier than they actually need to be. For example in the case of PowerPC wmb() calls the heavy-weight sync instruction when for coherent memory operations all that is really needed is an lsync or eieio instruction. This commit adds a coherent only version of the mandatory memory barriers rmb() and wmb(). In most cases this should result in the barrier being the same as the SMP barriers for the SMP case, however in some cases we use a barrier that is somewhere in between rmb() and smp_rmb(). For example on ARM the rmb barriers break down as follows: Barrier Call Explanation --------- -------- ---------------------------------- rmb() dsb() Data synchronization barrier - system dma_rmb() dmb(osh) data memory barrier - outer sharable smp_rmb() dmb(ish) data memory barrier - inner sharable These new barriers are not as safe as the standard rmb() and wmb(). Specifically they do not guarantee ordering between coherent and incoherent memories. The primary use case for these would be to enforce ordering of reads and writes when accessing coherent memory that is shared between the CPU and a device. It may also be noted that there is no dma_mb(). Most architectures don't provide a good mechanism for performing a coherent only full barrier without resorting to the same mechanism used in mb(). As such there isn't much to be gained in trying to define such a function. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: David Miller <davem@davemloft.net> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Duyck 提交于
This patch is meant to cleanup the handling of read_barrier_depends and smp_read_barrier_depends. In multiple spots in the kernel headers read_barrier_depends is defined as "do {} while (0)", however we then go into the SMP vs non-SMP sections and have the SMP version reference read_barrier_depends, and the non-SMP define it as yet another empty do/while. With this commit I went through and cleaned out the duplicate definitions and reduced the number of definitions down to 2 per header. In addition I moved the 50 line comments for the macro from the x86 and mips headers that defined it as an empty do/while to those that were actually defining the macro, alpha and blackfin. Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 12月, 2014 8 次提交
-
-
由 Juergen Gross 提交于
With the virtual mapped linear p2m list the post-init mmu operations must be used for setting up the p2m mappings, as in case of CONFIG_FLATMEM the init routines may trigger BUGs. paging_init() sets up all infrastructure needed to switch to the post-init mmu ops done by xen_post_allocator_init(). With the virtual mapped linear p2m list we need some mmu ops during setup of this list, so we have to switch to the correct mmu ops as soon as possible. The p2m list is usable from the beginning, just expansion requires to have established the new linear mapping. So the call of xen_remap_memory() had to be introduced, but this is not due to the mmu ops requiring this. Summing it up: calling xen_post_allocator_init() not directly after paging_init() was conceptually wrong in the beginning, it just didn't matter up to now as no functions used between the two calls needed some critical mmu ops (e.g. alloc_pte). This has changed now, so I corrected it. Reported-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
由 Borislav Petkov 提交于
Those are identical on 32- and 64-bit, unify them. No functional change. Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1418127959-29902-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Borislav Petkov 提交于
Sometimes it is helpful to build a kernel compilation unit directly, i.e.: make .../<filename>.i in order to look at compiler output. Since asm-offsets_{32,64}.c are included by asm-offsets.c and building them directly doesn't evaluate the macros used (thus making the preprocessor output not very useful), error out when an attempt is made to build them. Issue a hint for the user to build asm-offsets.c instead. Suggested-by: NMichael Matz <matz@suse.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Michal Marek <mmarek@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1418139917-12722-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
Otherwise, if buggy user code points DS or ES into the TLS array, they would be corrupted after a context switch. This also significantly improves the comments and documents some gotchas in the code. Before this patch, the both tests below failed. With this patch, the es test passes, although the gsbase test still fails. ----- begin es test ----- /* * Copyright (c) 2014 Andy Lutomirski * GPL v2 */ static unsigned short GDT3(int idx) { return (idx << 3) | 3; } static int create_tls(int idx, unsigned int base) { struct user_desc desc = { .entry_number = idx, .base_addr = base, .limit = 0xfffff, .seg_32bit = 1, .contents = 0, /* Data, grow-up */ .read_exec_only = 0, .limit_in_pages = 1, .seg_not_present = 0, .useable = 0, }; if (syscall(SYS_set_thread_area, &desc) != 0) err(1, "set_thread_area"); return desc.entry_number; } int main() { int idx = create_tls(-1, 0); printf("Allocated GDT index %d\n", idx); unsigned short orig_es; asm volatile ("mov %%es,%0" : "=rm" (orig_es)); int errors = 0; int total = 1000; for (int i = 0; i < total; i++) { asm volatile ("mov %0,%%es" : : "rm" (GDT3(idx))); usleep(100); unsigned short es; asm volatile ("mov %%es,%0" : "=rm" (es)); asm volatile ("mov %0,%%es" : : "rm" (orig_es)); if (es != GDT3(idx)) { if (errors == 0) printf("[FAIL]\tES changed from 0x%hx to 0x%hx\n", GDT3(idx), es); errors++; } } if (errors) { printf("[FAIL]\tES was corrupted %d/%d times\n", errors, total); return 1; } else { printf("[OK]\tES was preserved\n"); return 0; } } ----- end es test ----- ----- begin gsbase test ----- /* * gsbase.c, a gsbase test * Copyright (c) 2014 Andy Lutomirski * GPL v2 */ static unsigned char *testptr, *testptr2; static unsigned char read_gs_testvals(void) { unsigned char ret; asm volatile ("movb %%gs:%1, %0" : "=r" (ret) : "m" (*testptr)); return ret; } int main() { int errors = 0; testptr = mmap((void *)0x200000000UL, 1, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0); if (testptr == MAP_FAILED) err(1, "mmap"); testptr2 = mmap((void *)0x300000000UL, 1, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0); if (testptr2 == MAP_FAILED) err(1, "mmap"); *testptr = 0; *testptr2 = 1; if (syscall(SYS_arch_prctl, ARCH_SET_GS, (unsigned long)testptr2 - (unsigned long)testptr) != 0) err(1, "ARCH_SET_GS"); usleep(100); if (read_gs_testvals() == 1) { printf("[OK]\tARCH_SET_GS worked\n"); } else { printf("[FAIL]\tARCH_SET_GS failed\n"); errors++; } asm volatile ("mov %0,%%gs" : : "r" (0)); if (read_gs_testvals() == 0) { printf("[OK]\tWriting 0 to gs worked\n"); } else { printf("[FAIL]\tWriting 0 to gs failed\n"); errors++; } usleep(100); if (read_gs_testvals() == 0) { printf("[OK]\tgsbase is still zero\n"); } else { printf("[FAIL]\tgsbase was corrupted\n"); errors++; } return errors == 0 ? 0 : 1; } ----- end gsbase test ----- Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Cc: <stable@vger.kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/509d27c9fec78217691c3dad91cec87e1006b34a.1418075657.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Xishi Qiu 提交于
The type of "MAX_DMA_PFN" and "xXx_pfn" are both unsigned long now, so use min() instead of min_t(). Suggested-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NXishi Qiu <qiuxishi@huawei.com> Cc: Linux MM <linux-mm@kvack.org> Cc: <dave@sr71.net> Cc: Rik van Riel <riel@redhat.com> Link: http://lkml.kernel.org/r/5487AB3F.7050807@huawei.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Xishi Qiu 提交于
This is the usual physical memory layout boot printout: ... [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x00001000-0x00ffffff] [ 0.000000] DMA32 [mem 0x01000000-0xffffffff] [ 0.000000] Normal [mem 0x100000000-0xc3fffffff] [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x00001000-0x00099fff] [ 0.000000] node 0: [mem 0x00100000-0xbf78ffff] [ 0.000000] node 0: [mem 0x100000000-0x63fffffff] [ 0.000000] node 1: [mem 0x640000000-0xc3fffffff] ... This is the log when we set "mem=2G" on the boot cmdline: ... [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x00001000-0x00ffffff] [ 0.000000] DMA32 [mem 0x01000000-0xffffffff] // should be 0x7fffffff, right? [ 0.000000] Normal empty [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x00001000-0x00099fff] [ 0.000000] node 0: [mem 0x00100000-0x7fffffff] ... This patch fixes the printout, the following log shows the right ranges: ... [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x00001000-0x00ffffff] [ 0.000000] DMA32 [mem 0x01000000-0x7fffffff] [ 0.000000] Normal empty [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x00001000-0x00099fff] [ 0.000000] node 0: [mem 0x00100000-0x7fffffff] ... Suggested-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NXishi Qiu <qiuxishi@huawei.com> Cc: Linux MM <linux-mm@kvack.org> Cc: <dave@sr71.net> Cc: Rik van Riel <riel@redhat.com> Link: http://lkml.kernel.org/r/5487AB3D.6070306@huawei.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kirill A. Shutemov 提交于
As a small zero page, huge zero page should not be accounted in smaps report as normal page. For small pages we rely on vm_normal_page() to filter out zero page, but vm_normal_page() is not designed to handle pmds. We only get here due hackish cast pmd to pte in smaps_pte_range() -- pte and pmd format is not necessary compatible on each and every architecture. Let's add separate codepath to handle pmds. follow_trans_huge_pmd() will detect huge zero page for us. We would need pmd_dirty() helper to do this properly. The patch adds it to THP-enabled architectures which don't yet have one. [akpm@linux-foundation.org: use do_div to fix 32-bit build] Signed-off-by: N"Kirill A. Shutemov" <kirill@shutemov.name> Reported-by: NFengguang Wu <fengguang.wu@intel.com> Tested-by: NFengwei Yin <yfw.kernel@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Daniel Borkmann 提交于
As there are now no remaining users of arch_fast_hash(), lets kill it entirely. This basically reverts commit 71ae8aac ("lib: introduce arch optimized hash library") and follow-up work, that is f.e., commit 23721754 ("lib: hash: follow-up fixups for arch hash"), commit e3fec2f7 ("lib: Add missing arch generic-y entries for asm-generic/hash.h") and last but not least commit 6a02652d ("perf tools: Fix include for non x86 architectures"). Cc: Francesco Fusco <fusco@ntop.org> Cc: Thomas Graf <tgraf@suug.ch> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 12月, 2014 3 次提交
-
-
由 Borislav Petkov 提交于
I'm such a moron! The simple solution of saving the BSP patch for use on resume was too simple (and wrong!), hint: sizeof(struct microcode_intel). What needs to be done instead is to fish out the microcode patch we have stashed previously and apply that on the BSP in case the late loader hasn't been utilized. So do that instead. Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20141208110820.GB20057@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Joe Perches 提交于
Let the compiler decide instead. No change in object size x86-64 -O2 no profiling Signed-off-by: NJoe Perches <joe@perches.com> Suggested-by: NEric Dumazet <eric.dumazet@gmail.com> Acked-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Joe Perches 提交于
Use the (1 << reg) & mask trick to reduce code size. x86-64 size difference -O2 without profiling for various gcc versions: $ size arch/x86/net/bpf_jit_comp.o* text data bss dec hex filename 9266 4 0 9270 2436 arch/x86/net/bpf_jit_comp.o.4.4.new 10042 4 0 10046 273e arch/x86/net/bpf_jit_comp.o.4.4.old 9109 4 0 9113 2399 arch/x86/net/bpf_jit_comp.o.4.6.new 9717 4 0 9721 25f9 arch/x86/net/bpf_jit_comp.o.4.6.old 8789 4 0 8793 2259 arch/x86/net/bpf_jit_comp.o.4.7.new 10245 4 0 10249 2809 arch/x86/net/bpf_jit_comp.o.4.7.old 9671 4 0 9675 25cb arch/x86/net/bpf_jit_comp.o.4.9.new 10679 4 0 10683 29bb arch/x86/net/bpf_jit_comp.o.4.9.old Signed-off-by: NJoe Perches <joe@perches.com> Tested-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 12月, 2014 6 次提交
-
-
由 Richard Weinberger 提交于
systemd has a hard dependency on CONFIG_FHANDLE. If you run systemd with CONFIG_FHANDLE=n it will somehow boot but fail to spawn a getty or other basic services. As systemd is now used by most x86 distributions it makes sense to enabled this by default and save kernel hackers a lot of value debugging time. Signed-off-by: NRichard Weinberger <richard@nod.at> Cc: gregkh@linuxfoundation.org Cc: rafael.j.wysocki@intel.com Cc: pebolle@tiscali.nl Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1416958612-7448-1-git-send-email-richard@nod.atSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Juergen Gross 提交于
Commit 5b8e7d80 removed the __init annotation from xen_set_identity_and_remap_chunk(). Add it again. Signed-off-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
由 Juergen Gross 提交于
Introduce two helper functions to safely read and write unsigned long values from or to memory when the access may fault because the mapping is non-present or read-only. These helpers can be used instead of open coded uses of __get_user() and __put_user() avoiding the need to do casts to fix sparse warnings. Use the helpers in page.h and p2m.c. This will fix the sparse warnings when doing "make C=1". Signed-off-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
由 Rasmus Villemoes 提交于
seq_puts is a lot cheaper than seq_printf, so use that to print literal strings. Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Link: http://lkml.kernel.org/r/1417208622-12264-1-git-send-email-linux@rasmusvillemoes.dkSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
arch_spin_unlock_wait() looks very suboptimal, to the point I think this is just wrong and can lead to livelock: if the lock is heavily contended we can never see head == tail. But we do not need to wait for arch_spin_is_locked() == F. If it is locked we only need to wait until the current owner drops this lock. So we could simply spin until old_head != lock->tickets.head in this case, but .head can overflow and thus we can't check "unlocked" only once before the main loop. Also, the "unlocked" check can ignore TICKET_SLOWPATH_FLAG bit. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Paul E.McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <Waiman.Long@hp.com> Link: http://lkml.kernel.org/r/20141201213417.GA5842@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Borislav Petkov 提交于
We need the additional "k" to make it a hard-c: https://en.wiktionary.org/wiki/panickedSigned-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1417642605-15730-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 12月, 2014 4 次提交
-
-
由 Borislav Petkov 提交于
Normally, we do reapply microcode on resume. However, in the cases where that microcode comes from the early loader and the late loader hasn't been utilized yet, there's no easy way for us to go and apply the patch applied during boot by the early loader. Thus, reuse the patch stashed by the early loader for the BSP. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
由 Boris Ostrovsky 提交于
Paravirtual guests are not expected to load microcode into processors and therefore it is not necessary to initialize microcode loading logic. In fact, under certain circumstances initializing this logic may cause the guest to crash. Specifically, 32-bit kernels use __pa_nodebug() macro which does not work in Xen (the code path that leads to this macro happens during resume when we call mc_bp_resume()->load_ucode_ap() ->check_loader_disabled_ap()) Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Link: http://lkml.kernel.org/r/1417469264-31470-1-git-send-email-boris.ostrovsky@oracle.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
由 Borislav Petkov 提交于
apply_microcode_early() doesn't use mc_saved_data, kill it. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
由 Alexei Starovoitov 提交于
classic BPF has a restriction that last insn is always BPF_RET. eBPF doesn't have BPF_RET instruction and this restriction. It has BPF_EXIT insn which can appear anywhere in the program one or more times and it doesn't have to be last insn. Fix eBPF JIT to emit epilogue when first BPF_EXIT is seen and all other BPF_EXIT instructions will be emitted as jump. Since jump offset to epilogue is computed as: jmp_offset = ctx->cleanup_addr - addrs[i] we need to change type of cleanup_addr to signed to compute the offset as: (long long) ((int)20 - (int)30) instead of: (long long) ((unsigned int)20 - (int)30) Fixes: 62258278 ("net: filter: x86: internal BPF JIT") Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 12月, 2014 8 次提交
-
-
由 Juergen Gross 提交于
Instead of checking at each call of set_phys_to_machine() whether a new p2m page has to be allocated due to writing an entry in a large invalid or identity area, just map those areas read only and react to a page fault on write by allocating the new page. This change will make the common path with no allocation much faster as it only requires a single write of the new mfn instead of walking the address translation tables and checking for the special cases. Suggested-by: NDavid Vrabel <david.vrabel@citrix.com> Signed-off-by: NJuergen Gross <jgross@suse.com> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
由 Juergen Gross 提交于
At start of the day the Xen hypervisor presents a contiguous mfn list to a pv-domain. In order to support sparse memory this mfn list is accessed via a three level p2m tree built early in the boot process. Whenever the system needs the mfn associated with a pfn this tree is used to find the mfn. Instead of using a software walked tree for accessing a specific mfn list entry this patch is creating a virtual address area for the entire possible mfn list including memory holes. The holes are covered by mapping a pre-defined page consisting only of "invalid mfn" entries. Access to a mfn entry is possible by just using the virtual base address of the mfn list and the pfn as index into that list. This speeds up the (hot) path of determining the mfn of a pfn. Kernel build on a Dell Latitude E6440 (2 cores, HT) in 64 bit Dom0 showed following improvements: Elapsed time: 32:50 -> 32:35 System: 18:07 -> 17:47 User: 104:00 -> 103:30 Tested with following configurations: - 64 bit dom0, 8GB RAM - 64 bit dom0, 128 GB RAM, PCI-area above 4 GB - 32 bit domU, 512 MB, 8 GB, 43 GB (more wouldn't work even without the patch) - 32 bit domU, ballooning up and down - 32 bit domU, save and restore - 32 bit domU with PCI passthrough - 64 bit domU, 8 GB, 2049 MB, 5000 MB - 64 bit domU, ballooning up and down - 64 bit domU, save and restore - 64 bit domU with PCI passthrough Signed-off-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
由 Juergen Gross 提交于
Today get_phys_to_machine() is always called when the mfn for a pfn is to be obtained. Add a wrapper __pfn_to_mfn() as inline function to be able to avoid calling get_phys_to_machine() when possible as soon as the switch to a linear mapped p2m list has been done. Signed-off-by: NJuergen Gross <jgross@suse.com> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
由 Juergen Gross 提交于
Introduces lookup_pmd_address() to get the address of the pmd entry related to a virtual address in the current address space. This function is needed for support of a virtual mapped sparse p2m list in xen pv domains, as we need the address of the pmd entry, not the one of the pte in that case. Signed-off-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
由 Juergen Gross 提交于
When the physical memory configuration is initialized the p2m entries for not pouplated memory pages are set to "invalid". As those pages are beyond the hypervisor built p2m list the p2m tree has to be extended. This patch delays processing the extra memory related p2m entries during the boot process until some more basic memory management functions are callable. This removes the need to create new p2m entries until virtual memory management is available. Signed-off-by: NJuergen Gross <jgross@suse.com> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
由 Juergen Gross 提交于
The m2p overrides are used to be able to find the local pfn for a foreign mfn mapped into the domain. They are used by driver backends having to access frontend data. As this functionality isn't used in early boot it makes no sense to initialize the m2p override functions very early. It can be done later without doing any harm, removing the need for allocating memory via extend_brk(). While at it make some m2p override functions static as they are only used internally. Signed-off-by: NJuergen Gross <jgross@suse.com> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
由 Juergen Gross 提交于
Early in the boot process the memory layout of a pv-domain is changed to match the E820 map (either the host one for Dom0 or the Xen one) regarding placement of RAM and PCI holes. This requires removing memory pages initially located at positions not suitable for RAM and adding them later at higher addresses where no restrictions apply. To be able to operate on the hypervisor supported p2m list until a virtual mapped linear p2m list can be constructed, remapping must be delayed until virtual memory management is initialized, as the initial p2m list can't be extended unlimited at physical memory initialization time due to it's fixed structure. A further advantage is the reduction in complexity and code volume as we don't have to be careful regarding memory restrictions during p2m updates. Signed-off-by: NJuergen Gross <jgross@suse.com> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
由 Juergen Gross 提交于
In arch/x86/xen/p2m.c three different allocation functions for obtaining a memory page are used: extend_brk(), alloc_bootmem_align() or __get_free_page(). Which of those functions is used depends on the progress of the boot process of the system. Introduce a common allocation routine selecting the to be called allocation routine dynamically based on the boot progress. This allows moving initialization steps without having to care about changing allocation calls. Signed-off-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-