- 08 4月, 2014 18 次提交
-
-
由 Catalin Marinas 提交于
If the buffer needing cache invalidation for inbound DMA does start or end on a cache line aligned address, we need to use the non-destructive clean&invalidate operation. This issue was introduced by commit 7363590d (arm64: Implement coherent DMA API based on swiotlb). Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Reported-by: NJon Medhurst (Tixy) <tixy@linaro.org>
-
由 Mark Salter 提交于
Add support for early IO or memory mappings which are needed before the normal ioremap() is usable. This also adds fixmap support for permanent fixed mappings such as that used by the earlyprintk device register region. Signed-off-by: NMark Salter <msalter@redhat.com> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Dave Young <dyoung@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mark Salter 提交于
Presently, paging_init() calls init_mem_pgprot() to initialize pgprot values used by macros such as PAGE_KERNEL, PAGE_KERNEL_EXEC, etc. The new fixmap and early_ioremap support also needs to use these macros before paging_init() is called. This patch moves the init_mem_pgprot() call out of paging_init() and into setup_arch() so that pgprot_default gets initialized in time for fixmap and early_ioremap. Signed-off-by: NMark Salter <msalter@redhat.com> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Dave Young <dyoung@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mark Salter 提交于
Move x86 over to the generic early ioremap implementation. Signed-off-by: NMark Salter <msalter@redhat.com> Acked-by: NH. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dave Young 提交于
This patch series takes the common bits from the x86 early ioremap implementation and creates a generic implementation which may be used by other architectures. The early ioremap interfaces are intended for situations where boot code needs to make temporary virtual mappings before the normal ioremap interfaces are available. Typically, this means before paging_init() has run. This patch (of 6): There's a lot of sparse warnings for code like below: void *a = early_memremap(phys_addr, size); early_memremap intend to map kernel memory with ioremap facility, the return pointer should be a kernel ram pointer instead of iomem one. For making the function clearer and supressing sparse warnings this patch do below two things: 1. cast to (__force void *) for the return value of early_memremap 2. add early_memunmap function and pass (__force void __iomem *) to iounmap From Boris: "Ingo told me yesterday, it makes sense too. I'd guess we can try it. FWIW, all callers of early_memremap use the memory they get remapped as normal memory so we should be safe" Signed-off-by: NDave Young <dyoung@redhat.com> Signed-off-by: NMark Salter <msalter@redhat.com> Acked-by: NH. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Lameter 提交于
The kernel has never been audited to ensure that this_cpu operations are consistently used throughout the kernel. The code generated in many places can be improved through the use of this_cpu operations (which uses a segment register for relocation of per cpu offsets instead of performing address calculations). The patch set also addresses various consistency issues in general with the per cpu macros. A. The semantics of __this_cpu_ptr() differs from this_cpu_ptr only because checks are skipped. This is typically shown through a raw_ prefix. So this patch set changes the places where __this_cpu_ptr() is used to raw_cpu_ptr(). B. There has been the long term wish by some that __this_cpu operations would check for preemption. However, there are cases where preemption checks need to be skipped. This patch set adds raw_cpu operations that do not check for preemption and then adds preemption checks to the __this_cpu operations. C. The use of __get_cpu_var is always a reference to a percpu variable that can also be handled via a this_cpu operation. This patch set replaces all uses of __get_cpu_var with this_cpu operations. D. We can then use this_cpu RMW operations in various places replacing sequences of instructions by a single one. E. The use of this_cpu operations throughout will allow other arches than x86 to implement optimized references and RMV operations to work with per cpu local data. F. The use of this_cpu operations opens up the possibility to further optimize code that relies on synchronization through per cpu data. The patch set works in a couple of stages: I. Patch 1 adds the additional raw_cpu operations and raw_cpu_ptr(). Also converts the existing __this_cpu_xx_# primitive in the x86 code to raw_cpu_xx_#. II. Patch 2-4 use the raw_cpu operations in places that would give us false positives once they are enabled. III. Patch 5 adds preemption checks to __this_cpu operations to allow checking if preemption is properly disabled when these functions are used. IV. Patches 6-20 are patches that simply replace uses of __get_cpu_var with this_cpu_ptr. They do not depend on any changes to the percpu code. No preemption tests are skipped if they are applied. V. Patches 21-46 are conversion patches that use this_cpu operations in various kernel subsystems/drivers or arch code. VI. Patches 47/48 (not included in this series) remove no longer used functions (__this_cpu_ptr and __get_cpu_var). These should only be applied after all the conversion patches have made it and after we have done additional passes through the kernel to ensure that none of the uses of these functions remain. This patch (of 46): The patches following this one will add preemption checks to __this_cpu ops so we need to have an alternative way to use this_cpu operations without preemption checks. raw_cpu_ops will be the basis for all other ops since these will be the operations that do not implement any checks. Primitive operations are renamed by this patch from __this_cpu_xxx to raw_cpu_xxxx. Also change the uses of the x86 percpu primitives in preempt.h. These depend directly on asm/percpu.h (header #include nesting issue). Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NChristoph Lameter <cl@linux.com> Acked-by: NIngo Molnar <mingo@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Alex Shi <alex.shi@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bryan Wu <cooloney@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: David Daney <david.daney@cavium.com> Cc: David Miller <davem@davemloft.net> Cc: David S. Miller <davem@davemloft.net> Cc: Dimitri Sivanich <sivanich@sgi.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Hedi Berriche <hedi@sgi.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: John Stultz <john.stultz@linaro.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Mike Travis <travis@sgi.com> Cc: Neil Brown <neilb@suse.de> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Robert Richter <rric@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Howells 提交于
arch_align_stack() moved to asm/exec.h, so change the comment referring to asm/system.h which no longer exists. Signed-off-by: NDavid Howells <dhowells@redhat.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Uwe Kleine-König 提交于
If the renamed symbol is defined lib/iomap.c implements ioport_map and ioport_unmap and currently (nearly) all platforms define the port accessor functions outb/inb and friend unconditionally. So HAS_IOPORT_MAP is the better name for this. Consequently NO_IOPORT is renamed to NO_IOPORT_MAP. The motivation for this change is to reintroduce a symbol HAS_IOPORT that signals if outb/int et al are available. I will address that at least one merge window later though to keep surprises to a minimum and catch new introductions of (HAS|NO)_IOPORT. The changes in this commit were done using: $ git grep -l -E '(NO|HAS)_IOPORT' | xargs perl -p -i -e 's/\b((?:CONFIG_)?(?:NO|HAS)_IOPORT)\b/$1_MAP/' Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Josh Triplett 提交于
This ensures that BUG() always has a definition that causes a trap (via an undefined instruction), and that the compiler still recognizes the code following BUG() as unreachable, avoiding warnings that would otherwise appear (such as on non-void functions that don't return a value after BUG()). In addition to saving a few bytes over the generic infinite-loop implementation, this implementation traps rather than looping, which potentially allows for better error-recovery behavior (such as by rebooting). Signed-off-by: NJosh Triplett <josh@joshtriplett.org> Reported-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NArnd Bergmann <arnd@arndb.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Josh Triplett 提交于
Fix breakage which will be exposed by the patch "kconfig: make allnoconfig disable options behind EMBEDDED and EXPERT". arch/powerpc/kernel/mce.c, compiled in for PPC_BOOK3S_64, calls functions only built when IRQ_WORK, so select it. Fixes the following build error: arch/powerpc/kernel/built-in.o: In function `.machine_check_queue_event': (.text+0x11260): undefined reference to `.irq_work_queue' Signed-off-by: NJosh Triplett <josh@joshtriplett.org> Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Josh Triplett 提交于
Fix breakage which will be exposed by the patch "kconfig: make allnoconfig disable options behind EMBEDDED and EXPERT". arch/ia64/kernel/unaligned.c uses tty_write_message to print an unaligned access exception to the TTY of the current user process. Enable TTY to prevent a build error. Minimal fix, on the basis that few people on ia64 will care deeply about kernel size enough to turn off TTY. Ideally, I'd instead suggest dropping the tty_write_message entirely, and just leaving the printk. Bonus: no need to sprintf first. Signed-off-by: NJosh Triplett <josh@joshtriplett.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Geert Uytterhoeven 提交于
Fix breakage which will be exposed by the patch "kconfig: make allnoconfig disable options behind EMBEDDED and EXPERT". Now allnoconfig started disabling CONFIG_PROC_FS: arch/cris/kernel/built-in.o:(.rodata+0xc): undefined reference to `show_cpuinfo' make: *** [vmlinux] Error 1 Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Josh Triplett 提交于
Fix breakage which will be exposed by the patch "kconfig: make allnoconfig disable options behind EMBEDDED and EXPERT". arch/cris/arch-v10/kernel/debugport.c, compiled in unconditionally with ETRAX_ARCH_V10, requires TTY, so select TTY to avoid a build failure. Signed-off-by: NJosh Triplett <josh@joshtriplett.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexandre Bounine 提交于
This patch removes an artificial RapidIO bus root device and establishes actual device hierarchy by providing reference to real parent devices. It also introduces device class for RapidIO controller devices (on-chip or an eternal bridge, known as "mport"). Existing implementation was sufficient for SoC-based platforms that have a single RapidIO controller. With introduction of devices using multiple RapidIO controllers and PCIe-to-RapidIO bridges the old scheme is very limiting or does not work at all. The implemented changes allow to properly reference platform's local RapidIO mport devices and provide device details needed for upper layers. This change to RapidIO device hierarchy does not break any known existing kernel or user space interfaces. Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Li Yang <leoli@freescale.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com> Cc: Stef van Os <stef.van.os@prodrive-technologies.com> Cc: Jerry Jacobs <jerry.jacobs@prodrive-technologies.com> Cc: Arno Tiemersma <arno.tiemersma@prodrive-technologies.com> Cc: Rob Landley <rob@landley.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rashika Kheria 提交于
Eliminate the following warning in proc/vmcore.c: fs/proc/vmcore.c:1088:6: warning: no previous prototype for `vmcore_cleanup' [-Wmissing-prototypes] [akpm@linux-foundation.org: clean up powerpc, remove unneeded EXPORT_SYMBOL] Signed-off-by: NRashika Kheria <rashika.kheria@gmail.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Davidlohr Bueso 提交于
This patch is a continuation of efforts trying to optimize find_vma(), avoiding potentially expensive rbtree walks to locate a vma upon faults. The original approach (https://lkml.org/lkml/2013/11/1/410), where the largest vma was also cached, ended up being too specific and random, thus further comparison with other approaches were needed. There are two things to consider when dealing with this, the cache hit rate and the latency of find_vma(). Improving the hit-rate does not necessarily translate in finding the vma any faster, as the overhead of any fancy caching schemes can be too high to consider. We currently cache the last used vma for the whole address space, which provides a nice optimization, reducing the total cycles in find_vma() by up to 250%, for workloads with good locality. On the other hand, this simple scheme is pretty much useless for workloads with poor locality. Analyzing ebizzy runs shows that, no matter how many threads are running, the mmap_cache hit rate is less than 2%, and in many situations below 1%. The proposed approach is to replace this scheme with a small per-thread cache, maximizing hit rates at a very low maintenance cost. Invalidations are performed by simply bumping up a 32-bit sequence number. The only expensive operation is in the rare case of a seq number overflow, where all caches that share the same address space are flushed. Upon a miss, the proposed replacement policy is based on the page number that contains the virtual address in question. Concretely, the following results are seen on an 80 core, 8 socket x86-64 box: 1) System bootup: Most programs are single threaded, so the per-thread scheme does improve ~50% hit rate by just adding a few more slots to the cache. +----------------+----------+------------------+ | caching scheme | hit-rate | cycles (billion) | +----------------+----------+------------------+ | baseline | 50.61% | 19.90 | | patched | 73.45% | 13.58 | +----------------+----------+------------------+ 2) Kernel build: This one is already pretty good with the current approach as we're dealing with good locality. +----------------+----------+------------------+ | caching scheme | hit-rate | cycles (billion) | +----------------+----------+------------------+ | baseline | 75.28% | 11.03 | | patched | 88.09% | 9.31 | +----------------+----------+------------------+ 3) Oracle 11g Data Mining (4k pages): Similar to the kernel build workload. +----------------+----------+------------------+ | caching scheme | hit-rate | cycles (billion) | +----------------+----------+------------------+ | baseline | 70.66% | 17.14 | | patched | 91.15% | 12.57 | +----------------+----------+------------------+ 4) Ebizzy: There's a fair amount of variation from run to run, but this approach always shows nearly perfect hit rates, while baseline is just about non-existent. The amounts of cycles can fluctuate between anywhere from ~60 to ~116 for the baseline scheme, but this approach reduces it considerably. For instance, with 80 threads: +----------------+----------+------------------+ | caching scheme | hit-rate | cycles (billion) | +----------------+----------+------------------+ | baseline | 1.06% | 91.54 | | patched | 99.97% | 14.18 | +----------------+----------+------------------+ [akpm@linux-foundation.org: fix nommu build, per Davidlohr] [akpm@linux-foundation.org: document vmacache_valid() logic] [akpm@linux-foundation.org: attempt to untangle header files] [akpm@linux-foundation.org: add vmacache_find() BUG_ON] [hughd@google.com: add vmacache_valid_mm() (from Oleg)] [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: adjust and enhance comments] Signed-off-by: NDavidlohr Bueso <davidlohr@hp.com> Reviewed-by: NRik van Riel <riel@redhat.com> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Reviewed-by: NMichel Lespinasse <walken@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Tested-by: NHugh Dickins <hughd@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alex Thorlton 提交于
The main motivation behind this patch is to provide a way to disable THP for jobs where the code cannot be modified, and using a malloc hook with madvise is not an option (i.e. statically allocated data). This patch allows us to do just that, without affecting other jobs running on the system. We need to do this sort of thing for jobs where THP hurts performance, due to the possibility of increased remote memory accesses that can be created by situations such as the following: When you touch 1 byte of an untouched, contiguous 2MB chunk, a THP will be handed out, and the THP will be stuck on whatever node the chunk was originally referenced from. If many remote nodes need to do work on that same chunk, they'll be making remote accesses. With THP disabled, 4K pages can be handed out to separate nodes as they're needed, greatly reducing the amount of remote accesses to memory. This patch is based on some of my work combined with some suggestions/patches given by Oleg Nesterov. The main goal here is to add a prctl switch to allow us to disable to THP on a per mm_struct basis. Here's a bit of test data with the new patch in place... First with the flag unset: # perf stat -a ./prctl_wrapper_mmv3 0 ./thp_pthread -C 0 -m 0 -c 512 -b 256g Setting thp_disabled for this task... thp_disable: 0 Set thp_disabled state to 0 Process pid = 18027 PF/ MAX MIN TOTCPU/ TOT_PF/ TOT_PF/ WSEC/ TYPE: CPUS WALL WALL SYS USER TOTCPU CPU WALL_SEC SYS_SEC CPU NODES 512 1.120 0.060 0.000 0.110 0.110 0.000 28571428864 -9223372036854775808 55803572 23 Performance counter stats for './prctl_wrapper_mmv3_hack 0 ./thp_pthread -C 0 -m 0 -c 512 -b 256g': 273719072.841402 task-clock # 641.026 CPUs utilized [100.00%] 1,008,986 context-switches # 0.000 M/sec [100.00%] 7,717 CPU-migrations # 0.000 M/sec [100.00%] 1,698,932 page-faults # 0.000 M/sec 355,222,544,890,379 cycles # 1.298 GHz [100.00%] 536,445,412,234,588 stalled-cycles-frontend # 151.02% frontend cycles idle [100.00%] 409,110,531,310,223 stalled-cycles-backend # 115.17% backend cycles idle [100.00%] 148,286,797,266,411 instructions # 0.42 insns per cycle # 3.62 stalled cycles per insn [100.00%] 27,061,793,159,503 branches # 98.867 M/sec [100.00%] 1,188,655,196 branch-misses # 0.00% of all branches 427.001706337 seconds time elapsed Now with the flag set: # perf stat -a ./prctl_wrapper_mmv3 1 ./thp_pthread -C 0 -m 0 -c 512 -b 256g Setting thp_disabled for this task... thp_disable: 1 Set thp_disabled state to 1 Process pid = 144957 PF/ MAX MIN TOTCPU/ TOT_PF/ TOT_PF/ WSEC/ TYPE: CPUS WALL WALL SYS USER TOTCPU CPU WALL_SEC SYS_SEC CPU NODES 512 0.620 0.260 0.250 0.320 0.570 0.001 51612901376 128000000000 100806448 23 Performance counter stats for './prctl_wrapper_mmv3_hack 1 ./thp_pthread -C 0 -m 0 -c 512 -b 256g': 138789390.540183 task-clock # 641.959 CPUs utilized [100.00%] 534,205 context-switches # 0.000 M/sec [100.00%] 4,595 CPU-migrations # 0.000 M/sec [100.00%] 63,133,119 page-faults # 0.000 M/sec 147,977,747,269,768 cycles # 1.066 GHz [100.00%] 200,524,196,493,108 stalled-cycles-frontend # 135.51% frontend cycles idle [100.00%] 105,175,163,716,388 stalled-cycles-backend # 71.07% backend cycles idle [100.00%] 180,916,213,503,160 instructions # 1.22 insns per cycle # 1.11 stalled cycles per insn [100.00%] 26,999,511,005,868 branches # 194.536 M/sec [100.00%] 714,066,351 branch-misses # 0.00% of all branches 216.196778807 seconds time elapsed As with previous versions of the patch, We're getting about a 2x performance increase here. Here's a link to the test case I used, along with the little wrapper to activate the flag: http://oss.sgi.com/projects/memtests/thp_pthread_mmprctlv3.tar.gz This patch (of 3): Revert commit 8e72033f and add in code to fix up any issues caused by the revert. The revert is necessary because hugepage_madvise would return -EINVAL when VM_NOHUGEPAGE is set, which will break subsequent chunks of this patch set. Here's a snip of an e-mail from Gerald detailing the original purpose of this code, and providing justification for the revert: "The intent of commit 8e72033f was to guard against any future programming errors that may result in an madvice(MADV_HUGEPAGE) on guest mappings, which would crash the kernel. Martin suggested adding the bit to arch/s390/mm/pgtable.c, if 8e72033f was to be reverted, because that check will also prevent a kernel crash in the case described above, it will now send a SIGSEGV instead. This would now also allow to do the madvise on other parts, if needed, so it is a more flexible approach. One could also say that it would have been better to do it this way right from the beginning..." Signed-off-by: NAlex Thorlton <athorlton@sgi.com> Suggested-by: NOleg Nesterov <oleg@redhat.com> Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Laura Abbott 提交于
The Kconfig for CONFIG_STRICT_DEVMEM is missing despite being used in mmap.c. Add it. Signed-off-by: NLaura Abbott <lauraa@codeaurora.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 07 4月, 2014 1 次提交
-
-
由 Mark Salter 提交于
Recent arm64 builds using CONFIG_ARM64_64K_PAGES are failing with: arch/arm64/kernel/perf_regs.c: In function ‘perf_reg_abi’: arch/arm64/kernel/perf_regs.c:41:2: error: implicit declaration of function ‘is_compat_thread’ arch/arm64/kernel/perf_event.c:1398:2: error: unknown type name ‘compat_uptr_t’ This is due to some recent arm64 perf commits with compat support: commit 23c7d70d: ARM64: perf: add support for frame pointer unwinding in compat mode commit 2ee0d7fd: ARM64: perf: add support for perf registers API Those patches make the arm64 kernel unbuildable if CONFIG_COMPAT is not defined and CONFIG_ARM64_64K_PAGES depends on !CONFIG_COMPAT. This patch allows the arm64 kernel to build with and without CONFIG_COMPAT. Signed-off-by: NMark Salter <msalter@redhat.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 05 4月, 2014 21 次提交
-
-
由 Vineet Gupta 提交于
Despite the switch to right UART driver (prev patch), serial console still doesn't work due to missing CONFIG_SERIAL_OF_PLATFORM Also fix the default cmdline in DT to not refer to out-of-tree ARC framebuffer driver for console. Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: <stable@vger.kernel.org> #3.10, 3.12, 3.13, 3.14 Cc: Francois Bedard <Francois.Bedard@synopsys.com>
-
由 Mischa Jonker 提交于
The Synopsys APB DW UART has a couple of special features that are not in the System C model. In 3.8, the 8250_dw driver didn't really use these features, but from 3.9 onwards, the 8250_dw driver has become incompatible with our model. Signed-off-by: NMischa Jonker <mjonker@synopsys.com> Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Cc: <stable@vger.kernel.org> #3.10, 3.12, 3.13, 3.14 Cc: Francois Bedard <Francois.Bedard@synopsys.com>
-
由 Vineet Gupta 提交于
-Pass the expected arg to non-boot park'ing routine (It worked so far because existing SMP backends don't use the arg) -CONFIG_DEBUG_PREEMPT warning
-
由 Catalin Marinas 提交于
This reverts commit 82b2f495. The __boot_cpu_mode variable is flushed in head.S after being written, therefore the additional cache flushing is no longer required. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Catalin Marinas 提交于
With system caches for the host OS or architected caches for guest OS we cannot easily guarantee that there are no dirty or stale cache lines for the areas of memory written by the kernel during boot with the MMU off (therefore non-cacheable accesses). This patch adds the necessary cache maintenance during boot and relaxes the booting requirements. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Richard Kuo 提交于
Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Richard Kuo 提交于
The SP/r29 macro wasn't used anywhere else and was causing conflicts with another module, so just remove it. Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Richard Kuo 提交于
Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Richard Kuo 提交于
Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Richard Kuo 提交于
Normal writes in our our architecture don't invalidate lock reservations. Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Richard Kuo 提交于
Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Ilia Mirkin 提交于
swapper_pg_dir is an array of pgd_t, not pgd_t*. This has no actual effect since sizeof(pgd_t) == sizeof(pgd_t*), but unconfuses tools that check types. Signed-off-by: NIlia Mirkin <imirkin@alum.mit.edu> Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Jiang Liu 提交于
Commit 9a46ad6d "smp: make smp_call_function_many() use logic similar to smp_call_function_single()" has unified the way to handle single and multiple cross-CPU function calls. Now only one intterupt is needed for architecture specific code to support generic SMP function call interfaces, so kill the redundant single function call interrupt. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Shaohua Li <shli@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jiri Kosina <trivial@kernel.org> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: linux-hexagon@vger.kernel.org Signed-off-by: NJiang Liu <liuj97@gmail.com> Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Chen Gang 提交于
Need dumy mmiowb(), or can not pass compiling, the related error with allmodconfig: CC [M] drivers/mmc/host/sdhci.o drivers/mmc/host/sdhci.c: In function 'sdhci_request': drivers/mmc/host/sdhci.c:1409:2: error: implicit declaration of function 'mmiowb' [-Werror=implicit-function-declaration] Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Chen Gang 提交于
Need export all related functions and symbols for various modules with allmodconfig. The related errors: MODPOST 2879 modules ERROR: "__vmyield" [sound/sound_firmware.ko] undefined! ERROR: "__phys_offset" [sound/drivers/snd-dummy.ko] undefined! ERROR: "ioremap_nocache" [drivers/char/ipmi/ipmi_si.ko] undefined! ERROR: "__iounmap" [drivers/char/ipmi/ipmi_si.ko] undefined! ... For including files, need "linux/*" first, then "asm/*". All related included files and symbols need be sorted by alphabetical order. Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Chen Gang 提交于
arch: hexagon: kernel: reset.c: use function pointer instead of function for pm_power_off and export it 'pm_power_off' is a function pointer, not a function, so need change its type, and also need export it, or can not pass compiling with allmodconfig. The related error: MODPOST 2879 modules ERROR: "pm_power_off" [drivers/char/ipmi/ipmi_poweroff.ko] undefined! Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Chen Gang 提交于
Need include generic "vga.h", or can not pass compiling with allmodconfig, the related error: CC [M] drivers/gpu/drm/drm_irq.o In file included from include/linux/vgaarb.h:34:0, from drivers/gpu/drm/drm_irq.c:42: include/video/vga.h:22:21: fatal error: asm/vga.h: No such file or directory Also move "preempt.h" upper to match sort order. Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Chen Gang 提交于
Add "serial.h" in Kbuild, or can not pass compiling with allmodconfig, the related error: CC [M] drivers/staging/speakup/speakup_acntpc.o In file included from drivers/staging/speakup/speakup_acntpc.c:33:0: drivers/staging/speakup/serialio.h:7:24: fatal error: asm/serial.h: No such file or directory Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Chen Gang 提交于
Define dummy '__init' instead of include "linux/init.h" if !__KERNEL__, or can not pass checking. The related error (with allmodconfig under hexagon): CHECK include/asm (34 files) usr/include/asm/setup.h:22: included file 'linux/init.h' is not exported Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Chen Gang 提交于
Append "hvmc_" or "hvmi_" to all related enum members (which are too common to make conflict with another sub-systems). The related error with allmodconfig: CC [M] drivers/md/raid1.o drivers/md/raid1.c:1440:13: error: 'status' redeclared as different kind of symbol arch/hexagon/include/asm/hexagon_vm.h:76:2: note: previous definition of 'status' was here Also use 'affinity' instead of 'locdis' for __vmintop_affinity(). Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-
由 Chen Gang 提交于
arch: hexagon: Kconfig: add HAVE_DMA_ATTR in Kconfig and remove "linux/dma-mapping.h" from "asm/dma-mapping.h" When HAS_DMA, and also need use generic implementation, HAVE_DMA_ATTR must be enabled, or can not pass compiling with allmodconfig, the related error: CC [M] drivers/ata/libata-core.o drivers/ata/libata-core.c: In function 'ata_sg_clean': drivers/ata/libata-core.c:4598:3: error: implicit declaration of function 'dma_unmap_sg' [-Werror=implicit-function-declaration] drivers/ata/libata-core.c: In function 'ata_sg_setup': drivers/ata/libata-core.c:4708:2: error: implicit declaration of function 'dma_map_sg' [-Werror=implicit-function-declaration] "linux/dma-mapping.h" will include "asm/dma-mapping.h", so need remove "linux/dma-mapping.h" from "asm/dma-mapping.h", Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>
-