- 30 1月, 2016 1 次提交
-
-
由 Borislav Petkov 提交于
Move them to a separate header and have the following dependency: x86/cpufeatures.h <- x86/processor.h <- x86/cpufeature.h This makes it easier to use the header in asm code and not include the whole cpufeature.h and add guards for asm. Suggested-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1453842730-28463-5-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 02 6月, 2015 1 次提交
-
-
由 Ingo Molnar 提交于
So the dwarf2 annotations in low level assembly code have become an increasing hindrance: unreadable, messy macros mixed into some of the most security sensitive code paths of the Linux kernel. These debug info annotations don't even buy the upstream kernel anything: dwarf driven stack unwinding has caused problems in the past so it's out of tree, and the upstream kernel only uses the much more robust framepointers based stack unwinding method. In addition to that there's a steady, slow bitrot going on with these annotations, requiring frequent fixups. There's no tooling and no functionality upstream that keeps it correct. So burn down the sick forest, allowing new, healthier growth: 27 files changed, 350 insertions(+), 1101 deletions(-) Someone who has the willingness and time to do this properly can attempt to reintroduce dwarf debuginfo in x86 assembly code plus dwarf unwinding from first principles, with the following conditions: - it should be maximally readable, and maximally low-key to 'ordinary' code reading and maintenance. - find a build time method to insert dwarf annotations automatically in the most common cases, for pop/push instructions that manipulate the stack pointer. This could be done for example via a preprocessing step that just looks for common patterns - plus special annotations for the few cases where we want to depart from the default. We have hundreds of CFI annotations, so automating most of that makes sense. - it should come with build tooling checks that ensure that CFI annotations are sensible. We've seen such efforts from the framepointer side, and there's no reason it couldn't be done on the dwarf side. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 23 2月, 2015 2 次提交
-
-
由 Borislav Petkov 提交于
Make REP_GOOD variant the default after alternatives have run. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
由 Borislav Petkov 提交于
Up until now we have always paid attention to make sure the length of the new instruction replacing the old one is at least less or equal to the length of the old instruction. If the new instruction is longer, at the time it replaces the old instruction it will overwrite the beginning of the next instruction in the kernel image and cause your pants to catch fire. So instead of having to pay attention, teach the alternatives framework to pad shorter old instructions with NOPs at buildtime - but only in the case when len(old instruction(s)) < len(new instruction(s)) and add nothing in the >= case. (In that case we do add_nops() when patching). This way the alternatives user shouldn't have to care about instruction sizes and simply use the macros. Add asm ALTERNATIVE* flavor macros too, while at it. Also, we need to save the pad length in a separate struct alt_instr member for NOP optimization and the way to do that reliably is to carry the pad length instead of trying to detect whether we're looking at single-byte NOPs or at pathological instruction offsets like e9 90 90 90 90, for example, which is a valid instruction. Thanks to Michael Matz for the great help with toolchain questions. Signed-off-by: NBorislav Petkov <bp@suse.de>
-
- 14 2月, 2015 1 次提交
-
-
由 Andrey Ryabinin 提交于
Recently instrumentation of builtin functions calls was removed from GCC 5.0. To check the memory accessed by such functions, userspace asan always uses interceptors for them. So now we should do this as well. This patch declares memset/memmove/memcpy as weak symbols. In mm/kasan/kasan.c we have our own implementation of those functions which checks memory before accessing it. Default memset/memmove/memcpy now now always have aliases with '__' prefix. For files that built without kasan instrumentation (e.g. mm/slub.c) original mem* replaced (via #define) with prefixed variants, cause we don't want to check memory accesses there. Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: NAndrey Konovalov <adech.fo@gmail.com> Cc: Yuri Gribov <tetra2005@gmail.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 4月, 2013 1 次提交
-
-
由 Andy Shevchenko 提交于
As suggested by Peter Anvin. Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: H . Peter Anvin <hpa@linux.intel.com> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 27 1月, 2012 2 次提交
-
-
由 Jan Beulich 提交于
While hard to measure, reducing the number of possibly/likely mis-predicted branches can generally be expected to be slightly better. Other than apparent at the first glance, this also doesn't grow the function size (the alignment gap to the next function just gets smaller). Signed-off-by: NJan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/4F218584020000780006F422@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jan Beulich 提交于
While currently there doesn't appear to be any reachable in-tree case where such large memory blocks may be passed to memcpy(), we already had hit the problem in our Xen kernels. Just like done recently for mmeset(), rather than working around it, prevent others from falling into the same trap by fixing this long standing limitation. Signed-off-by: NJan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/4F21846F020000780006F3FA@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 5月, 2011 1 次提交
-
-
由 Fenghua Yu 提交于
Support memcpy() with enhanced rep movsb. On processors supporting enhanced rep movsb, the alternative memcpy() function using enhanced rep movsb overrides the original function and the fast string function. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1305671358-14478-8-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 02 5月, 2011 1 次提交
-
-
由 Bart Van Assche 提交于
Signed-off-by: NBart Van Assche <bvanassche@acm.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: http://lkml.kernel.org/r/201105011409.21629.bvanassche@acm.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 24 8月, 2010 1 次提交
-
-
由 Ma Ling 提交于
All read operations after allocation stage can run speculatively, all write operation will run in program order, and if addresses are different read may run before older write operation, otherwise wait until write commit. However CPU don't check each address bit, so read could fail to recognize different address even they are in different page.For example if rsi is 0xf004, rdi is 0xe008, in following operation there will generate big performance latency. 1. movq (%rsi), %rax 2. movq %rax, (%rdi) 3. movq 8(%rsi), %rax 4. movq %rax, 8(%rdi) If %rsi and rdi were in really the same meory page, there are TRUE read-after-write dependence because instruction 2 write 0x008 and instruction 3 read 0x00c, the two address are overlap partially. Actually there are in different page and no any issues, but without checking each address bit CPU could think they are in the same page, and instruction 3 have to wait for instruction 2 to write data into cache from write buffer, then load data from cache, the cost time read spent is equal to mfence instruction. We may avoid it by tuning operation sequence as follow. 1. movq 8(%rsi), %rax 2. movq %rax, 8(%rdi) 3. movq (%rsi), %rax 4. movq %rax, (%rdi) Instruction 3 read 0x004, instruction 2 write address 0x010, no any dependence. At last on Core2 we gain 1.83x speedup compared with original instruction sequence. In this patch we first handle small size(less 20bytes), then jump to different copy mode. Based on our micro-benchmark small bytes from 1 to 127 bytes, we got up to 2X improvement, and up to 1.5X improvement for 1024 bytes on Corei7. (We use our micro-benchmark, and will do further test according to your requirment) Signed-off-by: NMa Ling <ling.ma@intel.com> LKML-Reference: <1277753065-18610-1-git-send-email-ling.ma@intel.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 08 7月, 2010 1 次提交
-
-
由 H. Peter Anvin 提交于
We already have cpufeature indicies above 255, so use a 16-bit number for the alternatives index. This consumes a padding field and so doesn't add any size, but it means that abusing the padding field to create assembly errors on overflow no longer works. We can retain the test simply by redirecting it to the .discard section, however. [ v3: updated to include open-coded locations ] Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <tip-f88731e3068f9d1392ba71cc9f50f035d26a0d4f@git.kernel.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 30 12月, 2009 1 次提交
-
-
由 Jan Beulich 提交于
In order to avoid unnecessary chains of branches, rather than implementing memcpy()/memset()'s access to their alternative implementations via a jump, patch the (larger) original function directly. The memcpy() part of this is slightly subtle: while alternative instruction patching does itself use memcpy(), with the replacement block being less than 64-bytes in size the main loop of the original function doesn't get used for copying memcpy_c() over memcpy(), and hence we can safely write over its beginning. Also note that the CFI annotations are fine for both variants of each of the functions. Signed-off-by: NJan Beulich <jbeulich@novell.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4B2BB8D30200007800026AF2@vpn.id2.novell.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 3月, 2009 2 次提交
-
-
由 Ingo Molnar 提交于
Impact: cleanup Make this file more readable by bringing it more in line with the usual kernel style. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jan Beulich 提交于
Impact: micro-optimization This should slightly improve its performance. Signed-off-by: NJan Beulich <jbeulich@novell.com> LKML-Reference: <49B8F641.76E4.0078.0@novell.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 11 10月, 2007 2 次提交
-
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 8月, 2007 1 次提交
-
-
由 Petr Vandrovec 提交于
apply_alternatives uses memcpy() to apply alternatives. Which has the unfortunate effect that while applying memcpy alternative to memcpy itself it tries to overwrite itself with nops - which causes #UD fault as it overwrites half of an instruction in copy loop, and from this point on only possible outcome is triplefault and reboot. So let's overwrite only first two instructions of memcpy - as long as the main memcpy loop is not in first two bytes it will work fine. Signed-off-by: NPetr Vandrovec <petr@vandrovec.name> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 10月, 2006 1 次提交
-
-
由 Dave Jones 提交于
kbuild explicitly includes this at build time. Signed-off-by: NDave Jones <davej@redhat.com>
-
- 26 9月, 2006 1 次提交
-
-
由 Jan Beulich 提交于
Add unwind annotations to arch/x86_64/lib/*.S, and also use the macros provided by linux/linkage.h where-ever possible. Some of the alternative instructions handling needed to be adjusted so that the replacement code would also have valid unwind information. Signed-off-by: NJan Beulich <jbeulich@novell.com> Signed-off-by: NAndi Kleen <ak@suse.de>
-
- 05 2月, 2006 1 次提交
-
-
由 Andi Kleen 提交于
They cause quite bad performance regressions on Netburst This is temporary until we can get new optimized functions for these CPUs. This undoes changes that were done in 2.6.15 and in 2.6.16-rc1, essentially bringing the code back to 2.6.14 level. Only change is I renamed the X86_FEATURE_K8_C flag to X86_FEATURE_REP_GOOD and fixed the check for the flag and also fixed some comments. Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 15 11月, 2005 1 次提交
-
-
由 Andi Kleen 提交于
B stepping were the first shipping Opterons. memcpy/memset/copy_page/ clear_page had special optimized version for them. These are really old and in the minority now and the difference to the generic versions (using rep microcode) is not that big anyways. So just remove them. TODO: figure out optimized versions for Intel Netburst based EM64T Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 17 4月, 2005 1 次提交
-
-
由 Linus Torvalds 提交于
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
-