- 13 6月, 2012 1 次提交
-
-
由 Stephane Eranian 提交于
I noticed that the LBR fixups were not working anymore on programs where they used to. I tracked this down to a recent change to copy_from_user_nmi(): db0dc75d ("perf/x86: Check user address explicitly in copy_from_user_nmi()") This commit added a call to __range_not_ok() to the copy_from_user_nmi() routine. The problem is that the logic of the test must be reversed. __range_not_ok() returns 0 if the range is VALID. We want to return early from copy_from_user_nmi() if the range is NOT valid. Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NArun Sharma <asharma@fb.com> Link: http://lkml.kernel.org/r/20120611134426.GA7542@quadSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 6月, 2012 2 次提交
-
-
由 Arun Sharma 提交于
Signed-off-by: NArun Sharma <asharma@fb.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1334961696-19580-5-git-send-email-asharma@fb.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Masami Hiramatsu 提交于
Fix the x86 instruction decoder to decode bsr/bsf/jmpe with operand-size prefix (66h). This fixes the test case failure reported by Linus, attached below. bsf/bsr/jmpe have a special encoding. Opcode map in Intel Software Developers Manual vol2 says they have TZCNT/LZCNT variants if it has F3h prefix. However, there is no information if it has other 66h or F2h prefixes. Current instruction decoder supposes that those are bad instructions, but it actually accepts at least operand-size prefixes. H. Peter Anvin further explains: " TZCNT/LZCNT are F3 + BSF/BSR exactly because the F2 and F3 prefixes have historically been no-ops with most instructions. This allows software to unconditionally use the prefixed versions and get TZCNT/LZCNT on the processors that have them if they don't care about the difference. " This fixes errors reported by test_get_len: Warning: arch/x86/tools/test_get_len found difference at <em_bsf>:ffffffff81036d87 Warning: ffffffff81036de5: 66 0f bc c2 bsf %dx,%ax Warning: objdump says 4 bytes, but insn_get_length() says 3 Warning: arch/x86/tools/test_get_len found difference at <em_bsr>:ffffffff81036ea6 Warning: ffffffff81036f04: 66 0f bd c2 bsr %dx,%ax Warning: objdump says 4 bytes, but insn_get_length() says 3 Warning: decoded and checked 13298882 instructions with 2 warnings Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Reported-by: NPekka Enberg <penberg@kernel.org> Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <yrl.pp-manager.tt@hitachi.com> Link: http://lkml.kernel.org/r/20120604150911.22338.43296.stgit@localhost.localdomainSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 27 5月, 2012 2 次提交
-
-
由 Linus Torvalds 提交于
This throws away the old x86-specific functions in favor of the generic optimized version. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
The generic strncpy_from_user() is not really optimal, since it is designed to work on both little-endian and big-endian. And on little-endian you can simplify much of the logic to find the first zero byte, since little-endian arithmetic doesn't have to worry about the carry bit propagating into earlier bytes (only later bytes, which we don't care about). But I have patches to make the generic routines use the architecture- specific <asm/word-at-a-time.h> infrastructure, so that we can regain the little-endian optimizations. But before we do that, switch over to the generic routines to make the patches each do just one well-defined thing. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 4月, 2012 1 次提交
-
-
由 Linus Torvalds 提交于
This makes the newly optimized x86 strncpy_from_user clear the final bytes in the word past the final NUL character, rather than copy them as the word they were in the source. NOTE! Unlike the silly semantics of the libc 'strncpy()' function, the kernel strncpy_from_user() has never cleared all of the end of the destination buffer. And neither does it do so now: it only clears the bytes at the end of the last word it copied. So why make this change at all? It doesn't really cost us anything extra (we have to calculate the mask to get the length anyway), and it means that *if* any user actually cares about zeroing the whole buffer, they can do a "memset()" before the strncpy_from_user(), and we will no longer write random bytes after the NUL character. In particular, the buffer contents will now at no point contain random source data from beyond the end of the string. In other words, it makes behavior a bit more repeatable at no new cost, so it's a small cleanup. I've been carrying this as a patch for the last few weeks or so in my tree (done at the same time the sign error was fixed in commit 12e993b8), I might as well commit it. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 4月, 2012 7 次提交
-
-
由 H. Peter Anvin 提交于
Remove open-coded exception table entries in arch/x86/lib/usercopy_32.c, and replace them with _ASM_EXTABLE() macros; this will allow us to change the format and type of the exception table entries. Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Cc: David Daney <david.daney@cavium.com> Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
-
由 H. Peter Anvin 提交于
Remove open-coded exception table entries in arch/x86/lib/putuser.S, and replace them with _ASM_EXTABLE() macros; this will allow us to change the format and type of the exception table entries. Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Cc: David Daney <david.daney@cavium.com> Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
-
由 H. Peter Anvin 提交于
Remove open-coded exception table entries in arch/x86/lib/getuser.S, and replace them with _ASM_EXTABLE() macros; this will allow us to change the format and type of the exception table entries. Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Cc: David Daney <david.daney@cavium.com> Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
-
由 H. Peter Anvin 提交于
Remove open-coded exception table entries in arch/x86/lib/csum-copy_64.S, and replace them with _ASM_EXTABLE() macros; this will allow us to change the format and type of the exception table entries. Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Cc: David Daney <david.daney@cavium.com> Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
-
由 H. Peter Anvin 提交于
Remove open-coded exception table entries in arch/x86/lib/copy_user_nocache_64.S, and replace them with _ASM_EXTABLE() macros; this will allow us to change the format and type of the exception table entries. Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Cc: David Daney <david.daney@cavium.com> Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
-
由 H. Peter Anvin 提交于
Remove open-coded exception table entries in arch/x86/lib/copy_user_64.S, and replace them with _ASM_EXTABLE() macros; this will allow us to change the format and type of the exception table entries. Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Cc: David Daney <david.daney@cavium.com> Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
-
由 H. Peter Anvin 提交于
Remove open-coded exception table entries in arch/x86/lib/checksum_32.S, and replace them with _ASM_EXTABLE() macros; this will allow us to change the format and type of the exception table entries. Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Cc: David Daney <david.daney@cavium.com> Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
-
- 16 4月, 2012 2 次提交
-
-
由 Masami Hiramatsu 提交于
This can happen if the instruction is much longer than the maximum length, or if insn->opnd_bytes is manually changed. This patch also fixes warnings from -Wswitch-default flag. Reported-by: NPrashanth Nageshappa <prashanth@linux.vnet.ibm.com> Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com> Cc: Linux-mm <linux-mm@kvack.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: yrl.pp-manager.tt@hitachi.com Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20120413032427.32577.42602.stgit@localhost.localdomainSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Linus Torvalds 提交于
The 'max' range needs to be unsigned, since the size of the user address space is bigger than 2GB. We know that 'count' is positive in 'long' (that is checked in the caller), so we will truncate 'max' down to something that fits in a signed long, but before we actually do that, that comparison needs to be done in unsigned. Bug introduced in commit 92ae03f2 ("x86: merge 32/64-bit versions of 'strncpy_from_user()' and speed it up"). On x86-64 you can't trigger this, since the user address space is much smaller than 63 bits, and on x86-32 it works in practice, since you would seldom hit the strncpy limits anyway. I had actually tested the corner-cases, I had only tested them on x86-64. Besides, I had only worried about the case of a pointer *close* to the end of the address space, rather than really far away from it ;) This also changes the "we hit the user-specified maximum" to return 'res', for the trivial reason that gcc seems to generate better code that way. 'res' and 'count' are the same in that case, so it really doesn't matter which one we return. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 4月, 2012 1 次提交
-
-
由 Linus Torvalds 提交于
This merges the 32- and 64-bit versions of the x86 strncpy_from_user() by just rewriting it in C rather than the ancient inline asm versions that used lodsb/stosb and had been duplicated for (trivial) differences between the 32-bit and 64-bit versions. While doing that, it also speeds them up by doing the accesses a word at a time. Finally, the new routines also properly handle the case of hitting the end of the address space, which we have never done correctly before (fs/namei.c has a hack around it for that reason). Despite all these improvements, it actually removes more lines than it adds, due to the de-duplication. Also, we no longer export (or define) the legacy __strncpy_from_user() function (that was defined to not do the user permission checks), since it's not actually used anywhere, and the user address space checks are built in to the new code. Other architecture maintainers have been notified that the old hack in fs/namei.c will be going away in the 3.5 merge window, in case they copied the x86 approach of being a bit cavalier about the end of the address space. Cc: linux-arch@vger.kernel.org Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Anvin" <hpa@zytor.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 3月, 2012 1 次提交
-
-
由 Cong Wang 提交于
Acked-by: NAvi Kivity <avi@redhat.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NCong Wang <amwang@redhat.com>
-
- 10 3月, 2012 1 次提交
-
-
由 Thomas Gleixner 提交于
Commit f0fbf0ab ("x86: integrate delay functions") converted delay_tsc() into a random delay generator for 64 bit. The reason is that it merged the mostly identical versions of delay_32.c and delay_64.c. Though the subtle difference of the result was: static void delay_tsc(unsigned long loops) { - unsigned bclock, now; + unsigned long bclock, now; Now the function uses rdtscl() which returns the lower 32bit of the TSC. On 32bit that's not problematic as unsigned long is 32bit. On 64 bit this fails when the lower 32bit are close to wrap around when bclock is read, because the following check if ((now - bclock) >= loops) break; evaluated to true on 64bit for e.g. bclock = 0xffffffff and now = 0 because the unsigned long (now - bclock) of these values results in 0xffffffff00000001 which is definitely larger than the loops value. That explains Tvortkos observation: "Because I am seeing udelay(500) (_occasionally_) being short, and that by delaying for some duration between 0us (yep) and 491us." Make those variables explicitely u32 again, so this works for both 32 and 64 bit. Reported-by: NTvrtko Ursulin <tvrtko.ursulin@onelan.co.uk> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org # >= 2.6.27 Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 2月, 2012 1 次提交
-
-
由 Masami Hiramatsu 提交于
Fix to decode grouped AVX with VEX pp bits which should be handled as same as last-prefixes. This fixes below warnings in posttest with CONFIG_CRYPTO_SHA1_SSSE3=y. Warning: arch/x86/tools/test_get_len found difference at <sha1_transform_avx>:ffffffff810d5fc0 Warning: ffffffff810d6069: c5 f9 73 de 04 vpsrldq $0x4,%xmm6,%xmm0 Warning: objdump says 5 bytes, but insn_get_length() says 4 ... With this change, test_get_len can decode it correctly. $ arch/x86/tools/test_get_len -v -y ffffffff810d6069: c5 f9 73 de 04 vpsrldq $0x4,%xmm6,%xmm0 Succeed: decoded and checked 1 instructions Reported-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: yrl.pp-manager.tt@hitachi.com Link: http://lkml.kernel.org/r/20120210053340.30429.73410.stgit@localhost.localdomainSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 27 1月, 2012 2 次提交
-
-
由 Jan Beulich 提交于
While hard to measure, reducing the number of possibly/likely mis-predicted branches can generally be expected to be slightly better. Other than apparent at the first glance, this also doesn't grow the function size (the alignment gap to the next function just gets smaller). Signed-off-by: NJan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/4F218584020000780006F422@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jan Beulich 提交于
While currently there doesn't appear to be any reachable in-tree case where such large memory blocks may be passed to memcpy(), we already had hit the problem in our Xen kernels. Just like done recently for mmeset(), rather than working around it, prevent others from falling into the same trap by fixing this long standing limitation. Signed-off-by: NJan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/4F21846F020000780006F3FA@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 26 1月, 2012 1 次提交
-
-
由 Jan Beulich 提交于
While currently there doesn't appear to be any reachable in-tree case where such large memory blocks may be passed to memset() (alloc_bootmem() being the primary non-reachable one, as it gets called with suitably large sizes in FLATMEM configurations), we have recently hit the problem a second time in our Xen kernels. Rather than working around it a second time, prevent others from falling into the same trap by fixing this long standing limitation. Signed-off-by: NJan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/4F05D992020000780006AA09@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 21 1月, 2012 2 次提交
-
-
由 Jan Beulich 提交于
In the "xchg" implementation, %ebx and %ecx don't need to be copied into %eax and %edx respectively (this is only necessary when desiring to only read the stored value). In the "add_unless" implementation, swapping the use of %ecx and %esi for passing arguments allows %esi to become an input only (i.e. permitting the register to be re-used to address the same object without reload). In "{add,sub}_return", doing the initial read64 through the passed in %ecx decreases a register dependency. In "inc_not_zero", a branch can be eliminated by or-ing together the two halves of the current (64-bit) value, and code size can be further reduced by adjusting the arithmetic slightly. v2: Undo the folding of "xchg" and "set". Signed-off-by: NJan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/4F19A2BC020000780006E0DC@nat28.tlf.novell.com Cc: Luca Barbieri <luca@luca-barbieri.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Jan Beulich 提交于
Eric pointed out overly restrictive constraints in atomic64_set(), but there are issues throughout the file. In the cited case, %ebx and %ecx are inputs only (don't get changed by either of the two low level implementations). This was also the case elsewhere. Further in many cases early-clobber indicators were missing. Finally, the previous implementation rolled a custom alternative instruction macro from scratch, rather than using alternative_call() (which was introduced with the commit that the description of the change in question actually refers to). Adjusting has the benefit of not hiding referenced symbols from the compiler, which however requires them to be declared not just in the exporting source file (which, as a desirable side effect, in turn allows that exporting file to become a real 5-line stub). This patch does not eliminate the overly restrictive memory clobbers, however: Doing so would occasionally make the compiler set up a second register for accessing the memory object (to satisfy the added "m" constraint), and it's not clear which of the two non-optimal alternatives is better. v2: Re-do the declaration and exporting of the internal symbols. Reported-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NJan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/4F19A2A5020000780006E0D9@nat28.tlf.novell.com Cc: Luca Barbieri <luca@luca-barbieri.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 18 1月, 2012 1 次提交
-
-
由 Ulrich Drepper 提交于
The Intel documentation at http://software.intel.com/file/36945 shows the ANDN opcode and Group 17 with encoding f2 and f3 encoding respectively. The current version of x86-opcode-map.txt shows them with f3 and f4. Unless someone can point to documentation which shows the currently used encoding the following patch be applied. Signed-off-by: NUlrich Drepper <drepper@gmail.com> Link: http://lkml.kernel.org/r/CAOPLpQdq5SuVo9=023CYhbFLAX9rONyjmYq7jJkqc5xwctW5eA@mail.gmail.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 16 1月, 2012 1 次提交
-
-
由 Ulrich Drepper 提交于
The arch/x86/lib/x86-opcode-map.txt file [used by the kprobes instruction decoder] contains the line: af: SCAS/W/D/Q rAX,Xv This is what the Intel manuals show, but it's not correct. The 'X' stands for: Memory addressed by the DS:rSI register pair (for example, MOVS, CMPS, OUTS, or LODS). On the other hand 'Y' means (also see the ae byte entry for SCASB): Memory addressed by the ES:rDI register pair (for example, MOVS, CMPS, INS, STOS, or SCAS). Signed-off-by: NUlrich Drepper <drepper@gmail.com> Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: yrl.pp-manager.tt@hitachi.com Link: http://lkml.kernel.org/r/CAOPLpQfytPyDEBF1Hbkpo7ovUerEsstVGxBr%3DEpDL-BKEMaqLA@mail.gmail.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 06 1月, 2012 1 次提交
-
-
由 Jan Beulich 提交于
%r13 got saved and restored without ever getting touched, so there's no need to do so. Signed-off-by: NJan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/4F05D9F9020000780006AA0D@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 12月, 2011 1 次提交
-
-
由 Alexey Dobriyan 提交于
Current i386 strlen() hardcodes NOT/DEC sequence. DEC is mentioned to be suboptimal on Core2. So, put only REPNE SCASB sequence in assembly, compiler can do the rest. The difference in generated code is like below (MCORE2=y): <strlen>: push %edi mov $0xffffffff,%ecx mov %eax,%edi xor %eax,%eax repnz scas %es:(%edi),%al not %ecx - dec %ecx - mov %ecx,%eax + lea -0x1(%ecx),%eax pop %edi ret Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jan Beulich <JBeulich@suse.com> Link: http://lkml.kernel.org/r/20111211181319.GA17097@p183.telecom.bySigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 05 12月, 2011 2 次提交
-
-
由 Masami Hiramatsu 提交于
Since new Intel software developers manual introduces new format for AVX instruction set (including AVX2), it is important to update x86-opcode-map.txt to fit those changes. Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: yrl.pp-manager.tt@hitachi.com Link: http://lkml.kernel.org/r/20111205120557.15475.13236.stgit@cloudSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Masami Hiramatsu 提交于
For reducing memory usage of attribute table, x86 instruction decoder puts "Group" attribute only on "no-last-prefix" attribute table (same as vex_p == 0 case). Thus, the decoder should look no-last-prefix table first, and then only if it is not a group, move on to "with-last-prefix" table (vex_p != 0). However, current implementation, inat_get_avx_attribute() looks with-last-prefix directly. So, when decoding a grouped AVX instruction, the decoder fails to find correct group because there is no "Group" attribute on the table. This ends up with the mis-decoding of instructions, as Ingo reported in http://thread.gmane.org/gmane.linux.kernel/1214103 This patch fixes it to check no-last-prefix table first even if that is an AVX instruction, and get an attribute from "with last-prefix" table only if that is not a group. Reported-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: yrl.pp-manager.tt@hitachi.com Link: http://lkml.kernel.org/r/20111205120539.15475.91428.stgit@cloudSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 10 10月, 2011 1 次提交
-
-
由 Masami Hiramatsu 提交于
Fix x86 insn decoder for hardening against invalid length instructions. This adds length checkings for each byte-read site and if it exceeds MAX_INSN_SIZE, returns immediately. This can happen when decoding user-space binary. Caller can check whether it happened by checking insn.*.got member is set or not. Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: acme@redhat.com Cc: ming.m.lin@intel.com Cc: robert.richter@amd.com Cc: ravitillo@lbl.gov Cc: yrl.pp-manager.tt@hitachi.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20111007133155.10933.58577.stgit@localhost.localdomainSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 27 7月, 2011 1 次提交
-
-
由 Arun Sharma 提交于
This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: NArun Sharma <asharma@fb.com> Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: NMike Frysinger <vapier@gentoo.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 7月, 2011 1 次提交
-
-
由 Robert Richter 提交于
copy_from_user_nmi() is used in oprofile and perf. Moving it to other library functions like copy_from_user(). As this is x86 code for 32 and 64 bits, create a new file usercopy.c for unified code. Signed-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110607172413.GJ20052@erda.amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 21 7月, 2011 3 次提交
-
-
由 Jan Beulich 提交于
With the write lock path simply subtracting RW_LOCK_BIAS there is, on large systems, the theoretical possibility of overflowing the 32-bit value that was used so far (namely if 128 or more CPUs manage to do the subtraction, but don't get to do the inverse addition in the failure path quickly enough). A first measure is to modify RW_LOCK_BIAS itself - with the new value chosen, it is good for up to 2048 CPUs each allowed to nest over 2048 times on the read path without causing an issue. Quite possibly it would even be sufficient to adjust the bias a little further, assuming that allowing for significantly less nesting would suffice. However, as the original value chosen allowed for even more nesting levels, to support more than 2048 CPUs (possible currently only for 64-bit kernels) the lock itself gets widened to 64 bits. Signed-off-by: NJan Beulich <jbeulich@novell.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4E258E0D020000780004E3F0@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jan Beulich 提交于
Rather than having two functionally identical implementations for 32- and 64-bit configurations, use the previously extended assembly abstractions to fold the rwsem two implementations into a shared one. Signed-off-by: NJan Beulich <jbeulich@novell.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4E258DF3020000780004E3ED@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Jan Beulich 提交于
Rather than having two functionally identical implementations for 32- and 64-bit configurations, extend the existing assembly abstractions enough to fold the two rwlock implementations into a shared one. Signed-off-by: NJan Beulich <jbeulich@novell.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4E258DD7020000780004E3EA@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 14 7月, 2011 1 次提交
-
-
由 Andy Lutomirski 提交于
This save a few bytes on x86-64 and means that future patches can apply alternatives to unrelocated code. Signed-off-by: NAndy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/ff64a6b9a1a3860ca4a7b8b6dc7b4754f9491cd7.1310563276.git.luto@mit.eduSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 04 6月, 2011 1 次提交
-
-
由 Borislav Petkov 提交于
Drop thunk_ra macro in favor of an additional argument to the thunk macro since their bodies are almost identical. Do a whitespace scrubbing and use CFI-aware macros for full annotation. Signed-off-by: NBorislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1306873314-32523-5-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 18 5月, 2011 2 次提交
-
-
由 Jiri Olsa 提交于
As reported in BZ #30352: https://bugzilla.kernel.org/show_bug.cgi?id=30352 there's a kernel bug related to reading the last allowed page on x86_64. The _copy_to_user() and _copy_from_user() functions use the following check for address limit: if (buf + size >= limit) fail(); while it should be more permissive: if (buf + size > limit) fail(); That's because the size represents the number of bytes being read/write from/to buf address AND including the buf address. So the copy function will actually never touch the limit address even if "buf + size == limit". Following program fails to use the last page as buffer due to the wrong limit check: #include <sys/mman.h> #include <sys/socket.h> #include <assert.h> #define PAGE_SIZE (4096) #define LAST_PAGE ((void*)(0x7fffffffe000)) int main() { int fds[2], err; void * ptr = mmap(LAST_PAGE, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0); assert(ptr == LAST_PAGE); err = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds); assert(err == 0); err = send(fds[0], ptr, PAGE_SIZE, 0); perror("send"); assert(err == PAGE_SIZE); err = recv(fds[1], ptr, PAGE_SIZE, MSG_WAITALL); perror("recv"); assert(err == PAGE_SIZE); return 0; } The other place checking the addr limit is the access_ok() function, which is working properly. There's just a misleading comment for the __range_not_ok() macro - which this patch fixes as well. The last page of the user-space address range is a guard page and Brian Gerst observed that the guard page itself due to an erratum on K8 cpus (#121 Sequential Execution Across Non-Canonical Boundary Causes Processor Hang). However, the test code is using the last valid page before the guard page. The bug is that the last byte before the guard page can't be read because of the off-by-one error. The guard page is left in place. This bug would normally not show up because the last page is part of the process stack and never accessed via syscalls. Signed-off-by: NJiri Olsa <jolsa@redhat.com> Acked-by: NBrian Gerst <brgerst@gmail.com> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/1305210630-7136-1-git-send-email-jolsa@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Fenghua Yu 提交于
Support memset() with enhanced rep stosb. On processors supporting enhanced REP MOVSB/STOSB, the alternative memset_c_e function using enhanced rep stosb overrides the fast string alternative memset_c and the original function. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1305671358-14478-10-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-