1. 13 6月, 2012 1 次提交
  2. 06 6月, 2012 2 次提交
  3. 27 5月, 2012 2 次提交
    • L
      x86: use the new generic strnlen_user() function · 5723aa99
      Linus Torvalds 提交于
      This throws away the old x86-specific functions in favor of the generic
      optimized version.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5723aa99
    • L
      x86: use generic strncpy_from_user routine · 4ae73f2d
      Linus Torvalds 提交于
      The generic strncpy_from_user() is not really optimal, since it is
      designed to work on both little-endian and big-endian.  And on
      little-endian you can simplify much of the logic to find the first zero
      byte, since little-endian arithmetic doesn't have to worry about the
      carry bit propagating into earlier bytes (only later bytes, which we
      don't care about).
      
      But I have patches to make the generic routines use the architecture-
      specific <asm/word-at-a-time.h> infrastructure, so that we can regain
      the little-endian optimizations.  But before we do that, switch over to
      the generic routines to make the patches each do just one well-defined
      thing.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4ae73f2d
  4. 29 4月, 2012 1 次提交
    • L
      x86: make word-at-a-time strncpy_from_user clear bytes at the end · 07497083
      Linus Torvalds 提交于
      This makes the newly optimized x86 strncpy_from_user clear the final
      bytes in the word past the final NUL character, rather than copy them as
      the word they were in the source.
      
      NOTE! Unlike the silly semantics of the libc 'strncpy()' function, the
      kernel strncpy_from_user() has never cleared all of the end of the
      destination buffer.  And neither does it do so now: it only clears the
      bytes at the end of the last word it copied.
      
      So why make this change at all? It doesn't really cost us anything extra
      (we have to calculate the mask to get the length anyway), and it means
      that *if* any user actually cares about zeroing the whole buffer, they
      can do a "memset()" before the strncpy_from_user(), and we will no
      longer write random bytes after the NUL character.
      
      In particular, the buffer contents will now at no point contain random
      source data from beyond the end of the string.
      
      In other words, it makes behavior a bit more repeatable at no new cost,
      so it's a small cleanup.  I've been carrying this as a patch for the
      last few weeks or so in my tree (done at the same time the sign error
      was fixed in commit 12e993b8), I might as well commit it.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      07497083
  5. 21 4月, 2012 7 次提交
  6. 16 4月, 2012 2 次提交
    • M
      x86: Handle failures of parsing immediate operands in the instruction decoder · 6c7b8e82
      Masami Hiramatsu 提交于
      This can happen if the instruction is much longer than the maximum length,
      or if insn->opnd_bytes is manually changed.
      
      This patch also fixes warnings from -Wswitch-default flag.
      Reported-by: NPrashanth Nageshappa <prashanth@linux.vnet.ibm.com>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Linux-mm <linux-mm@kvack.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120413032427.32577.42602.stgit@localhost.localdomainSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6c7b8e82
    • L
      x86-32: fix up strncpy_from_user() sign error · 12e993b8
      Linus Torvalds 提交于
      The 'max' range needs to be unsigned, since the size of the user address
      space is bigger than 2GB.
      
      We know that 'count' is positive in 'long' (that is checked in the
      caller), so we will truncate 'max' down to something that fits in a
      signed long, but before we actually do that, that comparison needs to be
      done in unsigned.
      
      Bug introduced in commit 92ae03f2 ("x86: merge 32/64-bit versions of
      'strncpy_from_user()' and speed it up").  On x86-64 you can't trigger
      this, since the user address space is much smaller than 63 bits, and on
      x86-32 it works in practice, since you would seldom hit the strncpy
      limits anyway.
      
      I had actually tested the corner-cases, I had only tested them on
      x86-64.  Besides, I had only worried about the case of a pointer *close*
      to the end of the address space, rather than really far away from it ;)
      
      This also changes the "we hit the user-specified maximum" to return
      'res', for the trivial reason that gcc seems to generate better code
      that way.  'res' and 'count' are the same in that case, so it really
      doesn't matter which one we return.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      12e993b8
  7. 12 4月, 2012 1 次提交
    • L
      x86: merge 32/64-bit versions of 'strncpy_from_user()' and speed it up · 92ae03f2
      Linus Torvalds 提交于
      This merges the 32- and 64-bit versions of the x86 strncpy_from_user()
      by just rewriting it in C rather than the ancient inline asm versions
      that used lodsb/stosb and had been duplicated for (trivial) differences
      between the 32-bit and 64-bit versions.
      
      While doing that, it also speeds them up by doing the accesses a word at
      a time.  Finally, the new routines also properly handle the case of
      hitting the end of the address space, which we have never done correctly
      before (fs/namei.c has a hack around it for that reason).
      
      Despite all these improvements, it actually removes more lines than it
      adds, due to the de-duplication.  Also, we no longer export (or define)
      the legacy __strncpy_from_user() function (that was defined to not do
      the user permission checks), since it's not actually used anywhere, and
      the user address space checks are built in to the new code.
      
      Other architecture maintainers have been notified that the old hack in
      fs/namei.c will be going away in the 3.5 merge window, in case they
      copied the x86 approach of being a bit cavalier about the end of the
      address space.
      
      Cc: linux-arch@vger.kernel.org
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      92ae03f2
  8. 20 3月, 2012 1 次提交
  9. 10 3月, 2012 1 次提交
    • T
      x86: Derandom delay_tsc for 64 bit · a7f4255f
      Thomas Gleixner 提交于
      Commit f0fbf0ab ("x86: integrate delay functions") converted
      delay_tsc() into a random delay generator for 64 bit.  The reason is
      that it merged the mostly identical versions of delay_32.c and
      delay_64.c.  Though the subtle difference of the result was:
      
       static void delay_tsc(unsigned long loops)
       {
      -	unsigned bclock, now;
      +	unsigned long bclock, now;
      
      Now the function uses rdtscl() which returns the lower 32bit of the
      TSC. On 32bit that's not problematic as unsigned long is 32bit. On 64
      bit this fails when the lower 32bit are close to wrap around when
      bclock is read, because the following check
      
             if ((now - bclock) >= loops)
             	  	break;
      
      evaluated to true on 64bit for e.g. bclock = 0xffffffff and now = 0
      because the unsigned long (now - bclock) of these values results in
      0xffffffff00000001 which is definitely larger than the loops
      value. That explains Tvortkos observation:
      
      "Because I am seeing udelay(500) (_occasionally_) being short, and
       that by delaying for some duration between 0us (yep) and 491us."
      
      Make those variables explicitely u32 again, so this works for both 32
      and 64 bit.
      Reported-by: NTvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org # >= 2.6.27
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7f4255f
  10. 11 2月, 2012 1 次提交
  11. 27 1月, 2012 2 次提交
  12. 26 1月, 2012 1 次提交
  13. 21 1月, 2012 2 次提交
    • J
      x86: atomic64 assembly improvements · cb8095bb
      Jan Beulich 提交于
      In the "xchg" implementation, %ebx and %ecx don't need to be copied
      into %eax and %edx respectively (this is only necessary when desiring
      to only read the stored value).
      
      In the "add_unless" implementation, swapping the use of %ecx and %esi
      for passing arguments allows %esi to become an input only (i.e.
      permitting the register to be re-used to address the same object
      without reload).
      
      In "{add,sub}_return", doing the initial read64 through the passed in
      %ecx decreases a register dependency.
      
      In "inc_not_zero", a branch can be eliminated by or-ing together the
      two halves of the current (64-bit) value, and code size can be further
      reduced by adjusting the arithmetic slightly.
      
      v2: Undo the folding of "xchg" and "set".
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Link: http://lkml.kernel.org/r/4F19A2BC020000780006E0DC@nat28.tlf.novell.com
      Cc: Luca Barbieri <luca@luca-barbieri.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      cb8095bb
    • J
      x86: Adjust asm constraints in atomic64 wrappers · 819165fb
      Jan Beulich 提交于
      Eric pointed out overly restrictive constraints in atomic64_set(), but
      there are issues throughout the file. In the cited case, %ebx and %ecx
      are inputs only (don't get changed by either of the two low level
      implementations). This was also the case elsewhere.
      
      Further in many cases early-clobber indicators were missing.
      
      Finally, the previous implementation rolled a custom alternative
      instruction macro from scratch, rather than using alternative_call()
      (which was introduced with the commit that the description of the
      change in question actually refers to). Adjusting has the benefit of
      not hiding referenced symbols from the compiler, which however requires
      them to be declared not just in the exporting source file (which, as a
      desirable side effect, in turn allows that exporting file to become a
      real 5-line stub).
      
      This patch does not eliminate the overly restrictive memory clobbers,
      however: Doing so would occasionally make the compiler set up a second
      register for accessing the memory object (to satisfy the added "m"
      constraint), and it's not clear which of the two non-optimal
      alternatives is better.
      
      v2: Re-do the declaration and exporting of the internal symbols.
      Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Link: http://lkml.kernel.org/r/4F19A2A5020000780006E0D9@nat28.tlf.novell.com
      Cc: Luca Barbieri <luca@luca-barbieri.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      819165fb
  14. 18 1月, 2012 1 次提交
  15. 16 1月, 2012 1 次提交
  16. 06 1月, 2012 1 次提交
  17. 13 12月, 2011 1 次提交
  18. 05 12月, 2011 2 次提交
  19. 10 10月, 2011 1 次提交
  20. 27 7月, 2011 1 次提交
  21. 22 7月, 2011 1 次提交
  22. 21 7月, 2011 3 次提交
  23. 14 7月, 2011 1 次提交
  24. 04 6月, 2011 1 次提交
  25. 18 5月, 2011 2 次提交
    • J
      x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit · 26afb7c6
      Jiri Olsa 提交于
      As reported in BZ #30352:
      
        https://bugzilla.kernel.org/show_bug.cgi?id=30352
      
      there's a kernel bug related to reading the last allowed page on x86_64.
      
      The _copy_to_user() and _copy_from_user() functions use the following
      check for address limit:
      
        if (buf + size >= limit)
      	fail();
      
      while it should be more permissive:
      
        if (buf + size > limit)
      	fail();
      
      That's because the size represents the number of bytes being
      read/write from/to buf address AND including the buf address.
      So the copy function will actually never touch the limit
      address even if "buf + size == limit".
      
      Following program fails to use the last page as buffer
      due to the wrong limit check:
      
       #include <sys/mman.h>
       #include <sys/socket.h>
       #include <assert.h>
      
       #define PAGE_SIZE       (4096)
       #define LAST_PAGE       ((void*)(0x7fffffffe000))
      
       int main()
       {
              int fds[2], err;
              void * ptr = mmap(LAST_PAGE, PAGE_SIZE, PROT_READ | PROT_WRITE,
                                MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0);
              assert(ptr == LAST_PAGE);
              err = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
              assert(err == 0);
              err = send(fds[0], ptr, PAGE_SIZE, 0);
              perror("send");
              assert(err == PAGE_SIZE);
              err = recv(fds[1], ptr, PAGE_SIZE, MSG_WAITALL);
              perror("recv");
              assert(err == PAGE_SIZE);
              return 0;
       }
      
      The other place checking the addr limit is the access_ok() function,
      which is working properly. There's just a misleading comment
      for the __range_not_ok() macro - which this patch fixes as well.
      
      The last page of the user-space address range is a guard page and
      Brian Gerst observed that the guard page itself due to an erratum on K8 cpus
      (#121 Sequential Execution Across Non-Canonical Boundary Causes Processor
      Hang).
      
      However, the test code is using the last valid page before the guard page.
      The bug is that the last byte before the guard page can't be read
      because of the off-by-one error. The guard page is left in place.
      
      This bug would normally not show up because the last page is
      part of the process stack and never accessed via syscalls.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NBrian Gerst <brgerst@gmail.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: <stable@kernel.org>
      Link: http://lkml.kernel.org/r/1305210630-7136-1-git-send-email-jolsa@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      26afb7c6
    • F
      x86, mem: memset_64.S: Optimize memset by enhanced REP MOVSB/STOSB · 2f19e06a
      Fenghua Yu 提交于
      Support memset() with enhanced rep stosb. On processors supporting enhanced
      REP MOVSB/STOSB, the alternative memset_c_e function using enhanced rep stosb
      overrides the fast string alternative memset_c and the original function.
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Link: http://lkml.kernel.org/r/1305671358-14478-10-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      2f19e06a