1. 21 1月, 2012 1 次提交
    • J
      x86: Adjust asm constraints in atomic64 wrappers · 819165fb
      Jan Beulich 提交于
      Eric pointed out overly restrictive constraints in atomic64_set(), but
      there are issues throughout the file. In the cited case, %ebx and %ecx
      are inputs only (don't get changed by either of the two low level
      implementations). This was also the case elsewhere.
      
      Further in many cases early-clobber indicators were missing.
      
      Finally, the previous implementation rolled a custom alternative
      instruction macro from scratch, rather than using alternative_call()
      (which was introduced with the commit that the description of the
      change in question actually refers to). Adjusting has the benefit of
      not hiding referenced symbols from the compiler, which however requires
      them to be declared not just in the exporting source file (which, as a
      desirable side effect, in turn allows that exporting file to become a
      real 5-line stub).
      
      This patch does not eliminate the overly restrictive memory clobbers,
      however: Doing so would occasionally make the compiler set up a second
      register for accessing the memory object (to satisfy the added "m"
      constraint), and it's not clear which of the two non-optimal
      alternatives is better.
      
      v2: Re-do the declaration and exporting of the internal symbols.
      Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Link: http://lkml.kernel.org/r/4F19A2A5020000780006E0D9@nat28.tlf.novell.com
      Cc: Luca Barbieri <luca@luca-barbieri.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      819165fb
  2. 27 7月, 2011 1 次提交
  3. 26 2月, 2010 1 次提交
    • L
      x86-32: Rewrite 32-bit atomic64 functions in assembly · a7e926ab
      Luca Barbieri 提交于
      This patch replaces atomic64_32.c with two assembly implementations,
      one for 386/486 machines using pushf/cli/popf and one for 586+ machines
      using cmpxchg8b.
      
      The cmpxchg8b implementation provides the following advantages over the
      current one:
      
      1. Implements atomic64_add_unless, atomic64_dec_if_positive and
         atomic64_inc_not_zero
      
      2. Uses the ZF flag changed by cmpxchg8b instead of doing a comparison
      
      3. Uses custom register calling conventions that reduce or eliminate
         register moves to suit cmpxchg8b
      
      4. Reads the initial value instead of using cmpxchg8b to do that.
         Currently we use lock xaddl and movl, which seems the fastest.
      
      5. Does not use the lock prefix for atomic64_set
         64-bit writes are already atomic, so we don't need that.
         We still need it for atomic64_read to avoid restoring a value
         changed in the meantime.
      
      6. Allocates registers as well or better than gcc
      
      The 386 implementation provides support for 386 and 486 machines.
      386/486 SMP is not supported (we dropped it), but such support can be
      added easily if desired.
      
      A pure assembly implementation is required due to the custom calling
      conventions, and desire to use %ebp in atomic64_add_return (we need
      7 registers...), as well as the ability to use pushf/popf in the 386
      code without an intermediate pop/push.
      
      The parameter names are changed to match the convention in atomic_64.h
      
      Changes in v3 (due to rebasing to tip/x86/asm):
      - Patches atomic64_32.h instead of atomic_32.h
      - Uses the CALL alternative mechanism from commit
        1b1d9258
      
      Changes in v2:
      - Merged 386 and cx8 support in the same patch
      - 386 support now done in assembly, C code no longer used at all
      - cmpxchg64 is used for atomic64_cmpxchg
      - stop using macros, use one-line inline functions instead
      - miscellanous changes and improvements
      Signed-off-by: NLuca Barbieri <luca@luca-barbieri.com>
      LKML-Reference: <1267005265-27958-5-git-send-email-luca@luca-barbieri.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      a7e926ab
  4. 04 7月, 2009 4 次提交
    • E
      x86: atomic64: Inline atomic64_read() again · a79f0da8
      Eric Dumazet 提交于
      Now atomic64_read() is light weight (no register pressure and
      small icache), we can inline it again.
      
      Also use "=&A" constraint instead of "+A" to avoid warning
      about unitialized 'res' variable. (gcc had to force 0 in eax/edx)
      
        $ size vmlinux.prev vmlinux.after
           text    data     bss     dec     hex filename
        4908667  451676 1684868 7045211  6b805b vmlinux.prev
        4908651  451676 1684868 7045195  6b804b vmlinux.after
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <4A4E1AA2.30002@gmail.com>
      [ Also fix typo in atomic64_set() export ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a79f0da8
    • I
      x86: atomic64: Clean up atomic64_sub_and_test() and atomic64_add_negative() · ddf9a003
      Ingo Molnar 提交于
      Linus noticed that the variable name 'old_val' is
      confusingly named in these functions - the correct
      naming is 'new_val'.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907030942260.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ddf9a003
    • I
      x86: atomic64: Improve atomic64_xchg() · 3a8d1788
      Ingo Molnar 提交于
      Remove the read-first logic from atomic64_xchg() and simplify
      the loop.
      
      This function was the last user of __atomic64_read() - remove it.
      
      Also, change the 'real_val' assumption from the somewhat quirky
      1ULL << 32 value to the (just as arbitrary, but simpler) value
      of 0.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <tip-05118ab8859492ac9ddda0154cf90e37b0a4a0b0@git.kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a8d1788
    • I
      x86: atomic64: Export APIs to modules · 1fde902d
      Ingo Molnar 提交于
      atomic64_t primitives are used by a handful of drivers,
      so export the APIs consistently. These were inlined
      before.
      
      Also mark atomic64_32.o a core object, so that the symbols
      are available even if not linked to core kernel pieces.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <tip-05118ab8859492ac9ddda0154cf90e37b0a4a0b0@git.kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1fde902d
  5. 03 7月, 2009 7 次提交
    • E
      x86: atomic64: Improve atomic64_read() · 67d7178f
      Eric Dumazet 提交于
      Optimize atomic64_read() as a special open-coded
      cmpxchg8b variant. This generates nicer code:
      
      arch/x86/lib/atomic64_32.o:
      
         text	   data	    bss	    dec	    hex	filename
          435	      0	      0	    435	    1b3	atomic64_32.o.before
          431	      0	      0	    431	    1af	atomic64_32.o.after
      
      md5:
         bd8ab95e69c93518578bfaf0ea3be4d9  atomic64_32.o.before.asm
         2bdfd4bd1f6b7b61b7fc127aef90ce3b  atomic64_32.o.after.asm
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      67d7178f
    • I
      x86: atomic64: Fix unclean type use in atomic64_xchg() · 199e2378
      Ingo Molnar 提交于
      Linus noticed that atomic64_xchg() uses atomic_read(), which
      happens to work because atomic_read() is a macro so the
      .counter value gets u64-read on 32-bit too - but this is really
      bogus and serious bugs are waiting to happen.
      
      Fix atomic64_xchg() to use __atomic64_read() instead.
      
      No code changed:
      
      arch/x86/lib/atomic64_32.o:
      
         text	   data	    bss	    dec	    hex	filename
          435	      0	      0	    435	    1b3	atomic64_32.o.before
          435	      0	      0	    435	    1b3	atomic64_32.o.after
      
      md5:
         bd8ab95e69c93518578bfaf0ea3be4d9  atomic64_32.o.before.asm
         bd8ab95e69c93518578bfaf0ea3be4d9  atomic64_32.o.after.asm
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      199e2378
    • I
      x86: atomic64: Reduce size of functions · 3ac805d2
      Ingo Molnar 提交于
      cmpxchg8b is a huge instruction in terms of register footprint,
      we almost never want to inline it, not even within the same
      code module.
      
      GCC 4.3 still messes up for two functions, under-judging the
      true cost of this instruction - so annotate two key functions
      to reduce the bloat:
      
      arch/x86/lib/atomic64_32.o:
      
         text	   data	    bss	    dec	    hex	filename
         1763	      0	      0	   1763	    6e3	atomic64_32.o.before
          435	      0	      0	    435	    1b3	atomic64_32.o.after
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3ac805d2
    • I
      x86: atomic64: Improve atomic64_add_return() · 824975ef
      Ingo Molnar 提交于
      Linus noted (based on Eric Dumazet's numbers) that we would
      probably be better off not trying an atomic_read() in
      atomic64_add_return() but intead intentionally let the first
      cmpxchg8b fail - to get a cache-friendly 'give me ownership
      of this cacheline' transaction. That can then be followed
      by the real cmpxchg8b which sets the value local to the CPU.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      824975ef
    • E
      x86: atomic64: Improve cmpxchg8b() · 69237f94
      Eric Dumazet 提交于
      Rewrite cmpxchg8b() to not use %edi register but a generic "+m"
      constraint, to increase compiler freedom in code generation and
      possibly better code.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      69237f94
    • E
      x86: atomic64: Improve atomic64_read() · aacf682f
      Eric Dumazet 提交于
      Linus noticed that the 32-bit version of atomic64_read() was
      being overly complex with re-reading the value and doing a
      retry loop over that.
      
      Instead we can just rely on cmpxchg8b returning either the new
      value or returning the current value.
      
      We can use any 'old' value, which will be faster as it can be
      loaded via immediates. Using some value that is not equal to
      the real value in memory the instruction gets faster.
      
      This also has the advantage that the CPU could avoid dirtying
      the cacheline.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      aacf682f
    • I
      x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file · b7882b7c
      Ingo Molnar 提交于
      Linus noted that the atomic64_t primitives are all inlines
      currently which is crazy because these functions have a large
      register footprint anyway.
      
      Move them to a separate file: arch/x86/lib/atomic64_32.c
      
      Also, while at it, rename all uses of 'unsigned long long' to
      the much shorter u64.
      
      This makes the appearance of the prototypes a lot nicer - and
      it also uncovered a few bugs where (yet unused) API variants
      had 'long' as their return type instead of u64.
      
      [ More intrusive changes are not yet done in this patch. ]
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b7882b7c