1. 23 9月, 2017 1 次提交
    • J
      x86/asm: Fix inline asm call constraints for Clang · f5caf621
      Josh Poimboeuf 提交于
      For inline asm statements which have a CALL instruction, we list the
      stack pointer as a constraint to convince GCC to ensure the frame
      pointer is set up first:
      
        static inline void foo()
        {
      	register void *__sp asm(_ASM_SP);
      	asm("call bar" : "+r" (__sp))
        }
      
      Unfortunately, that pattern causes Clang to corrupt the stack pointer.
      
      The fix is easy: convert the stack pointer register variable to a global
      variable.
      
      It should be noted that the end result is different based on the GCC
      version.  With GCC 6.4, this patch has exactly the same result as
      before:
      
      	defconfig	defconfig-nofp	distro		distro-nofp
       before	9820389		9491555		8816046		8516940
       after	9820389		9491555		8816046		8516940
      
      With GCC 7.2, however, GCC's behavior has changed.  It now changes its
      behavior based on the conversion of the register variable to a global.
      That somehow convinces it to *always* set up the frame pointer before
      inserting *any* inline asm.  (Therefore, listing the variable as an
      output constraint is a no-op and is no longer necessary.)  It's a bit
      overkill, but the performance impact should be negligible.  And in fact,
      there's a nice improvement with frame pointers disabled:
      
      	defconfig	defconfig-nofp	distro		distro-nofp
       before	9796316		9468236		9076191		8790305
       after	9796957		9464267		9076381		8785949
      
      So in summary, while listing the stack pointer as an output constraint
      is no longer necessary for newer versions of GCC, it's still needed for
      older versions.
      Suggested-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: NMatthias Kaehlcke <mka@chromium.org>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f5caf621
  2. 18 10月, 2016 1 次提交
  3. 20 9月, 2016 1 次提交
    • J
      locking/rwsem, x86: Drop a bogus cc clobber · c907420f
      Jan Beulich 提交于
      With the addition of uses of GCC's condition code outputs in commit:
      
        35ccfb71 ("x86, asm: Use CC_SET()/CC_OUT() in <asm/rwsem.h>")
      
      ... there's now an overlap of outputs and clobbers in __down_write_trylock().
      
      Such overlaps are generally getting tagged with an error (occasionally
      even with an ICE). I can't really tell why plain GCC 6.2 doesn't detect
      this (judging by the code it is meant to), while the slightly modified
      one I use does. Since condition code clobbers are never necessary on x86
      (other than perhaps for documentation purposes, which doesn't really
      get done consistently), remove it altogether rather than inventing
      something like CC_CLOBBER (to accompany CC_SET/CC_OUT).
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/57E003CC0200007800110102@prv-mh.provo.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c907420f
  4. 09 6月, 2016 2 次提交
  5. 08 6月, 2016 1 次提交
    • J
      locking/rwsem: Remove rwsem_atomic_add() and rwsem_atomic_update() · d157bd86
      Jason Low 提交于
      The rwsem-xadd count has been converted to an atomic variable and the
      rwsem code now directly uses atomic_long_add() and
      atomic_long_add_return(), so we can remove the arch implementations of
      rwsem_atomic_add() and rwsem_atomic_update().
      Signed-off-by: NJason Low <jason.low2@hpe.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Terry Rudd <terry.rudd@hpe.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Waiman Long <Waiman.Long@hpe.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d157bd86
  6. 22 4月, 2016 1 次提交
    • M
      locking/rwsem: Provide down_write_killable() · 916633a4
      Michal Hocko 提交于
      Now that all the architectures implement the necessary glue code
      we can introduce down_write_killable(). The only difference wrt. regular
      down_write() is that the slow path waits in TASK_KILLABLE state and the
      interruption by the fatal signal is reported as -EINTR to the caller.
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
      Cc: Signed-off-by: Jason Low <jason.low2@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: linux-xtensa@linux-xtensa.org
      Cc: sparclinux@vger.kernel.org
      Link: http://lkml.kernel.org/r/1460041951-22347-12-git-send-email-mhocko@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      916633a4
  7. 13 4月, 2016 2 次提交
    • M
      locking/rwsem, x86: Provide __down_write_killable() · 664b4e24
      Michal Hocko 提交于
      which uses the same fast path as __down_write() except it falls back to
      call_rwsem_down_write_failed_killable() slow path and return -EINTR if
      killed. To prevent from code duplication extract the skeleton of
      __down_write() into a helper macro which just takes the semaphore
      and the slow path function to be called.
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
      Cc: Signed-off-by: Jason Low <jason.low2@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: linux-xtensa@linux-xtensa.org
      Cc: sparclinux@vger.kernel.org
      Link: http://lkml.kernel.org/r/1460041951-22347-11-git-send-email-mhocko@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      664b4e24
    • M
      locking/rwsem: Get rid of __down_write_nested() · f8e04d85
      Michal Hocko 提交于
      This is no longer used anywhere and all callers (__down_write()) use
      0 as a subclass. Ditch __down_write_nested() to make the code easier
      to follow.
      
      This shouldn't introduce any functional change.
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NDavidlohr Bueso <dave@stgolabs.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
      Cc: Signed-off-by: Jason Low <jason.low2@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: linux-xtensa@linux-xtensa.org
      Cc: sparclinux@vger.kernel.org
      Link: http://lkml.kernel.org/r/1460041951-22347-2-git-send-email-mhocko@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f8e04d85
  8. 24 2月, 2016 1 次提交
  9. 07 5月, 2013 1 次提交
  10. 30 8月, 2011 1 次提交
  11. 27 1月, 2011 6 次提交
  12. 21 7月, 2010 2 次提交
  13. 14 2月, 2010 1 次提交
    • A
      x86-64, rwsem: Avoid store forwarding hazard in __downgrade_write · 0d1622d7
      Avi Kivity 提交于
      The Intel Architecture Optimization Reference Manual states that a short
      load that follows a long store to the same object will suffer a store
      forwading penalty, particularly if the two accesses use different addresses.
      Trivially, a long load that follows a short store will also suffer a penalty.
      
      __downgrade_write() in rwsem incurs both penalties:  the increment operation
      will not be able to reuse a recently-loaded rwsem value, and its result will
      not be reused by any recently-following rwsem operation.
      
      A comment in the code states that this is because 64-bit immediates are
      special and expensive; but while they are slightly special (only a single
      instruction allows them), they aren't expensive: a test shows that two loops,
      one loading a 32-bit immediate and one loading a 64-bit immediate, both take
      1.5 cycles per iteration.
      
      Fix this by changing __downgrade_write to use the same add instruction on
      i386 and on x86_64, so that it uses the same operand size as all the other
      rwsem functions.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      LKML-Reference: <1266049992-17419-1-git-send-email-avi@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      0d1622d7
  14. 19 1月, 2010 1 次提交
    • H
      x86-64, rwsem: 64-bit xadd rwsem implementation · 1838ef1d
      H. Peter Anvin 提交于
      For x86-64, 32767 threads really is not enough.  Change rwsem_count_t
      to a signed long, so that it is 64 bits on x86-64.
      
      This required the following changes to the assembly code:
      
      a) %z0 doesn't work on all versions of gcc!  At least gcc 4.4.2 as
         shipped with Fedora 12 emits "ll" not "q" for 64 bits, even for
         integer operands.  Newer gccs apparently do this correctly, but
         avoid this problem by using the _ASM_ macros instead of %z.
      b) 64 bits immediates are only allowed in "movq $imm,%reg"
         constructs... no others.  Change some of the constraints to "e",
         and fix the one case where we would have had to use an invalid
         immediate -- in that case, we only care about the upper half
         anyway, so just access the upper half.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <tip-bafaecd1@git.kernel.org>
      1838ef1d
  15. 14 1月, 2010 1 次提交
    • L
      x86: clean up rwsem type system · 5d0b7235
      Linus Torvalds 提交于
      The fast version of the rwsems (the code that uses xadd) has
      traditionally only worked on x86-32, and as a result it mixes different
      kinds of types wildly - they just all happen to be 32-bit.  We have
      "long", we have "__s32", and we have "int".
      
      To make it work on x86-64, the types suddenly matter a lot more.  It can
      be either a 32-bit or 64-bit signed type, and both work (with the caveat
      that a 32-bit counter will only have 15 bits of effective write
      counters, so it's limited to 32767 users).  But whatever type you
      choose, it needs to be used consistently.
      
      This makes a new 'rwsem_counter_t', that is a 32-bit signed type.  For a
      64-bit type, you'd need to also update the BIAS values.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <alpine.LFD.2.00.1001121755220.17145@localhost.localdomain>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      5d0b7235
  16. 13 1月, 2010 1 次提交
    • L
      x86-32: clean up rwsem inline asm statements · 59c33fa7
      Linus Torvalds 提交于
      This makes gcc use the right register names and instruction operand sizes
      automatically for the rwsem inline asm statements.
      
      So instead of using "(%%eax)" to specify the memory address that is the
      semaphore, we use "(%1)" or similar. And instead of forcing the operation
      to always be 32-bit, we use "%z0", taking the size from the actual
      semaphore data structure itself.
      
      This doesn't actually matter on x86-32, but if we want to use the same
      inline asm for x86-64, we'll need to have the compiler generate the proper
      64-bit names for the registers (%rax instead of %eax), and if we want to
      use a 64-bit counter too (in order to avoid the 15-bit limit on the
      write counter that limits concurrent users to 32767 threads), we'll need
      to be able to generate instructions with "q" accesses rather than "l".
      
      Since this header currently isn't enabled on x86-64, none of that matters,
      but we do want to use the xadd version of the semaphores rather than have
      to take spinlocks to do a rwsem. The mm->mmap_sem can be heavily contended
      when you have lots of threads all taking page faults, and the fallback
      rwsem code that uses a spinlock performs abysmally badly in that case.
      
      [ hpa: modified the patch to skip size suffixes entirely when they are
        redundant due to register operands. ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <alpine.LFD.2.00.1001121613560.17145@localhost.localdomain>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      59c33fa7
  17. 23 10月, 2008 2 次提交
  18. 23 7月, 2008 1 次提交
    • V
      x86: consolidate header guards · 77ef50a5
      Vegard Nossum 提交于
      This patch is the result of an automatic script that consolidates the
      format of all the headers in include/asm-x86/.
      
      The format:
      
      1. No leading underscore. Names with leading underscores are reserved.
      2. Pathname components are separated by two underscores. So we can
         distinguish between mm_types.h and mm/types.h.
      3. Everything except letters and numbers are turned into single
         underscores.
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      77ef50a5
  19. 17 4月, 2008 1 次提交
  20. 30 1月, 2008 2 次提交
  21. 11 10月, 2007 1 次提交
  22. 08 12月, 2006 1 次提交
  23. 26 9月, 2006 1 次提交
  24. 09 7月, 2006 1 次提交
    • L
      i386: improve and correct inline asm memory constraints · b862f3b0
      Linus Torvalds 提交于
      Use "+m" rather than a combination of "=m" and "m" for improved clarity
      and consistency.
      
      This also fixes some inlines that incorrectly didn't tell the compiler
      that they read the old value at all, potentially causing the compiler to
      generate bogus code.  It appear that all of those potential bugs were
      hidden by the use of extra "volatile" specifiers on the data structures
      in question, though.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b862f3b0
  25. 04 7月, 2006 2 次提交
  26. 30 10月, 2005 1 次提交
  27. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4