1. 14 6月, 2016 1 次提交
    • P
      locking/spinlock, arch: Update and fix spin_unlock_wait() implementations · 726328d9
      Peter Zijlstra 提交于
      This patch updates/fixes all spin_unlock_wait() implementations.
      
      The update is in semantics; where it previously was only a control
      dependency, we now upgrade to a full load-acquire to match the
      store-release from the spin_unlock() we waited on. This ensures that
      when spin_unlock_wait() returns, we're guaranteed to observe the full
      critical section we waited on.
      
      This fixes a number of spin_unlock_wait() users that (not
      unreasonably) rely on this.
      
      I also fixed a number of ticket lock versions to only wait on the
      current lock holder, instead of for a full unlock, as this is
      sufficient.
      
      Furthermore; again for ticket locks; I added an smp_rmb() in between
      the initial ticket load and the spin loop testing the current value
      because I could not convince myself the address dependency is
      sufficient, esp. if the loads are of different sizes.
      
      I'm more than happy to remove this smp_rmb() again if people are
      certain the address dependency does indeed work as expected.
      
      Note: PPC32 will be fixed independently
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: chris@zankel.net
      Cc: cmetcalf@mellanox.com
      Cc: davem@davemloft.net
      Cc: dhowells@redhat.com
      Cc: james.hogan@imgtec.com
      Cc: jejb@parisc-linux.org
      Cc: linux@armlinux.org.uk
      Cc: mpe@ellerman.id.au
      Cc: ralf@linux-mips.org
      Cc: realmz6@gmail.com
      Cc: rkuo@codeaurora.org
      Cc: rth@twiddle.net
      Cc: schwidefsky@de.ibm.com
      Cc: tony.luck@intel.com
      Cc: vgupta@synopsys.com
      Cc: ysato@users.sourceforge.jp
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      726328d9
  2. 14 10月, 2015 1 次提交
    • C
      s390/spinlock: remove unneeded serializations at unlock · fdbbe8e7
      Christian Borntraeger 提交于
      the kernel locks have aqcuire/release semantics. No operation done
      after the lock can be "moved" before the lock and no operation before
      the unlock can be moved after the unlock. But it is perfectly fine
      that memory accesses which happen code wise after unlock are performed
      within the critical section.
      On s390x, reads are in-order with other reads (PoP section
      "Storage-Operand Fetch References") and writes are in-order with
      other writes (PoP section "Storage-Operand Store References"). Writes
      are also in-order with reads to the same memory location (PoP section
      "Storage-Operand Store References"). To other CPUs (and the channel
      subsystem), reads additionally appear to be performed prior to reads or
      writes that happen after them in the conceptual sequence (PoP section
      "Relation between Operand Accesses").
      So at least as observed by other CPUs and the channel subsystem, reads
      inside the critical sections will not happen after unlock (and writes
      are in-order anyway). That's exactly what we need for "RELEASE
      operations" (memory-barriers.txt): "It guarantees that all memory
      operations before the RELEASE operation will appear to happen before the
      RELEASE operation with respect to the other components of the system."
      Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-By: NSascha Silbe <silbe@linux.vnet.ibm.com>
      [cross-reading and lot of improvements for the patch description]
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      fdbbe8e7
  3. 03 11月, 2014 1 次提交
    • M
      s390/cmpxchg: use compiler builtins · f318a122
      Martin Schwidefsky 提交于
      The kernel build for s390 fails for gcc compilers with version 3.x,
      set the minimum required version of gcc to version 4.3.
      
      As the atomic builtins are available with all gcc 4.x compilers,
      use the __sync_val_compare_and_swap and __sync_bool_compare_and_swap
      functions to replace the complex macro and inline assembler magic
      in include/asm/cmpxchg.h. The compiler can just-do-it and generates
      better code with the builtins.
      
      While we are at it use __sync_bool_compare_and_swap for the
      _raw_compare_and_swap function in the spinlock code as well.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      f318a122
  4. 25 9月, 2014 3 次提交
  5. 09 9月, 2014 1 次提交
    • H
      s390/spinlock: optimize spin_unlock code · 44230282
      Heiko Carstens 提交于
      Use a memory barrier + store sequence instead of a load + compare and swap
      sequence to unlock a spinlock and an rw lock.
      For the spinlock case this saves us two memory reads and a not needed cpu
      serialization after the compare and swap instruction stored the new value.
      
      The kernel size (performance_defconfig) gets reduced by ~14k.
      
      Average execution time of a tight inlined spin_unlock loop drops from
      5.8ns to 0.7ns on a zEC12 machine.
      
      An artificial stress test case where several counters are protected with
      a single spinlock and which are only incremented while holding the spinlock
      shows ~30% improvement on a 4 cpu machine.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      44230282
  6. 20 5月, 2014 3 次提交
  7. 28 9月, 2013 1 次提交
  8. 20 7月, 2012 1 次提交
    • H
      s390/comments: unify copyright messages and remove file names · a53c8fab
      Heiko Carstens 提交于
      Remove the file name from the comment at top of many files. In most
      cases the file name was wrong anyway, so it's rather pointless.
      
      Also unify the IBM copyright statement. We did have a lot of sightly
      different statements and wanted to change them one after another
      whenever a file gets touched. However that never happened. Instead
      people start to take the old/"wrong" statements to use as a template
      for new files.
      So unify all of them in one go.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      a53c8fab
  9. 30 10月, 2011 1 次提交
  10. 27 2月, 2010 1 次提交
  11. 15 12月, 2009 4 次提交
  12. 14 11月, 2009 1 次提交
    • T
      locking: Make inlining decision Kconfig based · 6beb0009
      Thomas Gleixner 提交于
      commit 892a7c67 (locking: Allow arch-inlined spinlocks) implements the
      selection of which lock functions are inlined based on defines in
      arch/.../spinlock.h: #define __always_inline__LOCK_FUNCTION
      
      Despite of the name __always_inline__* the lock functions can be built
      out of line depending on config options. Also if the arch does not set
      some inline defines the generic code might set them; again depending on
      config options.
      
      This makes it unnecessary hard to figure out when and which lock
      functions are inlined. Aside of that it makes it way harder and
      messier for -rt to manipulate the lock functions.
      
      Convert the inlining decision to CONFIG switches. Each lock function
      is inlined depending on CONFIG_INLINE_*. The configs implement the
      existing dependencies. The architecture code can select ARCH_INLINE_*
      to signal that it wants the corresponding lock function inlined.
      ARCH_INLINE_* is necessary as Kconfig ignores "depends on"
      restrictions when a config element is selected.
      
      No functional change.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <20091109151428.504477141@linutronix.de>
      Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Reviewed-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      6beb0009
  13. 01 9月, 2009 1 次提交
    • H
      locking: Inline spinlock code for all locking variants on s390 · b62e180c
      Heiko Carstens 提交于
      Speeds up several benchmarks in a measurable way, so inline
      all spin-lock variants by default.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Horst Hartmann <horsth@linux.vnet.ibm.com>
      Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <20090831124419.319518405@de.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b62e180c
  14. 12 6月, 2009 1 次提交
  15. 03 4月, 2009 1 次提交
  16. 02 8月, 2008 1 次提交
  17. 26 1月, 2008 2 次提交
  18. 05 10月, 2006 1 次提交
  19. 01 10月, 2006 2 次提交
    • M
      [PATCH] Directed yield: direct yield of spinlocks for s390. · 3c1fcfe2
      Martin Schwidefsky 提交于
      Use the new diagnose 0x9c in the spinlock implementation for s390.  It
      yields the remaining timeslice of the virtual cpu that tries to acquire a
      lock to the virtual cpu that is the current holder of the lock.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3c1fcfe2
    • M
      [PATCH] Directed yield: cpu_relax variants for spinlocks and rw-locks · ef6edc97
      Martin Schwidefsky 提交于
      On systems running with virtual cpus there is optimization potential in
      regard to spinlocks and rw-locks.  If the virtual cpu that has taken a lock
      is known to a cpu that wants to acquire the same lock it is beneficial to
      yield the timeslice of the virtual cpu in favour of the cpu that has the
      lock (directed yield).
      
      With CONFIG_PREEMPT="n" this can be implemented by the architecture without
      common code changes.  Powerpc already does this.
      
      With CONFIG_PREEMPT="y" the lock loops are coded with _raw_spin_trylock,
      _raw_read_trylock and _raw_write_trylock in kernel/spinlock.c.  If the lock
      could not be taken cpu_relax is called.  A directed yield is not possible
      because cpu_relax doesn't know anything about the lock.  To be able to
      yield the lock in favour of the current lock holder variants of cpu_relax
      for spinlocks and rw-locks are needed.  The new _raw_spin_relax,
      _raw_read_relax and _raw_write_relax primitives differ from cpu_relax
      insofar that they have an argument: a pointer to the lock structure.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ef6edc97
  20. 28 9月, 2006 1 次提交
    • M
      [S390] Inline assembly cleanup. · 94c12cc7
      Martin Schwidefsky 提交于
      Major cleanup of all s390 inline assemblies. They now have a common
      coding style. Quite a few have been shortened, mainly by using register
      asm variables. Use of the EX_TABLE macro helps  as well. The atomic ops,
      bit ops and locking inlines new use the Q-constraint if a newer gcc
      is used.  That results in slightly better code.
      
      Thanks to Christian Borntraeger for proof reading the changes.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      94c12cc7
  21. 11 9月, 2005 1 次提交
    • I
      [PATCH] spinlock consolidation · fb1c8f93
      Ingo Molnar 提交于
      This patch (written by me and also containing many suggestions of Arjan van
      de Ven) does a major cleanup of the spinlock code.  It does the following
      things:
      
       - consolidates and enhances the spinlock/rwlock debugging code
      
       - simplifies the asm/spinlock.h files
      
       - encapsulates the raw spinlock type and moves generic spinlock
         features (such as ->break_lock) into the generic code.
      
       - cleans up the spinlock code hierarchy to get rid of the spaghetti.
      
      Most notably there's now only a single variant of the debugging code,
      located in lib/spinlock_debug.c.  (previously we had one SMP debugging
      variant per architecture, plus a separate generic one for UP builds)
      
      Also, i've enhanced the rwlock debugging facility, it will now track
      write-owners.  There is new spinlock-owner/CPU-tracking on SMP builds too.
      All locks have lockup detection now, which will work for both soft and hard
      spin/rwlock lockups.
      
      The arch-level include files now only contain the minimally necessary
      subset of the spinlock code - all the rest that can be generalized now
      lives in the generic headers:
      
       include/asm-i386/spinlock_types.h       |   16
       include/asm-x86_64/spinlock_types.h     |   16
      
      I have also split up the various spinlock variants into separate files,
      making it easier to see which does what. The new layout is:
      
         SMP                         |  UP
         ----------------------------|-----------------------------------
         asm/spinlock_types_smp.h    |  linux/spinlock_types_up.h
         linux/spinlock_types.h      |  linux/spinlock_types.h
         asm/spinlock_smp.h          |  linux/spinlock_up.h
         linux/spinlock_api_smp.h    |  linux/spinlock_api_up.h
         linux/spinlock.h            |  linux/spinlock.h
      
      /*
       * here's the role of the various spinlock/rwlock related include files:
       *
       * on SMP builds:
       *
       *  asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the
       *                        initializers
       *
       *  linux/spinlock_types.h:
       *                        defines the generic type and initializers
       *
       *  asm/spinlock.h:       contains the __raw_spin_*()/etc. lowlevel
       *                        implementations, mostly inline assembly code
       *
       *   (also included on UP-debug builds:)
       *
       *  linux/spinlock_api_smp.h:
       *                        contains the prototypes for the _spin_*() APIs.
       *
       *  linux/spinlock.h:     builds the final spin_*() APIs.
       *
       * on UP builds:
       *
       *  linux/spinlock_type_up.h:
       *                        contains the generic, simplified UP spinlock type.
       *                        (which is an empty structure on non-debug builds)
       *
       *  linux/spinlock_types.h:
       *                        defines the generic type and initializers
       *
       *  linux/spinlock_up.h:
       *                        contains the __raw_spin_*()/etc. version of UP
       *                        builds. (which are NOPs on non-debug, non-preempt
       *                        builds)
       *
       *   (included on UP-non-debug builds:)
       *
       *  linux/spinlock_api_up.h:
       *                        builds the _spin_*() APIs.
       *
       *  linux/spinlock.h:     builds the final spin_*() APIs.
       */
      
      All SMP and UP architectures are converted by this patch.
      
      arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via
      crosscompilers.  m32r, mips, sh, sparc, have not been tested yet, but should
      be mostly fine.
      
      From: Grant Grundler <grundler@parisc-linux.org>
      
        Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU).
        Builds 32-bit SMP kernel (not booted or tested).  I did not try to build
        non-SMP kernels.  That should be trivial to fix up later if necessary.
      
        I converted bit ops atomic_hash lock to raw_spinlock_t.  Doing so avoids
        some ugly nesting of linux/*.h and asm/*.h files.  Those particular locks
        are well tested and contained entirely inside arch specific code.  I do NOT
        expect any new issues to arise with them.
      
       If someone does ever need to use debug/metrics with them, then they will
        need to unravel this hairball between spinlocks, atomic ops, and bit ops
        that exist only because parisc has exactly one atomic instruction: LDCW
        (load and clear word).
      
      From: "Luck, Tony" <tony.luck@intel.com>
      
         ia64 fix
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArjan van de Ven <arjanv@infradead.org>
      Signed-off-by: NGrant Grundler <grundler@parisc-linux.org>
      Cc: Matthew Wilcox <willy@debian.org>
      Signed-off-by: NHirokazu Takata <takata@linux-m32r.org>
      Signed-off-by: NMikael Pettersson <mikpe@csd.uu.se>
      Signed-off-by: NBenoit Boissinot <benoit.boissinot@ens-lyon.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fb1c8f93
  22. 05 9月, 2005 1 次提交
  23. 28 7月, 2005 1 次提交
    • M
      [PATCH] s390: spin lock retry · 951f22d5
      Martin Schwidefsky 提交于
      Split spin lock and r/w lock implementation into a single try which is done
      inline and an out of line function that repeatedly tries to get the lock
      before doing the cpu_relax().  Add a system control to set the number of
      retries before a cpu is yielded.
      
      The reason for the spin lock retry is that the diagnose 0x44 that is used to
      give up the virtual cpu is quite expensive.  For spin locks that are held only
      for a short period of time the costs of the diagnoses outweights the savings
      for spin locks that are held for a longer timer.  The default retry count is
      1000.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      951f22d5
  24. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4