1. 28 9月, 2017 2 次提交
    • M
      s390/spinlock: introduce spinlock wait queueing · b96f7d88
      Martin Schwidefsky 提交于
      The queued spinlock code for s390 follows the principles of the common
      code qspinlock implementation but with a few notable differences.
      
      The format of the spinlock_t locking word differs, s390 needs to store
      the logical CPU number of the lock holder in the spinlock_t to be able
      to use the diagnose 9c directed yield hypervisor call.
      
      The inline code sequences for spin_lock and spin_unlock are nice and
      short. The inline portion of a spin_lock now typically looks like this:
      
      	lhi	%r0,0			# 0 indicates an empty lock
      	l	%r1,0x3a0		# CPU number + 1 from lowcore
      	cs	%r0,%r1,<some_lock>	# lock operation
      	jnz	call_wait		# on failure call wait function
      locked:
      	...
      call_wait:
      	la	%r2,<some_lock>
      	brasl	%r14,arch_spin_lock_wait
      	j	locked
      
      A spin_unlock is as simple as before:
      
      	lhi	%r0,0
      	sth	%r0,2(%r2)		# unlock operation
      
      After a CPU has queued itself it may not enable interrupts again for the
      arch_spin_lock_flags() variant. The arch_spin_lock_wait_flags wait function
      is removed.
      
      To improve performance the code implements opportunistic lock stealing.
      If the wait function finds a spinlock_t that indicates that the lock is
      free but there are queued waiters, the CPU may steal the lock up to three
      times without queueing itself. The lock stealing update the steal counter
      in the lock word to prevent more than 3 steals. The counter is reset at
      the time the CPU next in the queue successfully takes the lock.
      
      While the queued spinlocks improve performance in a system with dedicated
      CPUs, in a virtualized environment with continuously overcommitted CPUs
      the queued spinlocks can have a negative effect on performance. This
      is due to the fact that a queued CPU that is preempted by the hypervisor
      will block the queue at some point even without holding the lock. With
      the classic spinlock it does not matter if a CPU is preempted that waits
      for the lock. Therefore use the queued spinlock code only if the system
      runs with dedicated CPUs and fall back to classic spinlocks when running
      with shared CPUs.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b96f7d88
    • M
      s390/spinlock: use the cpu number +1 as spinlock value · 81533803
      Martin Schwidefsky 提交于
      The queued spinlock code will come out simpler if the encoding of
      the CPU that holds the spinlock is (cpu+1) instead of (~cpu).
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      81533803
  2. 26 7月, 2017 1 次提交
    • M
      s390/spinlock: add niai spinlock hints · 7f7e6e28
      Martin Schwidefsky 提交于
      The z14 machine introduces new mode of the next-instruction-access-intent
      NIAI instruction. With NIAI-8 it is possible to pin a cache-line on a
      CPU for a small amount of time, NIAI-7 releases the cache-line again.
      Finally NIAI-4 can be used to prevent the CPU to speculatively access
      memory beyond the compare-and-swap instruction to get the lock.
      
      Use these instruction in the spinlock code.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      7f7e6e28
  3. 12 4月, 2017 2 次提交
  4. 17 2月, 2017 2 次提交
  5. 22 11月, 2016 1 次提交
  6. 16 4月, 2016 1 次提交
    • H
      s390/spinlock: avoid yield to non existent cpu · 84976952
      Heiko Carstens 提交于
      arch_spin_lock_wait_flags() checks if a spinlock is not held before
      trying a compare and swap instruction. If the lock is unlocked it
      tries the compare and swap instruction, however if a different cpu
      grabbed the lock in the meantime the instruction will fail as
      expected.
      
      Subsequently the arch_spin_lock_wait_flags() incorrectly tries to
      figure out if the cpu that holds the lock is running. However it is
      using the wrong cpu number for this (-1) and then will also yield the
      current cpu to the wrong cpu.
      
      Fix this by adding a missing continue statement.
      
      Fixes: 470ada6b ("s390/spinlock: refactor arch_spin_lock_wait[_flags]")
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      84976952
  7. 27 11月, 2015 2 次提交
    • M
      s390/spinlock: do not yield to a CPU in udelay/mdelay · 419123f9
      Martin Schwidefsky 提交于
      It does not make sense to try to relinquish the time slice with diag 0x9c
      to a CPU in a state that does not allow to schedule the CPU. The scenario
      where this can happen is a CPU waiting in udelay/mdelay while holding a
      spin-lock.
      
      Add a CIF bit to tag a CPU in enabled wait and use it to detect that the
      yield of a CPU will not be successful and skip the diagnose call.
      Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      419123f9
    • M
      s390/spinlock: avoid diagnose loop · db1c4515
      Martin Schwidefsky 提交于
      The spinlock implementation calls the diagnose 0x9c / 0x44 immediately
      if the SIGP sense running reported the target CPU as not running.
      
      The diagnose 0x9c is a hint to the hypervisor to schedule the target
      CPU in preference to the source CPU that issued the diagnose. It can
      happen that on return from the diagnose the target CPU has not been
      scheduled yet, e.g. if the target logical CPU is on another physical
      CPU and the hypervisor did not want to migrate the logical CPU.
      
      Avoid the immediate repeat of the diagnose instruction, instead do
      the retry loop before the next invocation of diagnose 0x9c.
      Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      db1c4515
  8. 14 10月, 2015 1 次提交
  9. 23 1月, 2015 1 次提交
    • M
      s390/spinlock: add compare-and-delay to lock wait loops · 2c72a44e
      Martin Schwidefsky 提交于
      Add the compare-and-delay instruction to the spin-lock and rw-lock
      retry loops. A CPU executing the compare-and-delay instruction stops
      until the lock value has changed. This is done to make the locking
      code for contended locks to behave better in regard to the multi-
      hreading facility. A thread of a core executing a compare-and-delay
      will allow the other threads of a core to get a larger share of the
      core resources.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      2c72a44e
  10. 25 9月, 2014 4 次提交
  11. 20 5月, 2014 6 次提交
  12. 20 7月, 2012 1 次提交
    • H
      s390/comments: unify copyright messages and remove file names · a53c8fab
      Heiko Carstens 提交于
      Remove the file name from the comment at top of many files. In most
      cases the file name was wrong anyway, so it's rather pointless.
      
      Also unify the IBM copyright statement. We did have a lot of sightly
      different statements and wanted to change them one after another
      whenever a file gets touched. However that never happened. Instead
      people start to take the old/"wrong" statements to use as a template
      for new files.
      So unify all of them in one go.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      a53c8fab
  13. 11 3月, 2012 1 次提交
    • M
      [S390] rework smp code · 8b646bd7
      Martin Schwidefsky 提交于
      Define struct pcpu and merge some of the NR_CPUS arrays into it, including
      __cpu_logical_map, current_set and smp_cpu_state. Split smp related
      functions to those operating on physical cpus and the functions operating
      on a logical cpu number. Make the functions for physical cpus use a
      pointer to a struct pcpu. This hides the knowledge about cpu addresses in
      smp.c, entry[64].S and swsusp_asm64.S, thus remove the sigp.h header.
      
      The PSW restart mechanism is used to start secondary cpus, calling a
      function on an online cpu, calling a function on the ipl cpu, and for
      the nmi signal. Replace the different assembler functions with a
      single function restart_int_handler. The new entry point calls a function
      whose pointer is stored in the lowcore of the target cpu and it can wait
      for the source cpu to stop. This covers all existing use cases.
      
      Overall the code is now simpler and there are ~380 lines less code.
      Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      8b646bd7
  14. 27 2月, 2010 1 次提交
  15. 14 1月, 2010 1 次提交
  16. 15 12月, 2009 4 次提交
  17. 12 6月, 2009 1 次提交
  18. 26 1月, 2008 2 次提交
  19. 01 10月, 2006 1 次提交
  20. 10 3月, 2006 1 次提交
  21. 15 1月, 2006 1 次提交
  22. 07 1月, 2006 1 次提交
  23. 11 9月, 2005 1 次提交
    • I
      [PATCH] spinlock consolidation · fb1c8f93
      Ingo Molnar 提交于
      This patch (written by me and also containing many suggestions of Arjan van
      de Ven) does a major cleanup of the spinlock code.  It does the following
      things:
      
       - consolidates and enhances the spinlock/rwlock debugging code
      
       - simplifies the asm/spinlock.h files
      
       - encapsulates the raw spinlock type and moves generic spinlock
         features (such as ->break_lock) into the generic code.
      
       - cleans up the spinlock code hierarchy to get rid of the spaghetti.
      
      Most notably there's now only a single variant of the debugging code,
      located in lib/spinlock_debug.c.  (previously we had one SMP debugging
      variant per architecture, plus a separate generic one for UP builds)
      
      Also, i've enhanced the rwlock debugging facility, it will now track
      write-owners.  There is new spinlock-owner/CPU-tracking on SMP builds too.
      All locks have lockup detection now, which will work for both soft and hard
      spin/rwlock lockups.
      
      The arch-level include files now only contain the minimally necessary
      subset of the spinlock code - all the rest that can be generalized now
      lives in the generic headers:
      
       include/asm-i386/spinlock_types.h       |   16
       include/asm-x86_64/spinlock_types.h     |   16
      
      I have also split up the various spinlock variants into separate files,
      making it easier to see which does what. The new layout is:
      
         SMP                         |  UP
         ----------------------------|-----------------------------------
         asm/spinlock_types_smp.h    |  linux/spinlock_types_up.h
         linux/spinlock_types.h      |  linux/spinlock_types.h
         asm/spinlock_smp.h          |  linux/spinlock_up.h
         linux/spinlock_api_smp.h    |  linux/spinlock_api_up.h
         linux/spinlock.h            |  linux/spinlock.h
      
      /*
       * here's the role of the various spinlock/rwlock related include files:
       *
       * on SMP builds:
       *
       *  asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the
       *                        initializers
       *
       *  linux/spinlock_types.h:
       *                        defines the generic type and initializers
       *
       *  asm/spinlock.h:       contains the __raw_spin_*()/etc. lowlevel
       *                        implementations, mostly inline assembly code
       *
       *   (also included on UP-debug builds:)
       *
       *  linux/spinlock_api_smp.h:
       *                        contains the prototypes for the _spin_*() APIs.
       *
       *  linux/spinlock.h:     builds the final spin_*() APIs.
       *
       * on UP builds:
       *
       *  linux/spinlock_type_up.h:
       *                        contains the generic, simplified UP spinlock type.
       *                        (which is an empty structure on non-debug builds)
       *
       *  linux/spinlock_types.h:
       *                        defines the generic type and initializers
       *
       *  linux/spinlock_up.h:
       *                        contains the __raw_spin_*()/etc. version of UP
       *                        builds. (which are NOPs on non-debug, non-preempt
       *                        builds)
       *
       *   (included on UP-non-debug builds:)
       *
       *  linux/spinlock_api_up.h:
       *                        builds the _spin_*() APIs.
       *
       *  linux/spinlock.h:     builds the final spin_*() APIs.
       */
      
      All SMP and UP architectures are converted by this patch.
      
      arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via
      crosscompilers.  m32r, mips, sh, sparc, have not been tested yet, but should
      be mostly fine.
      
      From: Grant Grundler <grundler@parisc-linux.org>
      
        Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU).
        Builds 32-bit SMP kernel (not booted or tested).  I did not try to build
        non-SMP kernels.  That should be trivial to fix up later if necessary.
      
        I converted bit ops atomic_hash lock to raw_spinlock_t.  Doing so avoids
        some ugly nesting of linux/*.h and asm/*.h files.  Those particular locks
        are well tested and contained entirely inside arch specific code.  I do NOT
        expect any new issues to arise with them.
      
       If someone does ever need to use debug/metrics with them, then they will
        need to unravel this hairball between spinlocks, atomic ops, and bit ops
        that exist only because parisc has exactly one atomic instruction: LDCW
        (load and clear word).
      
      From: "Luck, Tony" <tony.luck@intel.com>
      
         ia64 fix
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArjan van de Ven <arjanv@infradead.org>
      Signed-off-by: NGrant Grundler <grundler@parisc-linux.org>
      Cc: Matthew Wilcox <willy@debian.org>
      Signed-off-by: NHirokazu Takata <takata@linux-m32r.org>
      Signed-off-by: NMikael Pettersson <mikpe@csd.uu.se>
      Signed-off-by: NBenoit Boissinot <benoit.boissinot@ens-lyon.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fb1c8f93
  24. 28 7月, 2005 1 次提交
    • M
      [PATCH] s390: spin lock retry · 951f22d5
      Martin Schwidefsky 提交于
      Split spin lock and r/w lock implementation into a single try which is done
      inline and an out of line function that repeatedly tries to get the lock
      before doing the cpu_relax().  Add a system control to set the number of
      retries before a cpu is yielded.
      
      The reason for the spin lock retry is that the diagnose 0x44 that is used to
      give up the virtual cpu is quite expensive.  For spin locks that are held only
      for a short period of time the costs of the diagnoses outweights the savings
      for spin locks that are held for a longer timer.  The default retry count is
      1000.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      951f22d5