1. 12 12月, 2014 1 次提交
    • A
      arch: Cleanup read_barrier_depends() and comments · 8a449718
      Alexander Duyck 提交于
      This patch is meant to cleanup the handling of read_barrier_depends and
      smp_read_barrier_depends.  In multiple spots in the kernel headers
      read_barrier_depends is defined as "do {} while (0)", however we then go
      into the SMP vs non-SMP sections and have the SMP version reference
      read_barrier_depends, and the non-SMP define it as yet another empty
      do/while.
      
      With this commit I went through and cleaned out the duplicate definitions
      and reduced the number of definitions down to 2 per header.  In addition I
      moved the 50 line comments for the macro from the x86 and mips headers that
      defined it as an empty do/while to those that were actually defining the
      macro, alpha and blackfin.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a449718
  2. 18 4月, 2014 1 次提交
  3. 12 1月, 2014 1 次提交
    • P
      arch: Introduce smp_load_acquire(), smp_store_release() · 47933ad4
      Peter Zijlstra 提交于
      A number of situations currently require the heavyweight smp_mb(),
      even though there is no need to order prior stores against later
      loads.  Many architectures have much cheaper ways to handle these
      situations, but the Linux kernel currently has no portable way
      to make use of them.
      
      This commit therefore supplies smp_load_acquire() and
      smp_store_release() to remedy this situation.  The new
      smp_load_acquire() primitive orders the specified load against
      any subsequent reads or writes, while the new smp_store_release()
      primitive orders the specifed store against any prior reads or
      writes.  These primitives allow array-based circular FIFOs to be
      implemented without an smp_mb(), and also allow a theoretical
      hole in rcu_assign_pointer() to be closed at no additional
      expense on most architectures.
      
      In addition, the RCU experience transitioning from explicit
      smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
      and rcu_assign_pointer(), respectively resulted in substantial
      improvements in readability.  It therefore seems likely that
      replacing other explicit barriers with smp_load_acquire() and
      smp_store_release() will provide similar benefits.  It appears
      that roughly half of the explicit barriers in core kernel code
      might be so replaced.
      
      [Changelog by PaulMck]
      Reviewed-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Victor Kaplansky <VICTORK@il.ibm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Link: http://lkml.kernel.org/r/20131213150640.908486364@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      47933ad4
  4. 05 11月, 2013 1 次提交
  5. 01 2月, 2013 1 次提交
  6. 29 3月, 2012 1 次提交
  7. 27 2月, 2010 4 次提交
    • D
      MIPS: Optimize spinlocks. · 500c2e1f
      David Daney 提交于
      The current locking mechanism uses a ll/sc sequence to release a
      spinlock.  This is slower than a wmb() followed by a store to unlock.
      
      The branching forward to .subsection 2 on sc failure slows down the
      contended case.  So we get rid of that part too.
      
      Since we are now working on naturally aligned u16 values, we can get
      rid of a masking operation as the LHU already does the right thing.
      The ANDI are reversed for better scheduling on multi-issue CPUs
      
      On a 12 CPU 750MHz Octeon cn5750 this patch improves ipv4 UDP packet
      forwarding rates from 3.58*10^6 PPS to 3.99*10^6 PPS, or about 11%.
      Signed-off-by: NDavid Daney <ddaney@caviumnetworks.com>
      To: linux-mips@linux-mips.org
      Patchwork: http://patchwork.linux-mips.org/patch/937/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      500c2e1f
    • D
      MIPS: Octeon: Use optimized memory barrier primitives. · 6b07d38a
      David Daney 提交于
      In order to achieve correct synchronization semantics, the Octeon port
      had defined CONFIG_WEAK_REORDERING_BEYOND_LLSC.  This resulted in code
      that looks like:
      
         sync
         ll ...
         .
         .
         .
         sc ...
         .
         .
         sync
      
      The second SYNC was redundant, but harmless.
      
      Octeon has a SYNCW instruction that acts as a write-memory-barrier
      (due to an erratum in some parts two SYNCW are used).  It is much
      faster than SYNC because it imposes ordering on the writes, but
      doesn't otherwise stall the execution pipeline.  On Octeon, SYNC
      stalls execution until all preceeding writes are committed to the
      coherent memory system.
      
      Using:
      
          syncw;syncw
          ll
          .
          .
          .
          sc
          .
          .
      
      Has identical semantics to the first sequence, but is much faster.
      The SYNCW orders the writes, and the SC will not complete successfully
      until the write is committed to the coherent memory system.  So at the
      end all preceeding writes have been committed.  Since Octeon does not
      do speculative reads, this functions as a full barrier.
      
      The patch removes CONFIG_WEAK_REORDERING_BEYOND_LLSC, and substitutes
      SYNCW for SYNC in write-memory-barriers.
      Signed-off-by: NDavid Daney <ddaney@caviumnetworks.com>
      To: linux-mips@linux-mips.org
      Patchwork: http://patchwork.linux-mips.org/patch/850/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      6b07d38a
    • D
      MIPS: New macro smp_mb__before_llsc. · f252ffd5
      David Daney 提交于
      Replace some instances of smp_llsc_mb() with a new macro
      smp_mb__before_llsc().  It is used before ll/sc sequences that are
      documented as needing write barrier semantics.
      
      The default implementation of smp_mb__before_llsc() is just smp_llsc_mb(),
      so there are no changes in semantics.
      
      Also simplify definition of smp_mb(), smp_rmb(), and smp_wmb() to be just
      barrier() in the non-SMP case.
      Signed-off-by: NDavid Daney <ddaney@caviumnetworks.com>
      To: linux-mips@linux-mips.org
      Patchwork: http://patchwork.linux-mips.org/patch/851/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      f252ffd5
    • D
      MIPS: Remove unused macros from barrier.h · ec5380c7
      David Daney 提交于
      The smp_llsc_rmb() and smp_llsc_wmb() macros are not used in the tree,
      remove them.
      Signed-off-by: NDavid Daney <ddaney@caviumnetworks.com>
      To: linux-mips@linux-mips.org
      Patchwork: http://patchwork.linux-mips.org/patch/848/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      ec5380c7
  8. 11 10月, 2008 1 次提交
  9. 16 7月, 2008 1 次提交
  10. 21 7月, 2007 1 次提交
  11. 05 12月, 2006 1 次提交