1. 13 1月, 2016 1 次提交
    • M
      asm-generic: implement virt_xxx memory barriers · 6a65d263
      Michael S. Tsirkin 提交于
      Guests running within virtual machines might be affected by SMP effects even if
      the guest itself is compiled without SMP support.  This is an artifact of
      interfacing with an SMP host while running an UP kernel.  Using mandatory
      barriers for this use-case would be possible but is often suboptimal.
      
      In particular, virtio uses a bunch of confusing ifdefs to work around
      this, while xen just uses the mandatory barriers.
      
      To better handle this case, low-level virt_mb() etc macros are made available.
      These are implemented trivially using the low-level __smp_xxx macros,
      the purpose of these wrappers is to annotate those specific cases.
      
      These have the same effect as smp_mb() etc when SMP is enabled, but generate
      identical code for SMP and non-SMP systems. For example, virtual machine guests
      should use virt_mb() rather than smp_mb() when synchronizing against a
      (possibly SMP) host.
      Suggested-by: NDavid Miller <davem@davemloft.net>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      6a65d263
  2. 04 11月, 2015 1 次提交
    • L
      atomic: remove all traces of READ_ONCE_CTRL() and atomic*_read_ctrl() · 105ff3cb
      Linus Torvalds 提交于
      This seems to be a mis-reading of how alpha memory ordering works, and
      is not backed up by the alpha architecture manual.  The helper functions
      don't do anything special on any other architectures, and the arguments
      that support them being safe on other architectures also argue that they
      are safe on alpha.
      
      Basically, the "control dependency" is between a previous read and a
      subsequent write that is dependent on the value read.  Even if the
      subsequent write is actually done speculatively, there is no way that
      such a speculative write could be made visible to other cpu's until it
      has been committed, which requires validating the speculation.
      
      Note that most weakely ordered architectures (very much including alpha)
      do not guarantee any ordering relationship between two loads that depend
      on each other on a control dependency:
      
          read A
          if (val == 1)
              read B
      
      because the conditional may be predicted, and the "read B" may be
      speculatively moved up to before reading the value A.  So we require the
      user to insert a smp_rmb() between the two accesses to be correct:
      
          read A;
          if (A == 1)
              smp_rmb()
              read B
      
      Alpha is further special in that it can break that ordering even if the
      *address* of B depends on the read of A, because the cacheline that is
      read later may be stale unless you have a memory barrier in between the
      pointer read and the read of the value behind a pointer:
      
          read ptr
          read offset(ptr)
      
      whereas all other weakly ordered architectures guarantee that the data
      dependency (as opposed to just a control dependency) will order the two
      accesses.  As a result, alpha needs a "smp_read_barrier_depends()" in
      between those two reads for them to be ordered.
      
      The coontrol dependency that "READ_ONCE_CTRL()" and "atomic_read_ctrl()"
      had was a control dependency to a subsequent *write*, however, and
      nobody can finalize such a subsequent write without having actually done
      the read.  And were you to write such a value to a "stale" cacheline
      (the way the unordered reads came to be), that would seem to lose the
      write entirely.
      
      So the things that make alpha able to re-order reads even more
      aggressively than other weak architectures do not seem to be relevant
      for a subsequent write.  Alpha memory ordering may be strange, but
      there's no real indication that it is *that* strange.
      
      Also, the alpha architecture reference manual very explicitly talks
      about the definition of "Dependence Constraints" in section 5.6.1.7,
      where a preceding read dominates a subsequent write.
      
      Such a dependence constraint admittedly does not impose a BEFORE (alpha
      architecture term for globally visible ordering), but it does guarantee
      that there can be no "causal loop".  I don't see how you could avoid
      such a loop if another cpu could see the stored value and then impact
      the value of the first read.  Put another way: the read and the write
      could not be seen as being out of order wrt other cpus.
      
      So I do not see how these "x_ctrl()" functions can currently be necessary.
      
      I may have to eat my words at some point, but in the absense of clear
      proof that alpha actually needs this, or indeed even an explanation of
      how alpha could _possibly_ need it, I do not believe these functions are
      called for.
      
      And if it turns out that alpha really _does_ need a barrier for this
      case, that barrier still should not be "smp_read_barrier_depends()".
      We'd have to make up some new speciality barrier just for alpha, along
      with the documentation for why it really is necessary.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul E McKenney <paulmck@us.ibm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      105ff3cb
  3. 07 10月, 2015 2 次提交
  4. 23 9月, 2015 1 次提交
  5. 04 8月, 2015 1 次提交
  6. 03 8月, 2015 1 次提交
  7. 16 7月, 2015 3 次提交
  8. 28 5月, 2015 2 次提交
    • P
      smp: Make control dependencies work on Alpha, improve documentation · 5af4692a
      Paul E. McKenney 提交于
      The current formulation of control dependencies fails on DEC Alpha,
      which does not respect dependencies of any kind unless an explicit
      memory barrier is provided.  This means that the current fomulation of
      control dependencies fails on Alpha.  This commit therefore creates a
      READ_ONCE_CTRL() that has the same overhead on non-Alpha systems, but
      causes Alpha to produce the needed ordering.  This commit also applies
      READ_ONCE_CTRL() to the one known use of control dependencies.
      
      Use of READ_ONCE_CTRL() also has the beneficial effect of adding a bit
      of self-documentation to control dependencies.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      5af4692a
    • W
      documentation: memory-barriers: Fix smp_mb__before_spinlock() semantics · d956028e
      Will Deacon 提交于
      Our current documentation claims that, when followed by an ACQUIRE,
      smp_mb__before_spinlock() orders prior loads against subsequent loads
      and stores, which isn't the intent.  This commit therefore fixes the
      documentation to state that this sequence orders only prior stores
      against subsequent loads and stores.
      
      In addition, the original intent of smp_mb__before_spinlock() was to only
      order prior loads against subsequent stores, however, people have started
      using it as if it ordered prior loads against subsequent loads and stores.
      This commit therefore also updates smp_mb__before_spinlock()'s header
      comment to reflect this new reality.
      
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      d956028e
  9. 19 5月, 2015 1 次提交
  10. 09 4月, 2015 1 次提交
  11. 27 2月, 2015 1 次提交
  12. 08 1月, 2015 2 次提交
  13. 12 12月, 2014 1 次提交
    • A
      arch: Add lightweight memory barriers dma_rmb() and dma_wmb() · 1077fa36
      Alexander Duyck 提交于
      There are a number of situations where the mandatory barriers rmb() and
      wmb() are used to order memory/memory operations in the device drivers
      and those barriers are much heavier than they actually need to be.  For
      example in the case of PowerPC wmb() calls the heavy-weight sync
      instruction when for coherent memory operations all that is really needed
      is an lsync or eieio instruction.
      
      This commit adds a coherent only version of the mandatory memory barriers
      rmb() and wmb().  In most cases this should result in the barrier being the
      same as the SMP barriers for the SMP case, however in some cases we use a
      barrier that is somewhere in between rmb() and smp_rmb().  For example on
      ARM the rmb barriers break down as follows:
      
        Barrier   Call     Explanation
        --------- -------- ----------------------------------
        rmb()     dsb()    Data synchronization barrier - system
        dma_rmb() dmb(osh) data memory barrier - outer sharable
        smp_rmb() dmb(ish) data memory barrier - inner sharable
      
      These new barriers are not as safe as the standard rmb() and wmb().
      Specifically they do not guarantee ordering between coherent and incoherent
      memories.  The primary use case for these would be to enforce ordering of
      reads and writes when accessing coherent memory that is shared between the
      CPU and a device.
      
      It may also be noted that there is no dma_mb().  Most architectures don't
      provide a good mechanism for performing a coherent only full barrier without
      resorting to the same mechanism used in mb().  As such there isn't much to
      be gained in trying to define such a function.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1077fa36
  14. 14 11月, 2014 2 次提交
  15. 21 10月, 2014 1 次提交
    • W
      documentation: memory-barriers: clarify relaxed io accessor semantics · a8e0aead
      Will Deacon 提交于
      This patch extends the paragraph describing the relaxed read io accessors
      so that the relaxed accessors are defined to be:
      
       - Ordered with respect to each other if accessing the same peripheral
      
       - Unordered with respect to normal memory accesses
      
       - Unordered with respect to LOCK/UNLOCK operations
      
      Whilst many architectures will provide stricter semantics, ARM, Alpha and
      PPC can achieve significant performance gains by taking advantage of some
      or all of the above relaxations.
      
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      a8e0aead
  16. 08 9月, 2014 3 次提交
  17. 08 7月, 2014 2 次提交
  18. 07 6月, 2014 1 次提交
  19. 18 4月, 2014 1 次提交
  20. 21 3月, 2014 1 次提交
  21. 25 2月, 2014 1 次提交
  22. 18 2月, 2014 3 次提交
  23. 12 1月, 2014 1 次提交
  24. 16 12月, 2013 5 次提交
  25. 22 11月, 2013 1 次提交