1. 29 6月, 2016 2 次提交
  2. 24 6月, 2016 2 次提交
  3. 21 6月, 2016 7 次提交
  4. 16 6月, 2016 2 次提交
  5. 14 6月, 2016 4 次提交
    • B
      powerpc/spinlock: Fix spin_unlock_wait() · 6262db7c
      Boqun Feng 提交于
      There is an ordering issue with spin_unlock_wait() on powerpc, because
      the spin_lock primitive is an ACQUIRE and an ACQUIRE is only ordering
      the load part of the operation with memory operations following it.
      Therefore the following event sequence can happen:
      
      CPU 1			CPU 2			CPU 3
      
      ==================	====================	==============
      						spin_unlock(&lock);
      			spin_lock(&lock):
      			  r1 = *lock; // r1 == 0;
      o = object;		o = READ_ONCE(object); // reordered here
      object = NULL;
      smp_mb();
      spin_unlock_wait(&lock);
      			  *lock = 1;
      smp_mb();
      o->dead = true;         < o = READ_ONCE(object); > // reordered upwards
      			if (o) // true
      				BUG_ON(o->dead); // true!!
      
      To fix this, we add a "nop" ll/sc loop in arch_spin_unlock_wait() on
      ppc, the "nop" ll/sc loop reads the lock
      value and writes it back atomically, in this way it will synchronize the
      view of the lock on CPU1 with that on CPU2. Therefore in the scenario
      above, either CPU2 will fail to get the lock at first or CPU1 will see
      the lock acquired by CPU2, both cases will eliminate this bug. This is a
      similar idea as what Will Deacon did for ARM64 in:
      
        d86b8da0 ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers")
      
      Furthermore, if the "nop" ll/sc figures out the lock is locked, we
      actually don't need to do the "nop" ll/sc trick again, we can just do a
      normal load+check loop for the lock to be released, because in that
      case, spin_unlock_wait() is called when someone is holding the lock, and
      the store part of the "nop" ll/sc happens before the lock release of the
      current lock holder:
      
      	"nop" ll/sc -> spin_unlock()
      
      and the lock release happens before the next lock acquisition:
      
      	spin_unlock() -> spin_lock() <next holder>
      
      which means the "nop" ll/sc happens before the next lock acquisition:
      
      	"nop" ll/sc -> spin_unlock() -> spin_lock() <next holder>
      
      With a smp_mb() preceding spin_unlock_wait(), the store of object is
      guaranteed to be observed by the next lock holder:
      
      	STORE -> smp_mb() -> "nop" ll/sc
      	-> spin_unlock() -> spin_lock() <next holder>
      
      This patch therefore fixes the issue and also cleans the
      arch_spin_unlock_wait() a little bit by removing superfluous memory
      barriers in loops and consolidating the implementations for PPC32 and
      PPC64 into one.
      Suggested-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NBoqun Feng <boqun.feng@gmail.com>
      Reviewed-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      [mpe: Inline the "nop" ll/sc loop and set EH=0, munge change log]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6262db7c
    • M
      powerpc: Define and use PPC64_ELF_ABI_v2/v1 · f55d9665
      Michael Ellerman 提交于
      We're approaching 20 locations where we need to check for ELF ABI v2.
      That's fine, except the logic is a bit awkward, because we have to check
      that _CALL_ELF is defined and then what its value is.
      
      So check it once in asm/types.h and define PPC64_ELF_ABI_v2 when ELF ABI
      v2 is detected.
      
      We also have a few places where what we're really trying to check is
      that we are using the 64-bit v1 ABI, ie. function descriptors. So also
      add a #define for that, which simplifies several checks.
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f55d9665
    • M
      powerpc: Various typo fixes · 027dfac6
      Michael Ellerman 提交于
      Signed-off-by: NAndrea Gelmini <andrea.gelmini@gelma.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      027dfac6
    • A
      powerpc: Remove assembly versions of strcpy, strcat, strlen and strcmp · 3ece1663
      Anton Blanchard 提交于
      A number of our assembly implementations of string functions do not
      align their hot loops. I was going to align them manually, but I
      realised that they are are almost instruction for instruction
      identical to what gcc produces, with the advantage that gcc does
      align them.
      
      In light of that, let's just remove the assembly versions.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      3ece1663
  6. 10 6月, 2016 1 次提交
  7. 31 5月, 2016 2 次提交
  8. 20 5月, 2016 1 次提交
    • H
      arch: fix has_transparent_hugepage() · fd8cfd30
      Hugh Dickins 提交于
      I've just discovered that the useful-sounding has_transparent_hugepage()
      is actually an architecture-dependent minefield: on some arches it only
      builds if CONFIG_TRANSPARENT_HUGEPAGE=y, on others it's also there when
      not, but on some of those (arm and arm64) it then gives the wrong
      answer; and on mips alone it's marked __init, which would crash if
      called later (but so far it has not been called later).
      
      Straighten this out: make it available to all configs, with a sensible
      default in asm-generic/pgtable.h, removing its definitions from those
      arches (arc, arm, arm64, sparc, tile) which are served by the default,
      adding #define has_transparent_hugepage has_transparent_hugepage to
      those (mips, powerpc, s390, x86) which need to override the default at
      runtime, and removing the __init from mips (but maybe that kind of code
      should be avoided after init: set a static variable the first time it's
      called).
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andres Lagar-Cavilla <andreslc@google.com>
      Cc: Yang Shi <yang.shi@linaro.org>
      Cc: Ning Qu <quning@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: Vineet Gupta <vgupta@synopsys.com>		[arch/arc]
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>	[arch/s390]
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fd8cfd30
  9. 13 5月, 2016 1 次提交
    • C
      KVM: halt_polling: provide a way to qualify wakeups during poll · 3491caf2
      Christian Borntraeger 提交于
      Some wakeups should not be considered a sucessful poll. For example on
      s390 I/O interrupts are usually floating, which means that _ALL_ CPUs
      would be considered runnable - letting all vCPUs poll all the time for
      transactional like workload, even if one vCPU would be enough.
      This can result in huge CPU usage for large guests.
      This patch lets architectures provide a way to qualify wakeups if they
      should be considered a good/bad wakeups in regard to polls.
      
      For s390 the implementation will fence of halt polling for anything but
      known good, single vCPU events. The s390 implementation for floating
      interrupts does a wakeup for one vCPU, but the interrupt will be delivered
      by whatever CPU checks first for a pending interrupt. We prefer the
      woken up CPU by marking the poll of this CPU as "good" poll.
      This code will also mark several other wakeup reasons like IPI or
      expired timers as "good". This will of course also mark some events as
      not sucessful. As  KVM on z runs always as a 2nd level hypervisor,
      we prefer to not poll, unless we are really sure, though.
      
      This patch successfully limits the CPU usage for cases like uperf 1byte
      transactional ping pong workload or wakeup heavy workload like OLTP
      while still providing a proper speedup.
      
      This also introduced a new vcpu stat "halt_poll_no_tuning" that marks
      wakeups that are considered not good for polling.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version)
      Cc: David Matlack <dmatlack@google.com>
      Cc: Wanpeng Li <kernellwp@gmail.com>
      [Rename config symbol. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3491caf2
  10. 12 5月, 2016 1 次提交
    • G
      kvm: introduce KVM_MAX_VCPU_ID · 0b1b1dfd
      Greg Kurz 提交于
      The KVM_MAX_VCPUS define provides the maximum number of vCPUs per guest, and
      also the upper limit for vCPU ids. This is okay for all archs except PowerPC
      which can have higher ids, depending on the cpu/core/thread topology. In the
      worst case (single threaded guest, host with 8 threads per core), it limits
      the maximum number of vCPUS to KVM_MAX_VCPUS / 8.
      
      This patch separates the vCPU numbering from the total number of vCPUs, with
      the introduction of KVM_MAX_VCPU_ID, as the maximal valid value for vCPU ids
      plus one.
      
      The corresponding KVM_CAP_MAX_VCPU_ID allows userspace to validate vCPU ids
      before passing them to KVM_CREATE_VCPU.
      
      This patch only implements KVM_MAX_VCPU_ID with a specific value for PowerPC.
      Other archs continue to return KVM_MAX_VCPUS instead.
      Suggested-by: NRadim Krcmar <rkrcmar@redhat.com>
      Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
      Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0b1b1dfd
  11. 11 5月, 2016 17 次提交