1. 12 7月, 2018 1 次提交
  2. 31 3月, 2018 1 次提交
  3. 23 3月, 2018 2 次提交
    • P
      KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9 · 4bb3c7a0
      Paul Mackerras 提交于
      POWER9 has hardware bugs relating to transactional memory and thread
      reconfiguration (changes to hardware SMT mode).  Specifically, the core
      does not have enough storage to store a complete checkpoint of all the
      architected state for all four threads.  The DD2.2 version of POWER9
      includes hardware modifications designed to allow hypervisor software
      to implement workarounds for these problems.  This patch implements
      those workarounds in KVM code so that KVM guests see a full, working
      transactional memory implementation.
      
      The problems center around the use of TM suspended state, where the
      CPU has a checkpointed state but execution is not transactional.  The
      workaround is to implement a "fake suspend" state, which looks to the
      guest like suspended state but the CPU does not store a checkpoint.
      In this state, any instruction that would cause a transition to
      transactional state (rfid, rfebb, mtmsrd, tresume) or would use the
      checkpointed state (treclaim) causes a "soft patch" interrupt (vector
      0x1500) to the hypervisor so that it can be emulated.  The trechkpt
      instruction also causes a soft patch interrupt.
      
      On POWER9 DD2.2, we avoid returning to the guest in any state which
      would require a checkpoint to be present.  The trechkpt in the guest
      entry path which would normally create that checkpoint is replaced by
      either a transition to fake suspend state, if the guest is in suspend
      state, or a rollback to the pre-transactional state if the guest is in
      transactional state.  Fake suspend state is indicated by a flag in the
      PACA plus a new bit in the PSSCR.  The new PSSCR bit is write-only and
      reads back as 0.
      
      On exit from the guest, if the guest is in fake suspend state, we still
      do the treclaim instruction as we would in real suspend state, in order
      to get into non-transactional state, but we do not save the resulting
      register state since there was no checkpoint.
      
      Emulation of the instructions that cause a softpatch interrupt is
      handled in two paths.  If the guest is in real suspend mode, we call
      kvmhv_p9_tm_emulation_early() to handle the cases where the guest is
      transitioning to transactional state.  This is called before we do the
      treclaim in the guest exit path; because we haven't done treclaim, we
      can get back to the guest with the transaction still active.  If the
      instruction is a case that kvmhv_p9_tm_emulation_early() doesn't
      handle, or if the guest is in fake suspend state, then we proceed to
      do the complete guest exit path and subsequently call
      kvmhv_p9_tm_emulation() in host context with the MMU on.  This handles
      all the cases including the cases that generate program interrupts
      (illegal instruction or TM Bad Thing) and facility unavailable
      interrupts.
      
      The emulation is reasonably straightforward and is mostly concerned
      with checking for exception conditions and updating the state of
      registers such as MSR and CR0.  The treclaim emulation takes care to
      ensure that the TEXASR register gets updated as if it were the guest
      treclaim instruction that had done failure recording, not the treclaim
      done in hypervisor state in the guest exit path.
      
      With this, the KVM_CAP_PPC_HTM capability returns true (1) even if
      transactional memory is not available to host userspace.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4bb3c7a0
    • P
      powerpc: Add CPU feature bits for TM bug workarounds on POWER9 v2.2 · b5af4f27
      Paul Mackerras 提交于
      This adds a CPU feature bit which is set for POWER9 "Nimbus" DD2.2
      processors which will be used to enable the hypervisor to assist
      hardware with the handling of checkpointed register values while the
      CPU is in suspend state, in order to work around hardware bugs.  The
      hardware assistance for these workarounds introduced a new hardware
      bug relating to the XER[SO] bit.  We add a separate feature bit for
      this bug in case future chips fix it while still requiring the
      hypervisor assistance with suspend state.
      
      When the dt_cpu_ftrs subsystem is in use, the software assistance can
      be enabled using a "tm-suspend-hypervisor-assist" node in the device
      tree, and a "tm-suspend-xer-so-bug" node enables the workarounds for
      the XER[SO] bug.  In the absence of such nodes, a quirk enables both
      for POWER9 "Nimbus" DD2.2 processors.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b5af4f27
  4. 17 1月, 2018 1 次提交
    • N
      powerpc/64s: Improve local TLB flush for boot and MCE on POWER9 · d4748276
      Nicholas Piggin 提交于
      There are several cases outside the normal address space management
      where a CPU's entire local TLB is to be flushed:
      
        1. Booting the kernel, in case something has left stale entries in
           the TLB (e.g., kexec).
      
        2. Machine check, to clean corrupted TLB entries.
      
      One other place where the TLB is flushed, is waking from deep idle
      states. The flush is a side-effect of calling ->cpu_restore with the
      intention of re-setting various SPRs. The flush itself is unnecessary
      because in the first case, the TLB should not acquire new corrupted
      TLB entries as part of sleep/wake (though they may be lost).
      
      This type of TLB flush is coded inflexibly, several times for each CPU
      type, and they have a number of problems with ISA v3.0B:
      
      - The current radix mode of the MMU is not taken into account, it is
        always done as a hash flushn For IS=2 (LPID-matching flush from host)
        and IS=3 with HV=0 (guest kernel flush), tlbie(l) is undefined if
        the R field does not match the current radix mode.
      
      - ISA v3.0B hash must flush the partition and process table caches as
        well.
      
      - ISA v3.0B radix must flush partition and process scoped translations,
        partition and process table caches, and also the page walk cache.
      
      So consolidate the flushing code and implement it in C and inline asm
      under the mm/ directory with the rest of the flush code. Add ISA v3.0B
      cases for radix and hash, and use the radix flush in radix environment.
      
      Provide a way for IS=2 (LPID flush) to specify the radix mode of the
      partition. Have KVM pass in the radix mode of the guest.
      
      Take out the flushes from early cputable/dt_cpu_ftrs detection hooks,
      and move it later in the boot process after, the MMU registers are set
      up and before relocation is first turned on.
      
      The TLB flush is no longer called when restoring from deep idle states.
      This was not be done as a separate step because booting secondaries
      uses the same cpu_restore as idle restore, which needs the TLB flush.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d4748276
  5. 15 11月, 2017 1 次提交
    • M
      powerpc/64s: Fix Power9 DD2.0 workarounds by adding DD2.1 feature · 3ffa9d9e
      Michael Ellerman 提交于
      Recently we added a CPU feature for Power9 DD2.0, to capture the fact
      that some workarounds are required only on Power9 DD1 and DD2.0 but
      not DD2.1 or later.
      
      Then in commit 9d2f510a ("powerpc/64s/idle: avoid POWER9 DD1 and
      DD2.0 ERAT workaround on DD2.1") and commit e3646330
      "powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on
      DD2.1") we changed CPU_FTR_SECTIONs to check for DD1 or DD20, eg:
      
        BEGIN_FTR_SECTION
                PPC_INVALIDATE_ERAT
        END_FTR_SECTION_IFSET(CPU_FTR_POWER9_DD1 | CPU_FTR_POWER9_DD20)
      
      Unfortunately although this reads as "if set DD1 or DD2.0", the or is
      a bitwise or and actually generates a mask of both bits. The code that
      does the feature patching then checks that the value of the CPU
      features masked with that mask are equal to the mask.
      
      So the end result is we're checking for DD1 and DD20 being set, which
      never happens. Yes the API is terrible.
      
      Removing the ERAT workaround on DD2.0 results in random SEGVs, the
      system tends to boot, but things randomly die including sometimes
      dhclient, udev etc.
      
      To fix the problem and hopefully avoid it in future, we remove the
      DD2.0 CPU feature and instead add a DD2.1 (or later) feature. This
      allows us to easily express that the workarounds are required if DD2.1
      is not set.
      
      At some point we will drop the DD1 workarounds entirely and some of
      this can be cleaned up.
      
      Fixes: 9d2f510a ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 ERAT workaround on DD2.1")
      Fixes: e3646330 ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on DD2.1")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      3ffa9d9e
  6. 06 11月, 2017 1 次提交
  7. 10 8月, 2017 3 次提交
  8. 25 5月, 2017 1 次提交
  9. 09 5月, 2017 1 次提交
  10. 10 3月, 2017 1 次提交
  11. 17 2月, 2017 1 次提交
  12. 25 9月, 2016 1 次提交
  13. 13 9月, 2016 1 次提交
  14. 01 8月, 2016 2 次提交
  15. 17 7月, 2016 1 次提交
  16. 11 5月, 2016 1 次提交
  17. 05 3月, 2016 1 次提交
  18. 22 2月, 2016 1 次提交
    • M
      powerpc: Add POWER9 cputable entry · c3ab300e
      Michael Neuling 提交于
      Add a cputable entry for POWER9.  More code is required to actually
      boot and run on a POWER9 but this gets the base piece in which we can
      start building on.
      
      Copies over from POWER8 except for:
      - Adds a new CPU_FTR_ARCH_300 bit to start hanging new architecture
         features from (in subsequent patches).
      - Advertises new user features bits PPC_FEATURE2_ARCH_3_00 &
        HAS_IEEE128 when on POWER9.
      - Drops CPU_FTR_SUBCORE.
      - Drops PMU code and machine check.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c3ab300e
  19. 19 6月, 2015 1 次提交
    • S
      powerpc/tm: Abort syscalls in active transactions · b4b56f9e
      Sam bobroff 提交于
      This patch changes the syscall handler to doom (tabort) active
      transactions when a syscall is made and return very early without
      performing the syscall and keeping side effects to a minimum (no CPU
      accounting or system call tracing is performed). Also included is a
      new HWCAP2 bit, PPC_FEATURE2_HTM_NOSC, to indicate this
      behaviour to userspace.
      
      Currently, the system call instruction automatically suspends an
      active transaction which causes side effects to persist when an active
      transaction fails.
      
      This does change the kernel's behaviour, but in a way that was
      documented as unsupported.  It doesn't reduce functionality as
      syscalls will still be performed after tsuspend; it just requires that
      the transaction be explicitly suspended.  It also provides a
      consistent interface and makes the behaviour of user code
      substantially the same across powerpc and platforms that do not
      support suspended transactions (e.g. x86 and s390).
      
      Performance measurements using
      http://ozlabs.org/~anton/junkcode/null_syscall.c indicate the cost of
      a normal (non-aborted) system call increases by about 0.25%.
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b4b56f9e
  20. 20 3月, 2015 1 次提交
  21. 17 3月, 2015 1 次提交
  22. 30 1月, 2015 1 次提交
  23. 22 9月, 2014 1 次提交
  24. 28 7月, 2014 2 次提交
  25. 22 7月, 2014 1 次提交
    • J
      powerpc: Disable doorbells on Power8 DD1.x · bd6ba351
      Joel Stanley 提交于
      These processors do not currently support doorbell IPIs, so remove them
      from the feature list if we are at DD 1.xx for the 0x004d part.
      
      This fixes a regression caused by d4e58e59 (powerpc/powernv: Enable
      POWER8 doorbell IPIs). With that patch the kernel would hang at boot
      when calling smp_call_function_many, as the doorbell would not be
      received by the target CPUs:
      
        .smp_call_function_many+0x2bc/0x3c0 (unreliable)
        .on_each_cpu_mask+0x30/0x100
        .cpuidle_register_driver+0x158/0x1a0
        .cpuidle_register+0x2c/0x110
        .powernv_processor_idle_init+0x23c/0x2c0
        .do_one_initcall+0xd4/0x260
        .kernel_init_freeable+0x25c/0x33c
        .kernel_init+0x1c/0x120
        .ret_from_kernel_thread+0x58/0x7c
      
      Fixes: d4e58e59 (powerpc/powernv: Enable POWER8 doorbell IPIs)
      Signed-off-by: NJoel Stanley <joel@jms.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      bd6ba351
  26. 11 6月, 2014 2 次提交
  27. 24 3月, 2014 1 次提交
  28. 05 12月, 2013 3 次提交
  29. 08 8月, 2013 1 次提交
  30. 24 7月, 2013 1 次提交
  31. 10 6月, 2013 1 次提交
  32. 01 6月, 2013 1 次提交