1. 18 11月, 2016 1 次提交
  2. 04 10月, 2016 1 次提交
    • N
      powerpc/bpf: Implement support for tail calls · ce076141
      Naveen N. Rao 提交于
      Tail calls allow JIT'ed eBPF programs to call into other JIT'ed eBPF
      programs. This can be achieved either by:
      (1) retaining the stack setup by the first eBPF program and having all
      subsequent eBPF programs re-using it, or,
      (2) by unwinding/tearing down the stack and having each eBPF program
      deal with its own stack as it sees fit.
      
      To ensure that this does not create loops, there is a limit to how many
      tail calls can be done (currently 32). This requires the JIT'ed code to
      maintain a count of the number of tail calls done so far.
      
      Approach (1) is simple, but requires every eBPF program to have (almost)
      the same prologue/epilogue, regardless of whether they need it. This is
      inefficient for small eBPF programs which may not sometimes need a
      prologue at all. As such, to minimize impact of tail call
      implementation, we use approach (2) here which needs each eBPF program
      in the chain to use its own prologue/epilogue. This is not ideal when
      many tail calls are involved and when all the eBPF programs in the chain
      have similar prologue/epilogue. However, the impact is restricted to
      programs that do tail calls. Individual eBPF programs are not affected.
      
      We maintain the tail call count in a fixed location on the stack and
      updated tail call count values are passed in through this. The very
      first eBPF program in a chain sets this up to 0 (the first 2
      instructions). Subsequent tail calls skip the first two eBPF JIT
      instructions to maintain the count. For programs that don't do tail
      calls themselves, the first two instructions are NOPs.
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ce076141
  3. 25 9月, 2016 1 次提交
  4. 17 7月, 2016 2 次提交
  5. 15 7月, 2016 1 次提交
    • S
      powerpc/powernv: Add platform support for stop instruction · bcef83a0
      Shreyas B. Prabhu 提交于
      POWER ISA v3 defines a new idle processor core mechanism. In summary,
       a) new instruction named stop is added. This instruction replaces
      	instructions like nap, sleep, rvwinkle.
       b) new per thread SPR named Processor Stop Status and Control Register
      	(PSSCR) is added which controls the behavior of stop instruction.
      
      PSSCR layout:
      ----------------------------------------------------------
      | PLS | /// | SD | ESL | EC | PSLL | /// | TR | MTL | RL |
      ----------------------------------------------------------
      0      4     41   42    43   44     48    54   56    60
      
      PSSCR key fields:
      	Bits 0:3  - Power-Saving Level Status. This field indicates the lowest
      	power-saving state the thread entered since stop instruction was last
      	executed.
      
      	Bit 42 - Enable State Loss
      	0 - No state is lost irrespective of other fields
      	1 - Allows state loss
      
      	Bits 44:47 - Power-Saving Level Limit
      	This limits the power-saving level that can be entered into.
      
      	Bits 60:63 - Requested Level
      	Used to specify which power-saving level must be entered on executing
      	stop instruction
      
      This patch adds support for stop instruction and PSSCR handling.
      Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      bcef83a0
  6. 05 7月, 2016 2 次提交
  7. 24 6月, 2016 2 次提交
  8. 27 4月, 2016 1 次提交
    • C
      powerpc: Add support for userspace P9 copy paste · 8a649045
      Chris Smart 提交于
      The copy paste facility introduced in POWER9 provides an optimised
      mechanism for a userspace application to copy a cacheline. This is
      provided by a pair of instructions, copy and paste, while a third,
      cp_abort (copy paste abort), provides a clean up of the state in case of
      a failure.
      
      The copy instruction will read a 128 byte cacheline and store it in an
      internal buffer. The subsequent paste instruction will store this
      internal buffer to memory and set a CR field if the paste succeeds.
      
      Since the state of the copy paste buffer is internal (and not
      architecturally visible), in the unlikely event of a context switch, the
      state cannot be stored and the paste should therefore fail.
      
      The cp_abort instruction exists to fail and clean up any such
      interrupted copy paste sequence and is to be called by the kernel as
      part of the context switch. Doing so prevents data from a preceding copy
      in one process leaking into the paste of another.
      
      This code enables use of the cp_abort instruction if a supported
      processor is detected.
      
      NOTE: this is for userspace only, not in kernel, and does not deal
      with KVM guests.
      
      Patch created with much assistance from Michael Neuling
      <mikey@neuling.org>
      Signed-off-by: NChris Smart <chris@distroguy.com>
      Reviewed-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8a649045
  9. 21 10月, 2015 1 次提交
    • P
      powerpc: Revert "Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8" · 23316316
      Paul Mackerras 提交于
      This reverts commit 9678cdaa ("Use the POWER8 Micro Partition
      Prefetch Engine in KVM HV on POWER8") because the original commit had
      multiple, partly self-cancelling bugs, that could cause occasional
      memory corruption.
      
      In fact the logmpp instruction was incorrectly using register r0 as the
      source of the buffer address and operation code, and depending on what
      was in r0, it would either do nothing or corrupt the 64k page pointed to
      by r0.
      
      The logmpp instruction encoding and the operation code definitions could
      be corrected, but then there is the problem that there is no clearly
      defined way to know when the hardware has finished writing to the
      buffer.
      
      The original commit attempted to work around this by aborting the
      write-out before starting the prefetch, but this is ineffective in the
      case where the virtual core is now executing on a different physical
      core from the one where the write-out was initiated.
      
      These problems plus advice from the hardware designers not to use the
      function (since the measured performance improvement from using the
      feature was actually mostly negative), mean that reverting the code is
      the best option.
      
      Fixes: 9678cdaa ("Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8")
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      23316316
  10. 22 8月, 2015 1 次提交
    • T
      KVM: PPC: Fix warnings from sparse · 5358a963
      Thomas Huth 提交于
      When compiling the KVM code for POWER with "make C=1", sparse
      complains about functions missing proper prototypes and a 64-bit
      constant missing the ULL prefix. Let's fix this by making the
      functions static or by including the proper header with the
      prototypes, and by appending a ULL prefix to the constant
      PPC_MPPE_ADDRESS_MASK.
      Signed-off-by: NThomas Huth <thuth@redhat.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      5358a963
  11. 11 5月, 2015 1 次提交
    • D
      powerpc: Add ICSWX instruction · edc424f8
      Dan Streetman 提交于
      Add the asm ICSWX and ICSWEPX opcodes.  Add definitions for the
      Coprocessor Request structures needed to use the icswx calls to
      coprocessors.  Add icswx() function to perform the ICSWX asm
      using the provided Coprocessor Command Word value and
      Coprocessor Request Block structure.
      
      This is required for communication with the NX-842 coprocessor on
      a PowerNV system.
      Signed-off-by: NDan Streetman <ddstreet@ieee.org>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      edc424f8
  12. 20 3月, 2015 1 次提交
    • P
      powerpc/powernv: Fixes for hypervisor doorbell handling · 755563bc
      Paul Mackerras 提交于
      Since we can now use hypervisor doorbells for host IPIs, this makes
      sure we clear the host IPI flag when taking a doorbell interrupt, and
      clears any pending doorbell IPI in pnv_smp_cpu_kill_self() (as we
      already do for IPIs sent via the XICS interrupt controller).  Otherwise
      if there did happen to be a leftover pending doorbell interrupt for
      an offline CPU thread for any reason, it would prevent that thread from
      going into a power-saving mode; it would instead keep waking up because
      of the interrupt.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      755563bc
  13. 21 2月, 2015 1 次提交
  14. 15 12月, 2014 1 次提交
    • S
      powernv/powerpc: Add winkle support for offline cpus · 77b54e9f
      Shreyas B. Prabhu 提交于
      Winkle is a deep idle state supported in power8 chips. A core enters
      winkle when all the threads of the core enter winkle. In this state
      power supply to the entire chiplet i.e core, private L2 and private L3
      is turned off. As a result it gives higher powersavings compared to
      sleep.
      
      But entering winkle results in a total hypervisor state loss. Hence the
      hypervisor context has to be preserved before entering winkle and
      restored upon wake up.
      
      Power-on Reset Engine (PORE) is a dedicated engine which is responsible
      for powering on the chiplet during wake up. It can be programmed to
      restore the register contests of a few specific registers. This patch
      uses PORE to restore register state wherever possible and uses stack to
      save and restore rest of the necessary registers.
      
      With hypervisor state restore things fall under three categories-
      per-core state, per-subcore state and per-thread state. To manage this,
      extend the infrastructure introduced for sleep. Mainly we add a paca
      variable subcore_sibling_mask. Using this and the core_idle_state we can
      distingush first thread in core and subcore.
      Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      77b54e9f
  15. 04 11月, 2014 1 次提交
  16. 30 7月, 2014 1 次提交
    • A
      powerpc/e6500: Add support for hardware threads · e16c8765
      Andy Fleming 提交于
      The general idea is that each core will release all of its
      threads into the secondary thread startup code, which will
      eventually wait in the secondary core holding area, for the
      appropriate bit in the PACA to be set. The kick_cpu function
      pointer will set that bit in the PACA, and thus "release"
      the core/thread to boot. We also need to do a few things that
      U-Boot normally does for CPUs (like enable branch prediction).
      Signed-off-by: NAndy Fleming <afleming@freescale.com>
      [scottwood@freescale.com: various changes, including only enabling
       threads if Linux wants to kick them]
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      e16c8765
  17. 28 7月, 2014 1 次提交
    • S
      Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8 · 9678cdaa
      Stewart Smith 提交于
      The POWER8 processor has a Micro Partition Prefetch Engine, which is
      a fancy way of saying "has way to store and load contents of L2 or
      L2+MRU way of L3 cache". We initiate the storing of the log (list of
      addresses) using the logmpp instruction and start restore by writing
      to a SPR.
      
      The logmpp instruction takes parameters in a single 64bit register:
      - starting address of the table to store log of L2/L2+L3 cache contents
        - 32kb for L2
        - 128kb for L2+L3
        - Aligned relative to maximum size of the table (32kb or 128kb)
      - Log control (no-op, L2 only, L2 and L3, abort logout)
      
      We should abort any ongoing logging before initiating one.
      
      To initiate restore, we write to the MPPR SPR. The format of what to write
      to the SPR is similar to the logmpp instruction parameter:
      - starting address of the table to read from (same alignment requirements)
      - table size (no data, until end of table)
      - prefetch rate (from fastest possible to slower. about every 8, 16, 24 or
        32 cycles)
      
      The idea behind loading and storing the contents of L2/L3 cache is to
      reduce memory latency in a system that is frequently swapping vcores on
      a physical CPU.
      
      The best case scenario for doing this is when some vcores are doing very
      cache heavy workloads. The worst case is when they have about 0 cache hits,
      so we just generate needless memory operations.
      
      This implementation just does L2 store/load. In my benchmarks this proves
      to be useful.
      
      Benchmark 1:
       - 16 core POWER8
       - 3x Ubuntu 14.04LTS guests (LE) with 8 VCPUs each
       - No split core/SMT
       - two guests running sysbench memory test.
         sysbench --test=memory --num-threads=8 run
       - one guest running apache bench (of default HTML page)
         ab -n 490000 -c 400 http://localhost/
      
      This benchmark aims to measure performance of real world application (apache)
      where other guests are cache hot with their own workloads. The sysbench memory
      benchmark does pointer sized writes to a (small) memory buffer in a loop.
      
      In this benchmark with this patch I can see an improvement both in requests
      per second (~5%) and in mean and median response times (again, about 5%).
      The spread of minimum and maximum response times were largely unchanged.
      
      benchmark 2:
       - Same VM config as benchmark 1
       - all three guests running sysbench memory benchmark
      
      This benchmark aims to see if there is a positive or negative affect to this
      cache heavy benchmark. Although due to the nature of the benchmark (stores) we
      may not see a difference in performance, but rather hopefully an improvement
      in consistency of performance (when vcore switched in, don't have to wait
      many times for cachelines to be pulled in)
      
      The results of this benchmark are improvements in consistency of performance
      rather than performance itself. With this patch, the few outliers in duration
      go away and we get more consistent performance in each guest.
      
      benchmark 3:
       - same 3 guests and CPU configuration as benchmark 1 and 2.
       - two idle guests
       - 1 guest running STREAM benchmark
      
      This scenario also saw performance improvement with this patch. On Copy and
      Scale workloads from STREAM, I got 5-6% improvement with this patch. For
      Add and triad, it was around 10% (or more).
      
      benchmark 4:
       - same 3 guests as previous benchmarks
       - two guests running sysbench --memory, distinctly different cache heavy
         workload
       - one guest running STREAM benchmark.
      
      Similar improvements to benchmark 3.
      
      benchmark 5:
       - 1 guest, 8 VCPUs, Ubuntu 14.04
       - Host configured with split core (SMT8, subcores-per-core=4)
       - STREAM benchmark
      
      In this benchmark, we see a 10-20% performance improvement across the board
      of STREAM benchmark results with this patch.
      
      Based on preliminary investigation and microbenchmarks
      by Prerna Saxena <prerna@linux.vnet.ibm.com>
      Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      9678cdaa
  18. 31 10月, 2013 2 次提交
  19. 17 10月, 2013 1 次提交
  20. 11 10月, 2013 1 次提交
  21. 31 7月, 2013 1 次提交
  22. 06 5月, 2013 1 次提交
  23. 26 4月, 2013 1 次提交
    • A
      powerpc/perf: Add new BHRB related instructions for POWER8 · 95213959
      Anshuman Khandual 提交于
      This patch adds new POWER8 instruction encoding for reading
      and clearing Branch History Rolling Buffer entries. The new
      instruction 'mfbhrbe' (move from branch history rolling buffer
      entry) is used to read BHRB buffer entries and instruction
      'clrbhrb' (clear branch history rolling buffer) is used to
      clear the entire buffer. The instruction 'clrbhrb' has straight
      forward encoding. But the instruction encoding format for
      reading the BHRB entries is like 'mfbhrbe RT, BHRBE' where it
      takes two arguments, i.e the index for the BHRB buffer entry to
      read and a general purpose register to put the value which was
      read from the buffer entry.
      Signed-off-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      95213959
  24. 15 2月, 2013 1 次提交
    • M
      powerpc: Add new instructions for transactional memory · 14c39a4c
      Michael Neuling 提交于
      Here we define the new instructions we need for transactional memory in the
      kernel.  This is so we can support compiling with binutils that don't support
      the new transactional memory instructions.
      
      Transactional memory results in two sets of architected state (GPRs/VSRs
      etc).
      
      treclaim allows us to read the checkpointed state (from the tbegin) so that we
      can store it away on a context switch.  It does this by overwriting the exiting
      architected state, so you have to save that away before you treclaim.  treclaim
      will also abort a transaction, so you can give a register value which contains
      an abort reason.
      
      trecheckpoint allows us to inject into the checkpointed state as if it were at
      the tbegin.  It does this by copying the current architected state into the
      checkpointed state.
      Signed-off-by: NMatt Evans <matt@ozlabs.org>
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      14c39a4c
  25. 10 1月, 2013 1 次提交
  26. 18 11月, 2012 1 次提交
  27. 15 11月, 2012 2 次提交
  28. 17 9月, 2012 1 次提交
  29. 10 7月, 2012 7 次提交