提交 · a2c01ed5d46f0686c52272e09f7d2f5be9f573fd · OpenHarmony / kernel_linux

21 10月, 2015 1 次提交

powerpc: Revert "Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8" · 23316316

由 Paul Mackerras 提交于 10月 21, 2015

This reverts commit 9678cdaa ("Use the POWER8 Micro Partition
Prefetch Engine in KVM HV on POWER8") because the original commit had
multiple, partly self-cancelling bugs, that could cause occasional
memory corruption.

In fact the logmpp instruction was incorrectly using register r0 as the
source of the buffer address and operation code, and depending on what
was in r0, it would either do nothing or corrupt the 64k page pointed to
by r0.

The logmpp instruction encoding and the operation code definitions could
be corrected, but then there is the problem that there is no clearly
defined way to know when the hardware has finished writing to the
buffer.

The original commit attempted to work around this by aborting the
write-out before starting the prefetch, but this is ineffective in the
case where the virtual core is now executing on a different physical
core from the one where the write-out was initiated.

These problems plus advice from the hardware designers not to use the
function (since the measured performance improvement from using the
feature was actually mostly negative), mean that reverting the code is
the best option.

Fixes: 9678cdaa ("Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8")
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

23316316

22 8月, 2015 1 次提交

KVM: PPC: Fix warnings from sparse · 5358a963

由 Thomas Huth 提交于 5月 22, 2015

When compiling the KVM code for POWER with "make C=1", sparse
complains about functions missing proper prototypes and a 64-bit
constant missing the ULL prefix. Let's fix this by making the
functions static or by including the proper header with the
prototypes, and by appending a ULL prefix to the constant
PPC_MPPE_ADDRESS_MASK.
Signed-off-by: NThomas Huth <thuth@redhat.com>
Signed-off-by: NAlexander Graf <agraf@suse.de>

5358a963

11 5月, 2015 1 次提交

powerpc: Add ICSWX instruction · edc424f8

由 Dan Streetman 提交于 5月 07, 2015

Add the asm ICSWX and ICSWEPX opcodes.  Add definitions for the
Coprocessor Request structures needed to use the icswx calls to
coprocessors.  Add icswx() function to perform the ICSWX asm
using the provided Coprocessor Command Word value and
Coprocessor Request Block structure.

This is required for communication with the NX-842 coprocessor on
a PowerNV system.
Signed-off-by: NDan Streetman <ddstreet@ieee.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

edc424f8

20 3月, 2015 1 次提交

powerpc/powernv: Fixes for hypervisor doorbell handling · 755563bc

由 Paul Mackerras 提交于 3月 19, 2015

Since we can now use hypervisor doorbells for host IPIs, this makes
sure we clear the host IPI flag when taking a doorbell interrupt, and
clears any pending doorbell IPI in pnv_smp_cpu_kill_self() (as we
already do for IPIs sent via the XICS interrupt controller).  Otherwise
if there did happen to be a leftover pending doorbell interrupt for
an offline CPU thread for any reason, it would prevent that thread from
going into a power-saving mode; it would instead keep waking up because
of the interrupt.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

755563bc

21 2月, 2015 1 次提交

ppc: bpf: add reqired opcodes for ppc32 · 693930d6

由 Denis Kirjanov 提交于 2月 17, 2015

Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

693930d6

15 12月, 2014 1 次提交

powernv/powerpc: Add winkle support for offline cpus · 77b54e9f

由 Shreyas B. Prabhu 提交于 12月 10, 2014

Winkle is a deep idle state supported in power8 chips. A core enters
winkle when all the threads of the core enter winkle. In this state
power supply to the entire chiplet i.e core, private L2 and private L3
is turned off. As a result it gives higher powersavings compared to
sleep.

But entering winkle results in a total hypervisor state loss. Hence the
hypervisor context has to be preserved before entering winkle and
restored upon wake up.

Power-on Reset Engine (PORE) is a dedicated engine which is responsible
for powering on the chiplet during wake up. It can be programmed to
restore the register contests of a few specific registers. This patch
uses PORE to restore register state wherever possible and uses stack to
save and restore rest of the necessary registers.

With hypervisor state restore things fall under three categories-
per-core state, per-subcore state and per-thread state. To manage this,
extend the infrastructure introduced for sleep. Mainly we add a paca
variable subcore_sibling_mask. Using this and the core_idle_state we can
distingush first thread in core and subcore.
Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

77b54e9f

04 11月, 2014 1 次提交

PPC: bpf_jit_comp: add SKF_AD_PKTTYPE instruction · 4e235761

由 Denis Kirjanov 提交于 10月 30, 2014

Add BPF extension SKF_AD_PKTTYPE to ppc JIT to load
skb->pkt_type field.

Before:
[   88.262622] test_bpf: #11 LD_IND_NET 86 97 99 PASS
[   88.265740] test_bpf: #12 LD_PKTTYPE 109 107 PASS

After:
[   80.605964] test_bpf: #11 LD_IND_NET 44 40 39 PASS
[   80.607370] test_bpf: #12 LD_PKTTYPE 9 9 PASS

CC: Alexei Starovoitov<alexei.starovoitov@gmail.com>
CC: Michael Ellerman<mpe@ellerman.id.au>
Cc: Matt Evans <matt@ozlabs.org>
Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org>

v2: Added test rusults
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4e235761

30 7月, 2014 1 次提交

powerpc/e6500: Add support for hardware threads · e16c8765

由 Andy Fleming 提交于 12月 08, 2011

The general idea is that each core will release all of its
threads into the secondary thread startup code, which will
eventually wait in the secondary core holding area, for the
appropriate bit in the PACA to be set. The kick_cpu function
pointer will set that bit in the PACA, and thus "release"
the core/thread to boot. We also need to do a few things that
U-Boot normally does for CPUs (like enable branch prediction).
Signed-off-by: NAndy Fleming <afleming@freescale.com>
[scottwood@freescale.com: various changes, including only enabling
 threads if Linux wants to kick them]
Signed-off-by: NScott Wood <scottwood@freescale.com>

e16c8765

28 7月, 2014 1 次提交

Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8 · 9678cdaa

由 Stewart Smith 提交于 7月 18, 2014

The POWER8 processor has a Micro Partition Prefetch Engine, which is
a fancy way of saying "has way to store and load contents of L2 or
L2+MRU way of L3 cache". We initiate the storing of the log (list of
addresses) using the logmpp instruction and start restore by writing
to a SPR.

The logmpp instruction takes parameters in a single 64bit register:
- starting address of the table to store log of L2/L2+L3 cache contents
  - 32kb for L2
  - 128kb for L2+L3
  - Aligned relative to maximum size of the table (32kb or 128kb)
- Log control (no-op, L2 only, L2 and L3, abort logout)

We should abort any ongoing logging before initiating one.

To initiate restore, we write to the MPPR SPR. The format of what to write
to the SPR is similar to the logmpp instruction parameter:
- starting address of the table to read from (same alignment requirements)
- table size (no data, until end of table)
- prefetch rate (from fastest possible to slower. about every 8, 16, 24 or
  32 cycles)

The idea behind loading and storing the contents of L2/L3 cache is to
reduce memory latency in a system that is frequently swapping vcores on
a physical CPU.

The best case scenario for doing this is when some vcores are doing very
cache heavy workloads. The worst case is when they have about 0 cache hits,
so we just generate needless memory operations.

This implementation just does L2 store/load. In my benchmarks this proves
to be useful.

Benchmark 1:
 - 16 core POWER8
 - 3x Ubuntu 14.04LTS guests (LE) with 8 VCPUs each
 - No split core/SMT
 - two guests running sysbench memory test.
   sysbench --test=memory --num-threads=8 run
 - one guest running apache bench (of default HTML page)
   ab -n 490000 -c 400 http://localhost/

This benchmark aims to measure performance of real world application (apache)
where other guests are cache hot with their own workloads. The sysbench memory
benchmark does pointer sized writes to a (small) memory buffer in a loop.

In this benchmark with this patch I can see an improvement both in requests
per second (~5%) and in mean and median response times (again, about 5%).
The spread of minimum and maximum response times were largely unchanged.

benchmark 2:
 - Same VM config as benchmark 1
 - all three guests running sysbench memory benchmark

This benchmark aims to see if there is a positive or negative affect to this
cache heavy benchmark. Although due to the nature of the benchmark (stores) we
may not see a difference in performance, but rather hopefully an improvement
in consistency of performance (when vcore switched in, don't have to wait
many times for cachelines to be pulled in)

The results of this benchmark are improvements in consistency of performance
rather than performance itself. With this patch, the few outliers in duration
go away and we get more consistent performance in each guest.

benchmark 3:
 - same 3 guests and CPU configuration as benchmark 1 and 2.
 - two idle guests
 - 1 guest running STREAM benchmark

This scenario also saw performance improvement with this patch. On Copy and
Scale workloads from STREAM, I got 5-6% improvement with this patch. For
Add and triad, it was around 10% (or more).

benchmark 4:
 - same 3 guests as previous benchmarks
 - two guests running sysbench --memory, distinctly different cache heavy
   workload
 - one guest running STREAM benchmark.

Similar improvements to benchmark 3.

benchmark 5:
 - 1 guest, 8 VCPUs, Ubuntu 14.04
 - Host configured with split core (SMT8, subcores-per-core=4)
 - STREAM benchmark

In this benchmark, we see a 10-20% performance improvement across the board
of STREAM benchmark results with this patch.

Based on preliminary investigation and microbenchmarks
by Prerna Saxena <prerna@linux.vnet.ibm.com>
Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
Acked-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

9678cdaa

31 10月, 2013 2 次提交

powerpc/bpf: Fix DIVWU instruction opcode · a40a2b67

由 Vladimir Murzin 提交于 9月 28, 2013

Currently DIVWU stands for *signed* divw opcode:

7d 2a 4b 96 	divwu   r9,r10,r9
7d 2a 4b d6 	divw    r9,r10,r9

Use the *unsigned* divw opcode for DIVWU.
Suggested-by: NVassili Karpov <av1474@comtv.ru>
Reviewed-by: NVassili Karpov <av1474@comtv.ru>
Signed-off-by: NVladimir Murzin <murzin.v@gmail.com>
Acked-by: NMatt Evans <matt@ozlabs.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a40a2b67

powerpc/bpf: BPF JIT compiler for 64-bit Little Endian · 9c662cad

由 Philippe Bergheaud 提交于 9月 24, 2013

This enables the Berkeley Packet Filter JIT compiler
for the PowerPC running in 64bit Little Endian.
Signed-off-by: NPhilippe Bergheaud <felix@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

9c662cad

17 10月, 2013 1 次提交

powerpc: Emulate sync instruction variants · 9863c28a

由 James Yang 提交于 7月 03, 2013

Reserved fields of the sync instruction have been used for other
instructions (e.g. lwsync).  On processors that do not support variants
of the sync instruction, emulate it by executing a sync to subsume the
effect of the intended instruction.
Signed-off-by: NJames Yang <James.Yang@freescale.com>
[scottwood@freescale.com: whitespace and subject line fix]
Signed-off-by: NScott Wood <scottwood@freescale.com>

9863c28a

11 10月, 2013 1 次提交

powerpc: Little endian builds double word swap VSX state during context save/restore · 926f160f

由 Anton Blanchard 提交于 9月 23, 2013

The elements within VSX loads and stores are big endian ordered
regardless of endianness. Our VSX context save/restore code uses
lxvd2x and stxvd2x which is a 2x doubleword operation. This means
the two doublewords will be swapped and we have to perform another
swap to undo it.

We need to do this on save and restore.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

926f160f

31 7月, 2013 1 次提交

powerpc: Move opcode definitions from kvm/emulate.c to asm/ppc-opcode.h · 9123c5ed

由 Hongtao Jia 提交于 4月 28, 2013

Opcode and xopcode are useful definitions not just for KVM. Move these
definitions to asm/ppc-opcode.h for public use.

Also add the opcodes for LHAUX and LWZUX.
Signed-off-by: NJia Hongtao <hongtao.jia@freescale.com>
Signed-off-by: NLi Yang <leoli@freescale.com>
[scottwood@freesacle.com: update commit message and rebase]
Signed-off-by: NScott Wood <scottwood@freescale.com>

9123c5ed

06 5月, 2013 1 次提交

powerpc: Emulate non privileged DSCR read and write · 73d2fb75

由 Anton Blanchard 提交于 5月 01, 2013

POWER8 allows read and write of the DSCR in userspace. We added
kernel emulation so applications could always use the instructions
regardless of the CPU type.

Unfortunately there are two SPRs for the DSCR and we only added
emulation for the privileged one. Add code to match the non
privileged one.

A simple test was created to verify the fix:

http://ozlabs.org/~anton/junkcode/user_dscr_test.c

Without the patch we get a SIGILL and it passes with the patch.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

73d2fb75

26 4月, 2013 1 次提交

powerpc/perf: Add new BHRB related instructions for POWER8 · 95213959

由 Anshuman Khandual 提交于 4月 22, 2013

This patch adds new POWER8 instruction encoding for reading
and clearing Branch History Rolling Buffer entries. The new
instruction 'mfbhrbe' (move from branch history rolling buffer
entry) is used to read BHRB buffer entries and instruction
'clrbhrb' (clear branch history rolling buffer) is used to
clear the entire buffer. The instruction 'clrbhrb' has straight
forward encoding. But the instruction encoding format for
reading the BHRB entries is like 'mfbhrbe RT, BHRBE' where it
takes two arguments, i.e the index for the BHRB buffer entry to
read and a general purpose register to put the value which was
read from the buffer entry.
Signed-off-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

95213959

15 2月, 2013 1 次提交

powerpc: Add new instructions for transactional memory · 14c39a4c

由 Michael Neuling 提交于 2月 13, 2013

Here we define the new instructions we need for transactional memory in the
kernel.  This is so we can support compiling with binutils that don't support
the new transactional memory instructions.

Transactional memory results in two sets of architected state (GPRs/VSRs
etc).

treclaim allows us to read the checkpointed state (from the tbegin) so that we
can store it away on a context switch.  It does this by overwriting the exiting
architected state, so you have to save that away before you treclaim.  treclaim
will also abort a transaction, so you can give a register value which contains
an abort reason.

trecheckpoint allows us to inject into the checkpointed state as if it were at
the tbegin.  It does this by copying the current architected state into the
checkpointed state.
Signed-off-by: NMatt Evans <matt@ozlabs.org>
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

14c39a4c

10 1月, 2013 1 次提交

powerpc: Define differences between doorbells on book3e and book3s · 42d02b81

由 Ian Munsie 提交于 11月 14, 2012

There are a few key differences between doorbells on server compared
with embedded that we care about on Linux, namely:

- We have a new msgsndp instruction for directed privileged doorbells.
  msgsnd is used for directed hypervisor doorbells.
- The tag we use in the instruction is the Thread Identification
  Register of the recipient thread (since server doorbells can only
  occur between threads within a single core), and is only 7 bits wide.
- A new message type is introduced for server doorbells (none of the
  existing book3e message types are currently supported on book3s).
Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
Tested-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

42d02b81

18 11月, 2012 1 次提交

PPC: net: bpf_jit_comp: add XOR instruction for BPF JIT · 02871903

由 Daniel Borkmann 提交于 11月 08, 2012

This patch is a follow-up for patch "filter: add XOR instruction for use
with X/K" that implements BPF PowerPC JIT parts for the BPF XOR operation.
Signed-off-by: NDaniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Cc: Matt Evans <matt@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NMatt Evans <matt@ozlabs.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

02871903

15 11月, 2012 2 次提交

powerpc: Fix typos in Freescale copyright claims · 8a56e1ee

由 Yang Li 提交于 11月 01, 2012

There are many cases that Semiconductor is misspelled.  The patch
fix these typos.
Signed-off-by: NLi Yang <leoli@freescale.com>
Acked-by: NTimur Tabi <timur@freescale.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8a56e1ee

powerpc/47x: Use the new ppc-opcode infrastructure · 1afc149d

由 Tony Breeds 提交于 10月 02, 2012

Don't use 47x only #defines for TLBIVAX or ICBT, supply and use helpers
in ppc-opcode.h

This fixes a compile breakage.
Signed-off-by: NTony Breeds <tony@bakeyournoodle.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

1afc149d

17 9月, 2012 1 次提交

powerpc: Add denormalisation exception handling for POWER6/7 · b92a66a6

由 Michael Neuling 提交于 9月 10, 2012

On POWER6 and POWER7 if the input operand to an instruction is a
denormalised single precision binary floating point value we can take
a denormalisation exception where it's expected that the hypervisor
(HV=1) will fix up the inputs before the instruction is run.

This adds code to handle this denormalisation exception for POWER6 and
POWER7.

It also add a CONFIG_PPC_DENORMALISATION option and sets it in
pseries/ppc64_defconfig.

This is useful on bare metal systems only.  Based on patch from Milton
Miller.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

b92a66a6

10 7月, 2012 9 次提交

powerpc: Enforce usage of RA 0-R31 where possible · 962cffbd

由 Michael Neuling 提交于 6月 25, 2012

Some macros use RA where when RA=R0 the values is 0, so make this
the enforced mnemonic in the macro.

Idea suggested by Andreas Schwab.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

962cffbd

powerpc: Add defines for RA 0-R31 · f4c01579

由 Michael Neuling 提交于 6月 25, 2012

R0 is special since it'll be 0.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f4c01579

powerpc: Enforce usage of R0-R31 where possible · 0b7673c3

由 Michael Neuling 提交于 6月 25, 2012

Enforce the use of R0-R31 in macros where possible now we have all the
fixes in.

R0-R31 macros are removed here so that can't be used anymore.  They
should not be defined anywhere.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

0b7673c3

powerpc: Introduce new __REG_R macros · 0972def4

由 Michael Neuling 提交于 6月 25, 2012

Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

0972def4

powerpc: Start using ___PPC_RA/B/S/T where necessary · cdaade71

由 Michael Neuling 提交于 6月 25, 2012

Now have ___PPC_RA/B/S/T we can use it in some places.  These are
places where we can't use the existing defines which will soon enforce
R0-R31 usage.

The macros being changed here are being used in inline asm, which
can't convert to enforce the R0-R31 usage.

bpf_jit uses a mix of both generated and non-generated with the same
code, so just convert all these to use the ___PPC_R versions which
won't enforce R usage later.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

cdaade71

powerpc: Introduce new ___PPC_RA/B/S/T macros · 55a5db18

由 Michael Neuling 提交于 6月 25, 2012

These are currently the same as __PPC_RA/B/S/T but we'll wrap them
soon.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

55a5db18

powerpc: Fix VSX macros so register names aren't wrapped · 178f2ae0

由 Michael Neuling 提交于 6月 25, 2012

We need to do this so we can enforce the name of a and b in called
macros PPC_RA/B later.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

178f2ae0

powerpc/pasemi: Move lbz/stbciz to ppc-opcode.h · 4404a9f9

由 Michael Neuling 提交于 6月 25, 2012

move lbz/stbciz to ppc-opcode.h.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

4404a9f9

powerpc: Add defines for R0-R31 · 82fff310

由 Michael Neuling 提交于 6月 25, 2012

We are going to use these later and convert r0 to %r0 etc.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

82fff310

05 3月, 2012 1 次提交

KVM: PPC: Implement MMIO emulation support for Book3S HV guests · 697d3899

由 Paul Mackerras 提交于 12月 12, 2011

This provides the low-level support for MMIO emulation in Book3S HV
guests.  When the guest tries to map a page which is not covered by
any memslot, that page is taken to be an MMIO emulation page.  Instead
of inserting a valid HPTE, we insert an HPTE that has the valid bit
clear but another hypervisor software-use bit set, which we call
HPTE_V_ABSENT, to indicate that this is an absent page.  An
absent page is treated much like a valid page as far as guest hcalls
(H_ENTER, H_REMOVE, H_READ etc.) are concerned, except of course that
an absent HPTE doesn't need to be invalidated with tlbie since it
was never valid as far as the hardware is concerned.

When the guest accesses a page for which there is an absent HPTE, it
will take a hypervisor data storage interrupt (HDSI) since we now set
the VPM1 bit in the LPCR.  Our HDSI handler for HPTE-not-present faults
looks up the hash table and if it finds an absent HPTE mapping the
requested virtual address, will switch to kernel mode and handle the
fault in kvmppc_book3s_hv_page_fault(), which at present just calls
kvmppc_hv_emulate_mmio() to set up the MMIO emulation.

This is based on an earlier patch by Benjamin Herrenschmidt, but since
heavily reworked.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

697d3899

22 7月, 2011 1 次提交

net: filter: BPF 'JIT' compiler for PPC64 · 0ca87f05

由 Matt Evans 提交于 7月 20, 2011

An implementation of a code generator for BPF programs to speed up packet
filtering on PPC64, inspired by Eric Dumazet's x86-64 version.

Filter code is generated as an ABI-compliant function in module_alloc()'d mem
with stackframe & prologue/epilogue generated if required (simple filters don't
need anything more than an li/blr).  The filter's local variables, M[], live in
registers.  Supports all BPF opcodes, although "complicated" loads from negative
packet offsets (e.g. SKF_LL_OFF) are not yet supported.

There are a couple of further optimisations left for future work; many-pass
assembly with branch-reach reduction and a register allocator to push M[]
variables into volatile registers would improve the code quality further.

This currently supports big-endian 64-bit PowerPC only (but is fairly simple
to port to PPC32 or LE!).

Enabled in the same way as x86-64:

	echo 1 > /proc/sys/net/core/bpf_jit_enable

Or, enabled with extra debug output:

	echo 2 > /proc/sys/net/core/bpf_jit_enable
Signed-off-by: NMatt Evans <matt@ozlabs.org>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ca87f05

27 4月, 2011 1 次提交

powerpc: Per process DSCR + some fixes (try#4) · efcac658

由 Alexey Kardashevskiy 提交于 3月 02, 2011

The DSCR (aka Data Stream Control Register) is supported on some
server PowerPC chips and allow some control over the prefetch
of data streams.

This patch allows the value to be specified per thread by emulating
the corresponding mfspr and mtspr instructions. Children of such
threads inherit the value. Other threads use a default value that
can be specified in sysfs - /sys/devices/system/cpu/dscr_default.

If a thread starts with non default value in the sysfs entry,
all children threads inherit this non default value even if
the sysfs value is changed later.
Signed-off-by: NAlexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

efcac658

20 4月, 2011 2 次提交

powerpc/a2: Add some #defines for A2 specific instructions · 931e1241

由 Benjamin Herrenschmidt 提交于 4月 14, 2011

Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

931e1241

powerpc: Add NAP mode support on Power7 in HV mode · 948cf67c

由 Benjamin Herrenschmidt 提交于 1月 24, 2011

Wakeup comes from the system reset handler with a potential loss of
the non-hypervisor CPU state. We save the non-volatile state on the
stack and a pointer to it in the PACA, which the system reset handler
uses to restore things
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

948cf67c

09 12月, 2010 1 次提交

powerpc: Hardcode popcnt instructions for old assemblers · b5f9b666

由 Anton Blanchard 提交于 12月 07, 2010

The popcnt instructions went into binutils relatively recently. As with a
number of other instructions, create macros and hardcode them.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

b5f9b666

22 6月, 2010 1 次提交

powerpc: Emulate most Book I instructions in emulate_step() · 0016a4cf

由 Paul Mackerras 提交于 6月 15, 2010

This extends the emulate_step() function to handle a large proportion
of the Book I instructions implemented on current 64-bit server
processors.  The aim is to handle all the load and store instructions
used in the kernel, plus all of the instructions that appear between
l[wd]arx and st[wd]cx., so this handles the Altivec/VMX lvx and stvx
and the VSX lxv2dx and stxv2dx instructions (implemented in POWER7).

The new code can emulate user mode instructions, and checks the
effective address for a load or store if the saved state is for
user mode.  It doesn't handle little-endian mode at present.

For floating-point, Altivec/VMX and VSX instructions, it checks
that the saved MSR has the enable bit for the relevant facility
set, and if so, assumes that the FP/VMX/VSX registers contain
valid state, and does loads or stores directly to/from the
FP/VMX/VSX registers, using assembly helpers in ldstfp.S.

Instructions supported now include:
* Loads and stores, including some but not all VMX and VSX instructions,
  and lmw/stmw
* Atomic loads and stores (l[dw]arx, st[dw]cx.)
* Arithmetic instructions (add, subtract, multiply, divide, etc.)
* Compare instructions
* Rotate and mask instructions
* Shift instructions
* Logical instructions (and, or, xor, etc.)
* Condition register logical instructions
* mtcrf, cntlz[wd], exts[bhw]
* isync, sync, lwsync, ptesync, eieio
* Cache operations (dcbf, dcbst, dcbt, dcbtst)

The overflow-checking arithmetic instructions are not included, but
they appear not to be ever used in C code.

This uses decimal values for the minor opcodes in the switch statements
because that is what appears in the Power ISA specification, thus it is
easier to check that they are correct if they are in decimal.

If this is used to single-step an instruction where a data breakpoint
interrupt occurred, then there is the possibility that the instruction
is a lwarx or ldarx.  In that case we have to be careful not to lose the
reservation until we get to the matching st[wd]cx., or we'll never make
forward progress.  One alternative is to try to arrange that we can
return from interrupts and handle data breakpoint interrupts without
losing the reservation, which means not using any spinlocks, mutexes,
or atomic ops (including bitops).  That seems rather fragile.  The
other alternative is to emulate the larx/stcx and all the instructions
in between.  This is why this commit adds support for a wide range
of integer instructions.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

0016a4cf

17 3月, 2010 1 次提交

powerpc/85xx: Make sure lwarx hint isn't set on ppc32 · d6ccb1f5

由 Kumar Gala 提交于 3月 10, 2010

e500v1/v2 based chips will treat any reserved field being set in an
opcode as illegal.  Thus always setting the hint in the opcode is
a bad idea.

Anton should be kept away from the powerpc opcode map.
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

d6ccb1f5

17 2月, 2010 1 次提交

powerpc: Use lwarx/ldarx hint in bit locks · 864b9e6f

由 Anton Blanchard 提交于 2月 10, 2010

This patch implements the lwarx/ldarx hint bit for bit locks.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

864b9e6f

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多