提交 · a41f62f592d9ecf97df4a12023760fe082b1ee68 · openeuler / qemu

20 2月, 2013 5 次提交

target-i386: Use movcond to implement shift flags. · a41f62f5

由 Richard Henderson 提交于 1月 30, 2013

With this being all straight-line code, it can get deleted
when the cc variables die.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

a41f62f5

target-i386: Add CC_OP_CLR · 436ff2d2

由 Richard Henderson 提交于 1月 29, 2013

Special case xor with self.  We need not even store the known
zero into cc_src.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

436ff2d2

target-i386: Implement tzcnt and fix lzcnt · 321c5351

由 Richard Henderson 提交于 1月 21, 2013

We weren't computing flags for lzcnt at all.  At the same time,
adjust the implementation of bsf/bsr to avoid the local branch,
using movcond instead.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

321c5351

R
target-i386: Use clz/ctz for bsf/bsr helpers · f1300734
由 Richard Henderson 提交于 1月 21, 2013
```
And mark the helpers as NO_RWG_SE.
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
f1300734
R
target-i386: Implement ADX extension · cd7f97ca
由 Richard Henderson 提交于 1月 23, 2013
```
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
cd7f97ca

19 2月, 2013 35 次提交

R
target-i386: Implement RORX · e2c3c2c5
由 Richard Henderson 提交于 1月 16, 2013
```
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
e2c3c2c5
R
target-i386: Implement SHLX, SARX, SHRX · 4a554890
由 Richard Henderson 提交于 1月 23, 2013
```
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
4a554890
R
target-i386: Implement PDEP, PEXT · 0592f74a
由 Richard Henderson 提交于 1月 23, 2013
```
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
0592f74a
R
target-i386: Implement MULX · 5f1f4b17
由 Richard Henderson 提交于 1月 23, 2013
```
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
5f1f4b17
R
target-i386: Implement BZHI · 02ea1e6b
由 Richard Henderson 提交于 1月 23, 2013
```
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
02ea1e6b
R
target-i386: Implement BLSR, BLSMSK, BLSI · bc4b43dc
由 Richard Henderson 提交于 1月 23, 2013
```
Do all of group 17 at one time for ease.
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
bc4b43dc
R
target-i386: Implement BEXTR · c7ab7565
由 Richard Henderson 提交于 1月 23, 2013
```
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
c7ab7565

target-i386: Implement ANDN · 7073fbad

由 Richard Henderson 提交于 1月 23, 2013

As this is the first of the BMI insns to be implemented,
this carries quite a bit more baggage than normal.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

7073fbad

R
target-i386: Implement MOVBE · 111994ee
由 Richard Henderson 提交于 1月 10, 2013
```
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
111994ee

target-i386: Decode the VEX prefixes · 701ed211

由 Richard Henderson 提交于 1月 11, 2013

No actual required uses of these encodings yet.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

701ed211

target-i386: Tidy prefix parsing · 4a6fd938

由 Richard Henderson 提交于 1月 10, 2013

Avoid duplicating switch statement between 32 and 64-bit modes.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

4a6fd938

target-i386: Use CC_SRC2 for ADC and SBB · 988c3eb0

由 Richard Henderson 提交于 1月 23, 2013

Add another slot in ENV and store two of the three inputs.  This lets us
do less work when carry-out is not needed, and avoids the unpredictable
CC_OP after translating these insns.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

988c3eb0

target-i386: Make helper_cc_compute_{all,c} const · db9f2597

由 Richard Henderson 提交于 1月 23, 2013

Pass the data in explicitly, rather than indirectly via env.
This avoids all sorts of unnecessary register spillage.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

db9f2597

target-i386: Don't reference ENV through most of cc helpers · 8601c0b6

由 Richard Henderson 提交于 1月 23, 2013

In preparation for making this a const helper.

By using the proper types in the parameters to the helper functions,
we get to avoid quite a lot of subsequent casting.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

8601c0b6

target-i386: optimize flags checking after sub using CC_SRCT · a3251186

由 Richard Henderson 提交于 1月 23, 2013

After a comparison or subtraction, the original value of the LHS will
currently be reconstructed using an addition.  However, in most cases
it is already available: store it in a temp-local variable and save 1
or 2 TCG ops (2 if the result of the addition needs to be extended).

The temp-local can be declared dead as soon as the cc_op changes again,
or also before the translation block ends because gen_prepare_cc will
always make a copy before returning it.  All this magic, plus copy
propagation and dead-code elimination, ensures that the temp local will
(almost) never be spilled.

Example (cmp $0x21,%rax + jbe):

 Before                                     After
----------------------------------------------------------------------------
 movi_i64 tmp1,$0x21                        movi_i64 tmp1,$0x21
 movi_i64 cc_src,$0x21                      movi_i64 cc_src,$0x21
 sub_i64 cc_dst,rax,tmp1                    sub_i64 cc_dst,rax,tmp1
 add_i64 tmp7,cc_dst,cc_src
 movi_i32 cc_op,$0x11                       movi_i32 cc_op,$0x11
 brcond_i64 tmp7,cc_src,leu,$0x0            discard loc11
                                            brcond_i64 rax,cc_src,leu,$0x0

 Before                                     After
----------------------------------------------------------------------------
  mov    (%r14),%rbp                        mov    (%r14),%rbp
  mov    %rbp,%rbx                          mov    %rbp,%rbx
  sub    $0x21,%rbx                         sub    $0x21,%rbx
  lea    0x21(%rbx),%r12
  movl   $0x11,0xa0(%r14)                   movl   $0x11,0xa0(%r14)
  movq   $0x21,0x90(%r14)                   movq   $0x21,0x90(%r14)
  mov    %rbx,0x98(%r14)                    mov    %rbx,0x98(%r14)
  cmp    $0x21,%r12                     |   cmp    $0x21,%rbp
  jbe    ...                                jbe    ...
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

a3251186

target-i386: Update cc_op before TCG branches · 891a5133

由 Richard Henderson 提交于 1月 18, 2013

Placing the CC_OP_DYNAMIC at the join is less effective than
before the branch, as the branch will have forced global registers
to their home locations.  This way we have a chance to discard
CC_SRC2 before it gets stored.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

891a5133

target-i386: introduce gen_jcc1_noeob · dc259201

由 Richard Henderson 提交于 1月 23, 2013

A jump that ends a basic block or otherwise falls back to CC_OP_DYNAMIC
will always have to call gen_op_set_cc_op. However, not all jumps end
a basic block, so introduce a variant that does not do this.

This was partially undone earlier (i386: drop cc_op argument of gen_jcc1),
redo it now also to prepare for the introduction of src2.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

dc259201

target-i386: use gen_op for cmps/scas · 63633fe6

由 Richard Henderson 提交于 1月 23, 2013

Replace low-level ops with a higher-level "cmp %al, (A0)" in the case
of scas, and "cmp T0, (A0)" in the case of cmps.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

63633fe6

target-i386: kill cpu_T3 · 3b9d3cf1

由 Paolo Bonzini 提交于 10月 12, 2012

It is almost unused, and it is simpler to pass a TCG value directly
to gen_shiftd_rm_T1_T3.  This value is then written to t2 without
going through a temporary register.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

3b9d3cf1

R
target-i386: expand cmov via movcond · 57eb0cc8
由 Richard Henderson 提交于 1月 16, 2013
```
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
57eb0cc8
P
target-i386: introduce gen_cmovcc1 · f32d3781
由 Paolo Bonzini 提交于 10月 07, 2012
```
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
f32d3781
P
target-i386: cleanup temporary macros for CCPrepare · cc8b6f5b
由 Paolo Bonzini 提交于 10月 08, 2012
```
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
cc8b6f5b

target-i386: inline gen_prepare_cc_slow · 69d1aa31

由 Richard Henderson 提交于 1月 23, 2013

Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

69d1aa31

target-i386: use CCPrepare to generate conditional jumps · 943131ca

由 Paolo Bonzini 提交于 10月 07, 2012

This simplifies all the jump generation code.  CCPrepare allows the
code to create an efficient brcond always, so there is no need to
duplicate the setcc and jcc code.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

943131ca

target-i386: introduce gen_prepare_cc · 276e6b5f

由 Richard Henderson 提交于 1月 23, 2013

This makes the i386 front-end able to create CCPrepare structs for all
condition, not just those that come from a single flag.  In particular,
JCC_L and JCC_LE can be optimized because gen_prepare_cc is not forced
to return a result in bit 0 (unlike gen_setcc_slow).

However, for now the slow jcc operations will still go through CC
computation in a single-bit temporary, followed by a brcond if the
temporary is nonzero.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

276e6b5f

target-i386: introduce CCPrepare · bec93d72

由 Richard Henderson 提交于 1月 23, 2013

Introduce a struct that describes how to build a *cond operation
that checks for a given x86 condition code.  For now, just change
gen_compute_eflags_* to return the new struct, generate code for
the CCPrepare struct, and go on as before.

[rth: Use ctz with the proper width rather than ffs.]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

bec93d72

target-i386: optimize setcc instructions · c365395e

由 Paolo Bonzini 提交于 10月 05, 2012

Reconstruct the arguments for complex conditions involving CC_OP_SUBx (BE,
L, LE).  In the others do it via setcond and gen_setcc_slow (which is
not that slow in many cases).
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

c365395e

target-i386: optimize setle · be10b289

由 Richard Henderson 提交于 1月 23, 2013

And allow gen_setcc_slow to operate on cpu_cc_src.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

be10b289

target-i386: optimize setbe · 2cb47645

由 Richard Henderson 提交于 1月 23, 2013

This is looking at EFLAGS, but it can do so more efficiently with
setcond.
Reviewed-by: NBlue Swirl <blauwirbel@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

2cb47645

target-i386: change gen_setcc_slow_T0 to gen_setcc_slow · 1a5c6359

由 Paolo Bonzini 提交于 10月 05, 2012

Do not hard code the destination register.
Reviewed-by: NBlue Swirl <blauwirbel@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

1a5c6359

target-i386: convert gen_compute_eflags_c to TCG · 06847f1f

由 Richard Henderson 提交于 1月 23, 2013

Do the switch at translation time, converting the helper templates to
TCG opcodes.  In some cases CF can be computed with a single setcond,
though others it may require a little more work.

In the CC_OP_DYNAMIC case, compute the whole EFLAGS, same as for ZF/SF/PF.
Reviewed-by: NBlue Swirl <blauwirbel@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

06847f1f

target-i386: use inverted setcond when computing NS or NZ · 8115f117

由 Richard Henderson 提交于 1月 23, 2013

Make gen_compute_eflags_z and gen_compute_eflags_s able to compute the
inverted condition, and use this in gen_setcc_slow_T0.  We cannot do it
yet in gen_compute_eflags_c, but prepare the code for it anyway.  It is
not worthwhile for PF, as usual.

shr+and+xor could be replaced by and+setcond.  I'm not doing it yet.
Reviewed-by: NBlue Swirl <blauwirbel@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

8115f117

target-i386: do not call helper to compute ZF/SF · 086c4077

由 Richard Henderson 提交于 1月 23, 2013

ZF, SF and PF can always be computed from CC_DST except in the
CC_OP_EFLAGS case (and CC_OP_DYNAMIC, which just resolves to CC_OP_EFLAGS
in gen_compute_eflags).  Use setcond to compute ZF and SF.

We could also use a table lookup to compute PF.
Reviewed-by: NBlue Swirl <blauwirbel@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

086c4077

target-i386: Move CC discards to set_cc_op · b666265b

由 Richard Henderson 提交于 1月 23, 2013

This gets us universal coverage, rather than scattering discards
around at various places.  As a bonus, we do not emit redundant
discards e.g. between sequential logic insns.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

b666265b

target-i386: no need to flush out cc_op before gen_eob · ccfcdd09

由 Richard Henderson 提交于 1月 23, 2013

This makes code more similar to the other callers of gen_eob, especially
loopz/loopnz/jcxz.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

ccfcdd09