提交 · 9575234db19df260a3c00f5f947a9c1c823b0f5b · openeuler / qemu

16 3月, 2015 1 次提交

tcg/optimize: Handle or r,a,a with constant a · 2374c4b8

由 Richard Henderson 提交于 3月 13, 2015

As seen with ubuntu-5.10-live-powerpc.iso.
Reported-by: NMark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: NMark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: NBastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

2374c4b8

13 2月, 2015 3 次提交

tcg: Implement insert_op_before · a4ce099a

由 Richard Henderson 提交于 3月 30, 2014

Rather reserving space in the op stream for optimization,
let the optimizer add ops as necessary.
Reviewed-by: NBastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

a4ce099a

tcg: Remove opcodes instead of noping them out · 0c627cdc

由 Richard Henderson 提交于 3月 30, 2014

With the linked list scheme we need not leave nops in the stream
that we need to process later.
Reviewed-by: NBastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

0c627cdc

tcg: Put opcodes in a linked list · c45cb8bb

由 Richard Henderson 提交于 9月 19, 2014

The previous setup required ops and args to be completely sequential,
and was error prone when it came to both iteration and optimization.
Reviewed-by: NBastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

c45cb8bb

19 6月, 2014 1 次提交

tcg/optimize: Don't special case TCG_OPF_CALL_CLOBBER · bc8d688f

由 Richard Henderson 提交于 6月 08, 2014

With the "old" ldst ops we didn't know the real width of the
result of the load, but with the "new" ldst ops we do.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

bc8d688f

05 6月, 2014 1 次提交

tcg: Remove TCG_TARGET_HAS_new_ldst · 3d1b2ff6

由 Richard Henderson 提交于 5月 29, 2014

Since all backends have been converted, remove the compatibility code.
Acked-by: NClaudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

3d1b2ff6

29 5月, 2014 3 次提交

tcg/optimize: Remember garbage high bits for 32-bit ops · 24666baf

由 Richard Henderson 提交于 5月 22, 2014

For a 64-bit host, the high bits of a register after a 32-bit operation
are undefined. Adjust the temps mask for all 32-bit ops to reflect that.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

24666baf

R
tcg/optimize: Move updating of gen_opc_buf into tcg_opt_gen_mov* · a62f6f56
由 Richard Henderson 提交于 5月 22, 2014
```
No functional change, just reduce a bit of redundancy.
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
a62f6f56

tcg: Optimize brcond2 and setcond2 ne/eq · a763551a

由 Richard Henderson 提交于 4月 23, 2014

If either the high or low pair can be resolved, we can
simplify to either a constant or to a 32-bit comparison.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

a763551a

13 5月, 2014 1 次提交

tcg: Make call address a constant parameter · cf066674

由 Richard Henderson 提交于 3月 22, 2014

Avoid allocating a tcg temporary to hold the constant address,
and instead place it directly into the op_call arguments.

At the same time, convert to the newly introduced tcg_out_call
backend function, rather than invoking tcg_out_op for the call.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

cf066674

29 4月, 2014 1 次提交

tcg: Add INDEX_op_trunc_shr_i32 · 4bb7a41e

由 Richard Henderson 提交于 9月 09, 2013

Let the backend do something special for truncation.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

4bb7a41e

19 4月, 2014 2 次提交

tcg: Fix out of range shift in deposit optimizations · d998e555

由 Richard Henderson 提交于 3月 18, 2014

By inspection, for a deposit(x, y, 0, 64), we'd have a shift of (1<<64)
and everything else falls apart.  But we can reuse the existing deposit
logic to get this right.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

d998e555

tcg: Mask shift quantities while folding · 50c5c4d1

由 Richard Henderson 提交于 3月 18, 2014

The TCG result would be undefined, but we can at least produce one
plausible result and avoid triggering the wrath of analysis tools.
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

50c5c4d1

18 2月, 2014 8 次提交

tcg/optimize: Add more identity simplifications · 464a1441

由 Richard Henderson 提交于 1月 31, 2014

Recognize 0 operand to andc, and -1 operands to and, orc, eqv.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NAurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

464a1441

tcg/optimize: Optmize ANDC X,Y,Y to MOV X,0 · e64e958e

由 Richard Henderson 提交于 1月 28, 2014

Like we already do for SUB and XOR.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NAurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

e64e958e

tcg/optimize: Simply some logical ops to NOT · e201b564

由 Richard Henderson 提交于 1月 28, 2014

Given, of course, an appropriate constant.  These could be generated
from the "canonical" operation for inversion on the guest, or via
other optimizations.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NAurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

e201b564

tcg/optimize: Handle known-zeros masks for ANDC · 23ec69ed

由 Richard Henderson 提交于 1月 28, 2014

Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NAurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

23ec69ed

tcg/optimize: add known-zero bits compute for load ops · c8d70272

由 Aurelien Jarno 提交于 9月 03, 2013

Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

c8d70272

tcg/optimize: improve known-zero bits for 32-bit ops · f096dc96

由 Aurelien Jarno 提交于 9月 03, 2013

The shl_i32 op might set some bits of the unused 32 high bits of the
mask. Fix that by clearing the unused 32 high bits for all 32-bit ops
except load/store which operate on tl values.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

f096dc96

tcg/optimize: fix known-zero bits optimization · 3031244b

由 Aurelien Jarno 提交于 9月 03, 2013

Known-zero bits optimization is a great idea that helps to generate more
optimized code. However the current implementation only works in very few
cases as the computed mask is not saved.

Fix this to make it really working.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

3031244b

tcg/optimize: fix known-zero bits for right shift ops · e46b225a

由 Aurelien Jarno 提交于 9月 03, 2013

32-bit versions of sar and shr ops should not propagate known-zero bits
from the unused 32 high bits. For sar it could even lead to wrong code
being generated.

Cc: qemu-stable@nongnu.org
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

e46b225a

26 9月, 2013 1 次提交
- S
  misc: Use new rotate functions · 3df2b8fd
  由 Stefan Weil 提交于 9月 12, 2013
```
Signed-off-by: NStefan Weil <sw@weilnetz.de>
```
  3df2b8fd
03 9月, 2013 2 次提交

tcg: Constant fold div, rem · 01547f7f

由 Richard Henderson 提交于 8月 14, 2013

Reviewed-by: NAurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

01547f7f

tcg: Add muluh and mulsh opcodes · 03271524

由 Richard Henderson 提交于 8月 14, 2013

Use them in places where mulu2 and muls2 are used.
Optimize mulx2 with dead low part to mulxh.
Reviewed-by: NAurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

03271524

09 5月, 2013 1 次提交

tcg/optimize: fix setcond2 optimization · 66e61b55

由 Aurelien Jarno 提交于 5月 08, 2013

When setcond2 is rewritten into setcond, the state of the destination
temp should be reset, so that a copy of the previous value is not
used instead of the result.
Reported-by: NMichael Tokarev <mjt@tls.msk.ru>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>

66e61b55

23 3月, 2013 1 次提交

tcg-optimize: Fold sub r,0,x to neg r,x · 2d497542

由 Richard Henderson 提交于 3月 21, 2013

Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>

2d497542

24 2月, 2013 2 次提交
- R
  tcg: Add signed multiword multiplication operations · 4d3203fd
  由 Richard Henderson 提交于 2月 19, 2013
```
Signed-off-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>
```
  4d3203fd
- R
  tcg: Add 64-bit multiword arithmetic operations · d7156f7c
  由 Richard Henderson 提交于 2月 19, 2013
```
Matching the 32-bit multiword arithmetic that we already have.
Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>
```
  d7156f7c
19 1月, 2013 3 次提交

optimize: optimize using nonzero bits · 633f6502

由 Paolo Bonzini 提交于 1月 11, 2013

This adds two optimizations using the non-zero bit mask.  In some cases
involving shifts or ANDs the value can become zero, and can thus be
optimized to a move of zero.  Second, useless zero-extension or an
AND with constant can be detected that would only zero bits that are
already zero.

The main advantage of this optimization is that it turns zero-extensions
into moves, thus enabling much better copy propagation (around 1% code
reduction).  Here is for example a "test $0xff0000,%ecx + je" before
optimization:

 mov_i64 tmp0,rcx
 movi_i64 tmp1,$0xff0000
 discard cc_src
 and_i64 cc_dst,tmp0,tmp1
 movi_i32 cc_op,$0x1c
 ext32u_i64 tmp0,cc_dst
 movi_i64 tmp12,$0x0
 brcond_i64 tmp0,tmp12,eq,$0x0

and after (without patch on the left, with on the right):

 movi_i64 tmp1,$0xff0000                 movi_i64 tmp1,$0xff0000
 discard cc_src                          discard cc_src
 and_i64 cc_dst,rcx,tmp1                 and_i64 cc_dst,rcx,tmp1
 movi_i32 cc_op,$0x1c                    movi_i32 cc_op,$0x1c
 ext32u_i64 tmp0,cc_dst
 movi_i64 tmp12,$0x0                     movi_i64 tmp12,$0x0
 brcond_i64 tmp0,tmp12,eq,$0x0           brcond_i64 cc_dst,tmp12,eq,$0x0

Other similar cases: "test %eax, %eax + jne" where eax is already 32-bit
(after optimization, without patch on the left, with on the right):

 discard cc_src                          discard cc_src
 mov_i64 cc_dst,rax                      mov_i64 cc_dst,rax
 movi_i32 cc_op,$0x1c                    movi_i32 cc_op,$0x1c
 ext32u_i64 tmp0,cc_dst
 movi_i64 tmp12,$0x0                     movi_i64 tmp12,$0x0
 brcond_i64 tmp0,tmp12,ne,$0x0           brcond_i64 rax,tmp12,ne,$0x0

"test $0x1, %dl + je":

 movi_i64 tmp1,$0x1                      movi_i64 tmp1,$0x1
 discard cc_src                          discard cc_src
 and_i64 cc_dst,rdx,tmp1                 and_i64 cc_dst,rdx,tmp1
 movi_i32 cc_op,$0x1a                    movi_i32 cc_op,$0x1a
 ext8u_i64 tmp0,cc_dst
 movi_i64 tmp12,$0x0                     movi_i64 tmp12,$0x0
 brcond_i64 tmp0,tmp12,eq,$0x0           brcond_i64 cc_dst,tmp12,eq,$0x0

In some cases TCG even outsmarts GCC. :)  Here the input code has
"and $0x2,%eax + movslq %eax,%rbx + test %rbx, %rbx" and the optimizer,
thanks to copy propagation, does the following:

 movi_i64 tmp12,$0x2                     movi_i64 tmp12,$0x2
 and_i64 rax,rax,tmp12                   and_i64 rax,rax,tmp12
 mov_i64 cc_dst,rax                      mov_i64 cc_dst,rax
 ext32s_i64 tmp0,rax                  -> nop
 mov_i64 rbx,tmp0                     -> mov_i64 rbx,cc_dst
 and_i64 cc_dst,rbx,rbx               -> nop
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>

633f6502

optimize: track nonzero bits of registers · 3a9d8b17

由 Paolo Bonzini 提交于 1月 11, 2013

Add a "mask" field to the tcg_temp_info struct.  A bit that is zero
in "mask" will always be zero in the corresponding temporary.
Zero bits in the mask can be produced from moves of immediates,
zero-extensions, ANDs with constants, shifts; they can then be
be propagated by logical operations, shifts, sign-extensions,
negations, deposit operations, and conditional moves.  Other
operations will just reset the mask to all-ones, i.e. unknown.

[rth: s/target_ulong/tcg_target_ulong/]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>

3a9d8b17

optimize: only write to state when clearing optimizer data · d193a14a

由 Paolo Bonzini 提交于 1月 11, 2013

The next patch will add to the TCG optimizer a field that should be
non-zero in the default case.  Thus, replace the memset of the
temps array with a loop.  Only the state field has to be up-to-date,
because others are not used except if the state is TCG_TEMP_COPY
or TCG_TEMP_CONST.

[rth: Extracted the loop to a function.]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>

d193a14a

17 11月, 2012 1 次提交

TCG: Use gen_opc_buf from context instead of global variable. · 92414b31

由 Evgeny Voevodin 提交于 11月 12, 2012

Signed-off-by: NEvgeny Voevodin <e.voevodin@samsung.com>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>

92414b31

28 10月, 2012 1 次提交

tcg: rework TCG helper flags · 78505279

由 Aurelien Jarno 提交于 10月 09, 2012

The current helper flags, TCG_CALL_CONST and TCG_CALL_PURE might be
confusing and doesn't provide enough granularity for some helpers (FP
helpers for example).

This patch changes them into the following helpers flags:
- TCG_CALL_NO_READ_GLOBALS means that the helper does not read globals,
  either directly or via an exception. They will not be saved to their
  canonical location before calling the helper.
- TCG_CALL_NO_WRITE_GLOBALS means that the helper does not modify any
  globals. They will only be saved to their canonical locations before
  calling helpers, but they won't be reloaded afterwise.
- TCG_CALL_NO_SIDE_EFFECTS means that the call to the function is
  removed if the return value is not used.

It provides convenience flags, to avoid helper definitions longer than
80 characters. It also provides compatibility flags, and updates the
documentation.
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>

78505279

17 10月, 2012 7 次提交

tcg: Optimize mulu2 · 1414968a

由 Richard Henderson 提交于 10月 02, 2012

Like add2, do operand ordering, constant folding, and dead operand
elimination.  The latter happens about 15% of all mulu2 during an
x86_64 bios boot.
Signed-off-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>

1414968a

tcg: Constant fold add2 and sub2 · 212c328d

由 Richard Henderson 提交于 10月 02, 2012

Signed-off-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>

212c328d

R
tcg: Do constant folding on double-word comparisons · 6c4382f8
由 Richard Henderson 提交于 10月 02, 2012
```
Signed-off-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
```
6c4382f8

tcg: Split out subroutines from do_constant_folding_cond · 9519da7e

由 Richard Henderson 提交于 10月 02, 2012

We can re-use these for implementing double-word folding.
Signed-off-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>

9519da7e

R
tcg: Optimize double-word comparisons against zero · bc1473ef
由 Richard Henderson 提交于 10月 02, 2012
```
Signed-off-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
```
bc1473ef

tcg: Use common code when failing to optimize · 6e14e91b

由 Richard Henderson 提交于 10月 02, 2012

This saves a whole lot of repetitive code sequences.
Signed-off-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>

6e14e91b

tcg: Swap commutative double-word comparisons · 0bfcb865

由 Richard Henderson 提交于 10月 02, 2012

Signed-off-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>

0bfcb865