- 20 2月, 2013 5 次提交
-
-
由 Richard Henderson 提交于
With this being all straight-line code, it can get deleted when the cc variables die. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Special case xor with self. We need not even store the known zero into cc_src. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
We weren't computing flags for lzcnt at all. At the same time, adjust the implementation of bsf/bsr to avoid the local branch, using movcond instead. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
And mark the helpers as NO_RWG_SE. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 19 2月, 2013 35 次提交
-
-
由 Richard Henderson 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Do all of group 17 at one time for ease. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
As this is the first of the BMI insns to be implemented, this carries quite a bit more baggage than normal. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
No actual required uses of these encodings yet. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Avoid duplicating switch statement between 32 and 64-bit modes. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Add another slot in ENV and store two of the three inputs. This lets us do less work when carry-out is not needed, and avoids the unpredictable CC_OP after translating these insns. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Pass the data in explicitly, rather than indirectly via env. This avoids all sorts of unnecessary register spillage. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
In preparation for making this a const helper. By using the proper types in the parameters to the helper functions, we get to avoid quite a lot of subsequent casting. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
After a comparison or subtraction, the original value of the LHS will currently be reconstructed using an addition. However, in most cases it is already available: store it in a temp-local variable and save 1 or 2 TCG ops (2 if the result of the addition needs to be extended). The temp-local can be declared dead as soon as the cc_op changes again, or also before the translation block ends because gen_prepare_cc will always make a copy before returning it. All this magic, plus copy propagation and dead-code elimination, ensures that the temp local will (almost) never be spilled. Example (cmp $0x21,%rax + jbe): Before After ---------------------------------------------------------------------------- movi_i64 tmp1,$0x21 movi_i64 tmp1,$0x21 movi_i64 cc_src,$0x21 movi_i64 cc_src,$0x21 sub_i64 cc_dst,rax,tmp1 sub_i64 cc_dst,rax,tmp1 add_i64 tmp7,cc_dst,cc_src movi_i32 cc_op,$0x11 movi_i32 cc_op,$0x11 brcond_i64 tmp7,cc_src,leu,$0x0 discard loc11 brcond_i64 rax,cc_src,leu,$0x0 Before After ---------------------------------------------------------------------------- mov (%r14),%rbp mov (%r14),%rbp mov %rbp,%rbx mov %rbp,%rbx sub $0x21,%rbx sub $0x21,%rbx lea 0x21(%rbx),%r12 movl $0x11,0xa0(%r14) movl $0x11,0xa0(%r14) movq $0x21,0x90(%r14) movq $0x21,0x90(%r14) mov %rbx,0x98(%r14) mov %rbx,0x98(%r14) cmp $0x21,%r12 | cmp $0x21,%rbp jbe ... jbe ... Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Placing the CC_OP_DYNAMIC at the join is less effective than before the branch, as the branch will have forced global registers to their home locations. This way we have a chance to discard CC_SRC2 before it gets stored. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
A jump that ends a basic block or otherwise falls back to CC_OP_DYNAMIC will always have to call gen_op_set_cc_op. However, not all jumps end a basic block, so introduce a variant that does not do this. This was partially undone earlier (i386: drop cc_op argument of gen_jcc1), redo it now also to prepare for the introduction of src2. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Replace low-level ops with a higher-level "cmp %al, (A0)" in the case of scas, and "cmp T0, (A0)" in the case of cmps. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Paolo Bonzini 提交于
It is almost unused, and it is simpler to pass a TCG value directly to gen_shiftd_rm_T1_T3. This value is then written to t2 without going through a temporary register. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Paolo Bonzini 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Paolo Bonzini 提交于
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Paolo Bonzini 提交于
This simplifies all the jump generation code. CCPrepare allows the code to create an efficient brcond always, so there is no need to duplicate the setcc and jcc code. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
This makes the i386 front-end able to create CCPrepare structs for all condition, not just those that come from a single flag. In particular, JCC_L and JCC_LE can be optimized because gen_prepare_cc is not forced to return a result in bit 0 (unlike gen_setcc_slow). However, for now the slow jcc operations will still go through CC computation in a single-bit temporary, followed by a brcond if the temporary is nonzero. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Introduce a struct that describes how to build a *cond operation that checks for a given x86 condition code. For now, just change gen_compute_eflags_* to return the new struct, generate code for the CCPrepare struct, and go on as before. [rth: Use ctz with the proper width rather than ffs.] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Paolo Bonzini 提交于
Reconstruct the arguments for complex conditions involving CC_OP_SUBx (BE, L, LE). In the others do it via setcond and gen_setcc_slow (which is not that slow in many cases). Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
And allow gen_setcc_slow to operate on cpu_cc_src. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
This is looking at EFLAGS, but it can do so more efficiently with setcond. Reviewed-by: NBlue Swirl <blauwirbel@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Paolo Bonzini 提交于
Do not hard code the destination register. Reviewed-by: NBlue Swirl <blauwirbel@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Do the switch at translation time, converting the helper templates to TCG opcodes. In some cases CF can be computed with a single setcond, though others it may require a little more work. In the CC_OP_DYNAMIC case, compute the whole EFLAGS, same as for ZF/SF/PF. Reviewed-by: NBlue Swirl <blauwirbel@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Make gen_compute_eflags_z and gen_compute_eflags_s able to compute the inverted condition, and use this in gen_setcc_slow_T0. We cannot do it yet in gen_compute_eflags_c, but prepare the code for it anyway. It is not worthwhile for PF, as usual. shr+and+xor could be replaced by and+setcond. I'm not doing it yet. Reviewed-by: NBlue Swirl <blauwirbel@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
ZF, SF and PF can always be computed from CC_DST except in the CC_OP_EFLAGS case (and CC_OP_DYNAMIC, which just resolves to CC_OP_EFLAGS in gen_compute_eflags). Use setcond to compute ZF and SF. We could also use a table lookup to compute PF. Reviewed-by: NBlue Swirl <blauwirbel@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
This gets us universal coverage, rather than scattering discards around at various places. As a bonus, we do not emit redundant discards e.g. between sequential logic insns. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
This makes code more similar to the other callers of gen_eob, especially loopz/loopnz/jcxz. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-