1. 29 4月, 2014 1 次提交
  2. 19 4月, 2014 2 次提交
  3. 18 2月, 2014 8 次提交
  4. 26 9月, 2013 1 次提交
  5. 03 9月, 2013 2 次提交
  6. 09 5月, 2013 1 次提交
  7. 23 3月, 2013 1 次提交
  8. 24 2月, 2013 2 次提交
  9. 19 1月, 2013 3 次提交
    • P
      optimize: optimize using nonzero bits · 633f6502
      Paolo Bonzini 提交于
      This adds two optimizations using the non-zero bit mask.  In some cases
      involving shifts or ANDs the value can become zero, and can thus be
      optimized to a move of zero.  Second, useless zero-extension or an
      AND with constant can be detected that would only zero bits that are
      already zero.
      
      The main advantage of this optimization is that it turns zero-extensions
      into moves, thus enabling much better copy propagation (around 1% code
      reduction).  Here is for example a "test $0xff0000,%ecx + je" before
      optimization:
      
       mov_i64 tmp0,rcx
       movi_i64 tmp1,$0xff0000
       discard cc_src
       and_i64 cc_dst,tmp0,tmp1
       movi_i32 cc_op,$0x1c
       ext32u_i64 tmp0,cc_dst
       movi_i64 tmp12,$0x0
       brcond_i64 tmp0,tmp12,eq,$0x0
      
      and after (without patch on the left, with on the right):
      
       movi_i64 tmp1,$0xff0000                 movi_i64 tmp1,$0xff0000
       discard cc_src                          discard cc_src
       and_i64 cc_dst,rcx,tmp1                 and_i64 cc_dst,rcx,tmp1
       movi_i32 cc_op,$0x1c                    movi_i32 cc_op,$0x1c
       ext32u_i64 tmp0,cc_dst
       movi_i64 tmp12,$0x0                     movi_i64 tmp12,$0x0
       brcond_i64 tmp0,tmp12,eq,$0x0           brcond_i64 cc_dst,tmp12,eq,$0x0
      
      Other similar cases: "test %eax, %eax + jne" where eax is already 32-bit
      (after optimization, without patch on the left, with on the right):
      
       discard cc_src                          discard cc_src
       mov_i64 cc_dst,rax                      mov_i64 cc_dst,rax
       movi_i32 cc_op,$0x1c                    movi_i32 cc_op,$0x1c
       ext32u_i64 tmp0,cc_dst
       movi_i64 tmp12,$0x0                     movi_i64 tmp12,$0x0
       brcond_i64 tmp0,tmp12,ne,$0x0           brcond_i64 rax,tmp12,ne,$0x0
      
      "test $0x1, %dl + je":
      
       movi_i64 tmp1,$0x1                      movi_i64 tmp1,$0x1
       discard cc_src                          discard cc_src
       and_i64 cc_dst,rdx,tmp1                 and_i64 cc_dst,rdx,tmp1
       movi_i32 cc_op,$0x1a                    movi_i32 cc_op,$0x1a
       ext8u_i64 tmp0,cc_dst
       movi_i64 tmp12,$0x0                     movi_i64 tmp12,$0x0
       brcond_i64 tmp0,tmp12,eq,$0x0           brcond_i64 cc_dst,tmp12,eq,$0x0
      
      In some cases TCG even outsmarts GCC. :)  Here the input code has
      "and $0x2,%eax + movslq %eax,%rbx + test %rbx, %rbx" and the optimizer,
      thanks to copy propagation, does the following:
      
       movi_i64 tmp12,$0x2                     movi_i64 tmp12,$0x2
       and_i64 rax,rax,tmp12                   and_i64 rax,rax,tmp12
       mov_i64 cc_dst,rax                      mov_i64 cc_dst,rax
       ext32s_i64 tmp0,rax                  -> nop
       mov_i64 rbx,tmp0                     -> mov_i64 rbx,cc_dst
       and_i64 cc_dst,rbx,rbx               -> nop
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>
      633f6502
    • P
      optimize: track nonzero bits of registers · 3a9d8b17
      Paolo Bonzini 提交于
      Add a "mask" field to the tcg_temp_info struct.  A bit that is zero
      in "mask" will always be zero in the corresponding temporary.
      Zero bits in the mask can be produced from moves of immediates,
      zero-extensions, ANDs with constants, shifts; they can then be
      be propagated by logical operations, shifts, sign-extensions,
      negations, deposit operations, and conditional moves.  Other
      operations will just reset the mask to all-ones, i.e. unknown.
      
      [rth: s/target_ulong/tcg_target_ulong/]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>
      3a9d8b17
    • P
      optimize: only write to state when clearing optimizer data · d193a14a
      Paolo Bonzini 提交于
      The next patch will add to the TCG optimizer a field that should be
      non-zero in the default case.  Thus, replace the memset of the
      temps array with a loop.  Only the state field has to be up-to-date,
      because others are not used except if the state is TCG_TEMP_COPY
      or TCG_TEMP_CONST.
      
      [rth: Extracted the loop to a function.]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>
      d193a14a
  10. 17 11月, 2012 1 次提交
  11. 28 10月, 2012 1 次提交
    • A
      tcg: rework TCG helper flags · 78505279
      Aurelien Jarno 提交于
      The current helper flags, TCG_CALL_CONST and TCG_CALL_PURE might be
      confusing and doesn't provide enough granularity for some helpers (FP
      helpers for example).
      
      This patch changes them into the following helpers flags:
      - TCG_CALL_NO_READ_GLOBALS means that the helper does not read globals,
        either directly or via an exception. They will not be saved to their
        canonical location before calling the helper.
      - TCG_CALL_NO_WRITE_GLOBALS means that the helper does not modify any
        globals. They will only be saved to their canonical locations before
        calling helpers, but they won't be reloaded afterwise.
      - TCG_CALL_NO_SIDE_EFFECTS means that the call to the function is
        removed if the return value is not used.
      
      It provides convenience flags, to avoid helper definitions longer than
      80 characters. It also provides compatibility flags, and updates the
      documentation.
      Reviewed-by: NRichard Henderson <rth@twiddle.net>
      Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
      78505279
  12. 17 10月, 2012 9 次提交
  13. 07 10月, 2012 1 次提交
  14. 22 9月, 2012 7 次提交