1. 25 8月, 2015 14 次提交
    • B
      tcg/ppc: Improve unaligned load/store handling on 64-bit backend · 68d45bb6
      Benjamin Herrenschmidt 提交于
      Currently, we get to the slow path for any unaligned access in the
      backend, because we effectively preserve the bottom address bits
      below the alignment requirement when comparing with the TLB entry,
      so any non-0 bit there will cause the compare to fail.
      
      For the same number of instructions, we can instead add the access
      size - 1 to the address and stick to clearing all the bottom bits.
      
      That means that normal unaligned accesses will not fallback (the HW
      will handle them fine). Only when crossing a page boundary well we
      end up having a mismatch because we'll end up pointing to the next
      page which cannot possibly be in that same TLB entry.
      Reviewed-by: NAurelien Jarno <aurelien@aurel32.net>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Message-Id: <1437455978.5809.2.camel@kernel.crashing.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      68d45bb6
    • A
      tcg/i386: use softmmu fast path for unaligned accesses · 8cc580f6
      Aurelien Jarno 提交于
      Softmmu unaligned load/stores currently goes through through the slow
      path for two reasons:
        - to support unaligned access on host with strict alignement
        - to correctly handle accesses crossing pages
      
      x86 is only concerned by the second reason. Unaligned accesses are
      avoided by compilers, but are not uncommon. We therefore would like
      to see them going through the fast path, if they don't cross pages.
      
      For that we can use the fact that two adjacent TLB entries can't contain
      the same page. Therefore accessing the TLB entry corresponding to the
      first byte, but comparing its content to page address of the last byte
      ensures that we don't cross pages. We can do this check without adding
      more instructions in the TLB code (but increasing its length by one
      byte) by using the LEA instruction to combine the existing move with the
      size addition.
      
      On an x86-64 host, this gives a 3% boot time improvement for a powerpc
      guest and 4% for an x86-64 guest.
      
      [rth: Tidied calculation of the offset mask]
      Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
      Message-Id: <1436467197-2183-1-git-send-email-aurelien@aurel32.net>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      8cc580f6
    • R
      tcg: Remove tcg_gen_trunc_i64_i32 · ecc7b3aa
      Richard Henderson 提交于
      Replacing it with tcg_gen_extrl_i64_i32.
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      ecc7b3aa
    • R
      tcg: Split trunc_shr_i32 opcode into extr[lh]_i64_i32 · 609ad705
      Richard Henderson 提交于
      Rather than allow arbitrary shift+trunc, only concern ourselves
      with low and high parts.  This is all that was being used anyway.
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      609ad705
    • A
      870ad154
    • A
      tcg/optimize: add optimizations for ext_i32_i64 and extu_i32_i64 ops · 8bcb5c8f
      Aurelien Jarno 提交于
      They behave the same as ext32s_i64 and ext32u_i64 from the constant
      folding and zero propagation point of view, except that they can't
      be replaced by a mov, so we don't compute the affected value.
      Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      8bcb5c8f
    • A
      tcg: implement real ext_i32_i64 and extu_i32_i64 ops · 4f2331e5
      Aurelien Jarno 提交于
      Implement real ext_i32_i64 and extu_i32_i64 ops. They ensure that a
      32-bit value is always converted to a 64-bit value and not propagated
      through the register allocator or the optimizer.
      
      Cc: Andrzej Zaborowski <balrogg@gmail.com>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Blue Swirl <blauwirbel@gmail.com>
      Cc: Stefan Weil <sw@weilnetz.de>
      Acked-by: NClaudio Fontana <claudio.fontana@huawei.com>
      Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      4f2331e5
    • A
      tcg: don't abuse TCG type in tcg_gen_trunc_shr_i64_i32 · 6acd2558
      Aurelien Jarno 提交于
      The tcg_gen_trunc_shr_i64_i32 function takes a 64-bit argument and
      returns a 32-bit value. Directly call tcg_gen_op3 with the correct
      types instead of calling tcg_gen_op3i_i32 and abusing the TCG types.
      Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      6acd2558
    • A
      tcg: rename trunc_shr_i32 into trunc_shr_i64_i32 · 0632e555
      Aurelien Jarno 提交于
      The op is sometimes named trunc_shr_i32 and sometimes trunc_shr_i64_i32,
      and the name in the README doesn't match the name offered to the
      frontends.
      
      Always use the long name to make it clear it is a size changing op.
      Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      0632e555
    • A
      tcg/optimize: allow constant to have copies · 299f8013
      Aurelien Jarno 提交于
      Now that copies and constants are tracked separately, we can allow
      constant to have copies, deferring the choice to use a register or a
      constant to the register allocation pass. This prevent this kind of
      regular constant reloading:
      
      -OUT: [size=338]
      +OUT: [size=298]
         mov    -0x4(%r14),%ebp
         test   %ebp,%ebp
         jne    0x7ffbe9cb0ed6
         mov    $0x40002219f8,%rbp
         mov    %rbp,(%r14)
      -  mov    $0x40002219f8,%rbp
         mov    $0x4000221a20,%rbx
         mov    %rbp,(%rbx)
         mov    $0x4000000000,%rbp
         mov    %rbp,(%r14)
      -  mov    $0x4000000000,%rbp
         mov    $0x4000221d38,%rbx
         mov    %rbp,(%rbx)
         mov    $0x40002221a8,%rbp
         mov    %rbp,(%r14)
      -  mov    $0x40002221a8,%rbp
         mov    $0x4000221d40,%rbx
         mov    %rbp,(%rbx)
         mov    $0x4000019170,%rbp
         mov    %rbp,(%r14)
      -  mov    $0x4000019170,%rbp
         mov    $0x4000221d48,%rbx
         mov    %rbp,(%rbx)
         mov    $0x40000049ee,%rbp
         mov    %rbp,0x80(%r14)
         mov    %r14,%rdi
         callq  0x7ffbe99924d0
         mov    $0x4000001680,%rbp
         mov    %rbp,0x30(%r14)
         mov    0x10(%r14),%rbp
         mov    $0x4000001680,%rbp
         mov    %rbp,0x30(%r14)
         mov    0x10(%r14),%rbp
         shl    $0x20,%rbp
         mov    (%r14),%rbx
         mov    %ebx,%ebx
         mov    %rbx,(%r14)
         or     %rbx,%rbp
         mov    %rbp,0x10(%r14)
         mov    %rbp,0x90(%r14)
         mov    0x60(%r14),%rbx
         mov    %rbx,0x38(%r14)
         mov    0x28(%r14),%rbx
         mov    $0x4000220e60,%r12
         mov    %rbx,(%r12)
         mov    $0x40002219c8,%rbx
         mov    %rbp,(%rbx)
         mov    0x20(%r14),%rbp
         sub    $0x8,%rbp
         mov    $0x4000004a16,%rbx
         mov    %rbx,0x0(%rbp)
         mov    %rbp,0x20(%r14)
         mov    $0x19,%ebp
         mov    %ebp,0xa8(%r14)
         mov    $0x4000015110,%rbp
         mov    %rbp,0x80(%r14)
         xor    %eax,%eax
         jmpq   0x7ffbebcae426
         lea    -0x5f6d72a(%rip),%rax        # 0x7ffbe3d437b3
         jmpq   0x7ffbebcae426
      Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      299f8013
    • A
      tcg/optimize: track const/copy status separately · b41059dd
      Aurelien Jarno 提交于
      Instead of using an enum which could be either a copy or a const, track
      them separately. This will be used in the next patch.
      
      Constants are tracked through a bool. Copies are tracked by initializing
      temp's next_copy and prev_copy to itself, allowing to simplify the code
      a bit.
      Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      b41059dd
    • A
      tcg/optimize: add temp_is_const and temp_is_copy functions · d9c769c6
      Aurelien Jarno 提交于
      Add two accessor functions temp_is_const and temp_is_copy, to make the
      code more readable and make code change easier.
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      d9c769c6
    • A
      tcg/optimize: optimize temps tracking · 1208d7dd
      Aurelien Jarno 提交于
      The tcg_temp_info structure uses 24 bytes per temp. Now that we emulate
      vector registers on most guests, it's not uncommon to have more than 100
      used temps. This means we have initialize more than 2kB at least twice
      per TB, often more when there is a few goto_tb.
      
      Instead used a TCGTempSet bit array to track which temps are in used in
      the current basic block. This means there are only around 16 bytes to
      initialize.
      
      This improves the boot time of a MIPS guest on an x86-64 host by around
      7% and moves out tcg_optimize from the the top of the profiler list.
      
      [rth: Handle TCG_CALL_DUMMY_ARG]
      Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      1208d7dd
    • A
      tcg/optimize: fix constant signedness · 29f3ff8d
      Aurelien Jarno 提交于
      By convention, on a 64-bit host TCG internally stores 32-bit constants
      as sign-extended. This is not the case in the optimizer when a 32-bit
      constant is folded.
      
      This doesn't seem to have more consequences than suboptimal code
      generation. For instance the x86 backend assumes sign-extended constants,
      and in some rare cases uses a 32-bit unsigned immediate 0xffffffff
      instead of a 8-bit signed immediate 0xff for the constant -1. This is
      with a ppc guest:
      
      before
      ------
      
       ---- 0x9f29cc
       movi_i32 tmp1,$0xffffffff
       movi_i32 tmp2,$0x0
       add2_i32 tmp0,CA,CA,tmp2,r6,tmp2
       add2_i32 tmp0,CA,tmp0,CA,tmp1,tmp2
       mov_i32 r10,tmp0
      
      0x7fd8c7dfe90c:  xor    %ebp,%ebp
      0x7fd8c7dfe90e:  mov    %ebp,%r11d
      0x7fd8c7dfe911:  mov    0x18(%r14),%r9d
      0x7fd8c7dfe915:  add    %r9d,%r10d
      0x7fd8c7dfe918:  adc    %ebp,%r11d
      0x7fd8c7dfe91b:  add    $0xffffffff,%r10d
      0x7fd8c7dfe922:  adc    %ebp,%r11d
      0x7fd8c7dfe925:  mov    %r11d,0x134(%r14)
      0x7fd8c7dfe92c:  mov    %r10d,0x28(%r14)
      
      after
      -----
      
       ---- 0x9f29cc
       movi_i32 tmp1,$0xffffffffffffffff
       movi_i32 tmp2,$0x0
       add2_i32 tmp0,CA,CA,tmp2,r6,tmp2
       add2_i32 tmp0,CA,tmp0,CA,tmp1,tmp2
       mov_i32 r10,tmp0
      
      0x7f37010d490c:  xor    %ebp,%ebp
      0x7f37010d490e:  mov    %ebp,%r11d
      0x7f37010d4911:  mov    0x18(%r14),%r9d
      0x7f37010d4915:  add    %r9d,%r10d
      0x7f37010d4918:  adc    %ebp,%r11d
      0x7f37010d491b:  add    $0xffffffffffffffff,%r10d
      0x7f37010d491f:  adc    %ebp,%r11d
      0x7f37010d4922:  mov    %r11d,0x134(%r14)
      0x7f37010d4929:  mov    %r10d,0x28(%r14)
      Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
      Message-Id: <1436544211-2769-2-git-send-email-aurelien@aurel32.net>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      29f3ff8d
  2. 20 8月, 2015 1 次提交
    • P
      configure: Don't permit SDL or GTK on OSX · a30878e7
      Peter Maydell 提交于
      The cocoa GUI frontend assumes it is the only GUI (it redefines
      main() so it always gets control before the rest of QEMU), so
      it does not play well with other UIs like SDL or GTK. (Mostly
      people building QEMU on OSX don't have the necessary dependencies
      available for configure to build those other front ends, so
      mostly this problem goes unnoticed.)
      
      Make configure automatically disable the SDL and GTK front ends
      if the cocoa front end is enabled. (We were sort of attempting
      to do this for SDL before, but not in a way that worked very well.)
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
      Reviewed-by: NJohn Arbuckle <programmingkidx@gmail.com>
      Message-id: 1439565052-3457-1-git-send-email-peter.maydell@linaro.org
      a30878e7
  3. 19 8月, 2015 14 次提交
  4. 15 8月, 2015 11 次提交