1. 17 12月, 2018 5 次提交
  2. 27 9月, 2018 1 次提交
    • R
      tcg/i386: fix vector operations on 32-bit hosts · 93bf9a42
      Roman Kapl 提交于
      The TCG backend uses LOWREGMASK to get the low 3 bits of register numbers.
      This was defined as no-op for 32-bit x86, with the assumption that we have
      eight registers anyway. This assumption is not true once we have xmm regs.
      
      Since LOWREGMASK was a no-op, xmm register indidices were wrong in opcodes
      and have overflown into other opcode fields, wreaking havoc.
      
      To trigger these problems, you can try running the "movi d8, #0x0" AArch64
      instruction on 32-bit x86. "vpxor %xmm0, %xmm0, %xmm0" should be generated,
      but instead TCG generated "vpxor %xmm0, %xmm0, %xmm2".
      
      Fixes: 770c2fc7 ("Add vector operations")
      Signed-off-by: NRoman Kapl <rka@sysgo.com>
      Message-Id: <20180824131734.18557-1-rka@sysgo.com>
      Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
      93bf9a42
  3. 24 7月, 2018 1 次提交
  4. 16 6月, 2018 2 次提交
  5. 09 5月, 2018 1 次提交
    • P
      tcg/i386: Fix dup_vec in non-AVX2 codepath · 7eb30ef0
      Peter Maydell 提交于
      The VPUNPCKLD* instructions are all "non-destructive source",
      indicated by "NDS" in the encoding string in the x86 ISA manual.
      This means that they take two source operands, one of which is
      encoded in the VEX.vvvv field. We were incorrectly treating them
      as if they were destructive-source and passing 0 as the 'v'
      argument of tcg_out_vex_modrm(). This meant we were always
      using %xmm0 as one of the source operands, causing incorrect
      results if the register allocator happened to want to use
      something else. For instance the input AArch64 insn:
       DUP v26.16b, w21
      which becomes TCG IR ops:
       dup_vec v128,e8,tmp2,x21
       st_vec v128,e8,tmp2,env,$0xa40
      was assembled to:
      0x607c568c:  c4 c1 7a 7e 86 e8 00 00  vmovq    0xe8(%r14), %xmm0
      0x607c5694:  00
      0x607c5695:  c5 f9 60 c8              vpunpcklbw %xmm0, %xmm0, %xmm1
      0x607c5699:  c5 f9 61 c9              vpunpcklwd %xmm1, %xmm0, %xmm1
      0x607c569d:  c5 f9 70 c9 00           vpshufd  $0, %xmm1, %xmm1
      0x607c56a2:  c4 c1 7a 7f 8e 40 0a 00  vmovdqu  %xmm1, 0xa40(%r14)
      0x607c56aa:  00
      
      when the vpunpcklwd insn should be "%xmm1, %xmm1, %xmm1".
      This resulted in our incorrectly setting the output vector to
      q26=0000320000003200:0000320000003200
      when given an input of x21 == 0000000002803200
      rather than the expected all-zeroes.
      
      Pass the correct source register number to tcg_out_vex_modrm()
      for these insns.
      
      Fixes: 770c2fc7
      Cc: qemu-stable@nongnu.org
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Message-Id: <20180504153431.5169-1-peter.maydell@linaro.org>
      Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
      7eb30ef0
  6. 16 3月, 2018 1 次提交
  7. 08 2月, 2018 1 次提交
    • R
      tcg/i386: Add vector operations · 770c2fc7
      Richard Henderson 提交于
      The x86 vector instruction set is extremely irregular.  With newer
      editions, Intel has filled in some of the blanks.  However, we don't
      get many 64-bit operations until SSE4.2, introduced in 2009.
      
      The subsequent edition was for AVX1, introduced in 2011, which added
      three-operand addressing, and adjusts how all instructions should be
      encoded.
      
      Given the relatively narrow 2 year window between possible to support
      and desirable to support, and to vastly simplify code maintainence,
      I am only planning to support AVX1 and later cpus.
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
      770c2fc7
  8. 10 10月, 2017 1 次提交
  9. 17 9月, 2017 2 次提交
  10. 08 9月, 2017 2 次提交
  11. 24 7月, 2017 1 次提交
  12. 06 6月, 2017 1 次提交
  13. 18 1月, 2017 2 次提交
  14. 11 1月, 2017 8 次提交
  15. 10 1月, 2017 1 次提交
  16. 21 9月, 2016 1 次提交
  17. 16 9月, 2016 2 次提交
  18. 06 7月, 2016 2 次提交
    • S
      tcg: Improve the alignment check infrastructure · 1f00b27f
      Sergey Sorokin 提交于
      Some architectures (e.g. ARMv8) need the address which is aligned
      to a size more than the size of the memory access.
      To support such check it's enough the current costless alignment
      check implementation in QEMU, but we need to support
      an alignment size specifying.
      Signed-off-by: NSergey Sorokin <afarallax@yandex.ru>
      Message-Id: <1466705806-679898-1-git-send-email-afarallax@yandex.ru>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      [rth: Assert in tcg_canonicalize_memop.  Leave get_alignment_bits
      available for, though unused by, user-mode.  Retain logging difference
      based on ALIGNED_ONLY.]
      1f00b27f
    • R
      tcg: Optimize spills of constants · 59d7c14e
      Richard Henderson 提交于
      While we can store constants via constrants on INDEX_op_st_i32 et al,
      we weren't able to spill constants to backing store.
      
      Add a new backend interface, tcg_out_sti, which may store the constant
      (and is allowed to fail).  Rearrange the temp_* helpers so that we only
      attempt to directly store a constant when the temp is becoming dead/free.
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      59d7c14e
  19. 13 5月, 2016 2 次提交
  20. 21 4月, 2016 2 次提交
  21. 24 2月, 2016 1 次提交
新手
引导
客服 返回
顶部