提交 · 1813e1758dcc60c96f8caf2d0c66c2193f1f86e0 · openeuler / qemu

13 5月, 2014 2 次提交

tcg: Define tcg_insn_unit for code pointers · 1813e175

由 Richard Henderson 提交于 3月 28, 2014

To be defined by the tcg backend based on the elemental unit of the ISA.
During the transition, allow TCG_TARGET_INSN_UNIT_SIZE to be undefined,
which allows us to default tcg_insn_unit to the current uint8_t.
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

1813e175

exec-all.h: Use stl_p to avoid undefined behaviour patching x86 jumps · 86360ad7

由 Peter Maydell 提交于 3月 28, 2014

The code which patches x86 jump instructions assumes it can do an
unaligned write of a uint32_t. This is actually safe on x86, but it's
still undefined behaviour. We have infrastructure for doing efficient
unaligned accesses which doesn't engage in undefined behaviour, so
use it.

This is technically fractionally less efficient, at least with gcc 4.6;
instead of one instruction:
 7b2:   89 3e                   mov    %edi,(%rsi)
we get an extra spurious store to the stack slot:
 7b2:   89 7c 24 64             mov    %edi,0x64(%rsp)
 7b6:   89 3e                   mov    %edi,(%rsi)
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

86360ad7

18 3月, 2014 1 次提交

exec-all.h: Increase MAX_OP_PER_INSTR for ARM A64 decoder · 14dcdac8

由 Peter Maydell 提交于 3月 17, 2014

The ARM A64 decoder's worst case number of TCG ops per instruction
is 266 (for insn 0x4c800000, a post-indexed ST4 multiple-structures
store). Raise the MAX_OP_PER_INSTR define accordingly.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Message-id: 1394822294-14837-17-git-send-email-peter.maydell@linaro.org

14dcdac8

14 3月, 2014 10 次提交
- A
  cputlb: Change tlb_set_page() argument to CPUState · 0c591eb0
  由 Andreas Färber 提交于 9月 03, 2013
```
Signed-off-by: NAndreas Färber <afaerber@suse.de>
```
  0c591eb0
- A
  cputlb: Change tlb_flush() argument to CPUState · 00c8cb0a
  由 Andreas Färber 提交于 9月 04, 2013
```
Signed-off-by: NAndreas Färber <afaerber@suse.de>
```
  00c8cb0a
- A
  cputlb: Change tlb_flush_page() argument to CPUState · 31b030d4
  由 Andreas Färber 提交于 9月 04, 2013
```
Signed-off-by: NAndreas Färber <afaerber@suse.de>
```
  31b030d4
- A
  cpu-exec: Change cpu_resume_from_signal() argument to CPUState · 0ea8cb88
  由 Andreas Färber 提交于 9月 03, 2013
```
Signed-off-by: NAndreas Färber <afaerber@suse.de>
```
  0ea8cb88
- A
  translate-all: Change tb_gen_code() argument to CPUState · 648f034c
  由 Andreas Färber 提交于 9月 01, 2013
```
Signed-off-by: NAndreas Färber <afaerber@suse.de>
```
  648f034c
- A
  translate-all: Change cpu_io_recompile() argument to CPUState · 90b40a69
  由 Andreas Färber 提交于 9月 01, 2013
```
Signed-off-by: NAndreas Färber <afaerber@suse.de>
```
  90b40a69
- A
  translate-all: Change cpu_restore_state() argument to CPUState · 3f38f309
  由 Andreas Färber 提交于 9月 01, 2013
```
This lets us drop some local variables in tlb_fill() functions.
Signed-off-by: NAndreas Färber <afaerber@suse.de>
```
  3f38f309
- A
  cpu-exec: Change cpu_loop_exit() argument to CPUState · 5638d180
  由 Andreas Färber 提交于 8月 27, 2013
```
Signed-off-by: NAndreas Färber <afaerber@suse.de>
```
  5638d180
- A
  exec: Change tlb_fill() argument to CPUState · d5a11fef
  由 Andreas Färber 提交于 8月 27, 2013
```
Signed-off-by: NAndreas Färber <afaerber@suse.de>
```
  d5a11fef
- A
  cpu: Move can_do_io field from CPU_COMMON to CPUState · 99df7dce
  由 Andreas Färber 提交于 8月 26, 2013
```
Rename can_do_io() to cpu_can_do_io() and change argument to CPUState.
Signed-off-by: NAndreas Färber <afaerber@suse.de>
```
  99df7dce
11 2月, 2014 3 次提交

cpu: Add per-cpu address space · 09daed84

由 Edgar E. Iglesias 提交于 12月 17, 2013

Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>

09daed84

exec: Make iotlb_to_region input an AS · 77717094

由 Edgar E. Iglesias 提交于 11月 07, 2013

Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>

77717094

exec: Make tb_invalidate_phys_addr input an AS · 29d8ec7b

由 Edgar E. Iglesias 提交于 11月 07, 2013

No functional change.
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>

29d8ec7b

18 1月, 2014 1 次提交

kvm: initialize qemu_host_page_size · 47c16ed5

由 Alexey Kardashevskiy 提交于 1月 17, 2014

There is a HOST_PAGE_ALIGN macro which makes sense for KVM accelerator
but it uses qemu_host_page_size/qemu_host_page_mask which initialized
for TCG only.

This moves qemu_host_page_size/qemu_host_page_mask initialization from
TCG's page_init() and adds a call for it from kvm_init().
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

47c16ed5

14 10月, 2013 1 次提交

exec: Fix prototype of phys_mem_set_alloc and related functions · 575ddeb4

由 Stefan Weil 提交于 9月 29, 2013

phys_mem_alloc and its assigned values qemu_anon_ram_alloc and
legacy_s390_alloc must have identical argument lists.

legacy_s390_alloc uses the size parameter to call mmap, so size_t is
good enough for all of them.

This patch fixes compiler errors on i686 Linux hosts:

  CC    alpha-softmmu/exec.o
exec.c:752:51: error:
 initialization from incompatible pointer type [-Werror]
exec.c: In function 'qemu_ram_alloc_from_ptr':
exec.c:1139:32: error:
 comparison of distinct pointer types lacks a cast [-Werror]
exec.c: In function 'qemu_ram_remap':
exec.c:1283:21: error:
 comparison of distinct pointer types lacks a cast [-Werror]
Signed-off-by: NStefan Weil <sw@weilnetz.de>
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>
Message-id: 1380481005-32399-1-git-send-email-sw@weilnetz.de
Signed-off-by: NAnthony Liguori <aliguori@amazon.com>

575ddeb4

11 10月, 2013 2 次提交

R
exec: Delete is_tcg_gen_code and GETRA_EXT · dbdbe0cd
由 Richard Henderson 提交于 9月 03, 2013
```
All implementations now boil down to GETRA.
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
dbdbe0cd

tcg-aarch64: Update to helper_ret_*_mmu routines · 023261ef

由 Richard Henderson 提交于 10月 01, 2013

A minimal update to use the new helpers with the return address argument.
Tested-by: NClaudio Fontana <claudio.fontana@linaro.org>
Reviewed-by: NClaudio Fontana <claudio.fontana@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

023261ef

02 10月, 2013 1 次提交

tcg-arm: Rearrange slow-path qemu_ld/st · d9f4dde4

由 Richard Henderson 提交于 7月 27, 2013

Use the new helper_ret_*_mmu routines.  Use a conditional call
to arrange for a tail-call from the store path, and to load the
return address for the helper for the load path.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

d9f4dde4

25 9月, 2013 1 次提交

tcg-ppc: use new return-argument ld/st helpers · 619f90ba

由 Paolo Bonzini 提交于 9月 05, 2013

These use a 32-bit load-of-immediate to save a mflr+addi+mtlr sequence.
Tested with a Windows 98 guest (pretty much the most recent thing I
could run on my PPC machine) and kvm-unit-tests's sieve.flat.  The
speed up for sieve.flat is as high as 10% for qemu-system-i386, 25%
(no kidding) for qemu-system-x86_64 on my PowerBook G4.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

619f90ba

13 9月, 2013 1 次提交

exec: Simplify the guest physical memory allocation hook · 91138037

由 Markus Armbruster 提交于 7月 31, 2013

Make it a generic hook rather than a KVM hook.  Less code and
ifdeffery.

Since the only user of the hook is old S390 KVM, there's hope we can
get rid of it some day.
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
Acked-by: NLaszlo Ersek <lersek@redhat.com>
Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-5-git-send-email-armbru@redhat.com
Signed-off-by: NAnthony Liguori <anthony@codemonkey.ws>

91138037

03 9月, 2013 2 次提交

exec: Split softmmu_defs.h · e58eb534

由 Richard Henderson 提交于 8月 27, 2013

The _cmmu helpers can be moved to exec-all.h.  The helpers that are
used from TCG will shortly need access to tcg_target_long so move
their declarations into tcg.h.

This requires minor include adjustments to all TCG backends.
Reviewed-by: NAurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

e58eb534

exec: Reorganize the GETRA/GETPC macros · 0f842f8a

由 Richard Henderson 提交于 8月 27, 2013

Always define GETRA; use __builtin_extract_return_addr, rather than
having a special case for s390.  Split GETPC_ADJ out of GETPC; use 2
universally, rather than having a special case for arm.

Rename GETPC_LDST to GETRA_LDST to indicate that it does not
contain the GETPC_ADJ value.  Likewise with GETPC_EXT to GETRA_EXT.

Perform the GETPC_ADJ adjustment inside helper_ret_ld/st.  This will
allow backends to pass along the "true" return address rather than
the massaged GETPC value.  In the meantime, double application of
GETPC_ADJ does not hurt, since the call insn in all ISAs is at least
4 bytes long.
Reviewed-by: NAurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

0f842f8a

30 8月, 2013 1 次提交

tcg-i386: Remove abort from GETPC_LDST · 584950fd

由 Richard Henderson 提交于 8月 29, 2013

Indeed, remove it entirely and remove the is_tcg_gen_code check
from GETPC_EXT.

Fixes https://bugs.launchpad.net/qemu/+bug/1218098 wherein a call
to a "normal" helper function performed a sequence of tail calls
all the way into the memory helper functions, leading to a stack
frame in which the memory helper function appeared to be called
directly from tcg.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

584950fd

27 8月, 2013 1 次提交

tcg-i386: Use new return-argument ld/st helpers · 401c227b

由 Richard Henderson 提交于 7月 25, 2013

Discontinue the jump-around-jump-to-jump scheme, trading it for a single
immediate move instruction.  The two extra jumps always consume 7 bytes,
whereas the immediate move is either 5 or 7 bytes depending on where the
code_gen_buffer gets located.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

401c227b

15 7月, 2013 1 次提交

tcg/aarch64: Implement tlb lookup fast path · c6d8ed24

由 Jani Kokkonen 提交于 7月 10, 2013

Supports CONFIG_QEMU_LDST_OPTIMIZATION
Signed-off-by: NJani Kokkonen <jani.kokkonen@huawei.com>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Reviewed-by: NClaudio Fontana <claudio.fontana@huawei.com>

c6d8ed24

12 6月, 2013 1 次提交

tcg/aarch64: implement new TCG target for aarch64 · 4a136e0a

由 Claudio Fontana 提交于 6月 12, 2013

add preliminary support for TCG target aarch64.
Signed-off-by: NClaudio Fontana <claudio.fontana@huawei.com>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Message-id: 51A5C596.3090108@huawei.com
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

4a136e0a

29 5月, 2013 1 次提交

memory: propagate errors on I/O dispatch · 791af8c8

由 Paolo Bonzini 提交于 5月 24, 2013

Reviewed-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

791af8c8

27 4月, 2013 1 次提交

tcg-arm: Convert to CONFIG_QEMU_LDST_OPTIMIZATION · df5e0ef7

由 Richard Henderson 提交于 3月 13, 2013

Move the slow path out of line, as the TODO's mention.
This allows the fast path to be unconditional, which can
speed up the fast path as well, depending on the core.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

df5e0ef7

16 2月, 2013 2 次提交

cpu: Move current_tb field to CPUState · d77953b9

由 Andreas Färber 提交于 1月 16, 2013

Explictly NULL it on CPU reset since it was located before breakpoints.

Change vapic_report_tpr_access() argument to CPUState. This also
resolves the use of void* for cpu.h independence.
Change vAPIC patch_instruction() argument to X86CPU.
Signed-off-by: NAndreas Färber <afaerber@suse.de>

d77953b9

TCG: Move translation block variables to new context inside tcg_ctx: tb_ctx · 5e5f07e0

由 Evgeny Voevodin 提交于 2月 01, 2013

It's worth to clean-up translation blocks variables and move them
into one context as was suggested by Swirl.
Also if we use this context directly inside tcg_ctx, then it
speeds up code generation a bit.
Signed-off-by: NEvgeny Voevodin <evgenyvoevodin@gmail.com>
Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>

5e5f07e0

20 1月, 2013 1 次提交

tci: Fix broken build (regression) · b54c2873

由 Stefan Weil 提交于 1月 19, 2013

s390x-linux-user now also uses GETPC. Instead of adding it to the list of
targets which use GETPC, the macro is now defined unconditionally.

This avoids future build regressions like this one:

  CC    s390x-linux-user/target-s390x/int_helper.o
cc1: warnings being treated as errors
qemu/target-s390x/int_helper.c: In function ‘helper_divs32’:
qemu/target-s390x/int_helper.c:47: error: implicit declaration of function ‘GETPC’
qemu/target-s390x/int_helper.c:47: error: nested extern declaration of ‘GETPC’
Signed-off-by: NStefan Weil <sw@weilnetz.de>
Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>

b54c2873

19 12月, 2012 2 次提交
- P
  misc: move include files to include/qemu/ · 1de7afc9
  由 Paolo Bonzini 提交于 12月 17, 2012
```
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
  1de7afc9
- P
  exec: move include files to include/exec/ · 022c62cb
  由 Paolo Bonzini 提交于 12月 17, 2012
```
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
  022c62cb
16 12月, 2012 1 次提交

exec: refactor cpu_restore_state · a8a826a3

由 Blue Swirl 提交于 12月 04, 2012

Refactor common code around calls to cpu_restore_state().

tb_find_pc() has now no external users, make it static.
Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>

a8a826a3

08 12月, 2012 1 次提交

TCG: Remove unused global gen_opc_ arrays. · 94788f54

由 Evgeny Voevodin 提交于 11月 21, 2012

Signed-off-by: NEvgeny Voevodin <e.voevodin@samsung.com>
Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>

94788f54

19 11月, 2012 1 次提交

tci: fix build breakage for target MIPS · de91f537

由 Stefan Weil 提交于 11月 18, 2012

commit 5f7319cd introduced GETPC() usage for MIPS, which is currently
not defined when building with --enable-tcg-interpreter. Add MIPS to
the list of targets we selectively define GETPC() for.
Signed-off-by: NStefan Weil <sw@weilnetz.de>
Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>

de91f537

06 11月, 2012 1 次提交

tcg/ppc32: Use trampolines to trim the code size for mmu slow path accessors · c878da3b

由 malc 提交于 11月 05, 2012

mmu access looks something like:

<check tlb>
if miss goto slow_path
<fast path>
done:
...

; end of the TB
slow_path:
 <pre process>
 mr r3, r27         ; move areg0 to r3
                    ; (r3 holds the first argument for all the PPC32 ABIs)
 <call mmu_helper>
 b $+8
 .long done
 <post process>
 b done

On ppc32 <call mmu_helper> is:

(SysV and Darwin)

mmu_helper is most likely not within direct branching distance from
the call site, necessitating

a. moving 32 bit offset of mmu_helper into a GPR ; 8 bytes
b. moving GPR to CTR/LR                          ; 4 bytes
c. (finally) branching to CTR/LR                 ; 4 bytes

r3 setting              - 4 bytes
call                    - 16 bytes
dummy jump over retaddr - 4 bytes
embedded retaddr        - 4 bytes
         Total overhead - 28 bytes

(PowerOpen (AIX))
a. moving 32 bit offset of mmu_helper's TOC into a GPR1 ; 8 bytes
b. loading 32 bit function pointer into GPR2            ; 4 bytes
c. moving GPR2 to CTR/LR                                ; 4 bytes
d. loading 32 bit small area pointer into R2            ; 4 bytes
e. (finally) branching to CTR/LR                        ; 4 bytes

r3 setting              - 4 bytes
call                    - 24 bytes
dummy jump over retaddr - 4 bytes
embedded retaddr        - 4 bytes
         Total overhead - 36 bytes

Following is done to trim the code size of slow path sections:

In tcg_target_qemu_prologue trampolines are emitted that look like this:

trampoline:
mfspr r3, LR
addi  r3, 4
mtspr LR, r3      ; fixup LR to point over embedded retaddr
mr    r3, r27
<jump mmu_helper> ; tail call of sorts

And slow path becomes:

slow_path:
 <pre process>
 <call trampoline>
 .long done
 <post process>
 b done

call                    - 4 bytes (trampoline is within code gen buffer
                                   and most likely accessible via
                                   direct branch)
embedded retaddr        - 4 bytes
         Total overhead - 8 bytes

In the end the icache pressure is decreased by 20/28 bytes at the cost
of an extra jump to trampoline and adjusting LR (to skip over embedded
retaddr) once inside.
Signed-off-by: Nmalc <av1474@comtv.ru>

c878da3b