- 13 5月, 2014 2 次提交
-
-
由 Richard Henderson 提交于
To be defined by the tcg backend based on the elemental unit of the ISA. During the transition, allow TCG_TARGET_INSN_UNIT_SIZE to be undefined, which allows us to default tcg_insn_unit to the current uint8_t. Reviewed-by: NPeter Maydell <peter.maydell@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Peter Maydell 提交于
The code which patches x86 jump instructions assumes it can do an unaligned write of a uint32_t. This is actually safe on x86, but it's still undefined behaviour. We have infrastructure for doing efficient unaligned accesses which doesn't engage in undefined behaviour, so use it. This is technically fractionally less efficient, at least with gcc 4.6; instead of one instruction: 7b2: 89 3e mov %edi,(%rsi) we get an extra spurious store to the stack slot: 7b2: 89 7c 24 64 mov %edi,0x64(%rsp) 7b6: 89 3e mov %edi,(%rsi) Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 18 3月, 2014 1 次提交
-
-
由 Peter Maydell 提交于
The ARM A64 decoder's worst case number of TCG ops per instruction is 266 (for insn 0x4c800000, a post-indexed ST4 multiple-structures store). Raise the MAX_OP_PER_INSTR define accordingly. Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Reviewed-by: NRichard Henderson <rth@twiddle.net> Message-id: 1394822294-14837-17-git-send-email-peter.maydell@linaro.org
-
- 14 3月, 2014 10 次提交
-
-
由 Andreas Färber 提交于
Signed-off-by: NAndreas Färber <afaerber@suse.de>
-
由 Andreas Färber 提交于
Signed-off-by: NAndreas Färber <afaerber@suse.de>
-
由 Andreas Färber 提交于
Signed-off-by: NAndreas Färber <afaerber@suse.de>
-
由 Andreas Färber 提交于
Signed-off-by: NAndreas Färber <afaerber@suse.de>
-
由 Andreas Färber 提交于
Signed-off-by: NAndreas Färber <afaerber@suse.de>
-
由 Andreas Färber 提交于
Signed-off-by: NAndreas Färber <afaerber@suse.de>
-
由 Andreas Färber 提交于
This lets us drop some local variables in tlb_fill() functions. Signed-off-by: NAndreas Färber <afaerber@suse.de>
-
由 Andreas Färber 提交于
Signed-off-by: NAndreas Färber <afaerber@suse.de>
-
由 Andreas Färber 提交于
Signed-off-by: NAndreas Färber <afaerber@suse.de>
-
由 Andreas Färber 提交于
Rename can_do_io() to cpu_can_do_io() and change argument to CPUState. Signed-off-by: NAndreas Färber <afaerber@suse.de>
-
- 11 2月, 2014 3 次提交
-
-
由 Edgar E. Iglesias 提交于
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org> Signed-off-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>
-
由 Edgar E. Iglesias 提交于
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org> Signed-off-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>
-
由 Edgar E. Iglesias 提交于
No functional change. Reviewed-by: NPeter Maydell <peter.maydell@linaro.org> Signed-off-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>
-
- 18 1月, 2014 1 次提交
-
-
由 Alexey Kardashevskiy 提交于
There is a HOST_PAGE_ALIGN macro which makes sense for KVM accelerator but it uses qemu_host_page_size/qemu_host_page_mask which initialized for TCG only. This moves qemu_host_page_size/qemu_host_page_mask initialization from TCG's page_init() and adds a call for it from kvm_init(). Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 14 10月, 2013 1 次提交
-
-
由 Stefan Weil 提交于
phys_mem_alloc and its assigned values qemu_anon_ram_alloc and legacy_s390_alloc must have identical argument lists. legacy_s390_alloc uses the size parameter to call mmap, so size_t is good enough for all of them. This patch fixes compiler errors on i686 Linux hosts: CC alpha-softmmu/exec.o exec.c:752:51: error: initialization from incompatible pointer type [-Werror] exec.c: In function 'qemu_ram_alloc_from_ptr': exec.c:1139:32: error: comparison of distinct pointer types lacks a cast [-Werror] exec.c: In function 'qemu_ram_remap': exec.c:1283:21: error: comparison of distinct pointer types lacks a cast [-Werror] Signed-off-by: NStefan Weil <sw@weilnetz.de> Reviewed-by: NMarkus Armbruster <armbru@redhat.com> Message-id: 1380481005-32399-1-git-send-email-sw@weilnetz.de Signed-off-by: NAnthony Liguori <aliguori@amazon.com>
-
- 11 10月, 2013 2 次提交
-
-
由 Richard Henderson 提交于
All implementations now boil down to GETRA. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
A minimal update to use the new helpers with the return address argument. Tested-by: NClaudio Fontana <claudio.fontana@linaro.org> Reviewed-by: NClaudio Fontana <claudio.fontana@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 02 10月, 2013 1 次提交
-
-
由 Richard Henderson 提交于
Use the new helper_ret_*_mmu routines. Use a conditional call to arrange for a tail-call from the store path, and to load the return address for the helper for the load path. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 25 9月, 2013 1 次提交
-
-
由 Paolo Bonzini 提交于
These use a 32-bit load-of-immediate to save a mflr+addi+mtlr sequence. Tested with a Windows 98 guest (pretty much the most recent thing I could run on my PPC machine) and kvm-unit-tests's sieve.flat. The speed up for sieve.flat is as high as 10% for qemu-system-i386, 25% (no kidding) for qemu-system-x86_64 on my PowerBook G4. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 13 9月, 2013 1 次提交
-
-
由 Markus Armbruster 提交于
Make it a generic hook rather than a KVM hook. Less code and ifdeffery. Since the only user of the hook is old S390 KVM, there's hope we can get rid of it some day. Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Acked-by: NLaszlo Ersek <lersek@redhat.com> Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Message-id: 1375276272-15988-5-git-send-email-armbru@redhat.com Signed-off-by: NAnthony Liguori <anthony@codemonkey.ws>
-
- 03 9月, 2013 2 次提交
-
-
由 Richard Henderson 提交于
The _cmmu helpers can be moved to exec-all.h. The helpers that are used from TCG will shortly need access to tcg_target_long so move their declarations into tcg.h. This requires minor include adjustments to all TCG backends. Reviewed-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Always define GETRA; use __builtin_extract_return_addr, rather than having a special case for s390. Split GETPC_ADJ out of GETPC; use 2 universally, rather than having a special case for arm. Rename GETPC_LDST to GETRA_LDST to indicate that it does not contain the GETPC_ADJ value. Likewise with GETPC_EXT to GETRA_EXT. Perform the GETPC_ADJ adjustment inside helper_ret_ld/st. This will allow backends to pass along the "true" return address rather than the massaged GETPC value. In the meantime, double application of GETPC_ADJ does not hurt, since the call insn in all ISAs is at least 4 bytes long. Reviewed-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 30 8月, 2013 1 次提交
-
-
由 Richard Henderson 提交于
Indeed, remove it entirely and remove the is_tcg_gen_code check from GETPC_EXT. Fixes https://bugs.launchpad.net/qemu/+bug/1218098 wherein a call to a "normal" helper function performed a sequence of tail calls all the way into the memory helper functions, leading to a stack frame in which the memory helper function appeared to be called directly from tcg. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 27 8月, 2013 1 次提交
-
-
由 Richard Henderson 提交于
Discontinue the jump-around-jump-to-jump scheme, trading it for a single immediate move instruction. The two extra jumps always consume 7 bytes, whereas the immediate move is either 5 or 7 bytes depending on where the code_gen_buffer gets located. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 15 7月, 2013 1 次提交
-
-
由 Jani Kokkonen 提交于
Supports CONFIG_QEMU_LDST_OPTIMIZATION Signed-off-by: NJani Kokkonen <jani.kokkonen@huawei.com> Reviewed-by: NRichard Henderson <rth@twiddle.net> Reviewed-by: NClaudio Fontana <claudio.fontana@huawei.com>
-
- 12 6月, 2013 1 次提交
-
-
由 Claudio Fontana 提交于
add preliminary support for TCG target aarch64. Signed-off-by: NClaudio Fontana <claudio.fontana@huawei.com> Reviewed-by: NRichard Henderson <rth@twiddle.net> Reviewed-by: NPeter Maydell <peter.maydell@linaro.org> Message-id: 51A5C596.3090108@huawei.com Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
- 29 5月, 2013 1 次提交
-
-
由 Paolo Bonzini 提交于
Reviewed-by: NRichard Henderson <rth@twiddle.net> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 4月, 2013 1 次提交
-
-
由 Richard Henderson 提交于
Move the slow path out of line, as the TODO's mention. This allows the fast path to be unconditional, which can speed up the fast path as well, depending on the core. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 16 2月, 2013 2 次提交
-
-
由 Andreas Färber 提交于
Explictly NULL it on CPU reset since it was located before breakpoints. Change vapic_report_tpr_access() argument to CPUState. This also resolves the use of void* for cpu.h independence. Change vAPIC patch_instruction() argument to X86CPU. Signed-off-by: NAndreas Färber <afaerber@suse.de>
-
由 Evgeny Voevodin 提交于
It's worth to clean-up translation blocks variables and move them into one context as was suggested by Swirl. Also if we use this context directly inside tcg_ctx, then it speeds up code generation a bit. Signed-off-by: NEvgeny Voevodin <evgenyvoevodin@gmail.com> Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>
-
- 20 1月, 2013 1 次提交
-
-
由 Stefan Weil 提交于
s390x-linux-user now also uses GETPC. Instead of adding it to the list of targets which use GETPC, the macro is now defined unconditionally. This avoids future build regressions like this one: CC s390x-linux-user/target-s390x/int_helper.o cc1: warnings being treated as errors qemu/target-s390x/int_helper.c: In function ‘helper_divs32’: qemu/target-s390x/int_helper.c:47: error: implicit declaration of function ‘GETPC’ qemu/target-s390x/int_helper.c:47: error: nested extern declaration of ‘GETPC’ Signed-off-by: NStefan Weil <sw@weilnetz.de> Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>
-
- 19 12月, 2012 2 次提交
-
-
由 Paolo Bonzini 提交于
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 16 12月, 2012 1 次提交
-
-
由 Blue Swirl 提交于
Refactor common code around calls to cpu_restore_state(). tb_find_pc() has now no external users, make it static. Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>
-
- 08 12月, 2012 1 次提交
-
-
由 Evgeny Voevodin 提交于
Signed-off-by: NEvgeny Voevodin <e.voevodin@samsung.com> Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>
-
- 19 11月, 2012 1 次提交
-
-
由 Stefan Weil 提交于
commit 5f7319cd introduced GETPC() usage for MIPS, which is currently not defined when building with --enable-tcg-interpreter. Add MIPS to the list of targets we selectively define GETPC() for. Signed-off-by: NStefan Weil <sw@weilnetz.de> Signed-off-by: NBlue Swirl <blauwirbel@gmail.com>
-
- 06 11月, 2012 1 次提交
-
-
由 malc 提交于
mmu access looks something like: <check tlb> if miss goto slow_path <fast path> done: ... ; end of the TB slow_path: <pre process> mr r3, r27 ; move areg0 to r3 ; (r3 holds the first argument for all the PPC32 ABIs) <call mmu_helper> b $+8 .long done <post process> b done On ppc32 <call mmu_helper> is: (SysV and Darwin) mmu_helper is most likely not within direct branching distance from the call site, necessitating a. moving 32 bit offset of mmu_helper into a GPR ; 8 bytes b. moving GPR to CTR/LR ; 4 bytes c. (finally) branching to CTR/LR ; 4 bytes r3 setting - 4 bytes call - 16 bytes dummy jump over retaddr - 4 bytes embedded retaddr - 4 bytes Total overhead - 28 bytes (PowerOpen (AIX)) a. moving 32 bit offset of mmu_helper's TOC into a GPR1 ; 8 bytes b. loading 32 bit function pointer into GPR2 ; 4 bytes c. moving GPR2 to CTR/LR ; 4 bytes d. loading 32 bit small area pointer into R2 ; 4 bytes e. (finally) branching to CTR/LR ; 4 bytes r3 setting - 4 bytes call - 24 bytes dummy jump over retaddr - 4 bytes embedded retaddr - 4 bytes Total overhead - 36 bytes Following is done to trim the code size of slow path sections: In tcg_target_qemu_prologue trampolines are emitted that look like this: trampoline: mfspr r3, LR addi r3, 4 mtspr LR, r3 ; fixup LR to point over embedded retaddr mr r3, r27 <jump mmu_helper> ; tail call of sorts And slow path becomes: slow_path: <pre process> <call trampoline> .long done <post process> b done call - 4 bytes (trampoline is within code gen buffer and most likely accessible via direct branch) embedded retaddr - 4 bytes Total overhead - 8 bytes In the end the icache pressure is decreased by 20/28 bytes at the cost of an extra jump to trampoline and adjusting LR (to skip over embedded retaddr) once inside. Signed-off-by: Nmalc <av1474@comtv.ru>
-