- 25 8月, 2015 14 次提交
-
-
由 Benjamin Herrenschmidt 提交于
Currently, we get to the slow path for any unaligned access in the backend, because we effectively preserve the bottom address bits below the alignment requirement when comparing with the TLB entry, so any non-0 bit there will cause the compare to fail. For the same number of instructions, we can instead add the access size - 1 to the address and stick to clearing all the bottom bits. That means that normal unaligned accesses will not fallback (the HW will handle them fine). Only when crossing a page boundary well we end up having a mismatch because we'll end up pointing to the next page which cannot possibly be in that same TLB entry. Reviewed-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Message-Id: <1437455978.5809.2.camel@kernel.crashing.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Aurelien Jarno 提交于
Softmmu unaligned load/stores currently goes through through the slow path for two reasons: - to support unaligned access on host with strict alignement - to correctly handle accesses crossing pages x86 is only concerned by the second reason. Unaligned accesses are avoided by compilers, but are not uncommon. We therefore would like to see them going through the fast path, if they don't cross pages. For that we can use the fact that two adjacent TLB entries can't contain the same page. Therefore accessing the TLB entry corresponding to the first byte, but comparing its content to page address of the last byte ensures that we don't cross pages. We can do this check without adding more instructions in the TLB code (but increasing its length by one byte) by using the LEA instruction to combine the existing move with the size addition. On an x86-64 host, this gives a 3% boot time improvement for a powerpc guest and 4% for an x86-64 guest. [rth: Tidied calculation of the offset mask] Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Message-Id: <1436467197-2183-1-git-send-email-aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Replacing it with tcg_gen_extrl_i64_i32. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Rather than allow arbitrary shift+trunc, only concern ourselves with low and high parts. This is all that was being used anyway. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Aurelien Jarno 提交于
Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Aurelien Jarno 提交于
They behave the same as ext32s_i64 and ext32u_i64 from the constant folding and zero propagation point of view, except that they can't be replaced by a mov, so we don't compute the affected value. Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Aurelien Jarno 提交于
Implement real ext_i32_i64 and extu_i32_i64 ops. They ensure that a 32-bit value is always converted to a 64-bit value and not propagated through the register allocator or the optimizer. Cc: Andrzej Zaborowski <balrogg@gmail.com> Cc: Alexander Graf <agraf@suse.de> Cc: Blue Swirl <blauwirbel@gmail.com> Cc: Stefan Weil <sw@weilnetz.de> Acked-by: NClaudio Fontana <claudio.fontana@huawei.com> Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Aurelien Jarno 提交于
The tcg_gen_trunc_shr_i64_i32 function takes a 64-bit argument and returns a 32-bit value. Directly call tcg_gen_op3 with the correct types instead of calling tcg_gen_op3i_i32 and abusing the TCG types. Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Aurelien Jarno 提交于
The op is sometimes named trunc_shr_i32 and sometimes trunc_shr_i64_i32, and the name in the README doesn't match the name offered to the frontends. Always use the long name to make it clear it is a size changing op. Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Aurelien Jarno 提交于
Now that copies and constants are tracked separately, we can allow constant to have copies, deferring the choice to use a register or a constant to the register allocation pass. This prevent this kind of regular constant reloading: -OUT: [size=338] +OUT: [size=298] mov -0x4(%r14),%ebp test %ebp,%ebp jne 0x7ffbe9cb0ed6 mov $0x40002219f8,%rbp mov %rbp,(%r14) - mov $0x40002219f8,%rbp mov $0x4000221a20,%rbx mov %rbp,(%rbx) mov $0x4000000000,%rbp mov %rbp,(%r14) - mov $0x4000000000,%rbp mov $0x4000221d38,%rbx mov %rbp,(%rbx) mov $0x40002221a8,%rbp mov %rbp,(%r14) - mov $0x40002221a8,%rbp mov $0x4000221d40,%rbx mov %rbp,(%rbx) mov $0x4000019170,%rbp mov %rbp,(%r14) - mov $0x4000019170,%rbp mov $0x4000221d48,%rbx mov %rbp,(%rbx) mov $0x40000049ee,%rbp mov %rbp,0x80(%r14) mov %r14,%rdi callq 0x7ffbe99924d0 mov $0x4000001680,%rbp mov %rbp,0x30(%r14) mov 0x10(%r14),%rbp mov $0x4000001680,%rbp mov %rbp,0x30(%r14) mov 0x10(%r14),%rbp shl $0x20,%rbp mov (%r14),%rbx mov %ebx,%ebx mov %rbx,(%r14) or %rbx,%rbp mov %rbp,0x10(%r14) mov %rbp,0x90(%r14) mov 0x60(%r14),%rbx mov %rbx,0x38(%r14) mov 0x28(%r14),%rbx mov $0x4000220e60,%r12 mov %rbx,(%r12) mov $0x40002219c8,%rbx mov %rbp,(%rbx) mov 0x20(%r14),%rbp sub $0x8,%rbp mov $0x4000004a16,%rbx mov %rbx,0x0(%rbp) mov %rbp,0x20(%r14) mov $0x19,%ebp mov %ebp,0xa8(%r14) mov $0x4000015110,%rbp mov %rbp,0x80(%r14) xor %eax,%eax jmpq 0x7ffbebcae426 lea -0x5f6d72a(%rip),%rax # 0x7ffbe3d437b3 jmpq 0x7ffbebcae426 Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Aurelien Jarno 提交于
Instead of using an enum which could be either a copy or a const, track them separately. This will be used in the next patch. Constants are tracked through a bool. Copies are tracked by initializing temp's next_copy and prev_copy to itself, allowing to simplify the code a bit. Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Aurelien Jarno 提交于
Add two accessor functions temp_is_const and temp_is_copy, to make the code more readable and make code change easier. Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Aurelien Jarno 提交于
The tcg_temp_info structure uses 24 bytes per temp. Now that we emulate vector registers on most guests, it's not uncommon to have more than 100 used temps. This means we have initialize more than 2kB at least twice per TB, often more when there is a few goto_tb. Instead used a TCGTempSet bit array to track which temps are in used in the current basic block. This means there are only around 16 bytes to initialize. This improves the boot time of a MIPS guest on an x86-64 host by around 7% and moves out tcg_optimize from the the top of the profiler list. [rth: Handle TCG_CALL_DUMMY_ARG] Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Aurelien Jarno 提交于
By convention, on a 64-bit host TCG internally stores 32-bit constants as sign-extended. This is not the case in the optimizer when a 32-bit constant is folded. This doesn't seem to have more consequences than suboptimal code generation. For instance the x86 backend assumes sign-extended constants, and in some rare cases uses a 32-bit unsigned immediate 0xffffffff instead of a 8-bit signed immediate 0xff for the constant -1. This is with a ppc guest: before ------ ---- 0x9f29cc movi_i32 tmp1,$0xffffffff movi_i32 tmp2,$0x0 add2_i32 tmp0,CA,CA,tmp2,r6,tmp2 add2_i32 tmp0,CA,tmp0,CA,tmp1,tmp2 mov_i32 r10,tmp0 0x7fd8c7dfe90c: xor %ebp,%ebp 0x7fd8c7dfe90e: mov %ebp,%r11d 0x7fd8c7dfe911: mov 0x18(%r14),%r9d 0x7fd8c7dfe915: add %r9d,%r10d 0x7fd8c7dfe918: adc %ebp,%r11d 0x7fd8c7dfe91b: add $0xffffffff,%r10d 0x7fd8c7dfe922: adc %ebp,%r11d 0x7fd8c7dfe925: mov %r11d,0x134(%r14) 0x7fd8c7dfe92c: mov %r10d,0x28(%r14) after ----- ---- 0x9f29cc movi_i32 tmp1,$0xffffffffffffffff movi_i32 tmp2,$0x0 add2_i32 tmp0,CA,CA,tmp2,r6,tmp2 add2_i32 tmp0,CA,tmp0,CA,tmp1,tmp2 mov_i32 r10,tmp0 0x7f37010d490c: xor %ebp,%ebp 0x7f37010d490e: mov %ebp,%r11d 0x7f37010d4911: mov 0x18(%r14),%r9d 0x7f37010d4915: add %r9d,%r10d 0x7f37010d4918: adc %ebp,%r11d 0x7f37010d491b: add $0xffffffffffffffff,%r10d 0x7f37010d491f: adc %ebp,%r11d 0x7f37010d4922: mov %r11d,0x134(%r14) 0x7f37010d4929: mov %r10d,0x28(%r14) Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Message-Id: <1436544211-2769-2-git-send-email-aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 20 8月, 2015 1 次提交
-
-
由 Peter Maydell 提交于
The cocoa GUI frontend assumes it is the only GUI (it redefines main() so it always gets control before the rest of QEMU), so it does not play well with other UIs like SDL or GTK. (Mostly people building QEMU on OSX don't have the necessary dependencies available for configure to build those other front ends, so mostly this problem goes unnoticed.) Make configure automatically disable the SDL and GTK front ends if the cocoa front end is enabled. (We were sort of attempting to do this for SDL before, but not in a way that worked very well.) Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Reviewed-by: NDaniel P. Berrange <berrange@redhat.com> Reviewed-by: NJohn Arbuckle <programmingkidx@gmail.com> Message-id: 1439565052-3457-1-git-send-email-peter.maydell@linaro.org
-
- 19 8月, 2015 14 次提交
-
-
由 Peter Maydell 提交于
apic_internal.h relies on cpu.h having been included (for the X86CPU type); include it directly rather than relying on it being pulled in via one of the other includes like timer.h. Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
-
由 Peter Maydell 提交于
Move the muldiv64() function from qemu-common.h to host-utils.h. This puts it together with all the other arithmetic functions where we provide a version with __int128_t and a fallback without, and allows headers which need muldiv64() to avoid including qemu-common.h. We don't include host-utils from qemu-common.h, to avoid dragging more things into qemu-common.h than it already has; in practice everywhere that needs muldiv64() can get it via qemu/timer.h. Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
-
由 Peter Maydell 提交于
Add a header comment to osdep.h, explaining what the header is for and some rules to avoid circular-include difficulties. Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
-
由 Peter Maydell 提交于
qemu-common.h has some system header includes and fixups for things that might be missing. This is really an OS dependency and belongs in osdep.h, so move it across. Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
-
由 Peter Maydell 提交于
qemu-common.h includes some fixups for things the Win32 headers don't define or define weirdly. These really belong in os-win32.h, so move them there. Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
-
由 Peter Maydell 提交于
Rather than rolling custom concatenate-strings macros for the QEMU_BUILD_BUG_ON macro to use, use the glue() macro we already have (since it's now available to us in this header). Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
-
由 Peter Maydell 提交于
osdep.h has a few things which are really compiler specific; move them to compiler.h, and include compiler.h from osdep.h. Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
-
由 Peter Maydell 提交于
qemu_printf is an ancient remnant which has been a simple #define to printf for over a decade, and is used in only a few places. Expand it out in those places and remove the #define. Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
-
由 Peter Maydell 提交于
qmp-event.c already includes qemu-common.h, so manually including os-win32.h/os-posix.h is unnecessary (and potentially fragile, since it's duplicating the #ifdef logic that chooses which of the two we need). Remove the unnecessary include logic. Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
-
由 Peter Maydell 提交于
Alpha shadow register optimization # gpg: Signature made Tue 18 Aug 2015 19:09:41 BST using RSA key ID 4DD0279B # gpg: Good signature from "Richard Henderson <rth7680@gmail.com>" # gpg: aka "Richard Henderson <rth@redhat.com>" # gpg: aka "Richard Henderson <rth@twiddle.net>" * remotes/rth/tags/pull-axp-201508018: target-alpha: Inline hw_ret target-alpha: Inline call_pal target-alpha: Use separate TCGv temporaries for the shadow registers Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
由 Richard Henderson 提交于
Reviewed-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Reviewed-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
This avoids having to manually swap them around when swapping to and from PALmode. We simply encode the shadow registers into the translation. The VMStateDescription version changes, because the meaning of "shadow" changes in the save file when in PALmode. It would be possible to fix this, but I don't think it's worth the effort. Reviewed-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Peter Maydell 提交于
* SCSI fixes from Stefan and Fam * vhost-scsi fix from Igor and Lu Lina * a build system fix from Daniel * two more multi-arch-related patches from Peter C. * TCG patches from myself and Sergey Fedorov * RCU improvement from Wen Congyang * a few more simple cleanups # gpg: Signature made Fri 14 Aug 2015 22:41:52 BST using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: disas: Defeature print_target_address hw: fix mask for ColdFire UART command register scsi-generic: identify AIO callbacks more clearly scsi-disk: identify AIO callbacks more clearly scsi: create restart bottom half in the right AioContext configure: only add CONFIG_RDMA to config-host.h once qemu-nbd: remove unnecessary qemu_notify_event() vhost-scsi: Clarify vhost_virtqueue_mask argument exec: use macro ROUND_UP for alignment rcu: Allow calling rcu_(un)register_thread() during synchronize_rcu() exec: drop cpu_can_do_io, just read cpu->can_do_io cpu_defs: Simplify CPUTLB padding logic cpu-exec: Do not invalidate original TB in cpu_exec_nocache() vhost/scsi: call vhost_dev_cleanup() at unrealize() time virtio-scsi-test: Add test case for tail unaligned WRITE SAME scsi-disk: Fix assertion failure on WRITE SAME tests: virtio-scsi: clear unit attention after reset scsi-disk: fix cmd.mode field typo virtio-scsi: use virtqueue_map_sg() when loading requests Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
- 15 8月, 2015 11 次提交
-
-
由 Peter Crosthwaite 提交于
It does not work in multi-arch as it requires the CPU specific TARGET_VIRT_ADDR_SPACE_BITS global define. Just use the generic version that does no masking. Targets should be responsible for passing in a sane virtual address. Signed-off-by: NPeter Crosthwaite <crosthwaite.peter@gmail.com> Message-Id: <1436129432-16617-1-git-send-email-crosthwaite.peter@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The "miscellaneous commands" part of the register is 3 bits wide. Spotted by Coverity and confirmed in the datasheet, downloadable from http://cache.freescale.com/files/32bit/doc/ref_manual/MCF5307BUM.pdf (figure 14-6). Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Functions that are not callbacks should assert that aiocb is NULL and have a SCSIGenericReq argument. AIO callbacks should assert that aiocb is not NULL. They also have an opaque argument. Reviewed-by: NFam Zheng <famz@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Functions that are not callbacks should assert that aiocb is NULL and have a non-opaque argument (usually a pointer to SCSIDiskReq). AIO callbacks should assert that aiocb is not NULL and take care of calling block_acct done. They also have an opaque argument. Reviewed-by: NFam Zheng <famz@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
This matches commit 4407c1c5 (virtio-blk: Schedule BH in the right context, 2014-06-17), which did the same thing for virtio-blk. Reviewed-by: NFam Zheng <famz@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Daniel P. Berrange 提交于
For unknown reasons (probably a git rebase merge mistake) commit 2da776db Author: Michael R. Hines <mrhines@us.ibm.com> Date: Mon Jul 22 10:01:54 2013 -0400 rdma: core logic Adds CONFIG_RDMA to config-host.h twice, as can be seen in the generated file: $ grep CONFIG_RDMA config-host.h #define CONFIG_RDMA 1 #define CONFIG_RDMA 1 Signed-off-by: NDaniel P. Berrange <berrange@redhat.com> Message-Id: <1438345403-32467-1-git-send-email-berrange@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
This was needed when qemu-nbd was using qemu_set_fd_handler2. It is not needed anymore now that nbd_update_server_fd_handler is called whenever nbd_can_accept() can change from false to true. nbd_update_server_fd_handler will call qemu_set_fd_handler(), which will call qemu_notify_event(). Reviewed-by: NMax Reitz <mreitz@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Lu Lina 提交于
vhost_virtqueue_mask takes an "absolute" virtqueue index, while the code looks like it's passing an index that is relative to s->dev.vq_index. In reality, s->dev.vq_index is always zero, so this patch does not make any difference, but the code is clearer. Signed-off-by: NLu Lina <lina.lulina@huawei.com> Signed-off-by: NGonglei <arei.gonglei@huawei.com> Message-Id: <1437978359-17960-1-git-send-email-arei.gonglei@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Chen Hanxiao 提交于
Use ROUND_UP instead. Signed-off-by: NChen Hanxiao <chenhanxiao@cn.fujitsu.com> Message-Id: <1437707523-4910-1-git-send-email-chenhanxiao@cn.fujitsu.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wen Congyang 提交于
If rcu_(un)register_thread() is called together with synchronize_rcu(), it will wait for the synchronize_rcu() to finish. But when synchronize_rcu() waits for some events, we can modify the list registry. We also use the lock rcu_gp_lock to assume that synchronize_rcu() isn't executed in more than one thread at the same time. Add a new mutex lock rcu_sync_lock to assume it and rename rcu_gp_lock to rcu_registry_lock. Release rcu_registry_lock when synchronize_rcu() waits for some events. Signed-off-by: NWen Congyang <wency@cn.fujitsu.com> Message-Id: <55B59652.4090503@cn.fujitsu.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
After commit 626cf8f4 (icount: set can_do_io outside TB execution, 2014-12-08), can_do_io is set to 1 if not executing code. It is no longer necessary to make this assumption in cpu_can_do_io. It is also possible to remove the use_icount test, simply by never setting cpu->can_do_io to 0 unless use_icount is true. With these changes cpu_can_do_io boils down to a read of cpu->can_do_io. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-