- 24 3月, 2016 14 次提交
-
-
由 Thomas Huth 提交于
tl;dr: This patch introduces an alternate way of handling the receive buffers of the spapr-vlan device, resulting in much better receive performance for the guest. Full story: One of our testers recently discovered that the performance of the spapr-vlan device is very poor compared to other NICs, and that a simple "ping -i 0.2 -s 65507 someip" in the guest can result in more than 50% lost ping packets (especially with older guest kernels < 3.17). After doing some analysis, it was clear that there is a problem with the way we handle the receive buffers in spapr_llan.c: The ibmveth driver of the guest Linux kernel tries to add a lot of buffers into several buffer pools (with 512, 2048 and 65536 byte sizes by default, but it can be changed via the entries in the /sys/devices/vio/1000/pool* directories of the guest). However, the spapr-vlan device of QEMU only tries to squeeze all receive buffer descriptors into one single page which has been supplied by the guest during the H_REGISTER_LOGICAL_LAN call, without taking care of different buffer sizes. This has two bad effects: First, only a very limited number of buffer descriptors is accepted at all. Second, we also hand 64k buffers to the guest even if the 2k buffers would fit better - and this results in dropped packets in the IP layer of the guest since too much skbuf memory is used. Though it seems at a first glance like PAPR says that we should store the receive buffer descriptors in the page that is supplied during the H_REGISTER_LOGICAL_LAN call, chapter 16.4.1.2 in the LoPAPR spec declares that "the contents of these descriptors are architecturally opaque, none of these descriptors are manipulated by code above the architected interfaces". That means we don't have to store the RX buffer descriptors in this page, but can also manage the receive buffers at the hypervisor level only. This is now what we are doing here: Introducing proper RX buffer pools which are also sorted by size of the buffers, so we can hand out a buffer with the best fitting size when a packet has been received. To avoid problems with migration from/to older version of QEMU, the old behavior is also retained and enabled by default. The new buffer management has to be enabled via a new "use-rx-buffer-pools" property. Now with the new buffer pool management enabled, the problem with "ping -s 65507" is fixed for me, and the throughput of a simple test with wget increases from creeping 3MB/s up to 20MB/s! Signed-off-by: NThomas Huth <thuth@redhat.com> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Thomas Huth 提交于
Refactor the code a little bit by extracting the code that reads and writes the receive buffer list page into separate functions. There should be no functional change in this patch, this is just a preparation for the upcoming extensions that introduce receive buffer pools. Signed-off-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Reviewed-by: NLaurent Vivier <lvivier@redhat.com> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Benjamin Herrenschmidt 提交于
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> [clg: squashed in patch 'ppc: Add dummy ACOP SPR' ] Signed-off-by: NCédric Le Goater <clg@fr.ibm.com> Reviewed-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Benjamin Herrenschmidt 提交于
We should implement HW breakpoint/watchpoint, qemu supports them... Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Benjamin Herrenschmidt 提交于
With appropriate AMR-like masks. Not actually used by the translation logic at that point Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> [clg: changed spr_register_hv(SPR_IAMR) to spr_register_kvm_hv(SPR_IAMR) changed gen_spr_amr() prototype ] Signed-off-by: NCédric Le Goater <clg@fr.ibm.com> Reviewed-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Benjamin Herrenschmidt 提交于
The masks weren't chosen nor applied properly. The architecture specifies that writes to AMR are masked by UAMOR for PR=1, otherwise AMOR for HV=0. The writes to UAMOR are masked by AMOR for HV=0 Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> [clg: moved gen_spr_amr() prototype change to next patch ] Signed-off-by: NCédric Le Goater <clg@fr.ibm.com> Reviewed-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Benjamin Herrenschmidt 提交于
Make sure we give the guest full authorization Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Benjamin Herrenschmidt 提交于
It's supposed to be an instruction counter. For now make us not crash when accessing it. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Benjamin Herrenschmidt 提交于
And move the code adjusting the MSR mask and calling kvmppc_set_papr() to it. This allows us to add a few more things such as disabling setting of MSR:HV and appropriate LPCR bits which will be used when fixing the exception model. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> [clg: removed LPCR setting ] Signed-off-by: NCédric Le Goater <clg@fr.ibm.com> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Benjamin Herrenschmidt 提交于
We don't give them a KVM reg number to most of the registers yet as no current KVM version supports HV mode. For DAWR and DAWRX, the KVM reg number is needed since this register can be set by the guest via the H_SET_MODE hypercall. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> [clg: squashed in patch 'ppc: Add KVM numbers to some P8 SPRs' changed the commit log with a proposal of Thomas Huth removed all hunks except those related to AMOR and DAWR* ] Signed-off-by: NCédric Le Goater <clg@fr.ibm.com> Reviewed-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Benjamin Herrenschmidt 提交于
The current set of spr_register_* macros only take the user and supervisor function pointers. To make the transition easy, we don't change that but we add "_hv" variants that can be used to register all 3 sets. To simplify the transition, users of the "old" macro will set the hypervisor callback to be the same as the supervisor one. The new registration function only needs to be used for registers that are either hypervisor only or behave differently in HV mode. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> [clg: fixed else if condition in gen_op_mfspr() ] Signed-off-by: NCédric Le Goater <clg@fr.ibm.com> Reviewed-by: NThomas Huth <thuth@redhat.com> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Benjamin Herrenschmidt 提交于
Add definitions for additional SPR numbers and SPR bit definitions that will be relevant for subsequent improvements to POWER8 emulation Also fix the definition of LPIDR which was incorrect (and is different for server and embedded). Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: NThomas Huth <thuth@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Alexey Kardashevskiy 提交于
ePAPR defines "hcall-instructions" device-tree property which contains code to call hypercalls in ePAPR paravirtualized guests. In general pseries guests won't use this property, instead using the PAPR defined hypercall interface. However, this property has been re-used to implement a hack to allow PR KVM to run (slightly modified) guests in some situations where it otherwise wouldn't be able to (because the system's L0 hypervisor doesn't forward the PAPR hypercalls to the PR KVM kernel). Hence, this property is always present in the device tree for pseries guests. All KVM guests use it at least to read features via the KVM_HC_FEATURES hypercall. The property is populated by the code returned from the KVM's KVM_PPC_GET_PVINFO ioctl; if not implemented in the KVM, QEMU supplies code which will fail all hypercall attempts. If QEMU does not create the property, and the guest kernel is compiled with CONFIG_EPAPR_PARAVIRT (which is normally the case), there is exactly the same stub at @epapr_hypercall_start already. Rather than maintaining this fairly useless stub implementation, it makes more sense not to create the property in the device tree in the first place if the host kernel does not implement it. This changes kvmppc_get_hypercall() to return 1 if the host kernel does not implement KVM_CAP_PPC_GET_PVINFO. The caller can use it to decide on whether to create the property or not. This changes the pseries machine to not create the property if KVM does not implement KVM_PPC_GET_PVINFO. In practice this means that from now on the property will not be created if either HV KVM or TCG is used. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> [reworded commit message for clarity --dwg] Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
由 Laurent Vivier 提交于
When a qemu-system-ppc64 is started, the 64-bit mode bit is not set in MSR. Signed-off-by: NLaurent Vivier <lvivier@redhat.com> Reviewed-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
-
- 23 3月, 2016 7 次提交
-
-
由 Peter Maydell 提交于
ivshmem: Fixes, cleanups, device model split # gpg: Signature made Mon 21 Mar 2016 20:33:54 GMT using RSA key ID EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" * remotes/armbru/tags/pull-ivshmem-2016-03-18: (40 commits) contrib/ivshmem-server: Print "not for production" warning ivshmem: Require master to have ID zero ivshmem: Drop ivshmem property x-memdev ivshmem: Clean up after the previous commit ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem: Replace int role_val by OnOffAuto master qdev: New DEFINE_PROP_ON_OFF_AUTO ivshmem: Inline check_shm_size() into its only caller ivshmem: Simplify memory regions for BAR 2 (shared memory) ivshmem: Implement shm=... with a memory backend ivshmem: Tighten check of property "size" ivshmem: Simplify how we cope with short reads from server ivshmem: Drop the hackish test for UNIX domain chardev ivshmem: Rely on server sending the ID right after the version ivshmem: Propagate errors through ivshmem_recv_setup() ivshmem: Receive shared memory synchronously in realize() ivshmem: Plug leaks on unplug, fix peer disconnect ivshmem: Disentangle ivshmem_read() ivshmem: Simplify rejection of invalid peer ID from server ivshmem: Assert interrupts are set up once ... Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
由 Peter Maydell 提交于
wxx patch queue # gpg: Signature made Tue 22 Mar 2016 18:18:36 GMT using RSA key ID 677450AD # gpg: Good signature from "Stefan Weil <sw@weilnetz.de>" # gpg: aka "Stefan Weil <stefan.weil@weilnetz.de>" # gpg: aka "Stefan Weil <stefan.weil@bib.uni-mannheim.de>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 4923 6FEA 75C9 5D69 8EC2 B78A E08C 21D5 6774 50AD * remotes/weil/tags/pull-wxx-20160322: wxx: Add support for ncurses Remove unneeded include statements for setjmp.h Include setjmp.h in qemu/osdep.h (bug fix for w64) Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
由 Stefan Weil 提交于
We used to support only pdcurses for Windows, but recently Cygwin added mingw64-i686-ncurses and mingw64-x86_64-ncurses packages which are supported now, too. Signed-off-by: NStefan Weil <sw@weilnetz.de>
-
由 Stefan Weil 提交于
As soon as setjmp.h is included from qemu/osdep.h, those old include statements are no longer needed. Add also setjmp.h to the list in scripts/clean-includes. Signed-off-by: NStefan Weil <sw@weilnetz.de>
-
由 Stefan Weil 提交于
setjmp must be declared before sysemu/os-win32.h because it is redefined there for 64 bit Windows. Reviewed-by: NRichard Henderson <rth@twiddle.net> Tested-by: NAndrew Baumann <Andrew.Baumann@microsoft.com> Signed-off-by: NStefan Weil <sw@weilnetz.de>
-
由 Peter Maydell 提交于
qemu-ga patch queue for 2.6 * remove unused variable # gpg: Signature made Mon 21 Mar 2016 17:32:42 GMT using RSA key ID F108B584 # gpg: Good signature from "Michael Roth <flukshun@gmail.com>" # gpg: aka "Michael Roth <mdroth@utexas.edu>" # gpg: aka "Michael Roth <mdroth@linux.vnet.ibm.com>" * remotes/mdroth/tags/qga-pull-2016-03-21-tag: qemu-ga: drop unused local err variable Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
由 Peter Maydell 提交于
usb: bugfix collection. # gpg: Signature made Mon 21 Mar 2016 11:07:39 GMT using RSA key ID D3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" * remotes/kraxel/tags/pull-usb-20160321-1: usb: ehci: add capability mmio write function hw/usb/dev-mtp: Guard inotify usage with CONFIG_INOTIFY1 usb: fix unbound stack warning for inotify_watchfn usb: fix unbound stack usage for usb_mtp_add_str usb: fix unbounded stack warning for xhci_dma_write_u32s usb: Fix compilation for Windows Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
- 22 3月, 2016 19 次提交
-
-
由 Markus Armbruster 提交于
The code is okay for illustrating how things work and for testing, but its error handling make it unfit for production use. Print a warning to protect the innocent. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-41-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
Migration with ivshmem needs to be carefully orchestrated to work. Exactly one peer (the "master") migrates to the destination, all other peers need to unplug (and disconnect), migrate, plug back (and reconnect). This is sort of documented in qemu-doc. If peers connect on the destination before migration completes, the shared memory can get messed up. This isn't documented anywhere. Fix that in qemu-doc. To avoid messing up register IVPosition on migration, the server must assign the same ID on source and destination. ivshmem-spec.txt leaves ID assignment unspecified, however. Amend ivshmem-spec.txt to require the first client to receive ID zero. The example ivshmem-server complies: it always assigns the first unused ID. For a bit of additional safety, enforce ID zero for the master. This does nothing when we're not using a server, because the ID is zero for all peers then. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-40-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
Use ivshmem-plain instead. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-39-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
Move code to more sensible places. Use the opportunity to reorder and document IVShmemState members. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-38-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
In preparation of making it a qdev property. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-36-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-35-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
Improve the error messages while there. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Message-Id: <1458066895-20632-34-git-send-email-armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
-
由 Markus Armbruster 提交于
ivshmem_realize() puts the shared memory region in a container region. Used to be necessary to permit delayed mapping of the shared memory. However, we recently moved to synchronous mapping, in "ivshmem: Receive shared memory synchronously in realize()" and the commit following it. The container is redundant since then. Drop it. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Message-Id: <1458066895-20632-33-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
ivshmem has its very own code to create and map shared memory. Replace that with an implicitly created memory backend. Reduces the number of ways we create BAR 2 from three to two. The memory-backend-file is currently available only with CONFIG_LINUX, so this adds a second Linuxism to ivshmem (the other one is eventfd). Should we ever need to make it portable to systems where memory-backend-file can't be made to serve, we could create a memory-backend-shmem that allocates memory with shm_open(). Bonus fix: shared memory files are now created with permissions 0655 instead of 0777. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Message-Id: <1458066895-20632-32-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
If size_t is narrower than 64 bits, passing uint64_t ivshmem_size to mmap() truncates. Reject such sizes. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-31-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
Short reads from a UNIX domain sockets are exceedingly unlikely when the other side always sends eight bytes and we always read eight bytes. We cope with them anyway. However, the code doing that is rather convoluted. Dumb it down radically. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-30-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
The chardev must be capable of transmitting SCM_RIGHTS ancillary messages. We check it by comparing CharDriverState member filename to "unix:". That's almost as brittle as it is disgusting. When the actual transmission all happened asynchronously, this check was all we could do in realize(), and thus better than nothing. But now we receive at least one SCM_RIGHTS synchronously in realize(), it's not worth its keep anymore. Drop it. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-29-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
The protocol specification (ivshmem-spec.txt, formerly ivshmem_device_spec.txt) has always required the ID message to be sent right at the beginning, and ivshmem-server has always complied. The device, however, accepts it out of order. If an interrupt setup arrived before it, though, it would be misinterpreted as connect notification. Fix the latent bug by relying on the spec and ivshmem-server's actual behavior. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-28-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
This kills off the funny state described in the previous commit. Simplify ivshmem_io_read() accordingly, and update documentation. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Message-Id: <1458066895-20632-27-git-send-email-armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
-
由 Markus Armbruster 提交于
When configured for interrupts (property "chardev" given), we receive the shared memory from an ivshmem server. We do so asynchronously after realize() completes, by setting up callbacks with qemu_chr_add_handlers(). Keeping server I/O out of realize() that way avoids delays due to a slow server. This is probably relevant only for hot plug. However, this funny "no shared memory, yet" state of the device also causes a raft of issues that are hard or impossible to work around: * The guest is exposed to this state: when we enter and leave it its shared memory contents is apruptly replaced, and device register IVPosition changes. This is a known issue. We document that guests should not access the shared memory after device initialization until the IVPosition register becomes non-negative. For cold plug, the funny state is unlikely to be visible in practice, because we normally receive the shared memory long before the guest gets around to mess with the device. For hot plug, the timing is tighter, but the relative slowness of PCI device configuration has a good chance to hide the funny state. In either case, guests complying with the documented procedure are safe. * Migration becomes racy. If migration completes before the shared memory setup completes on the source, shared memory contents is silently lost. Fortunately, migration is rather unlikely to win this race. If the shared memory's ramblock arrives at the destination before shared memory setup completes, migration fails. There is no known way for a management application to wait for shared memory setup to complete. All you can do is retry failed migration. You can improve your chances by leaving more time between running the destination QEMU and the migrate command. To mitigate silent memory loss, you need to ensure the server initializes shared memory exactly the same on source and destination. These issues are entirely undocumented so far. I'd expect the server to be almost always fast enough to hide these issues. But then rare catastrophic races are in a way the worst kind. This is way more trouble than I'm willing to take from any device. Kill the funny state by receiving shared memory synchronously in realize(). If your hot plug hangs, go kill your ivshmem server. For easier review, this commit only makes the receive synchronous, it doesn't add the necessary error propagation. Without that, the funny state persists. The next commit will do that, and kill it off for real. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
close_peer_eventfds() cleans up three things: ioeventfd triggers if they exist, eventfds, and the array to store them. Commit 98609cd8 (v1.2.0) fixed it not to clean up ioeventfd triggers when they don't exist (property ioeventfd=off, which is the default). Unfortunately, the fix also made it skip cleanup of the eventfds and the array then. This is a memory and file descriptor leak on unplug. Additionally, the reset of nb_eventfds is skipped. Doesn't matter on unplug. On peer disconnect, however, this permanently wedges the interrupt vectors used for that peer's ID. The eventfds stay behind, but aren't connected to a peer anymore. When the ID gets recycled for a new peer, the new peer's eventfds get assigned to vectors after the old ones. Commonly, the device's number of vectors matches the server's, so the new ones get dropped with a "Too many eventfd received" message. Interrupts either don't work (common case) or go to the wrong vector. Fix by narrowing the conditional to just the ioeventfd trigger cleanup. While there, move the "invalid" peer check to the only caller where it can actually happen, and tighten it to reject own ID. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-25-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-24-git-send-email-armbru@redhat.com>
-
由 Markus Armbruster 提交于
ivshmem_read() processes server messages. These are 64 bit signed integers. -1 is shared memory setup, 16 bit unsigned is a peer ID, anything else is invalid. ivshmem_read() rejects invalid negative messages right away, silently. Invalid positive messages get rejected only in resize_peers(), and ivshmem_read() then prints the rather cryptic message "failed to resize peers array". Extend the first check to cover all invalid messages, make it report "server sent invalid message", and drop the second check. Now resize_peers() can't fail anymore; simplify. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-23-git-send-email-armbru@redhat.com>
-