- 16 5月, 2018 32 次提交
-
-
由 Peter Xu 提交于
Though we may not need it, now we init both the src/dst migration objects in migration_object_init() so that even incoming migration object would be thread safe (it was not). Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-20-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
Finish the last step to do the final handshake for the recovery. First source sends one MIG_CMD_RESUME to dst, telling that source is ready to resume. Then, dest replies with MIG_RP_MSG_RESUME_ACK to source, telling that dest is ready to resume (after switch to postcopy-active state). When source received the RESUME_ACK, it switches its state to postcopy-active, and finally the recovery is completed. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-19-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
After we updated the dirty bitmaps of ramblocks, we also need to update the critical fields in RAMState to make sure it is ready for a resume. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-18-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
This patch implements the first part of core RAM resume logic for postcopy. ram_resume_prepare() is provided for the work. When the migration is interrupted by network failure, the dirty bitmap on the source side will be meaningless, because even the dirty bit is cleared, it is still possible that the sent page was lost along the way to destination. Here instead of continue the migration with the old dirty bitmap on source, we ask the destination side to send back its received bitmap, then invert it to be our initial dirty bitmap. The source side send thread will issue the MIG_CMD_RECV_BITMAP requests, once per ramblock, to ask for the received bitmap. On destination side, MIG_RP_MSG_RECV_BITMAP will be issued, along with the requested bitmap. Data will be received on the return-path thread of source, and the main migration thread will be notified when all the ramblock bitmaps are synchronized. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-17-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
This is hook function to be called when a postcopy migration wants to resume from a failure. For each module, it should provide its own recovery logic before we switch to the postcopy-active state. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-16-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
Creating new message to reply for MIG_CMD_POSTCOPY_RESUME. One uint32_t is used as payload to let the source know whether destination is ready to continue the migration. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-15-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
Introducing this new command to be sent when the source VM is ready to resume the paused migration. What the destination does here is basically release the fault thread to continue service page faults. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-14-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
Introducing new return path message MIG_RP_MSG_RECV_BITMAP to send received bitmap of ramblock back to source. This is the reply message of MIG_CMD_RECV_BITMAP, it contains not only the header (including the ramblock name), and it was appended with the whole ramblock received bitmap on the destination side. When the source receives such a reply message (MIG_RP_MSG_RECV_BITMAP), it parses it, convert it to the dirty bitmap by inverting the bits. One thing to mention is that, when we send the recv bitmap, we are doing these things in extra: - converting the bitmap to little endian, to support when hosts are using different endianess on src/dst. - do proper alignment for 8 bytes, to support when hosts are using different word size (32/64 bits) on src/dst. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-13-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
Add a new vm command MIG_CMD_RECV_BITMAP to request received bitmap for one ramblock. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-12-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
On the destination side, we cannot wake up all the threads when we got reconnected. The first thing to do is to wake up the main load thread, so that we can continue to receive valid messages from source again and reply when needed. At this point, we switch the destination VM state from postcopy-paused back to postcopy-recover. Now we are finally ready to do the resume logic. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-11-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
Introducing new migration state "postcopy-recover". If a migration procedure is paused and the connection is rebuilt afterward successfully, we'll switch the source VM state from "postcopy-paused" to the new state "postcopy-recover", then we'll do the resume logic in the migration thread (along with the return path thread). This patch only do the state switch on source side. Another following up patch will handle the state switching on destination side using the same status bit. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-10-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com> --- s/2.11/2.13/
-
由 Peter Xu 提交于
This patch detects the "resume" flag of migration command, rebuild the channels only if the flag is set. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-9-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
It will be used when we want to resume one paused migration. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-8-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com> --- s/2.12/2.13/
-
由 Peter Xu 提交于
Allows the fault thread to stop handling page faults temporarily. When network failure happened (and if we expect a recovery afterwards), we should not allow the fault thread to continue sending things to source, instead, it should halt for a while until the connection is rebuilt. When the dest main thread noticed the failure, it kicks the fault thread to switch to pause state. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-7-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
Let the thread pause for network issues. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-6-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
When there is IO error on the incoming channel (e.g., network down), instead of bailing out immediately, we allow the dst vm to switch to the new POSTCOPY_PAUSE state. Currently it is still simple - it waits the new semaphore, until someone poke it for another attempt. One note is that here on ram loading thread we cannot detect the POSTCOPY_ACTIVE state, but we need to detect the more specific POSTCOPY_INCOMING_RUNNING state, to make sure we have already loaded all the device states. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-5-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
Now when network down for postcopy, the source side will not fail the migration. Instead we convert the status into this new paused state, and we will try to wait for a rescue in the future. If a recovery is detected, migration_thread() will reset its local variables to prepare for that. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-4-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
Introducing a new state "postcopy-paused", which can be used when the postcopy migration is paused. It is targeted for postcopy network failure recovery. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: NJuan Quintela <quintela@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-3-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Xu 提交于
The old incoming migration is running in main thread and default gcontext. With the new qio_channel_add_watch_full() we can now let it run in the thread's own gcontext (if there is one). Currently this patch does nothing alone. But when any of the incoming migration is run in another iothread (e.g., the upcoming migrate-recover command), this patch will bind the incoming logic to the iothread instead of the main thread (which may already get page faulted and hanged). RDMA is not considered for now since it's not even using the QIO watch framework at all. Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: NJuan Quintela <quintela@redhat.com> Signed-off-by: NPeter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-2-peterx@redhat.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Juan Quintela 提交于
Once there, we don't need the struct names anywhere, just the typedefs. And now also document all fields. Signed-off-by: NJuan Quintela <quintela@redhat.com> Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
-
由 Juan Quintela 提交于
Signed-off-by: NJuan Quintela <quintela@redhat.com> Reviewed-by: NDaniel P. Berrangé <berrange@redhat.com> -- Be network agnostic. Add error checking for all values.
-
由 Juan Quintela 提交于
We need to make sure that we have started all the multifd threads. Signed-off-by: NJuan Quintela <quintela@redhat.com> Reviewed-by: NDaniel P. Berrangé <berrange@redhat.com>
-
由 Juan Quintela 提交于
In both sides. We still don't transmit anything through them. Signed-off-by: NJuan Quintela <quintela@redhat.com> Reviewed-by: NDaniel P. Berrangé <berrange@redhat.com>
-
由 Juan Quintela 提交于
Signed-off-by: NJuan Quintela <quintela@redhat.com> Reviewed-by: NDaniel P. Berrangé <berrange@redhat.com>
-
由 Juan Quintela 提交于
We need them before we start migration. Signed-off-by: NJuan Quintela <quintela@redhat.com> Reviewed-by: NDaniel P. Berrangé <berrange@redhat.com>
-
由 Juan Quintela 提交于
Once there, make count field to always be accessed with atomic operations. To make blocking operations, we need to know that the thread is running, so create a bool to indicate that. Signed-off-by: NJuan Quintela <quintela@redhat.com> Reviewed-by: NDaniel P. Berrangé <berrange@redhat.com> -- Once here, s/terminate_multifd_*-threads/multifd_*_terminate_threads/ This is consistente with every other function
-
由 Juan Quintela 提交于
Signed-off-by: NJuan Quintela <quintela@redhat.com> Reviewed-by: NDaniel P. Berrangé <berrange@redhat.com>
-
由 Juan Quintela 提交于
Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Juan Quintela 提交于
No need to write it to a file. Just need a proper firmware O:-) Signed-off-by: NJuan Quintela <quintela@redhat.com> Reviewed-by: NLaurent Vivier <lvivier@redhat.com> Reviewed-by: NThomas Huth <thuth@redhat.com>
-
由 Juan Quintela 提交于
Signed-off-by: NJuan Quintela <quintela@redhat.com> Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: NPeter Xu <peterx@redhat.com>
-
由 Xiao Guangrong 提交于
Fix the bug introduced by da3f56cb (migration: remove ram_save_compressed_page()), It should be 'return' rather than 'res' Sorry for this stupid mistake :( Signed-off-by: NXiao Guangrong <xiaoguangrong@tencent.com> Message-Id: <20180428081045.8878-1-xiaoguangrong@tencent.com> Signed-off-by: NJuan Quintela <quintela@redhat.com>
-
由 Peter Maydell 提交于
Block layer patches: - Switch AIO/callback based block drivers to a byte-based interface - Block jobs: Expose error string via query-block-jobs - Block job cleanups and fixes - hmp: Allow using a qdev id in block_set_io_throttle # gpg: Signature made Tue 15 May 2018 16:33:10 BST # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (37 commits) iotests: Add test for -U/force-share conflicts qemu-img: Use only string options in img_open_opts qemu-io: Use purely string blockdev options block: Document BDRV_REQ_WRITE_UNCHANGED support qemu-img: Check post-truncation size iotests: Add test for COR across nodes iotests: Copy 197 for COR filter driver iotests: Clean up wrap image in 197 block: Support BDRV_REQ_WRITE_UNCHANGED in filters block/quorum: Support BDRV_REQ_WRITE_UNCHANGED block: Set BDRV_REQ_WRITE_UNCHANGED for COR writes block: Add BDRV_REQ_WRITE_UNCHANGED flag block: BLK_PERM_WRITE includes ..._UNCHANGED block: Add COR filter driver iotests: Skip 181 and 201 without userfaultfd iotests: Add failure matching to common.qemu docs: Document the new default sizes of the qcow2 caches qcow2: Give the refcount cache the minimum possible size by default specs/qcow2: Clarify that compressed clusters have the COPIED bit reset Fix error message about compressed clusters with OFLAG_COPIED ... Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
- 15 5月, 2018 8 次提交
-
-
由 Kevin Wolf 提交于
- Copy-on-read block driver - The qcow2 default refcount cache size has been decreased - Various bug fixes # gpg: Signature made Tue May 15 16:18:25 2018 CEST # gpg: using RSA key F407DB0061D5CF40 # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * mreitz/tags/pull-block-2018-05-15: (21 commits) iotests: Add test for -U/force-share conflicts qemu-img: Use only string options in img_open_opts qemu-io: Use purely string blockdev options block: Document BDRV_REQ_WRITE_UNCHANGED support qemu-img: Check post-truncation size iotests: Add test for COR across nodes iotests: Copy 197 for COR filter driver iotests: Clean up wrap image in 197 block: Support BDRV_REQ_WRITE_UNCHANGED in filters block/quorum: Support BDRV_REQ_WRITE_UNCHANGED block: Set BDRV_REQ_WRITE_UNCHANGED for COR writes block: Add BDRV_REQ_WRITE_UNCHANGED flag block: BLK_PERM_WRITE includes ..._UNCHANGED block: Add COR filter driver iotests: Skip 181 and 201 without userfaultfd iotests: Add failure matching to common.qemu docs: Document the new default sizes of the qcow2 caches qcow2: Give the refcount cache the minimum possible size by default specs/qcow2: Clarify that compressed clusters have the COPIED bit reset Fix error message about compressed clusters with OFLAG_COPIED ... Signed-off-by: NKevin Wolf <kwolf@redhat.com>
-
由 Max Reitz 提交于
Signed-off-by: NMax Reitz <mreitz@redhat.com> Message-id: 20180502202051.15493-4-mreitz@redhat.com Reviewed-by: NEric Blake <eblake@redhat.com> Signed-off-by: NMax Reitz <mreitz@redhat.com>
-
由 Max Reitz 提交于
img_open_opts() takes a QemuOpts and converts them to a QDict, so all values therein are strings. Then it may try to call qdict_get_bool(), however, which will fail with a segmentation fault every time: $ ./qemu-img info -U --image-opts \ driver=file,filename=/dev/null,force-share=off [1] 27869 segmentation fault (core dumped) ./qemu-img info -U --image-opts driver=file,filename=/dev/null,force-share=off Fix this by using qdict_get_str() and comparing the value as a string. Also, when adding a force-share value to the QDict, add it as a string so it fits the rest of the dict. Cc: qemu-stable@nongnu.org Signed-off-by: NMax Reitz <mreitz@redhat.com> Message-id: 20180502202051.15493-3-mreitz@redhat.com Reviewed-by: NEric Blake <eblake@redhat.com> Signed-off-by: NMax Reitz <mreitz@redhat.com>
-
由 Max Reitz 提交于
Currently, qemu-io only uses string-valued blockdev options (as all are converted directly from QemuOpts) -- with one exception: -U adds the force-share option as a boolean. This in itself is already a bit questionable, but a real issue is that it also assumes the value already existing in the options QDict would be a boolean, which is wrong. That has the following effect: $ ./qemu-io -r -U --image-opts \ driver=file,filename=/dev/null,force-share=off [1] 15200 segmentation fault (core dumped) ./qemu-io -r -U --image-opts driver=file,filename=/dev/null,force-share=off Since @opts is converted from QemuOpts, the value must be a string, and we have to compare it as such. Consequently, it makes sense to also set it as a string instead of a boolean. Cc: qemu-stable@nongnu.org Signed-off-by: NMax Reitz <mreitz@redhat.com> Message-id: 20180502202051.15493-2-mreitz@redhat.com Reviewed-by: NEric Blake <eblake@redhat.com> Signed-off-by: NMax Reitz <mreitz@redhat.com>
-
由 Max Reitz 提交于
Add BDRV_REQ_WRITE_UNCHANGED to the list of flags honored during pwrite and pwrite_zeroes, and also add a note on when you absolutely need to support it. Signed-off-by: NMax Reitz <mreitz@redhat.com> Message-id: 20180502140359.18222-1-mreitz@redhat.com Reviewed-by: NEric Blake <eblake@redhat.com> Signed-off-by: NMax Reitz <mreitz@redhat.com>
-
由 Max Reitz 提交于
Some block drivers (iscsi and file-posix when dealing with device files) do not actually support truncation, even though they provide a .bdrv_truncate() method and will happily return success when providing a new size that does not exceed the current size. This is because these drivers expect the user to resize the image outside of qemu and then provide qemu with that information through the block_resize command (compare cb1b83e7). Of course, anyone using qemu-img resize will find that behavior useless. So we should check the actual size of the image after the supposedly successful truncation took place, emit an error if nothing changed and emit a warning if the target size was not met. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1523065Signed-off-by: NMax Reitz <mreitz@redhat.com> Message-id: 20180421163957.29872-1-mreitz@redhat.com Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMax Reitz <mreitz@redhat.com>
-
由 Max Reitz 提交于
COR across nodes (that is, you have some filter node between the actually COR target and the node that performs the COR) cannot reliably work together with the permission system when there is no explicit COR node that can request the WRITE_UNCHANGED permission for its child. This is because COR (currently) sneaks its requests by the usual permission checks, so it can work without a WRITE* permission; but if there is a filter node in between, that will re-issue the request, which then passes through the usual check -- and if nobody has requested a WRITE_UNCHANGED permission, that check will fail. There is no real direct fix apart from hoping that there is someone who has requested that permission; in case of just the qemu-io HMP command (and no guest device), however, that is not the case. The real real fix is to implement the copy-on-read flag through an implicitly added COR node. Such a node can request the necessary permissions as shown in this test. Signed-off-by: NMax Reitz <mreitz@redhat.com> Message-id: 20180421132929.21610-10-mreitz@redhat.com Reviewed-by: NKevin Wolf <kwolf@redhat.com> Signed-off-by: NMax Reitz <mreitz@redhat.com>
-
由 Max Reitz 提交于
iotest 197 tests copy-on-read using the (now old) copy-on-read flag. Copy it to 215 and modify it to use the COR filter driver instead. Signed-off-by: NMax Reitz <mreitz@redhat.com> Message-id: 20180421132929.21610-9-mreitz@redhat.com Reviewed-by: NKevin Wolf <kwolf@redhat.com> Signed-off-by: NMax Reitz <mreitz@redhat.com>
-