Introduce yet another migration version in API.
Migration just seems to go from bad to worse. We already had to introduce a second migration protocol when adding the QEMU driver, since the one from Xen was insufficiently flexible to cope with passing the data the QEMU driver required. It turns out that this protocol still has some flaws that we need to address. The current sequence is * Src: DumpXML - Generate XML to pass to dst * Dst: Prepare - Get ready to accept incoming VM - Generate optional cookie to pass to src * Src: Perform - Start migration and wait for send completion - Kill off VM if successful, resume if failed * Dst: Finish - Wait for recv completion and check status - Kill off VM if unsuccessful The problems with this are: - Since the first step is a generic 'DumpXML' call, we can't add in other migration specific data. eg, we can't include any VM lease data from lock manager plugins - Since the first step is a generic 'DumpXML' call, we can't emit any 'migration begin' event on the source, or have any hook that runs right at the start of the process - Since there is no final step on the source, if the Finish method fails to receive all migration data & has to kill the VM, then there's no way to resume the original VM on the source This patch attempts to introduce a version 3 that uses the improved 5 step sequence * Src: Begin - Generate XML to pass to dst - Generate optional cookie to pass to dst * Dst: Prepare - Get ready to accept incoming VM - Generate optional cookie to pass to src * Src: Perform - Start migration and wait for send completion - Generate optional cookie to pass to dst * Dst: Finish - Wait for recv completion and check status - Kill off VM if failed, resume if success - Generate optional cookie to pass to src * Src: Confirm - Kill off VM if success, resume if failed The API is designed to allow both input and output cookies in all methods where applicable. This lets us pass around arbitrary extra driver specific data between src & dst during migration. Combined with the extra 'Begin' method this lets us pass lease information from source to dst at the start of migration Moving the killing of the source VM out of Perform and into Confirm, means we can now recover if the dst host can't successfully Finish receiving migration data.
Showing
想要评论请 注册 或 登录