提交 400ac797 编写于 作者: E Eric Blake

blockjob: make block pivot safer

Since libvirt drops locks between issuing a monitor command and
getting a response, it is possible for libvirtd to be restarted
before getting a response on a block-job-complete command; worse, it
is also possible for the guest to shut itself down during the window
while libvirtd is down, ending the qemu process.  A management app
needs to know if the pivot happened (and the destination file
contains guest contents not in the source) or failed (and the source
file contains guest contents not in the destination), but since
the job is finished, 'query-block-jobs' no longer tracks the
status of the job, and if the qemu process itself has disappeared,
even 'query-block' cannot be checked to ask qemu its current state.

At the time of this patch, the design for persistent bitmap has not
been clarified, so a followup patch will be needed once qemu
actually figures out how to expose it, and we figure out how to use
it.  In the meantime, we have a solution that avoids the worst of
the problem.  [This problem was first analyzed with the RHEL 6.3
__com.redhat_drive-reopen command; which partly explains why
upstream qemu 1.3 ditched the drive-reopen idea and went with
block-job-complete plus persistent bitmap instead.]

If we surround 'drive-reopen' with a pause/resume pair, then we can
guarantee that the guest cannot modify either source or destination
files in the window of libvirtd uncertainty, and the management app
is guaranteed that either libvirt knows the outcome and reported it
correctly; or that on libvirtd restart, the guest will still be
paused and that the qemu process cannot have disappeared due to
guest shutdown; and use that as a clue that the management app must
implement recovery protocol, with both source and destination files
still being in sync and with 'query-block' still being an option as
part of that recovery.  My testing shows that the pause window will
typically be only a fraction of a second.

* src/qemu/qemu_driver.c (qemuDomainBlockPivot): Pause around
drive-reopen.
(qemuDomainBlockJobImpl): Update caller.
上级 eaba79d2
...@@ -12552,13 +12552,15 @@ cleanup: ...@@ -12552,13 +12552,15 @@ cleanup:
* abort with pivot; this updates the VM definition as appropriate, on * abort with pivot; this updates the VM definition as appropriate, on
* either success or failure. */ * either success or failure. */
static int static int
qemuDomainBlockPivot(struct qemud_driver *driver, virDomainObjPtr vm, qemuDomainBlockPivot(virConnectPtr conn,
struct qemud_driver *driver, virDomainObjPtr vm,
const char *device, virDomainDiskDefPtr disk) const char *device, virDomainDiskDefPtr disk)
{ {
int ret = -1; int ret = -1;
qemuDomainObjPrivatePtr priv = vm->privateData; qemuDomainObjPrivatePtr priv = vm->privateData;
virDomainBlockJobInfo info; virDomainBlockJobInfo info;
const char *format = virStorageFileFormatTypeToString(disk->mirrorFormat); const char *format = virStorageFileFormatTypeToString(disk->mirrorFormat);
bool resume = false;
/* Probe the status, if needed. */ /* Probe the status, if needed. */
if (!disk->mirroring) { if (!disk->mirroring) {
...@@ -12585,6 +12587,29 @@ qemuDomainBlockPivot(struct qemud_driver *driver, virDomainObjPtr vm, ...@@ -12585,6 +12587,29 @@ qemuDomainBlockPivot(struct qemud_driver *driver, virDomainObjPtr vm,
goto cleanup; goto cleanup;
} }
/* If we are using the older 'drive-reopen', we want to make sure
* that management apps can tell whether the command succeeded,
* even if libvirtd is restarted at the wrong time. To accomplish
* that, we pause the guest before drive-reopen, and resume it
* only when we know the outcome; if libvirtd restarts, then
* management will see the guest still paused, and know that no
* guest I/O has caused the source and mirror to diverge. XXX
* With the newer 'block-job-complete', we need to use a
* persistent bitmap to make things safe; so for now, we just
* blindly pause the guest. */
if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) {
if (qemuProcessStopCPUs(driver, vm, VIR_DOMAIN_PAUSED_SAVE,
QEMU_ASYNC_JOB_NONE) < 0)
goto cleanup;
resume = true;
if (!virDomainObjIsActive(vm)) {
virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
_("guest unexpectedly quit"));
goto cleanup;
}
}
/* Attempt the pivot. */ /* Attempt the pivot. */
qemuDomainObjEnterMonitorWithDriver(driver, vm); qemuDomainObjEnterMonitorWithDriver(driver, vm);
ret = qemuMonitorDrivePivot(priv->mon, device, disk->mirror, format); ret = qemuMonitorDrivePivot(priv->mon, device, disk->mirror, format);
...@@ -12620,6 +12645,14 @@ qemuDomainBlockPivot(struct qemud_driver *driver, virDomainObjPtr vm, ...@@ -12620,6 +12645,14 @@ qemuDomainBlockPivot(struct qemud_driver *driver, virDomainObjPtr vm,
} }
cleanup: cleanup:
if (resume && virDomainObjIsActive(vm) &&
qemuProcessStartCPUs(driver, vm, conn,
VIR_DOMAIN_RUNNING_UNPAUSED,
QEMU_ASYNC_JOB_NONE) < 0 &&
virGetLastError() == NULL) {
virReportError(VIR_ERR_OPERATION_FAILED, "%s",
_("resuming after drive-reopen failed"));
}
return ret; return ret;
} }
...@@ -12703,7 +12736,7 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base, ...@@ -12703,7 +12736,7 @@ qemuDomainBlockJobImpl(virDomainPtr dom, const char *path, const char *base,
if (disk->mirror && mode == BLOCK_JOB_ABORT && if (disk->mirror && mode == BLOCK_JOB_ABORT &&
(flags & VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT)) { (flags & VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT)) {
ret = qemuDomainBlockPivot(driver, vm, device, disk); ret = qemuDomainBlockPivot(dom->conn, driver, vm, device, disk);
goto endjob; goto endjob;
} }
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册