提交 · 2c9639ecabdb18c3281eba03c509f55efde80aa0 · openeuler / qemu

01 3月, 2017 1 次提交

Add a new qmp command to start/stop replication · 2c9639ec

由 Zhang Chen 提交于 8年前

We can call this qmp command to start/stop replication outside of qemu.
Like Xen colo need this function.
Signed-off-by: NZhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: NWen Congyang <wencongyang@gmail.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NStefano Stabellini <sstabellini@kernel.org>

2c9639ec

14 2月, 2017 3 次提交

COLO: Don't process failover request while loading VM's state · a8664ba5

由 zhanghailiang 提交于 8年前

We should not do failover work while the main thread is loading
VM's state. Otherwise the consistent of VM's memory and
device state will be broken.

We will restart the loading process after jump over the stage,
The new failover status 'RELAUNCH' will help to record if we
need to restart the process.

Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1484657864-21708-4-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
   Added a missing '(Since 2.9)'

a8664ba5

COLO: Shutdown related socket fd while do failover · c937b9a6

由 zhanghailiang 提交于 8年前

If the net connection between primary host and secondary host breaks
while COLO/COLO incoming threads are doing read() or write().
It will block until connection is timeout, and the failover process
will be blocked because of it.

So it is necessary to shutdown all the socket fds used by COLO
to avoid this situation. Besides, we should close the corresponding
file descriptors after failvoer BH shutdown them,
Or there will be an error.
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1484657864-21708-3-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: NDr. David Alan Gilbert <dgilbert@redhat.com>

c937b9a6

COLO: fix setting checkpoint-delay not working properly · 479125d5

由 zhanghailiang 提交于 8年前

If we set checkpoint-delay through command 'migrate-set-parameters',
It will not take effect until we finish last sleep chekpoint-delay,
That's will be offensive espeically when we want to change its value
from an extreme big one to a proper value.

Fix it by using timer to realize checkpoint-delay.
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Message-Id: <1484657864-21708-2-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>

479125d5

01 11月, 2016 1 次提交

migration: fix compiler warning on uninitialized variable · 02ba9265

由 Jeff Cody 提交于 8年前

Some older GCC versions (e.g. 4.4.7) report a warning on an
uninitialized variable for 'request', even though all possible code
paths that reference 'request' will be initialized.   To appease
these versions, initialize the variable to 0.
Reported-by: NMark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: NJeff Cody <jcody@redhat.com>
Reviewed-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Message-id: 259818682e41b95ae60f1423b87954a3fe377639.1477950393.git.jcody@redhat.com
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

02ba9265

30 10月, 2016 13 次提交

configure: Support enable/disable COLO feature · 180fb750

由 zhanghailiang 提交于 8年前

configure --enable-colo/--disable-colo to switch COLO
support on/off.

COLO feature doesn't depend on any other external libraries,
So here it is reasonable to enable COLO by default, to
avoid re-compile QEMU if users want to use this capability.
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: NGonglei <arei.gonglei@huawei.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAmit Shah <amit@amitshah.net>

180fb750

COLO: Implement failover work for secondary VM · 9d2db376

由 zhanghailiang 提交于 8年前

If users require SVM to takeover work, COLO incoming thread should
exit from loop while failover BH helps backing to migration incoming
coroutine.
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAmit Shah <amit@amitshah.net>

9d2db376

COLO: Implement the process of failover for primary VM · b3f7f0c5

由 zhanghailiang 提交于 8年前

For primary side, if COLO gets failover request from users.
To be exact, gets 'x_colo_lost_heartbeat' command.
COLO thread will exit the loop while the failover BH does the
cleanup work and resumes VM.
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAmit Shah <amit@amitshah.net>

b3f7f0c5

COLO: Introduce state to record failover process · aef06085

由 zhanghailiang 提交于 8年前

When handling failover, COLO processes differently according to
the different stage of failover process, here we introduce a global
atomic variable to record the status of failover.

We add four failover status to indicate the different stage of failover process.
You should use the helpers to get and set the value.
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAmit Shah <amit@amitshah.net>

aef06085

COLO: Add 'x-colo-lost-heartbeat' command to trigger failover · d89e666e

由 zhanghailiang 提交于 8年前

We leave users to choose whatever heartbeat solution they want,
if the heartbeat is lost, or other errors they detect, they can use
experimental command 'x_colo_lost_heartbeat' to tell COLO to do failover,
COLO will do operations accordingly.

For example, if the command is sent to the Primary side,
the Primary side will exit COLO mode, does cleanup work,
and then, PVM will take over the service work. If sent to the Secondary side,
the Secondary side will run failover work, then takes over PVM's service work.

Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAmit Shah <amit@amitshah.net>

d89e666e

COLO: Synchronize PVM's state to SVM periodically · 18cc23d7

由 zhanghailiang 提交于 8年前

Do checkpoint periodically, the default interval is 200ms.
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAmit Shah <amit@amitshah.net>

18cc23d7

COLO: Load VMState into QIOChannelBuffer before restore it · 4291d372

由 zhanghailiang 提交于 8年前

We should not destroy the state of SVM (Secondary VM) until we receive
the complete data of PVM's state, in case the primary fails in the process
of sending the state, so we cache the VM's state in secondary side before
load it into SVM.

Besides, we should call qemu_system_reset() before load VM state,
which can ensure the data is intact.
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: NGonglei <arei.gonglei@huawei.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAmit Shah <amit@amitshah.net>

4291d372

COLO: Send PVM state to secondary side when do checkpoint · a91246c9

由 zhanghailiang 提交于 8年前

VM checkpointing is to synchronize the state of PVM to SVM, just
like migration does, we re-use save helpers to achieve migrating
PVM's state to Secondary side.

COLO need to cache the data of VM's state in the secondary side before
synchronize it to SVM. COLO need the size of the data to determine
how much data should be read in the secondary side.
So here, we can get the size of the data by saving it into I/O channel
before send it to the secondary side.
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NGonglei <arei.gonglei@huawei.com>
Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAmit Shah <amit@amitshah.net>

a91246c9

COLO: Introduce checkpointing protocol · 4f97558e

由 zhanghailiang 提交于 8年前

We need communications protocol of user-defined to control
the checkpointing process.

The new checkpointing request is started by Primary VM,
and the interactive process like below:

Checkpoint synchronizing points:

                   Primary               Secondary
                                            initial work
'checkpoint-ready'    <-------------------- @

'checkpoint-request'  @ -------------------->
                                            Suspend (Only in hybrid mode)
'checkpoint-reply'    <-------------------- @
                      Suspend&Save state
'vmstate-send'        @ -------------------->
                      Send state            Receive state
'vmstate-received'    <-------------------- @
                      Release packets       Load state
'vmstate-load'        <-------------------- @
                      Resume                Resume (Only in hybrid mode)

                      Start Comparing (Only in hybrid mode)
NOTE:
 1) '@' who sends the message
 2) Every sync-point is synchronized by two sides with only
    one handshake(single direction) for low-latency.
    If more strict synchronization is required, a opposite direction
    sync-point should be added.
 3) Since sync-points are single direction, the remote side may
    go forward a lot when this side just receives the sync-point.
 4) For now, we only support 'periodic' checkpoint, for which
   the Secondary VM is not running, later we will support 'hybrid' mode.
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: NGonglei <arei.gonglei@huawei.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAmit Shah <amit@amitshah.net>

4f97558e

COLO: Establish a new communicating path for COLO · 56ba83d2

由 zhanghailiang 提交于 8年前

This new communication path will be used for returning messages
from Secondary side to Primary side.
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAmit Shah <amit@amitshah.net>

56ba83d2

migration: Switch to COLO process after finishing loadvm · 25d0c16f

由 zhanghailiang 提交于 8年前

Switch from normal migration loadvm process into COLO checkpoint process if
COLO mode is enabled.

We add three new members to struct MigrationIncomingState,
'have_colo_incoming_thread' and 'colo_incoming_thread' record the COLO
related thread for secondary VM, 'migration_incoming_co' records the
original migration incoming coroutine.
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAmit Shah <amit@amitshah.net>

25d0c16f

migration: Enter into COLO mode after migration if COLO is enabled · 0b827d5e

由 zhanghailiang 提交于 8年前

Add a new migration state: MIGRATION_STATUS_COLO. Migration source side
enters this state after the first live migration successfully finished
if COLO is enabled by command 'migrate_set_capability x-colo on'.

We reuse migration thread, so the process of checkpointing will be handled
in migration thread.
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: NGonglei <arei.gonglei@huawei.com>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAmit Shah <amit@amitshah.net>

0b827d5e

migration: Introduce capability 'x-colo' to migration · 35a6ed4f

由 zhanghailiang 提交于 8年前

We add helper function colo_supported() to indicate whether
colo is supported or not, with which we use to control whether or not
showing 'x-colo' string to users, they can use qmp command
'query-migrate-capabilities' or hmp command 'info migrate_capabilities'
to learn if colo is supported.

The default value for COLO (COarse-Grain LOck Stepping) is disabled.

Cc: Juan Quintela <quintela@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: NGonglei <arei.gonglei@huawei.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NAmit Shah <amit.shah@redhat.com>
Signed-off-by: NAmit Shah <amit@amitshah.net>

35a6ed4f