xref: /qemu/docs/COLO-FT.txt (revision c23874132b79f69328b53273437970116c7a3d0d)
1e59887d8SzhanghailiangCOarse-grained LOck-stepping Virtual Machines for Non-stop Service
2e59887d8Szhanghailiang----------------------------------------
3e59887d8SzhanghailiangCopyright (c) 2016 Intel Corporation
4e59887d8SzhanghailiangCopyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
5e59887d8SzhanghailiangCopyright (c) 2016 Fujitsu, Corp.
6e59887d8Szhanghailiang
7e59887d8SzhanghailiangThis work is licensed under the terms of the GNU GPL, version 2 or later.
8e59887d8SzhanghailiangSee the COPYING file in the top-level directory.
9e59887d8Szhanghailiang
10e59887d8SzhanghailiangThis document gives an overview of COLO's design and how to use it.
11e59887d8Szhanghailiang
12e59887d8Szhanghailiang== Background ==
13e59887d8SzhanghailiangVirtual machine (VM) replication is a well known technique for providing
14e59887d8Szhanghailiangapplication-agnostic software-implemented hardware fault tolerance,
15e59887d8Szhanghailiangalso known as "non-stop service".
16e59887d8Szhanghailiang
17e59887d8SzhanghailiangCOLO (COarse-grained LOck-stepping) is a high availability solution.
18e59887d8SzhanghailiangBoth primary VM (PVM) and secondary VM (SVM) run in parallel. They receive the
19e59887d8Szhanghailiangsame request from client, and generate response in parallel too.
20e59887d8SzhanghailiangIf the response packets from PVM and SVM are identical, they are released
21e59887d8Szhanghailiangimmediately. Otherwise, a VM checkpoint (on demand) is conducted.
22e59887d8Szhanghailiang
23e59887d8Szhanghailiang== Architecture ==
24e59887d8Szhanghailiang
25e59887d8SzhanghailiangThe architecture of COLO is shown in the diagram below.
26e59887d8SzhanghailiangIt consists of a pair of networked physical nodes:
27e59887d8SzhanghailiangThe primary node running the PVM, and the secondary node running the SVM
28e59887d8Szhanghailiangto maintain a valid replica of the PVM.
29e59887d8SzhanghailiangPVM and SVM execute in parallel and generate output of response packets for
30e59887d8Szhanghailiangclient requests according to the application semantics.
31e59887d8Szhanghailiang
32e59887d8SzhanghailiangThe incoming packets from the client or external network are received by the
33e59887d8Szhanghailiangprimary node, and then forwarded to the secondary node, so that both the PVM
34e59887d8Szhanghailiangand the SVM are stimulated with the same requests.
35e59887d8Szhanghailiang
36e59887d8SzhanghailiangCOLO receives the outbound packets from both the PVM and SVM and compares them
37e59887d8Szhanghailiangbefore allowing the output to be sent to clients.
38e59887d8Szhanghailiang
39e59887d8SzhanghailiangThe SVM is qualified as a valid replica of the PVM, as long as it generates
40e59887d8Szhanghailiangidentical responses to all client requests. Once the differences in the outputs
41e59887d8Szhanghailiangare detected between the PVM and SVM, COLO withholds transmission of the
42e59887d8Szhanghailiangoutbound packets until it has successfully synchronized the PVM state to the SVM.
43e59887d8Szhanghailiang
44e59887d8Szhanghailiang  Primary Node                                                            Secondary Node
45e59887d8Szhanghailiang+------------+  +-----------------------+       +------------------------+  +------------+
46a38299bfSZhang Chen|            |  |       HeartBeat       +<----->+       HeartBeat        |  |            |
47a38299bfSZhang Chen| Primary VM |  +-----------+-----------+       +-----------+------------+  |Secondary VM|
48e59887d8Szhanghailiang|            |              |                               |               |            |
49e59887d8Szhanghailiang|            |  +-----------|-----------+       +-----------|------------+  |            |
50e59887d8Szhanghailiang|            |  |QEMU   +---v----+      |       |QEMU  +----v---+        |  |            |
51e59887d8Szhanghailiang|            |  |       |Failover|      |       |      |Failover|        |  |            |
52e59887d8Szhanghailiang|            |  |       +--------+      |       |      +--------+        |  |            |
53e59887d8Szhanghailiang|            |  |   +---------------+   |       |   +---------------+    |  |            |
54a38299bfSZhang Chen|            |  |   | VM Checkpoint +-------------->+ VM Checkpoint |    |  |            |
55e59887d8Szhanghailiang|            |  |   +---------------+   |       |   +---------------+    |  |            |
56a38299bfSZhang Chen|Requests<--------------------------\ /-----------------\ /--------------------->Requests|
57a38299bfSZhang Chen|            |  |                   ^ ^ |       |       | |              |  |            |
58a38299bfSZhang Chen|Responses+---------------------\ /-|-|------------\ /-------------------------+Responses|
59a38299bfSZhang Chen|            |  |               | | | | |       |  | |  | |              |  |            |
60a38299bfSZhang Chen|            |  | +-----------+ | | | | |       |  | |  | | +----------+ |  |            |
61a38299bfSZhang Chen|            |  | | COLO disk | | | | | |       |  | |  | | | COLO disk| |  |            |
62a38299bfSZhang Chen|            |  | |   Manager +---------------------------->| Manager  | |  |            |
63a38299bfSZhang Chen|            |  | ++----------+ v v | | |       |  | v  v | +---------++ |  |            |
64a38299bfSZhang Chen|            |  |  |+-----------+-+-+-++|       | ++-+--+-+---------+ |  |  |            |
65a38299bfSZhang Chen|            |  |  ||   COLO Proxy     ||       | |   COLO Proxy    | |  |  |            |
66a38299bfSZhang Chen|            |  |  || (compare packet  ||       | |(adjust sequence | |  |  |            |
67a38299bfSZhang Chen|            |  |  ||and mirror packet)||       | |    and ACK)     | |  |  |            |
68a38299bfSZhang Chen|            |  |  |+------------+---+-+|       | +-----------------+ |  |  |            |
69a38299bfSZhang Chen+------------+  +-----------------------+       +------------------------+  +------------+
70a38299bfSZhang Chen+------------+     |             |   |                                |     +------------+
71a38299bfSZhang Chen| VM Monitor |     |             |   |                                |     | VM Monitor |
72a38299bfSZhang Chen+------------+     |             |   |                                |     +------------+
73a38299bfSZhang Chen+---------------------------------------+       +----------------------------------------+
74e59887d8Szhanghailiang|   Kernel         |             |   |  |       |   Kernel            |                  |
75a38299bfSZhang Chen+---------------------------------------+       +----------------------------------------+
76e59887d8Szhanghailiang                   |             |   |                                |
77a38299bfSZhang Chen    +--------------v+  +---------v---+--+       +------------------+ +v-------------+
78e59887d8Szhanghailiang    |   Storage     |  |External Network|       | External Network | |   Storage    |
79e59887d8Szhanghailiang    +---------------+  +----------------+       +------------------+ +--------------+
80e59887d8Szhanghailiang
81a38299bfSZhang Chen
82e59887d8Szhanghailiang== Components introduction ==
83e59887d8Szhanghailiang
84e59887d8SzhanghailiangYou can see there are several components in COLO's diagram of architecture.
85e59887d8SzhanghailiangTheir functions are described below.
86e59887d8Szhanghailiang
87e59887d8SzhanghailiangHeartBeat:
88e59887d8SzhanghailiangRuns on both the primary and secondary nodes, to periodically check platform
89e59887d8Szhanghailiangavailability. When the primary node suffers a hardware fail-stop failure,
90e59887d8Szhanghailiangthe heartbeat stops responding, the secondary node will trigger a failover
91e59887d8Szhanghailiangas soon as it determines the absence.
92e59887d8Szhanghailiang
93e59887d8SzhanghailiangCOLO disk Manager:
9476ca4b58SzhaolichangWhen primary VM writes data into image, the colo disk manager captures this data
95e59887d8Szhanghailiangand sends it to secondary VM's which makes sure the context of secondary VM's
96e59887d8Szhanghailiangimage is consistent with the context of primary VM 's image.
97e59887d8SzhanghailiangFor more details, please refer to docs/block-replication.txt.
98e59887d8Szhanghailiang
99e59887d8SzhanghailiangCheckpoint/Failover Controller:
100e59887d8SzhanghailiangModifications of save/restore flow to realize continuous migration,
101e59887d8Szhanghailiangto make sure the state of VM in Secondary side is always consistent with VM in
102e59887d8SzhanghailiangPrimary side.
103e59887d8Szhanghailiang
104e59887d8SzhanghailiangCOLO Proxy:
105806be373SLike XuDelivers packets to Primary and Secondary, and then compare the responses from
106e59887d8Szhanghailiangboth side. Then decide whether to start a checkpoint according to some rules.
107963e64a4SStefan WeilPlease refer to docs/colo-proxy.txt for more information.
108e59887d8Szhanghailiang
109e59887d8SzhanghailiangNote:
110e59887d8SzhanghailiangHeartBeat has not been implemented yet, so you need to trigger failover process
111e59887d8Szhanghailiangby using 'x-colo-lost-heartbeat' command.
112e59887d8Szhanghailiang
1138e640892SZhang Chen== COLO operation status ==
1148e640892SZhang Chen
1158e640892SZhang Chen+-----------------+
1168e640892SZhang Chen|                 |
1178e640892SZhang Chen|    Start COLO   |
1188e640892SZhang Chen|                 |
1198e640892SZhang Chen+--------+--------+
1208e640892SZhang Chen         |
1218e640892SZhang Chen         |  Main qmp command:
1228e640892SZhang Chen         |  migrate-set-capabilities with x-colo
1238e640892SZhang Chen         |  migrate
1248e640892SZhang Chen         |
1258e640892SZhang Chen         v
1268e640892SZhang Chen+--------+--------+
1278e640892SZhang Chen|                 |
1288e640892SZhang Chen|  COLO running   |
1298e640892SZhang Chen|                 |
1308e640892SZhang Chen+--------+--------+
1318e640892SZhang Chen         |
1328e640892SZhang Chen         |  Main qmp command:
1338e640892SZhang Chen         |  x-colo-lost-heartbeat
1348e640892SZhang Chen         |  or
1358e640892SZhang Chen         |  some error happened
1368e640892SZhang Chen         v
1378e640892SZhang Chen+--------+--------+
1388e640892SZhang Chen|                 |  send qmp event:
1398e640892SZhang Chen|  COLO failover  |  COLO_EXIT
1408e640892SZhang Chen|                 |
1418e640892SZhang Chen+-----------------+
1428e640892SZhang Chen
1438e640892SZhang ChenCOLO use the qmp command to switch and report operation status.
1448e640892SZhang ChenThe diagram just shows the main qmp command, you can get the detail
1458e640892SZhang Chenin test procedure.
1468e640892SZhang Chen
147e59887d8Szhanghailiang== Test procedure ==
14890dfe59bSLukas StraubNote: Here we are running both instances on the same host for testing,
14976ca4b58Szhaolichangchange the IP Addresses if you want to run it on two hosts. Initially
15090dfe59bSLukas Straub127.0.0.1 is the Primary Host and 127.0.0.2 is the Secondary Host.
151e59887d8Szhanghailiang
15290dfe59bSLukas Straub== Startup qemu ==
15390dfe59bSLukas Straub1. Primary:
15476ca4b58SzhaolichangNote: Initially, $imagefolder/primary.qcow2 needs to be copied to all hosts.
15590dfe59bSLukas StraubYou don't need to change any IP's here, because 0.0.0.0 listens on any
15690dfe59bSLukas Straubinterface. The chardev's with 127.0.0.1 IP's loopback to the local qemu
15790dfe59bSLukas Straubinstance.
15890dfe59bSLukas Straub
15990dfe59bSLukas Straub# imagefolder="/mnt/vms/colo-test-primary"
16090dfe59bSLukas Straub
16190dfe59bSLukas Straub# qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 512 -smp 1 -qmp stdio \
16290dfe59bSLukas Straub   -device piix3-usb-uhci -device usb-tablet -name primary \
16390dfe59bSLukas Straub   -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
16490dfe59bSLukas Straub   -device rtl8139,id=e0,netdev=hn0 \
165*c2387413SDaniel P. Berrangé   -chardev socket,id=mirror0,host=0.0.0.0,port=9003,server=on,wait=off \
166*c2387413SDaniel P. Berrangé   -chardev socket,id=compare1,host=0.0.0.0,port=9004,server=on,wait=on \
167*c2387413SDaniel P. Berrangé   -chardev socket,id=compare0,host=127.0.0.1,port=9001,server=on,wait=off \
16890dfe59bSLukas Straub   -chardev socket,id=compare0-0,host=127.0.0.1,port=9001 \
169*c2387413SDaniel P. Berrangé   -chardev socket,id=compare_out,host=127.0.0.1,port=9005,server=on,wait=off \
17090dfe59bSLukas Straub   -chardev socket,id=compare_out0,host=127.0.0.1,port=9005 \
17190dfe59bSLukas Straub   -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 \
17290dfe59bSLukas Straub   -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out \
17390dfe59bSLukas Straub   -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 \
17490dfe59bSLukas Straub   -object iothread,id=iothread1 \
17590dfe59bSLukas Straub   -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,\
17690dfe59bSLukas Strauboutdev=compare_out0,iothread=iothread1 \
17790dfe59bSLukas Straub   -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
17890dfe59bSLukas Straubchildren.0.file.filename=$imagefolder/primary.qcow2,children.0.driver=qcow2 -S
17990dfe59bSLukas Straub
18090dfe59bSLukas Straub2. Secondary:
18190dfe59bSLukas StraubNote: Active and hidden images need to be created only once and the
18290dfe59bSLukas Straubsize should be the same as primary.qcow2. Again, you don't need to change
18390dfe59bSLukas Straubany IP's here, except for the $primary_ip variable.
18490dfe59bSLukas Straub
18590dfe59bSLukas Straub# imagefolder="/mnt/vms/colo-test-secondary"
18690dfe59bSLukas Straub# primary_ip=127.0.0.1
18790dfe59bSLukas Straub
18890dfe59bSLukas Straub# qemu-img create -f qcow2 $imagefolder/secondary-active.qcow2 10G
18990dfe59bSLukas Straub
19090dfe59bSLukas Straub# qemu-img create -f qcow2 $imagefolder/secondary-hidden.qcow2 10G
19190dfe59bSLukas Straub
19290dfe59bSLukas Straub# qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 512 -smp 1 -qmp stdio \
19390dfe59bSLukas Straub   -device piix3-usb-uhci -device usb-tablet -name secondary \
19490dfe59bSLukas Straub   -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
19590dfe59bSLukas Straub   -device rtl8139,id=e0,netdev=hn0 \
19690dfe59bSLukas Straub   -chardev socket,id=red0,host=$primary_ip,port=9003,reconnect=1 \
19790dfe59bSLukas Straub   -chardev socket,id=red1,host=$primary_ip,port=9004,reconnect=1 \
19890dfe59bSLukas Straub   -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
19990dfe59bSLukas Straub   -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
20090dfe59bSLukas Straub   -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
20190dfe59bSLukas Straub   -drive if=none,id=parent0,file.filename=$imagefolder/primary.qcow2,driver=qcow2 \
20290dfe59bSLukas Straub   -drive if=none,id=childs0,driver=replication,mode=secondary,file.driver=qcow2,\
20390dfe59bSLukas Straubtop-id=colo-disk0,file.file.filename=$imagefolder/secondary-active.qcow2,\
20490dfe59bSLukas Straubfile.backing.driver=qcow2,file.backing.file.filename=$imagefolder/secondary-hidden.qcow2,\
20590dfe59bSLukas Straubfile.backing.backing=parent0 \
20690dfe59bSLukas Straub   -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
20790dfe59bSLukas Straubchildren.0=childs0 \
20890dfe59bSLukas Straub   -incoming tcp:0.0.0.0:9998
20990dfe59bSLukas Straub
21090dfe59bSLukas Straub
21190dfe59bSLukas Straub3. On Secondary VM's QEMU monitor, issue command
212e59887d8Szhanghailiang{'execute':'qmp_capabilities'}
21390dfe59bSLukas Straub{'execute': 'nbd-server-start', 'arguments': {'addr': {'type': 'inet', 'data': {'host': '0.0.0.0', 'port': '9999'} } } }
21490dfe59bSLukas Straub{'execute': 'nbd-server-add', 'arguments': {'device': 'parent0', 'writable': true } }
215e59887d8Szhanghailiang
216e59887d8SzhanghailiangNote:
217e59887d8Szhanghailiang  a. The qmp command nbd-server-start and nbd-server-add must be run
218e59887d8Szhanghailiang     before running the qmp command migrate on primary QEMU
219e59887d8Szhanghailiang  b. Active disk, hidden disk and nbd target's length should be the
220e59887d8Szhanghailiang     same.
22190dfe59bSLukas Straub  c. It is better to put active disk and hidden disk in ramdisk. They
22290dfe59bSLukas Straub     will be merged into the parent disk on failover.
223e59887d8Szhanghailiang
22490dfe59bSLukas Straub4. On Primary VM's QEMU monitor, issue command:
225e59887d8Szhanghailiang{'execute':'qmp_capabilities'}
22690dfe59bSLukas Straub{'execute': 'human-monitor-command', 'arguments': {'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0'}}
22790dfe59bSLukas Straub{'execute': 'x-blockdev-change', 'arguments':{'parent': 'colo-disk0', 'node': 'replication0' } }
22890dfe59bSLukas Straub{'execute': 'migrate-set-capabilities', 'arguments': {'capabilities': [ {'capability': 'x-colo', 'state': true } ] } }
22990dfe59bSLukas Straub{'execute': 'migrate', 'arguments': {'uri': 'tcp:127.0.0.2:9998' } }
230e59887d8Szhanghailiang
231e59887d8Szhanghailiang  Note:
232e59887d8Szhanghailiang  a. There should be only one NBD Client for each primary disk.
23390dfe59bSLukas Straub  b. The qmp command line must be run after running qmp command line in
234e59887d8Szhanghailiang     secondary qemu.
235e59887d8Szhanghailiang
23690dfe59bSLukas Straub5. After the above steps, you will see, whenever you make changes to PVM, SVM will be synced.
237e59887d8SzhanghailiangYou can issue command '{ "execute": "migrate-set-parameters" , "arguments":{ "x-checkpoint-delay": 2000 } }'
23890dfe59bSLukas Straubto change the idle checkpoint period time
239e59887d8Szhanghailiang
24090dfe59bSLukas Straub6. Failover test
24190dfe59bSLukas StraubYou can kill one of the VMs and Failover on the surviving VM:
242e59887d8Szhanghailiang
24390dfe59bSLukas StraubIf you killed the Secondary, then follow "Primary Failover". After that,
24490dfe59bSLukas Straubif you want to resume the replication, follow "Primary resume replication"
24590dfe59bSLukas Straub
24690dfe59bSLukas StraubIf you killed the Primary, then follow "Secondary Failover". After that,
24790dfe59bSLukas Straubif you want to resume the replication, follow "Secondary resume replication"
24890dfe59bSLukas Straub
24990dfe59bSLukas Straub== Primary Failover ==
25090dfe59bSLukas StraubThe Secondary died, resume on the Primary
25190dfe59bSLukas Straub
252e59887d8Szhanghailiang{'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0', 'child': 'children.1'} }
25390dfe59bSLukas Straub{'execute': 'human-monitor-command', 'arguments':{ 'command-line': 'drive_del replication0' } }
25490dfe59bSLukas Straub{'execute': 'object-del', 'arguments':{ 'id': 'comp0' } }
25590dfe59bSLukas Straub{'execute': 'object-del', 'arguments':{ 'id': 'iothread1' } }
25690dfe59bSLukas Straub{'execute': 'object-del', 'arguments':{ 'id': 'm0' } }
25790dfe59bSLukas Straub{'execute': 'object-del', 'arguments':{ 'id': 'redire0' } }
25890dfe59bSLukas Straub{'execute': 'object-del', 'arguments':{ 'id': 'redire1' } }
25990dfe59bSLukas Straub{'execute': 'x-colo-lost-heartbeat' }
260e59887d8Szhanghailiang
26190dfe59bSLukas Straub== Secondary Failover ==
26290dfe59bSLukas StraubThe Primary died, resume on the Secondary and prepare to become the new Primary
26390dfe59bSLukas Straub
264e59887d8Szhanghailiang{'execute': 'nbd-server-stop'}
26590dfe59bSLukas Straub{'execute': 'x-colo-lost-heartbeat'}
26690dfe59bSLukas Straub
26790dfe59bSLukas Straub{'execute': 'object-del', 'arguments':{ 'id': 'f2' } }
26890dfe59bSLukas Straub{'execute': 'object-del', 'arguments':{ 'id': 'f1' } }
26990dfe59bSLukas Straub{'execute': 'chardev-remove', 'arguments':{ 'id': 'red1' } }
27090dfe59bSLukas Straub{'execute': 'chardev-remove', 'arguments':{ 'id': 'red0' } }
27190dfe59bSLukas Straub
27290dfe59bSLukas Straub{'execute': 'chardev-add', 'arguments':{ 'id': 'mirror0', 'backend': {'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host': '0.0.0.0', 'port': '9003' } }, 'server': true } } } }
27390dfe59bSLukas Straub{'execute': 'chardev-add', 'arguments':{ 'id': 'compare1', 'backend': {'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host': '0.0.0.0', 'port': '9004' } }, 'server': true } } } }
27490dfe59bSLukas Straub{'execute': 'chardev-add', 'arguments':{ 'id': 'compare0', 'backend': {'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host': '127.0.0.1', 'port': '9001' } }, 'server': true } } } }
27590dfe59bSLukas Straub{'execute': 'chardev-add', 'arguments':{ 'id': 'compare0-0', 'backend': {'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host': '127.0.0.1', 'port': '9001' } }, 'server': false } } } }
27690dfe59bSLukas Straub{'execute': 'chardev-add', 'arguments':{ 'id': 'compare_out', 'backend': {'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host': '127.0.0.1', 'port': '9005' } }, 'server': true } } } }
27790dfe59bSLukas Straub{'execute': 'chardev-add', 'arguments':{ 'id': 'compare_out0', 'backend': {'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host': '127.0.0.1', 'port': '9005' } }, 'server': false } } } }
27890dfe59bSLukas Straub
27990dfe59bSLukas Straub== Primary resume replication ==
28090dfe59bSLukas StraubResume replication after new Secondary is up.
28190dfe59bSLukas Straub
28290dfe59bSLukas StraubStart the new Secondary (Steps 2 and 3 above), then on the Primary:
28390dfe59bSLukas Straub{'execute': 'drive-mirror', 'arguments':{ 'device': 'colo-disk0', 'job-id': 'resync', 'target': 'nbd://127.0.0.2:9999/parent0', 'mode': 'existing', 'format': 'raw', 'sync': 'full'} }
28490dfe59bSLukas Straub
28590dfe59bSLukas StraubWait until disk is synced, then:
28690dfe59bSLukas Straub{'execute': 'stop'}
28790dfe59bSLukas Straub{'execute': 'block-job-cancel', 'arguments':{ 'device': 'resync'} }
28890dfe59bSLukas Straub
28990dfe59bSLukas Straub{'execute': 'human-monitor-command', 'arguments':{ 'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0'}}
29090dfe59bSLukas Straub{'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0', 'node': 'replication0' } }
29190dfe59bSLukas Straub
29290dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-mirror', 'id': 'm0', 'props': { 'netdev': 'hn0', 'queue': 'tx', 'outdev': 'mirror0' } } }
29390dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 'redire0', 'props': { 'netdev': 'hn0', 'queue': 'rx', 'indev': 'compare_out' } } }
29490dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 'redire1', 'props': { 'netdev': 'hn0', 'queue': 'rx', 'outdev': 'compare0' } } }
29590dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'iothread', 'id': 'iothread1' } }
29690dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'colo-compare', 'id': 'comp0', 'props': { 'primary_in': 'compare0-0', 'secondary_in': 'compare1', 'outdev': 'compare_out0', 'iothread': 'iothread1' } } }
29790dfe59bSLukas Straub
29890dfe59bSLukas Straub{'execute': 'migrate-set-capabilities', 'arguments':{ 'capabilities': [ {'capability': 'x-colo', 'state': true } ] } }
29990dfe59bSLukas Straub{'execute': 'migrate', 'arguments':{ 'uri': 'tcp:127.0.0.2:9998' } }
30090dfe59bSLukas Straub
30190dfe59bSLukas StraubNote:
30290dfe59bSLukas StraubIf this Primary previously was a Secondary, then we need to insert the
30390dfe59bSLukas Straubfilters before the filter-rewriter by using the
30490dfe59bSLukas Straub"'insert': 'before', 'position': 'id=rew0'" Options. See below.
30590dfe59bSLukas Straub
30690dfe59bSLukas Straub== Secondary resume replication ==
30790dfe59bSLukas StraubBecome Primary and resume replication after new Secondary is up. Note
30890dfe59bSLukas Straubthat now 127.0.0.1 is the Secondary and 127.0.0.2 is the Primary.
30990dfe59bSLukas Straub
31090dfe59bSLukas StraubStart the new Secondary (Steps 2 and 3 above, but with primary_ip=127.0.0.2),
31190dfe59bSLukas Straubthen on the old Secondary:
31290dfe59bSLukas Straub{'execute': 'drive-mirror', 'arguments':{ 'device': 'colo-disk0', 'job-id': 'resync', 'target': 'nbd://127.0.0.1:9999/parent0', 'mode': 'existing', 'format': 'raw', 'sync': 'full'} }
31390dfe59bSLukas Straub
31490dfe59bSLukas StraubWait until disk is synced, then:
31590dfe59bSLukas Straub{'execute': 'stop'}
31690dfe59bSLukas Straub{'execute': 'block-job-cancel', 'arguments':{ 'device': 'resync' } }
31790dfe59bSLukas Straub
31890dfe59bSLukas Straub{'execute': 'human-monitor-command', 'arguments':{ 'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.1,file.port=9999,file.export=parent0,node-name=replication0'}}
31990dfe59bSLukas Straub{'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0', 'node': 'replication0' } }
32090dfe59bSLukas Straub
32190dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-mirror', 'id': 'm0', 'props': { 'insert': 'before', 'position': 'id=rew0', 'netdev': 'hn0', 'queue': 'tx', 'outdev': 'mirror0' } } }
32290dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 'redire0', 'props': { 'insert': 'before', 'position': 'id=rew0', 'netdev': 'hn0', 'queue': 'rx', 'indev': 'compare_out' } } }
32390dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 'redire1', 'props': { 'insert': 'before', 'position': 'id=rew0', 'netdev': 'hn0', 'queue': 'rx', 'outdev': 'compare0' } } }
32490dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'iothread', 'id': 'iothread1' } }
32590dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'colo-compare', 'id': 'comp0', 'props': { 'primary_in': 'compare0-0', 'secondary_in': 'compare1', 'outdev': 'compare_out0', 'iothread': 'iothread1' } } }
32690dfe59bSLukas Straub
32790dfe59bSLukas Straub{'execute': 'migrate-set-capabilities', 'arguments':{ 'capabilities': [ {'capability': 'x-colo', 'state': true } ] } }
32890dfe59bSLukas Straub{'execute': 'migrate', 'arguments':{ 'uri': 'tcp:127.0.0.1:9998' } }
329e59887d8Szhanghailiang
330e59887d8Szhanghailiang== TODO ==
33190dfe59bSLukas Straub1. Support shared storage.
33290dfe59bSLukas Straub2. Develop the heartbeat part.
33390dfe59bSLukas Straub3. Reduce checkpoint VM’s downtime while doing checkpoint.
334