xref: /qemu/docs/COLO-FT.txt (revision 90dfe59b33c9c20b81641d46a4dddbe89c5fba7a)
1e59887d8SzhanghailiangCOarse-grained LOck-stepping Virtual Machines for Non-stop Service
2e59887d8Szhanghailiang----------------------------------------
3e59887d8SzhanghailiangCopyright (c) 2016 Intel Corporation
4e59887d8SzhanghailiangCopyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
5e59887d8SzhanghailiangCopyright (c) 2016 Fujitsu, Corp.
6e59887d8Szhanghailiang
7e59887d8SzhanghailiangThis work is licensed under the terms of the GNU GPL, version 2 or later.
8e59887d8SzhanghailiangSee the COPYING file in the top-level directory.
9e59887d8Szhanghailiang
10e59887d8SzhanghailiangThis document gives an overview of COLO's design and how to use it.
11e59887d8Szhanghailiang
12e59887d8Szhanghailiang== Background ==
13e59887d8SzhanghailiangVirtual machine (VM) replication is a well known technique for providing
14e59887d8Szhanghailiangapplication-agnostic software-implemented hardware fault tolerance,
15e59887d8Szhanghailiangalso known as "non-stop service".
16e59887d8Szhanghailiang
17e59887d8SzhanghailiangCOLO (COarse-grained LOck-stepping) is a high availability solution.
18e59887d8SzhanghailiangBoth primary VM (PVM) and secondary VM (SVM) run in parallel. They receive the
19e59887d8Szhanghailiangsame request from client, and generate response in parallel too.
20e59887d8SzhanghailiangIf the response packets from PVM and SVM are identical, they are released
21e59887d8Szhanghailiangimmediately. Otherwise, a VM checkpoint (on demand) is conducted.
22e59887d8Szhanghailiang
23e59887d8Szhanghailiang== Architecture ==
24e59887d8Szhanghailiang
25e59887d8SzhanghailiangThe architecture of COLO is shown in the diagram below.
26e59887d8SzhanghailiangIt consists of a pair of networked physical nodes:
27e59887d8SzhanghailiangThe primary node running the PVM, and the secondary node running the SVM
28e59887d8Szhanghailiangto maintain a valid replica of the PVM.
29e59887d8SzhanghailiangPVM and SVM execute in parallel and generate output of response packets for
30e59887d8Szhanghailiangclient requests according to the application semantics.
31e59887d8Szhanghailiang
32e59887d8SzhanghailiangThe incoming packets from the client or external network are received by the
33e59887d8Szhanghailiangprimary node, and then forwarded to the secondary node, so that both the PVM
34e59887d8Szhanghailiangand the SVM are stimulated with the same requests.
35e59887d8Szhanghailiang
36e59887d8SzhanghailiangCOLO receives the outbound packets from both the PVM and SVM and compares them
37e59887d8Szhanghailiangbefore allowing the output to be sent to clients.
38e59887d8Szhanghailiang
39e59887d8SzhanghailiangThe SVM is qualified as a valid replica of the PVM, as long as it generates
40e59887d8Szhanghailiangidentical responses to all client requests. Once the differences in the outputs
41e59887d8Szhanghailiangare detected between the PVM and SVM, COLO withholds transmission of the
42e59887d8Szhanghailiangoutbound packets until it has successfully synchronized the PVM state to the SVM.
43e59887d8Szhanghailiang
44e59887d8Szhanghailiang  Primary Node                                                            Secondary Node
45e59887d8Szhanghailiang+------------+  +-----------------------+       +------------------------+  +------------+
46a38299bfSZhang Chen|            |  |       HeartBeat       +<----->+       HeartBeat        |  |            |
47a38299bfSZhang Chen| Primary VM |  +-----------+-----------+       +-----------+------------+  |Secondary VM|
48e59887d8Szhanghailiang|            |              |                               |               |            |
49e59887d8Szhanghailiang|            |  +-----------|-----------+       +-----------|------------+  |            |
50e59887d8Szhanghailiang|            |  |QEMU   +---v----+      |       |QEMU  +----v---+        |  |            |
51e59887d8Szhanghailiang|            |  |       |Failover|      |       |      |Failover|        |  |            |
52e59887d8Szhanghailiang|            |  |       +--------+      |       |      +--------+        |  |            |
53e59887d8Szhanghailiang|            |  |   +---------------+   |       |   +---------------+    |  |            |
54a38299bfSZhang Chen|            |  |   | VM Checkpoint +-------------->+ VM Checkpoint |    |  |            |
55e59887d8Szhanghailiang|            |  |   +---------------+   |       |   +---------------+    |  |            |
56a38299bfSZhang Chen|Requests<--------------------------\ /-----------------\ /--------------------->Requests|
57a38299bfSZhang Chen|            |  |                   ^ ^ |       |       | |              |  |            |
58a38299bfSZhang Chen|Responses+---------------------\ /-|-|------------\ /-------------------------+Responses|
59a38299bfSZhang Chen|            |  |               | | | | |       |  | |  | |              |  |            |
60a38299bfSZhang Chen|            |  | +-----------+ | | | | |       |  | |  | | +----------+ |  |            |
61a38299bfSZhang Chen|            |  | | COLO disk | | | | | |       |  | |  | | | COLO disk| |  |            |
62a38299bfSZhang Chen|            |  | |   Manager +---------------------------->| Manager  | |  |            |
63a38299bfSZhang Chen|            |  | ++----------+ v v | | |       |  | v  v | +---------++ |  |            |
64a38299bfSZhang Chen|            |  |  |+-----------+-+-+-++|       | ++-+--+-+---------+ |  |  |            |
65a38299bfSZhang Chen|            |  |  ||   COLO Proxy     ||       | |   COLO Proxy    | |  |  |            |
66a38299bfSZhang Chen|            |  |  || (compare packet  ||       | |(adjust sequence | |  |  |            |
67a38299bfSZhang Chen|            |  |  ||and mirror packet)||       | |    and ACK)     | |  |  |            |
68a38299bfSZhang Chen|            |  |  |+------------+---+-+|       | +-----------------+ |  |  |            |
69a38299bfSZhang Chen+------------+  +-----------------------+       +------------------------+  +------------+
70a38299bfSZhang Chen+------------+     |             |   |                                |     +------------+
71a38299bfSZhang Chen| VM Monitor |     |             |   |                                |     | VM Monitor |
72a38299bfSZhang Chen+------------+     |             |   |                                |     +------------+
73a38299bfSZhang Chen+---------------------------------------+       +----------------------------------------+
74e59887d8Szhanghailiang|   Kernel         |             |   |  |       |   Kernel            |                  |
75a38299bfSZhang Chen+---------------------------------------+       +----------------------------------------+
76e59887d8Szhanghailiang                   |             |   |                                |
77a38299bfSZhang Chen    +--------------v+  +---------v---+--+       +------------------+ +v-------------+
78e59887d8Szhanghailiang    |   Storage     |  |External Network|       | External Network | |   Storage    |
79e59887d8Szhanghailiang    +---------------+  +----------------+       +------------------+ +--------------+
80e59887d8Szhanghailiang
81a38299bfSZhang Chen
82e59887d8Szhanghailiang== Components introduction ==
83e59887d8Szhanghailiang
84e59887d8SzhanghailiangYou can see there are several components in COLO's diagram of architecture.
85e59887d8SzhanghailiangTheir functions are described below.
86e59887d8Szhanghailiang
87e59887d8SzhanghailiangHeartBeat:
88e59887d8SzhanghailiangRuns on both the primary and secondary nodes, to periodically check platform
89e59887d8Szhanghailiangavailability. When the primary node suffers a hardware fail-stop failure,
90e59887d8Szhanghailiangthe heartbeat stops responding, the secondary node will trigger a failover
91e59887d8Szhanghailiangas soon as it determines the absence.
92e59887d8Szhanghailiang
93e59887d8SzhanghailiangCOLO disk Manager:
94e59887d8SzhanghailiangWhen primary VM writes data into image, the colo disk manger captures this data
95e59887d8Szhanghailiangand sends it to secondary VM's which makes sure the context of secondary VM's
96e59887d8Szhanghailiangimage is consistent with the context of primary VM 's image.
97e59887d8SzhanghailiangFor more details, please refer to docs/block-replication.txt.
98e59887d8Szhanghailiang
99e59887d8SzhanghailiangCheckpoint/Failover Controller:
100e59887d8SzhanghailiangModifications of save/restore flow to realize continuous migration,
101e59887d8Szhanghailiangto make sure the state of VM in Secondary side is always consistent with VM in
102e59887d8SzhanghailiangPrimary side.
103e59887d8Szhanghailiang
104e59887d8SzhanghailiangCOLO Proxy:
105806be373SLike XuDelivers packets to Primary and Secondary, and then compare the responses from
106e59887d8Szhanghailiangboth side. Then decide whether to start a checkpoint according to some rules.
107963e64a4SStefan WeilPlease refer to docs/colo-proxy.txt for more information.
108e59887d8Szhanghailiang
109e59887d8SzhanghailiangNote:
110e59887d8SzhanghailiangHeartBeat has not been implemented yet, so you need to trigger failover process
111e59887d8Szhanghailiangby using 'x-colo-lost-heartbeat' command.
112e59887d8Szhanghailiang
1138e640892SZhang Chen== COLO operation status ==
1148e640892SZhang Chen
1158e640892SZhang Chen+-----------------+
1168e640892SZhang Chen|                 |
1178e640892SZhang Chen|    Start COLO   |
1188e640892SZhang Chen|                 |
1198e640892SZhang Chen+--------+--------+
1208e640892SZhang Chen         |
1218e640892SZhang Chen         |  Main qmp command:
1228e640892SZhang Chen         |  migrate-set-capabilities with x-colo
1238e640892SZhang Chen         |  migrate
1248e640892SZhang Chen         |
1258e640892SZhang Chen         v
1268e640892SZhang Chen+--------+--------+
1278e640892SZhang Chen|                 |
1288e640892SZhang Chen|  COLO running   |
1298e640892SZhang Chen|                 |
1308e640892SZhang Chen+--------+--------+
1318e640892SZhang Chen         |
1328e640892SZhang Chen         |  Main qmp command:
1338e640892SZhang Chen         |  x-colo-lost-heartbeat
1348e640892SZhang Chen         |  or
1358e640892SZhang Chen         |  some error happened
1368e640892SZhang Chen         v
1378e640892SZhang Chen+--------+--------+
1388e640892SZhang Chen|                 |  send qmp event:
1398e640892SZhang Chen|  COLO failover  |  COLO_EXIT
1408e640892SZhang Chen|                 |
1418e640892SZhang Chen+-----------------+
1428e640892SZhang Chen
1438e640892SZhang ChenCOLO use the qmp command to switch and report operation status.
1448e640892SZhang ChenThe diagram just shows the main qmp command, you can get the detail
1458e640892SZhang Chenin test procedure.
1468e640892SZhang Chen
147e59887d8Szhanghailiang== Test procedure ==
148*90dfe59bSLukas StraubNote: Here we are running both instances on the same host for testing,
149*90dfe59bSLukas Straubchange the IP Addresses if you want to run it on two hosts. Initally
150*90dfe59bSLukas Straub127.0.0.1 is the Primary Host and 127.0.0.2 is the Secondary Host.
151e59887d8Szhanghailiang
152*90dfe59bSLukas Straub== Startup qemu ==
153*90dfe59bSLukas Straub1. Primary:
154*90dfe59bSLukas StraubNote: Initally, $imagefolder/primary.qcow2 needs to be copied to all hosts.
155*90dfe59bSLukas StraubYou don't need to change any IP's here, because 0.0.0.0 listens on any
156*90dfe59bSLukas Straubinterface. The chardev's with 127.0.0.1 IP's loopback to the local qemu
157*90dfe59bSLukas Straubinstance.
158*90dfe59bSLukas Straub
159*90dfe59bSLukas Straub# imagefolder="/mnt/vms/colo-test-primary"
160*90dfe59bSLukas Straub
161*90dfe59bSLukas Straub# qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 512 -smp 1 -qmp stdio \
162*90dfe59bSLukas Straub   -device piix3-usb-uhci -device usb-tablet -name primary \
163*90dfe59bSLukas Straub   -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
164*90dfe59bSLukas Straub   -device rtl8139,id=e0,netdev=hn0 \
165*90dfe59bSLukas Straub   -chardev socket,id=mirror0,host=0.0.0.0,port=9003,server,nowait \
166*90dfe59bSLukas Straub   -chardev socket,id=compare1,host=0.0.0.0,port=9004,server,wait \
167*90dfe59bSLukas Straub   -chardev socket,id=compare0,host=127.0.0.1,port=9001,server,nowait \
168*90dfe59bSLukas Straub   -chardev socket,id=compare0-0,host=127.0.0.1,port=9001 \
169*90dfe59bSLukas Straub   -chardev socket,id=compare_out,host=127.0.0.1,port=9005,server,nowait \
170*90dfe59bSLukas Straub   -chardev socket,id=compare_out0,host=127.0.0.1,port=9005 \
171*90dfe59bSLukas Straub   -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 \
172*90dfe59bSLukas Straub   -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out \
173*90dfe59bSLukas Straub   -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 \
174*90dfe59bSLukas Straub   -object iothread,id=iothread1 \
175*90dfe59bSLukas Straub   -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,\
176*90dfe59bSLukas Strauboutdev=compare_out0,iothread=iothread1 \
177*90dfe59bSLukas Straub   -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
178*90dfe59bSLukas Straubchildren.0.file.filename=$imagefolder/primary.qcow2,children.0.driver=qcow2 -S
179*90dfe59bSLukas Straub
180*90dfe59bSLukas Straub2. Secondary:
181*90dfe59bSLukas StraubNote: Active and hidden images need to be created only once and the
182*90dfe59bSLukas Straubsize should be the same as primary.qcow2. Again, you don't need to change
183*90dfe59bSLukas Straubany IP's here, except for the $primary_ip variable.
184*90dfe59bSLukas Straub
185*90dfe59bSLukas Straub# imagefolder="/mnt/vms/colo-test-secondary"
186*90dfe59bSLukas Straub# primary_ip=127.0.0.1
187*90dfe59bSLukas Straub
188*90dfe59bSLukas Straub# qemu-img create -f qcow2 $imagefolder/secondary-active.qcow2 10G
189*90dfe59bSLukas Straub
190*90dfe59bSLukas Straub# qemu-img create -f qcow2 $imagefolder/secondary-hidden.qcow2 10G
191*90dfe59bSLukas Straub
192*90dfe59bSLukas Straub# qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 512 -smp 1 -qmp stdio \
193*90dfe59bSLukas Straub   -device piix3-usb-uhci -device usb-tablet -name secondary \
194*90dfe59bSLukas Straub   -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
195*90dfe59bSLukas Straub   -device rtl8139,id=e0,netdev=hn0 \
196*90dfe59bSLukas Straub   -chardev socket,id=red0,host=$primary_ip,port=9003,reconnect=1 \
197*90dfe59bSLukas Straub   -chardev socket,id=red1,host=$primary_ip,port=9004,reconnect=1 \
198*90dfe59bSLukas Straub   -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
199*90dfe59bSLukas Straub   -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
200*90dfe59bSLukas Straub   -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
201*90dfe59bSLukas Straub   -drive if=none,id=parent0,file.filename=$imagefolder/primary.qcow2,driver=qcow2 \
202*90dfe59bSLukas Straub   -drive if=none,id=childs0,driver=replication,mode=secondary,file.driver=qcow2,\
203*90dfe59bSLukas Straubtop-id=colo-disk0,file.file.filename=$imagefolder/secondary-active.qcow2,\
204*90dfe59bSLukas Straubfile.backing.driver=qcow2,file.backing.file.filename=$imagefolder/secondary-hidden.qcow2,\
205*90dfe59bSLukas Straubfile.backing.backing=parent0 \
206*90dfe59bSLukas Straub   -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
207*90dfe59bSLukas Straubchildren.0=childs0 \
208*90dfe59bSLukas Straub   -incoming tcp:0.0.0.0:9998
209*90dfe59bSLukas Straub
210*90dfe59bSLukas Straub
211*90dfe59bSLukas Straub3. On Secondary VM's QEMU monitor, issue command
212e59887d8Szhanghailiang{'execute':'qmp_capabilities'}
213*90dfe59bSLukas Straub{'execute': 'nbd-server-start', 'arguments': {'addr': {'type': 'inet', 'data': {'host': '0.0.0.0', 'port': '9999'} } } }
214*90dfe59bSLukas Straub{'execute': 'nbd-server-add', 'arguments': {'device': 'parent0', 'writable': true } }
215e59887d8Szhanghailiang
216e59887d8SzhanghailiangNote:
217e59887d8Szhanghailiang  a. The qmp command nbd-server-start and nbd-server-add must be run
218e59887d8Szhanghailiang     before running the qmp command migrate on primary QEMU
219e59887d8Szhanghailiang  b. Active disk, hidden disk and nbd target's length should be the
220e59887d8Szhanghailiang     same.
221*90dfe59bSLukas Straub  c. It is better to put active disk and hidden disk in ramdisk. They
222*90dfe59bSLukas Straub     will be merged into the parent disk on failover.
223e59887d8Szhanghailiang
224*90dfe59bSLukas Straub4. On Primary VM's QEMU monitor, issue command:
225e59887d8Szhanghailiang{'execute':'qmp_capabilities'}
226*90dfe59bSLukas Straub{'execute': 'human-monitor-command', 'arguments': {'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0'}}
227*90dfe59bSLukas Straub{'execute': 'x-blockdev-change', 'arguments':{'parent': 'colo-disk0', 'node': 'replication0' } }
228*90dfe59bSLukas Straub{'execute': 'migrate-set-capabilities', 'arguments': {'capabilities': [ {'capability': 'x-colo', 'state': true } ] } }
229*90dfe59bSLukas Straub{'execute': 'migrate', 'arguments': {'uri': 'tcp:127.0.0.2:9998' } }
230e59887d8Szhanghailiang
231e59887d8Szhanghailiang  Note:
232e59887d8Szhanghailiang  a. There should be only one NBD Client for each primary disk.
233*90dfe59bSLukas Straub  b. The qmp command line must be run after running qmp command line in
234e59887d8Szhanghailiang     secondary qemu.
235e59887d8Szhanghailiang
236*90dfe59bSLukas Straub5. After the above steps, you will see, whenever you make changes to PVM, SVM will be synced.
237e59887d8SzhanghailiangYou can issue command '{ "execute": "migrate-set-parameters" , "arguments":{ "x-checkpoint-delay": 2000 } }'
238*90dfe59bSLukas Straubto change the idle checkpoint period time
239e59887d8Szhanghailiang
240*90dfe59bSLukas Straub6. Failover test
241*90dfe59bSLukas StraubYou can kill one of the VMs and Failover on the surviving VM:
242e59887d8Szhanghailiang
243*90dfe59bSLukas StraubIf you killed the Secondary, then follow "Primary Failover". After that,
244*90dfe59bSLukas Straubif you want to resume the replication, follow "Primary resume replication"
245*90dfe59bSLukas Straub
246*90dfe59bSLukas StraubIf you killed the Primary, then follow "Secondary Failover". After that,
247*90dfe59bSLukas Straubif you want to resume the replication, follow "Secondary resume replication"
248*90dfe59bSLukas Straub
249*90dfe59bSLukas Straub== Primary Failover ==
250*90dfe59bSLukas StraubThe Secondary died, resume on the Primary
251*90dfe59bSLukas Straub
252e59887d8Szhanghailiang{'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0', 'child': 'children.1'} }
253*90dfe59bSLukas Straub{'execute': 'human-monitor-command', 'arguments':{ 'command-line': 'drive_del replication0' } }
254*90dfe59bSLukas Straub{'execute': 'object-del', 'arguments':{ 'id': 'comp0' } }
255*90dfe59bSLukas Straub{'execute': 'object-del', 'arguments':{ 'id': 'iothread1' } }
256*90dfe59bSLukas Straub{'execute': 'object-del', 'arguments':{ 'id': 'm0' } }
257*90dfe59bSLukas Straub{'execute': 'object-del', 'arguments':{ 'id': 'redire0' } }
258*90dfe59bSLukas Straub{'execute': 'object-del', 'arguments':{ 'id': 'redire1' } }
259*90dfe59bSLukas Straub{'execute': 'x-colo-lost-heartbeat' }
260e59887d8Szhanghailiang
261*90dfe59bSLukas Straub== Secondary Failover ==
262*90dfe59bSLukas StraubThe Primary died, resume on the Secondary and prepare to become the new Primary
263*90dfe59bSLukas Straub
264e59887d8Szhanghailiang{'execute': 'nbd-server-stop'}
265*90dfe59bSLukas Straub{'execute': 'x-colo-lost-heartbeat'}
266*90dfe59bSLukas Straub
267*90dfe59bSLukas Straub{'execute': 'object-del', 'arguments':{ 'id': 'f2' } }
268*90dfe59bSLukas Straub{'execute': 'object-del', 'arguments':{ 'id': 'f1' } }
269*90dfe59bSLukas Straub{'execute': 'chardev-remove', 'arguments':{ 'id': 'red1' } }
270*90dfe59bSLukas Straub{'execute': 'chardev-remove', 'arguments':{ 'id': 'red0' } }
271*90dfe59bSLukas Straub
272*90dfe59bSLukas Straub{'execute': 'chardev-add', 'arguments':{ 'id': 'mirror0', 'backend': {'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host': '0.0.0.0', 'port': '9003' } }, 'server': true } } } }
273*90dfe59bSLukas Straub{'execute': 'chardev-add', 'arguments':{ 'id': 'compare1', 'backend': {'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host': '0.0.0.0', 'port': '9004' } }, 'server': true } } } }
274*90dfe59bSLukas Straub{'execute': 'chardev-add', 'arguments':{ 'id': 'compare0', 'backend': {'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host': '127.0.0.1', 'port': '9001' } }, 'server': true } } } }
275*90dfe59bSLukas Straub{'execute': 'chardev-add', 'arguments':{ 'id': 'compare0-0', 'backend': {'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host': '127.0.0.1', 'port': '9001' } }, 'server': false } } } }
276*90dfe59bSLukas Straub{'execute': 'chardev-add', 'arguments':{ 'id': 'compare_out', 'backend': {'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host': '127.0.0.1', 'port': '9005' } }, 'server': true } } } }
277*90dfe59bSLukas Straub{'execute': 'chardev-add', 'arguments':{ 'id': 'compare_out0', 'backend': {'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host': '127.0.0.1', 'port': '9005' } }, 'server': false } } } }
278*90dfe59bSLukas Straub
279*90dfe59bSLukas Straub== Primary resume replication ==
280*90dfe59bSLukas StraubResume replication after new Secondary is up.
281*90dfe59bSLukas Straub
282*90dfe59bSLukas StraubStart the new Secondary (Steps 2 and 3 above), then on the Primary:
283*90dfe59bSLukas Straub{'execute': 'drive-mirror', 'arguments':{ 'device': 'colo-disk0', 'job-id': 'resync', 'target': 'nbd://127.0.0.2:9999/parent0', 'mode': 'existing', 'format': 'raw', 'sync': 'full'} }
284*90dfe59bSLukas Straub
285*90dfe59bSLukas StraubWait until disk is synced, then:
286*90dfe59bSLukas Straub{'execute': 'stop'}
287*90dfe59bSLukas Straub{'execute': 'block-job-cancel', 'arguments':{ 'device': 'resync'} }
288*90dfe59bSLukas Straub
289*90dfe59bSLukas Straub{'execute': 'human-monitor-command', 'arguments':{ 'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0'}}
290*90dfe59bSLukas Straub{'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0', 'node': 'replication0' } }
291*90dfe59bSLukas Straub
292*90dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-mirror', 'id': 'm0', 'props': { 'netdev': 'hn0', 'queue': 'tx', 'outdev': 'mirror0' } } }
293*90dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 'redire0', 'props': { 'netdev': 'hn0', 'queue': 'rx', 'indev': 'compare_out' } } }
294*90dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 'redire1', 'props': { 'netdev': 'hn0', 'queue': 'rx', 'outdev': 'compare0' } } }
295*90dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'iothread', 'id': 'iothread1' } }
296*90dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'colo-compare', 'id': 'comp0', 'props': { 'primary_in': 'compare0-0', 'secondary_in': 'compare1', 'outdev': 'compare_out0', 'iothread': 'iothread1' } } }
297*90dfe59bSLukas Straub
298*90dfe59bSLukas Straub{'execute': 'migrate-set-capabilities', 'arguments':{ 'capabilities': [ {'capability': 'x-colo', 'state': true } ] } }
299*90dfe59bSLukas Straub{'execute': 'migrate', 'arguments':{ 'uri': 'tcp:127.0.0.2:9998' } }
300*90dfe59bSLukas Straub
301*90dfe59bSLukas StraubNote:
302*90dfe59bSLukas StraubIf this Primary previously was a Secondary, then we need to insert the
303*90dfe59bSLukas Straubfilters before the filter-rewriter by using the
304*90dfe59bSLukas Straub"'insert': 'before', 'position': 'id=rew0'" Options. See below.
305*90dfe59bSLukas Straub
306*90dfe59bSLukas Straub== Secondary resume replication ==
307*90dfe59bSLukas StraubBecome Primary and resume replication after new Secondary is up. Note
308*90dfe59bSLukas Straubthat now 127.0.0.1 is the Secondary and 127.0.0.2 is the Primary.
309*90dfe59bSLukas Straub
310*90dfe59bSLukas StraubStart the new Secondary (Steps 2 and 3 above, but with primary_ip=127.0.0.2),
311*90dfe59bSLukas Straubthen on the old Secondary:
312*90dfe59bSLukas Straub{'execute': 'drive-mirror', 'arguments':{ 'device': 'colo-disk0', 'job-id': 'resync', 'target': 'nbd://127.0.0.1:9999/parent0', 'mode': 'existing', 'format': 'raw', 'sync': 'full'} }
313*90dfe59bSLukas Straub
314*90dfe59bSLukas StraubWait until disk is synced, then:
315*90dfe59bSLukas Straub{'execute': 'stop'}
316*90dfe59bSLukas Straub{'execute': 'block-job-cancel', 'arguments':{ 'device': 'resync' } }
317*90dfe59bSLukas Straub
318*90dfe59bSLukas Straub{'execute': 'human-monitor-command', 'arguments':{ 'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.1,file.port=9999,file.export=parent0,node-name=replication0'}}
319*90dfe59bSLukas Straub{'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0', 'node': 'replication0' } }
320*90dfe59bSLukas Straub
321*90dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-mirror', 'id': 'm0', 'props': { 'insert': 'before', 'position': 'id=rew0', 'netdev': 'hn0', 'queue': 'tx', 'outdev': 'mirror0' } } }
322*90dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 'redire0', 'props': { 'insert': 'before', 'position': 'id=rew0', 'netdev': 'hn0', 'queue': 'rx', 'indev': 'compare_out' } } }
323*90dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 'redire1', 'props': { 'insert': 'before', 'position': 'id=rew0', 'netdev': 'hn0', 'queue': 'rx', 'outdev': 'compare0' } } }
324*90dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'iothread', 'id': 'iothread1' } }
325*90dfe59bSLukas Straub{'execute': 'object-add', 'arguments':{ 'qom-type': 'colo-compare', 'id': 'comp0', 'props': { 'primary_in': 'compare0-0', 'secondary_in': 'compare1', 'outdev': 'compare_out0', 'iothread': 'iothread1' } } }
326*90dfe59bSLukas Straub
327*90dfe59bSLukas Straub{'execute': 'migrate-set-capabilities', 'arguments':{ 'capabilities': [ {'capability': 'x-colo', 'state': true } ] } }
328*90dfe59bSLukas Straub{'execute': 'migrate', 'arguments':{ 'uri': 'tcp:127.0.0.1:9998' } }
329e59887d8Szhanghailiang
330e59887d8Szhanghailiang== TODO ==
331*90dfe59bSLukas Straub1. Support shared storage.
332*90dfe59bSLukas Straub2. Develop the heartbeat part.
333*90dfe59bSLukas Straub3. Reduce checkpoint VM’s downtime while doing checkpoint.
334