Lines Matching full:the
7 QEMU has code to load/save the state of the guest that it is running.
8 These are two complementary operations. Saving the state just does
9 that, saves the state for each device that the guest is running.
10 Restoring a guest is just the opposite operation: we need to load the
13 For this to work, QEMU has to be launched with the same arguments the
14 two times. I.e. it can only restore the state in one guest that has
15 the same devices that the one it was saved (this last requirement can
17 to be exactly the same).
24 Next was the "live migration" functionality. This is important
27 migration allows the guest to continue running while the state is
28 transferred. Only while the last part of the state is transferred has
29 the guest to be stopped. Typically the time that the guest is
30 unresponsive during live migration is the low hundred of milliseconds
38 The migration stream is normally just a byte stream that can be passed
41 - tcp migration: do the migration using tcp sockets
42 - unix migration: do the migration using unix sockets
43 - exec migration: do the migration using the stdin/stdout through a process.
44 - fd migration: do the migration using a file descriptor that is
46 - file migration: do the migration using a file that is passed to QEMU
48 application to add its own metadata to the start of the file without
50 data/metadata at the end of migration.
52 The file migration also supports using a file that has already been
55 management application to have control over the migration file
57 interface if the multifd capability is enabled:
59 - the fdset must contain two file descriptors that are not
61 - if the direct-io capability is to be used, exactly one of the
62 file descriptors must have the O_DIRECT flag set;
63 - the file must be opened with WRONLY on the migration source side
64 and RDONLY on the migration destination side.
67 transports the page data using ``RDMA``, where the hardware takes
68 care of transporting the pages, and the load on the CPU is much
69 lower. While the internals of RDMA migration are a bit different,
70 this isn't really visible outside the RAM migration code.
72 All these migration protocols use the same infrastructure to
73 save/restore state devices. This infrastructure is shared with the
79 The files, sockets or fd's that carry the migration stream are abstracted by
80 the ``QEMUFile`` type (see ``migration/qemu-file.h``). In most cases this
84 Saving the state of one device
87 For most devices, the state is saved in a single call to the migration
88 infrastructure; these are *non-iterative* devices. The data for these
89 devices is sent at the end of precopy migration, when the CPUs are paused.
91 data (e.g. RAM or large tables). See the iterative device section below.
96 - The migration state saved should reflect the device being modelled rather
97 than the way your implementation works. That way if you change the implementation
98 later the migration stream will stay compatible. That model may include
101 - When saving a migration stream the device code may walk and check
102 the state of the device. These checks might fail in various ways (e.g.
103 discovering internal state is corrupt or that the guest has done something bad).
104 Consider carefully before asserting/aborting at this point, since the
106 apparently been running fine until then. In these error cases, the device
107 should log a message indicating the cause of error, and should consider
108 putting the device into an error state, allowing the rest of the VM to
111 - The migration might happen at an inconvenient point,
112 e.g. right in the middle of the guest reprogramming the device, during
113 guest reboot or shutdown or while the device is waiting for external IO.
115 since in the cloud environment migrations might happen automatically to
116 VMs that the administrator doesn't directly control.
121 - The destination should treat an incoming migration stream as hostile
122 (which we do to varying degrees in the existing code). Check that offsets
123 into buffers and the like can't cause overruns. Fail the incoming migration
124 in the case of a corrupted stream like this.
127 migration version dependent. For example, the order of PCI capabilities
132 - The state of the source should not be changed or destroyed by the
134 higher levels of management, or failures of the destination host are
135 not unusual, and in that case the VM is restarted on the source.
136 Note that the management layer can validly revert the migration
137 even though the QEMU level of migration has succeeded as long as it
138 does it before starting execution on the destination.
142 when hot adding USB devices it's important to specify the ports
143 and addresses, since implicit ordering based on the command line order
144 may be different on the destination. This can result in the
145 device state being loaded into the wrong device.
150 Most device data can be described using the ``VMSTATE`` macros (mostly defined
170 We are declaring the state with name "pckbd". The ``version_id`` is
171 3, and there are 4 uint8_t fields in the KBDState structure. We
172 registered this ``VMSTATEDescription`` with one of the following
173 functions. The first one will generate a device ``instance_id``
174 different for each registration. Use the second one if you already
175 have an id that is different for each instance of the device:
182 For devices that are ``qdev`` based, we can register the device in the class
189 The VMState macros take care of ensuring that the device data section
191 against the types of the fields in the structures.
198 Note that the format on the wire is still very raw; i.e. a VMSTATE_UINT32
199 ends up with a 4 byte bigendian representation on the wire; in the future
208 Each device has to register two functions, one to save the state and
209 another to load the state back.
219 Two functions in the ``ops`` structure are the ``save_state``
222 have a version_id parameter because it always uses the latest version.
224 Note that because the VMState macros still save the data in a raw
226 with a carefully constructed VMState description that matches the
227 byte layout of the existing code.
232 When we migrate a device, we save/load the state as a series
234 change the state to store more/different information. Changing the migration
236 care is taken to use the appropriate techniques. In general QEMU tries
244 The most common structure change is adding new data, e.g. when adding
250 or not. If this functions returns false, the subsection is not sent.
251 Subsections have a unique name, that is looked for on the receiving
254 On the receiving side, if we found a subsection for a device that we
255 don't understand, we just fail the migration. If we understand all
256 the subsections, then we load the state with success. There's no check
259 the subsection.
261 If the new data is only needed in a rare case, then the subsection
262 can be made conditional on that case and the migration will still
264 critical, but in some use cases it's preferred that the migration
265 should succeed even with the data missing. To support this the
269 The 'pre_load' and 'post_load' functions on subsections are only
270 called if the subsection is loaded.
272 One important note is that the outer post_load() function is called "after"
273 loading all subsections, because a newer subsection could change the same
274 value that it uses. A flag, and the combination of outer pre_load and
276 fall back on default behaviour when the subsection isn't present.
325 Here we have a subsection for the pio state. We only need to
326 save/send this state when we are in the middle of a pio operation
328 not enabled, the values on that fields are garbage and don't need to
343 b) Add an entry to the ``hw_compat_`` for the previous version that sets
344 the property to false.
345 c) Add a static bool support_foo function that tests the property.
346 d) Add a subsection with a .needed set to the support_foo function
348 for 'foo' to be used if the subsection isn't loaded.
351 machine type and the migration stream will be accepted by older
357 Sometimes members of the VMState are no longer needed:
361 - making them version dependent and bumping the version will break backward migration
364 Adding a dummy field into the migration stream is normally the best way to preserve
367 If the field really does need to be removed then:
369 a) Add a new property/compatibility/function in the same way for subsections above.
370 b) replace the VMSTATE macro with the _TEST version of the macro, e.g.:
378 Sometime in the future when we no longer care about the ancient versions these can be killed off.
379 Note that for backward compatibility it's important to fill in the structure with
380 data that the destination will understand.
382 Any difference in the predicates on the source and destination will end up
383 with different fields being enabled and data being loaded into the wrong
389 Version numbers are intended for major incompatible changes to the
394 Each version is associated with a series of fields saved. The ``save_state`` always saves
395 the state as the newer version. But ``load_state`` sometimes is able to
400 - ``version_id``: the maximum version_id supported by VMState for that device.
401 - ``minimum_version_id``: the minimum version_id that VMState is able to understand
415 Saving state will always create a section with the 'version_id' value
421 Sometimes, it is not enough to be able to save the state directly
422 from one structure, we need to fill the correct values there. One
423 example is when we are using kvm. Before saving the cpu state, we
424 need to ask kvm to copy to QEMU the state that it is using. And the
425 opposite when we are loading the state, we need a way to tell kvm to
426 load the state for the cpu that we have just loaded from the QEMUFile.
428 The functions to do that are inside a vmstate definition, and are called:
432 This function is called before we load the state of one device.
436 This function is called after we load the state of one device.
440 This function is called before we save the state of one device.
444 This function is called after we save the state of one device
445 (even upon failure, unless the call to pre_save returned an error).
447 Example: You can look at hpet.c, that uses the first three functions
448 to massage the state that is transferred.
450 The ``VMSTATE_WITH_TMP`` macro may be useful when the migration
451 data doesn't match the stored device data well; it allows an
453 data and then transferred to the main structure.
470 Since the order of device save/restore is not defined, you must
480 have large amounts of data that would mean that the CPUs would be
484 The iterative devices generally don't use VMState macros
486 qemu_put_*/qemu_get_* macros to read/write data to the stream. Specialist
492 - A ``save_setup`` function that initialises the data structures and
493 transmits a first section containing information on the device. In the
496 - A ``load_setup`` function that initialises the data structures on the
500 data we must save. The core migration code will use this to
501 determine when to pause the CPUs and complete the migration.
504 data we must save. When the estimated amount is smaller than the
508 the point that stream bandwidth limits tell it to stop. Each call
511 - A ``save_live_complete_precopy`` function that must transmit the
512 last section for the device containing any remaining data.
515 any of the save functions that generate sections.
518 at the end of migration.
520 Note that the contents of the sections for iterative migration tend
521 to be open-coded by the devices; care should be taken in parsing
522 the results and structuring the stream to make them easy to validate.
527 There are cases in which the ordering of device loading matters; for
529 if the interrupt controller is loaded later then it might lose the state.
531 Some ordering is implicitly provided by the order in which the machine
534 The ``MigrationPriority`` enum provides a means of explicitly enforcing
536 The priority is set by setting the ``priority`` field of the top level
537 ``VMStateDescription`` for the device.
542 The stream tries to be word and endian agnostic, allowing migration between hosts
543 of different characteristics running the same VM.
565 Consisting of a JSON description of the contents for analysis only
567 The ``device data`` in each section consists of the data produced
568 by the code described above. For non-iterative devices they have a single
571 Note that there is very little checking by the common code of the integrity
572 of the ``device data`` contents, that's up to the devices themselves.
573 The ``footer mark`` provides a little bit of protection for the case where
574 the receiving side reads more or less data than expected.
576 The ``ID string`` is normally unique, having been formed from a bus name
580 some reason don't have a bus concept) make use of the ``instance id``
589 flag to the source at the end of migration.
591 ``qemu_file_get_return_path(QEMUFile* fwdpath)`` gives the QEMUFile* for the return