xref: /qemu/docs/specs/ppc-spapr-hotplug.rst (revision 55ff468f7816ff40e4058153127c9d19ffd36261)
1*8915106cSLeonardo Garcia=============================
2*8915106cSLeonardo GarciasPAPR Dynamic Reconfiguration
3*8915106cSLeonardo Garcia=============================
411eec063SMichael Roth
5*8915106cSLeonardo GarciasPAPR or pSeries guests make use of a facility called dynamic reconfiguration
611eec063SMichael Rothto handle hot plugging of dynamic "physical" resources like PCI cards, or
7*8915106cSLeonardo Garcia"logical"/para-virtual resources like memory, CPUs, and "physical"
811eec063SMichael Rothhost-bridges, which are generally managed by the host/hypervisor and provided
9*8915106cSLeonardo Garciato guests as virtualized resources. The specifics of dynamic reconfiguration
10*8915106cSLeonardo Garciaare documented extensively in section 13 of the Linux on Power Architecture
11*8915106cSLeonardo GarciaReference document ([LoPAR]_). This document provides a summary of that
12*8915106cSLeonardo Garciainformation as it applies to the implementation within QEMU.
1311eec063SMichael Roth
14*8915106cSLeonardo GarciaDynamic-reconfiguration Connectors
15*8915106cSLeonardo Garcia==================================
1611eec063SMichael Roth
1711eec063SMichael RothTo manage hot plug/unplug of these resources, a firmware abstraction known as
1811eec063SMichael Rotha Dynamic Resource Connector (DRC) is used to assign a particular dynamic
1911eec063SMichael Rothresource to the guest, and provide an interface for the guest to manage
2011eec063SMichael Rothconfiguration/removal of the resource associated with it.
2111eec063SMichael Roth
22*8915106cSLeonardo GarciaDevice tree description of DRCs
23*8915106cSLeonardo Garcia===============================
2411eec063SMichael Roth
25*8915106cSLeonardo GarciaA set of four Open Firmware device tree array properties are used to describe
2611eec063SMichael Roththe name/index/power-domain/type of each DRC allocated to a guest at
27*8915106cSLeonardo Garciaboot time. There may be multiple sets of these arrays, rooted at different
2811eec063SMichael Rothpaths in the device tree depending on the type of resource the DRCs manage.
2911eec063SMichael Roth
3011eec063SMichael RothIn some cases, the DRCs themselves may be provided by a dynamic resource,
3111eec063SMichael Rothsuch as the DRCs managing PCI slots on a hot plugged PHB. In this case the
3211eec063SMichael Rotharrays would be fetched as part of the device tree retrieval interfaces
33*8915106cSLeonardo Garciafor hot plugged resources described under :ref:`guest-host-interface`.
3411eec063SMichael Roth
3511eec063SMichael RothThe array properties are described below. Each entry/element in an array
3611eec063SMichael Rothdescribes the DRC identified by the element in the corresponding position
37*8915106cSLeonardo Garciaof ``ibm,drc-indexes``:
3811eec063SMichael Roth
39*8915106cSLeonardo Garcia``ibm,drc-names``
40*8915106cSLeonardo Garcia-----------------
4111eec063SMichael Roth
42*8915106cSLeonardo Garcia  First 4-bytes: big-endian (BE) encoded integer denoting the number of entries.
4311eec063SMichael Roth
44*8915106cSLeonardo Garcia  Each entry: a NULL-terminated ``<name>`` string encoded as a byte array.
4511eec063SMichael Roth
46*8915106cSLeonardo Garcia    ``<name>`` values for logical/virtual resources are defined in the Linux on
47*8915106cSLeonardo Garcia    Power Architecture Reference ([LoPAR]_) section 13.5.2.4, and basically
48*8915106cSLeonardo Garcia    consist of the type of the resource followed by a space and a numerical
49*8915106cSLeonardo Garcia    value that's unique across resources of that type.
5011eec063SMichael Roth
51*8915106cSLeonardo Garcia    ``<name>`` values for "physical" resources such as PCI or VIO devices are
52*8915106cSLeonardo Garcia    defined as being "location codes", which are the "location labels" of each
53*8915106cSLeonardo Garcia    encapsulating device, starting from the chassis down to the individual slot
54*8915106cSLeonardo Garcia    for the device, concatenated by a hyphen. This provides a mapping of
55*8915106cSLeonardo Garcia    resources to a physical location in a chassis for debugging purposes. For
56*8915106cSLeonardo Garcia    QEMU, this mapping is less important, so we assign a location code that
57*8915106cSLeonardo Garcia    conforms to naming specifications, but is simply a location label for the
58*8915106cSLeonardo Garcia    slot by itself to simplify the implementation. The naming convention for
59*8915106cSLeonardo Garcia    location labels is documented in detail in the [LoPAR]_ section 12.3.1.5,
60*8915106cSLeonardo Garcia    and in our case amounts to using ``C<n>`` for PCI/VIO device slots, where
61*8915106cSLeonardo Garcia    ``<n>`` is unique across all PCI/VIO device slots.
6211eec063SMichael Roth
63*8915106cSLeonardo Garcia``ibm,drc-indexes``
64*8915106cSLeonardo Garcia-------------------
6511eec063SMichael Roth
66*8915106cSLeonardo Garcia  First 4-bytes: BE-encoded integer denoting the number of entries.
67*8915106cSLeonardo Garcia
68*8915106cSLeonardo Garcia  Each 4-byte entry: BE-encoded ``<index>`` integer that is unique across all
69*8915106cSLeonardo Garcia  DRCs in the machine.
70*8915106cSLeonardo Garcia
71*8915106cSLeonardo Garcia    ``<index>`` is arbitrary, but in the case of QEMU we try to maintain the
72*8915106cSLeonardo Garcia    convention used to assign them to pSeries guests on pHyp (the hypervisor
73*8915106cSLeonardo Garcia    portion of PowerVM):
74*8915106cSLeonardo Garcia
75*8915106cSLeonardo Garcia      ``bit[31:28]``: integer encoding of ``<type>``, where ``<type>`` is:
76*8915106cSLeonardo Garcia
77*8915106cSLeonardo Garcia        ``1`` for CPU resource.
78*8915106cSLeonardo Garcia
79*8915106cSLeonardo Garcia        ``2`` for PHB resource.
80*8915106cSLeonardo Garcia
81*8915106cSLeonardo Garcia        ``3`` for VIO resource.
82*8915106cSLeonardo Garcia
83*8915106cSLeonardo Garcia        ``4`` for PCI resource.
84*8915106cSLeonardo Garcia
85*8915106cSLeonardo Garcia        ``8`` for memory resource.
86*8915106cSLeonardo Garcia
87*8915106cSLeonardo Garcia      ``bit[27:0]``: integer encoding of ``<id>``, where ``<id>`` is unique
88*8915106cSLeonardo Garcia      across all resources of specified type.
89*8915106cSLeonardo Garcia
90*8915106cSLeonardo Garcia``ibm,drc-power-domains``
91*8915106cSLeonardo Garcia-------------------------
92*8915106cSLeonardo Garcia
93*8915106cSLeonardo Garcia  First 4-bytes: BE-encoded integer denoting the number of entries.
94*8915106cSLeonardo Garcia
95*8915106cSLeonardo Garcia  Each 4-byte entry: 32-bit, BE-encoded ``<index>`` integer that specifies the
96*8915106cSLeonardo Garcia  power domain the resource will be assigned to. In the case of QEMU we
97*8915106cSLeonardo Garcia  associated all resources with a "live insertion" domain, where the power is
98*8915106cSLeonardo Garcia  assumed to be managed automatically. The integer value for this domain is a
99*8915106cSLeonardo Garcia  special value of ``-1``.
10011eec063SMichael Roth
10111eec063SMichael Roth
102*8915106cSLeonardo Garcia``ibm,drc-types``
103*8915106cSLeonardo Garcia-----------------
10411eec063SMichael Roth
105*8915106cSLeonardo Garcia  First 4-bytes: BE-encoded integer denoting the number of entries.
10611eec063SMichael Roth
107*8915106cSLeonardo Garcia  Each entry: a NULL-terminated ``<type>`` string encoded as a byte array.
108*8915106cSLeonardo Garcia  ``<type>`` is assigned as follows:
10911eec063SMichael Roth
110*8915106cSLeonardo Garcia    "CPU" for a CPU.
11111eec063SMichael Roth
112*8915106cSLeonardo Garcia    "PHB" for a physical host-bridge.
11311eec063SMichael Roth
114*8915106cSLeonardo Garcia    "SLOT" for a VIO slot.
11511eec063SMichael Roth
116*8915106cSLeonardo Garcia    "28" for a PCI slot.
11711eec063SMichael Roth
118*8915106cSLeonardo Garcia    "MEM" for memory resource.
11911eec063SMichael Roth
120*8915106cSLeonardo Garcia.. _guest-host-interface:
12111eec063SMichael Roth
122*8915106cSLeonardo GarciaGuest->Host interface to manage dynamic resources
123*8915106cSLeonardo Garcia=================================================
12411eec063SMichael Roth
125*8915106cSLeonardo GarciaEach DRC is given a globally unique DRC index, and resources associated with a
126*8915106cSLeonardo Garciaparticular DRC are configured/managed by the guest via a number of RTAS calls
127*8915106cSLeonardo Garciawhich reference individual DRCs based on the DRC index. This can be considered
128*8915106cSLeonardo Garciathe guest->host interface.
12911eec063SMichael Roth
130*8915106cSLeonardo Garcia``rtas-set-power-level``
131*8915106cSLeonardo Garcia------------------------
13211eec063SMichael Roth
133*8915106cSLeonardo GarciaSet the power level for a specified power domain.
13411eec063SMichael Roth
135*8915106cSLeonardo Garcia  ``arg[0]``: integer identifying power domain.
13611eec063SMichael Roth
137*8915106cSLeonardo Garcia  ``arg[1]``: new power level for the domain, ``0-100``.
13811eec063SMichael Roth
139*8915106cSLeonardo Garcia  ``output[0]``: status, ``0`` on success.
14011eec063SMichael Roth
141*8915106cSLeonardo Garcia  ``output[1]``: power level after command.
142*8915106cSLeonardo Garcia
143*8915106cSLeonardo Garcia``rtas-get-power-level``
144*8915106cSLeonardo Garcia------------------------
145*8915106cSLeonardo Garcia
146*8915106cSLeonardo GarciaGet the power level for a specified power domain.
147*8915106cSLeonardo Garcia
148*8915106cSLeonardo Garcia  ``arg[0]``: integer identifying power domain.
149*8915106cSLeonardo Garcia
150*8915106cSLeonardo Garcia  ``output[0]``: status, ``0`` on success.
151*8915106cSLeonardo Garcia
152*8915106cSLeonardo Garcia  ``output[1]``: current power level.
153*8915106cSLeonardo Garcia
154*8915106cSLeonardo Garcia``rtas-set-indicator``
155*8915106cSLeonardo Garcia----------------------
156*8915106cSLeonardo Garcia
157*8915106cSLeonardo GarciaSet the state of an indicator or sensor.
158*8915106cSLeonardo Garcia
159*8915106cSLeonardo Garcia  ``arg[0]``: integer identifying sensor/indicator type.
160*8915106cSLeonardo Garcia
161*8915106cSLeonardo Garcia  ``arg[1]``: index of sensor, for DR-related sensors this is generally the DRC
162*8915106cSLeonardo Garcia  index.
163*8915106cSLeonardo Garcia
164*8915106cSLeonardo Garcia  ``arg[2]``: desired sensor value.
165*8915106cSLeonardo Garcia
166*8915106cSLeonardo Garcia  ``output[0]``: status, ``0`` on success.
167*8915106cSLeonardo Garcia
168*8915106cSLeonardo GarciaFor the purpose of this document we focus on the indicator/sensor types
169*8915106cSLeonardo Garciaassociated with a DRC. The types are:
170*8915106cSLeonardo Garcia
171*8915106cSLeonardo Garcia* ``9001``: ``isolation-state``, controls/indicates whether a device has been
172*8915106cSLeonardo Garcia  made accessible to a guest. Supported sensor values:
173*8915106cSLeonardo Garcia
174*8915106cSLeonardo Garcia    ``0``: ``isolate``, device is made inaccessible by guest OS.
175*8915106cSLeonardo Garcia
176*8915106cSLeonardo Garcia    ``1``: ``unisolate``, device is made available to guest OS.
177*8915106cSLeonardo Garcia
178*8915106cSLeonardo Garcia* ``9002``: ``dr-indicator``, controls "visual" indicator associated with
179*8915106cSLeonardo Garcia  device. Supported sensor values:
180*8915106cSLeonardo Garcia
181*8915106cSLeonardo Garcia    ``0``: ``inactive``, resource may be safely removed.
182*8915106cSLeonardo Garcia
183*8915106cSLeonardo Garcia    ``1``: ``active``, resource is in use and cannot be safely removed.
184*8915106cSLeonardo Garcia
185*8915106cSLeonardo Garcia    ``2``: ``identify``, used to visually identify slot for interactive hot plug.
186*8915106cSLeonardo Garcia
187*8915106cSLeonardo Garcia    ``3``: ``action``, in most cases, used in the same manner as identify.
188*8915106cSLeonardo Garcia
189*8915106cSLeonardo Garcia* ``9003``: ``allocation-state``, generally only used for "logical" DR resources
190*8915106cSLeonardo Garcia  to request the allocation/deallocation of a resource prior to acquiring it via
191*8915106cSLeonardo Garcia  ``isolation-state->unisolate``, or after releasing it via
192*8915106cSLeonardo Garcia  ``isolation-state->isolate``, respectively. For "physical" DR (like PCI
193*8915106cSLeonardo Garcia  hot plug/unplug) the pre-allocation of the resource is implied and this sensor
194*8915106cSLeonardo Garcia  is unused. Supported sensor values:
195*8915106cSLeonardo Garcia
196*8915106cSLeonardo Garcia    ``0``: ``unusable``, tell firmware/system the resource can be
197*8915106cSLeonardo Garcia    unallocated/reclaimed and added back to the system resource pool.
198*8915106cSLeonardo Garcia
199*8915106cSLeonardo Garcia    ``1``: ``usable``, request the resource be allocated/reserved for use by
200*8915106cSLeonardo Garcia    guest OS.
201*8915106cSLeonardo Garcia
202*8915106cSLeonardo Garcia    ``2``: ``exchange``, used to allocate a spare resource to use for fail-over
203*8915106cSLeonardo Garcia    in certain situations. Unused in QEMU.
204*8915106cSLeonardo Garcia
205*8915106cSLeonardo Garcia    ``3``: ``recover``, used to reclaim a previously allocated resource that's
206*8915106cSLeonardo Garcia    not currently allocated to the guest OS. Unused in QEMU.
207*8915106cSLeonardo Garcia
208*8915106cSLeonardo Garcia``rtas-get-sensor-state:``
209*8915106cSLeonardo Garcia--------------------------
21011eec063SMichael Roth
21111eec063SMichael RothUsed to read an indicator or sensor value.
21211eec063SMichael Roth
213*8915106cSLeonardo Garcia  ``arg[0]``: integer identifying sensor/indicator type.
21411eec063SMichael Roth
215*8915106cSLeonardo Garcia  ``arg[1]``: index of sensor, for DR-related sensors this is generally the DRC
216*8915106cSLeonardo Garcia  index
21711eec063SMichael Roth
218*8915106cSLeonardo Garcia  ``output[0]``: status, 0 on success
21911eec063SMichael Roth
220*8915106cSLeonardo GarciaFor DR-related operations, the only noteworthy sensor is ``dr-entity-sense``,
221*8915106cSLeonardo Garciawhich has a type value of ``9003``, as ``allocation-state`` does in the case of
222*8915106cSLeonardo Garcia``rtas-set-indicator``. The semantics/encodings of the sensor values are
223*8915106cSLeonardo Garciadistinct however.
22411eec063SMichael Roth
225*8915106cSLeonardo GarciaSupported sensor values for ``dr-entity-sense`` (``9003``) sensor:
22611eec063SMichael Roth
227*8915106cSLeonardo Garcia  ``0``: empty.
228*8915106cSLeonardo Garcia
229*8915106cSLeonardo Garcia    For physical resources: DRC/slot is empty.
230*8915106cSLeonardo Garcia
231*8915106cSLeonardo Garcia    For logical resources: unused.
232*8915106cSLeonardo Garcia
233*8915106cSLeonardo Garcia  ``1``: present.
234*8915106cSLeonardo Garcia
235*8915106cSLeonardo Garcia    For physical resources: DRC/slot is populated with a device/resource.
236*8915106cSLeonardo Garcia
237*8915106cSLeonardo Garcia    For logical resources: resource has been allocated to the DRC.
238*8915106cSLeonardo Garcia
239*8915106cSLeonardo Garcia  ``2``: unusable.
240*8915106cSLeonardo Garcia
241*8915106cSLeonardo Garcia    For physical resources: unused.
242*8915106cSLeonardo Garcia
243*8915106cSLeonardo Garcia    For logical resources: DRC has no resource allocated to it.
244*8915106cSLeonardo Garcia
245*8915106cSLeonardo Garcia  ``3``: exchange.
246*8915106cSLeonardo Garcia
247*8915106cSLeonardo Garcia    For physical resources: unused.
248*8915106cSLeonardo Garcia
249*8915106cSLeonardo Garcia    For logical resources: resource available for exchange (see
250*8915106cSLeonardo Garcia    ``allocation-state`` sensor semantics above).
251*8915106cSLeonardo Garcia
252*8915106cSLeonardo Garcia  ``4``: recovery.
253*8915106cSLeonardo Garcia
254*8915106cSLeonardo Garcia    For physical resources: unused.
255*8915106cSLeonardo Garcia
256*8915106cSLeonardo Garcia    For logical resources: resource available for recovery (see
257*8915106cSLeonardo Garcia    ``allocation-state`` sensor semantics above).
258*8915106cSLeonardo Garcia
259*8915106cSLeonardo Garcia``rtas-ibm-configure-connector``
260*8915106cSLeonardo Garcia--------------------------------
261*8915106cSLeonardo Garcia
262*8915106cSLeonardo GarciaUsed to fetch an OpenFirmware device tree description of the resource associated
263*8915106cSLeonardo Garciawith a particular DRC.
264*8915106cSLeonardo Garcia
265*8915106cSLeonardo Garcia  ``arg[0]``: guest physical address of 4096-byte work area buffer.
266*8915106cSLeonardo Garcia
267*8915106cSLeonardo Garcia  ``arg[1]``: 0, or address of additional 4096-byte work area buffer; only
268*8915106cSLeonardo Garcia  non-zero if a prior RTAS response indicated a need for additional memory.
269*8915106cSLeonardo Garcia
270*8915106cSLeonardo Garcia  ``output[0]``: status:
271*8915106cSLeonardo Garcia
272*8915106cSLeonardo Garcia    ``0``: completed transmittal of device tree node.
273*8915106cSLeonardo Garcia
274*8915106cSLeonardo Garcia    ``1``: instruct guest to prepare for next device tree sibling node.
275*8915106cSLeonardo Garcia
276*8915106cSLeonardo Garcia    ``2``: instruct guest to prepare for next device tree child node.
277*8915106cSLeonardo Garcia
278*8915106cSLeonardo Garcia    ``3``: instruct guest to prepare for next device tree property.
279*8915106cSLeonardo Garcia
280*8915106cSLeonardo Garcia    ``4``: instruct guest to ascend to parent device tree node.
281*8915106cSLeonardo Garcia
282*8915106cSLeonardo Garcia    ``5``: instruct guest to provide additional work-area buffer via ``arg[1]``.
283*8915106cSLeonardo Garcia
284*8915106cSLeonardo Garcia    ``990x``: instruct guest that operation took too long and to try again
285*8915106cSLeonardo Garcia    later.
286*8915106cSLeonardo Garcia
287*8915106cSLeonardo GarciaThe DRC index is encoded in the first 4-bytes of the first work area buffer.
288*8915106cSLeonardo GarciaWork area (``wa``) layout, using 4-byte offsets:
289*8915106cSLeonardo Garcia
290*8915106cSLeonardo Garcia  ``wa[0]``: DRC index of the DRC to fetch device tree nodes from.
291*8915106cSLeonardo Garcia
292*8915106cSLeonardo Garcia  ``wa[1]``: ``0`` (hard-coded).
293*8915106cSLeonardo Garcia
294*8915106cSLeonardo Garcia  ``wa[2]``:
295*8915106cSLeonardo Garcia
296*8915106cSLeonardo Garcia    For next-sibling/next-child response:
297*8915106cSLeonardo Garcia
298*8915106cSLeonardo Garcia      ``wa`` offset of null-terminated string denoting the new node's name.
299*8915106cSLeonardo Garcia
300*8915106cSLeonardo Garcia    For next-property response:
301*8915106cSLeonardo Garcia
302*8915106cSLeonardo Garcia      ``wa`` offset of null-terminated string denoting new property's name.
303*8915106cSLeonardo Garcia
304*8915106cSLeonardo Garcia  ``wa[3]``: for next-property response (unused otherwise):
305*8915106cSLeonardo Garcia
306*8915106cSLeonardo Garcia      Byte-length of new property's value.
307*8915106cSLeonardo Garcia
308*8915106cSLeonardo Garcia  ``wa[4]``: for next-property response (unused otherwise):
309*8915106cSLeonardo Garcia
310*8915106cSLeonardo Garcia      New property's value, encoded as an OFDT-compatible byte array.
311*8915106cSLeonardo Garcia
312*8915106cSLeonardo GarciaHot plug/unplug events
313*8915106cSLeonardo Garcia======================
31411eec063SMichael Roth
31511eec063SMichael RothFor most DR operations, the hypervisor will issue host->guest add/remove events
31611eec063SMichael Rothusing the EPOW/check-exception notification framework, where the host issues a
31711eec063SMichael Rothcheck-exception interrupt, then provides an RTAS event log via an
31811eec063SMichael Rothrtas-check-exception call issued by the guest in response. This framework is
31911eec063SMichael Rothdocumented by PAPR+ v2.7, and already use in by QEMU for generating powerdown
32011eec063SMichael Rothrequests via EPOW events.
32111eec063SMichael Roth
32211eec063SMichael RothFor DR, this framework has been extended to include hotplug events, which were
32311eec063SMichael Rothpreviously unneeded due to direct manipulation of DR-related guest userspace
32411eec063SMichael Rothtools by host-level management such as an HMC. This level of management is not
325*8915106cSLeonardo Garciaapplicable to KVM on Power, hence the reason for extending the notification
32611eec063SMichael Rothframework to support hotplug events.
32711eec063SMichael Roth
3289f992ccaSMichael RothThe format for these EPOW-signalled events is described below under
329*8915106cSLeonardo Garcia:ref:`hot-plug-unplug-event-structure`. Note that these events are not formally
330*8915106cSLeonardo Garciapart of the PAPR+ specification, and have been superseded by a newer format,
331*8915106cSLeonardo Garciaalso described below under :ref:`hot-plug-unplug-event-structure`, and so are
332*8915106cSLeonardo Garcianow deemed a "legacy" format. The formats are similar, but the "modern" format
333*8915106cSLeonardo Garciacontains additional fields/flags, which are denoted for the purposes of this
334*8915106cSLeonardo Garciadocumentation with ``#ifdef GUEST_SUPPORTS_MODERN`` guards.
3359f992ccaSMichael Roth
3369f992ccaSMichael RothQEMU should assume support only for "legacy" fields/flags unless the guest
337*8915106cSLeonardo Garciaadvertises support for the "modern" format via
338*8915106cSLeonardo Garcia``ibm,client-architecture-support`` hcall by setting byte 5, bit 6 of it's
339*8915106cSLeonardo Garcia``ibm,architecture-vec-5`` option vector structure (as described by [LoPAR]_,
340*8915106cSLeonardo Garciasection B.5.2.3). As with "legacy" format events, "modern" format events are
341*8915106cSLeonardo Garciasurfaced to the guest via check-exception RTAS calls, but use a dedicated event
342*8915106cSLeonardo Garciasource to signal the guest. This event source is advertised to the guest by the
343*8915106cSLeonardo Garciaaddition of a ``hot-plug-events`` node under ``/event-sources`` node of the
344*8915106cSLeonardo Garciaguest's device tree using the standard format described in [LoPAR]_,
345*8915106cSLeonardo Garciasection B.5.12.2.
3469f992ccaSMichael Roth
347*8915106cSLeonardo Garcia.. _hot-plug-unplug-event-structure:
3489f992ccaSMichael Roth
349*8915106cSLeonardo GarciaHot plug/unplug event structure
350*8915106cSLeonardo Garcia===============================
351*8915106cSLeonardo Garcia
352*8915106cSLeonardo GarciaThe hot plug specific payload in QEMU is implemented as follows (with all values
35311eec063SMichael Rothencoded in big-endian format):
35411eec063SMichael Roth
355*8915106cSLeonardo Garcia.. code-block:: c
356*8915106cSLeonardo Garcia
35711eec063SMichael Roth   struct rtas_event_log_v6_hp {
35811eec063SMichael Roth   #define SECTION_ID_HOTPLUG              0x4850 /* HP */
35911eec063SMichael Roth       struct section_header {
36011eec063SMichael Roth           uint16_t section_id;            /* set to SECTION_ID_HOTPLUG */
36111eec063SMichael Roth           uint16_t section_length;        /* sizeof(rtas_event_log_v6_hp),
36211eec063SMichael Roth                                            * plus the length of the DRC name
36311eec063SMichael Roth                                            * if a DRC name identifier is
36411eec063SMichael Roth                                            * specified for hotplug_identifier
36511eec063SMichael Roth                                            */
36611eec063SMichael Roth           uint8_t section_version;        /* version 1 */
36711eec063SMichael Roth           uint8_t section_subtype;        /* unused */
36811eec063SMichael Roth           uint16_t creator_component_id;  /* unused */
36911eec063SMichael Roth       } hdr;
37011eec063SMichael Roth   #define RTAS_LOG_V6_HP_TYPE_CPU         1
37111eec063SMichael Roth   #define RTAS_LOG_V6_HP_TYPE_MEMORY      2
37211eec063SMichael Roth   #define RTAS_LOG_V6_HP_TYPE_SLOT        3
37311eec063SMichael Roth   #define RTAS_LOG_V6_HP_TYPE_PHB         4
37411eec063SMichael Roth   #define RTAS_LOG_V6_HP_TYPE_PCI         5
37511eec063SMichael Roth       uint8_t hotplug_type;               /* type of resource/device */
37611eec063SMichael Roth   #define RTAS_LOG_V6_HP_ACTION_ADD       1
37711eec063SMichael Roth   #define RTAS_LOG_V6_HP_ACTION_REMOVE    2
37811eec063SMichael Roth       uint8_t hotplug_action;             /* action (add/remove) */
37911eec063SMichael Roth   #define RTAS_LOG_V6_HP_ID_DRC_NAME          1
38011eec063SMichael Roth   #define RTAS_LOG_V6_HP_ID_DRC_INDEX         2
38111eec063SMichael Roth   #define RTAS_LOG_V6_HP_ID_DRC_COUNT         3
3829f992ccaSMichael Roth   #ifdef GUEST_SUPPORTS_MODERN
3839f992ccaSMichael Roth   #define RTAS_LOG_V6_HP_ID_DRC_COUNT_INDEXED 4
3849f992ccaSMichael Roth   #endif
38511eec063SMichael Roth       uint8_t hotplug_identifier;         /* type of the resource identifier,
38611eec063SMichael Roth                                            * which serves as the discriminator
38711eec063SMichael Roth                                            * for the 'drc' union field below
38811eec063SMichael Roth                                            */
3899f992ccaSMichael Roth   #ifdef GUEST_SUPPORTS_MODERN
3909f992ccaSMichael Roth       uint8_t capabilities;               /* capability flags, currently unused
3919f992ccaSMichael Roth                                            * by QEMU
3929f992ccaSMichael Roth                                            */
3939f992ccaSMichael Roth   #else
39411eec063SMichael Roth       uint8_t reserved;
3959f992ccaSMichael Roth   #endif
39611eec063SMichael Roth       union {
39711eec063SMichael Roth           uint32_t index;                 /* DRC index of resource to take action
39811eec063SMichael Roth                                            * on
39911eec063SMichael Roth                                            */
40011eec063SMichael Roth           uint32_t count;                 /* number of DR resources to take
40111eec063SMichael Roth                                            * action on (guest chooses which)
40211eec063SMichael Roth                                            */
4039f992ccaSMichael Roth   #ifdef GUEST_SUPPORTS_MODERN
4049f992ccaSMichael Roth           struct {
4059f992ccaSMichael Roth               uint32_t count;             /* number of DR resources to take
4069f992ccaSMichael Roth                                            * action on
4079f992ccaSMichael Roth                                            */
4089f992ccaSMichael Roth               uint32_t index;             /* DRC index of first resource to take
4099f992ccaSMichael Roth                                            * action on. guest will take action
4109f992ccaSMichael Roth                                            * on DRC index <index> through
4119f992ccaSMichael Roth                                            * DRC index <index + count - 1> in
4129f992ccaSMichael Roth                                            * sequential order
4139f992ccaSMichael Roth                                            */
4149f992ccaSMichael Roth           } count_indexed;
4159f992ccaSMichael Roth   #endif
41611eec063SMichael Roth           char name[1];                   /* string representing the name of the
41711eec063SMichael Roth                                            * DRC to take action on
41811eec063SMichael Roth                                            */
41911eec063SMichael Roth       } drc;
42011eec063SMichael Roth   } QEMU_PACKED;
42111eec063SMichael Roth
422*8915106cSLeonardo Garcia``ibm,lrdr-capacity``
423*8915106cSLeonardo Garcia=====================
424db4ef288SBharata B Rao
425*8915106cSLeonardo Garcia``ibm,lrdr-capacity`` is a property in the /rtas device tree node that
426*8915106cSLeonardo Garciaidentifies the dynamic reconfiguration capabilities of the guest. It consists
427*8915106cSLeonardo Garciaof a triple consisting of ``<phys>``, ``<size>`` and ``<maxcpus>``.
428db4ef288SBharata B Rao
429*8915106cSLeonardo Garcia  ``<phys>``, encoded in BE format represents the maximum address in bytes and
430db4ef288SBharata B Rao  hence the maximum memory that can be allocated to the guest.
431db4ef288SBharata B Rao
432*8915106cSLeonardo Garcia  ``<size>``, encoded in BE format represents the size increments in which
433db4ef288SBharata B Rao  memory can be hot-plugged to the guest.
434db4ef288SBharata B Rao
435*8915106cSLeonardo Garcia  ``<maxcpus>``, a BE-encoded integer, represents the maximum number of
436db4ef288SBharata B Rao  processors that the guest can have.
437db4ef288SBharata B Rao
438*8915106cSLeonardo Garcia``pseries`` guests use this property to note the maximum allowed CPUs for the
439db4ef288SBharata B Raoguest.
440db4ef288SBharata B Rao
441*8915106cSLeonardo Garcia``ibm,dynamic-reconfiguration-memory``
442*8915106cSLeonardo Garcia======================================
44303d196b7SBharata B Rao
444*8915106cSLeonardo Garcia``ibm,dynamic-reconfiguration-memory`` is a device tree node that represents
445*8915106cSLeonardo Garciadynamically reconfigurable logical memory blocks (LMB). This node is generated
446*8915106cSLeonardo Garciaonly when the guest advertises the support for it via
447*8915106cSLeonardo Garcia``ibm,client-architecture-support`` call. Memory that is not dynamically
448*8915106cSLeonardo Garciareconfigurable is represented by ``/memory`` nodes. The properties of this node
449*8915106cSLeonardo Garciathat are of interest to the sPAPR memory hotplug implementation in QEMU are
450*8915106cSLeonardo Garciadescribed here.
45103d196b7SBharata B Rao
452*8915106cSLeonardo Garcia``ibm,lmb-size``
453*8915106cSLeonardo Garcia----------------
45403d196b7SBharata B Rao
455*8915106cSLeonardo GarciaThis 64-bit integer defines the size of each dynamically reconfigurable LMB.
45603d196b7SBharata B Rao
457*8915106cSLeonardo Garcia``ibm,associativity-lookup-arrays``
458*8915106cSLeonardo Garcia-----------------------------------
45903d196b7SBharata B Rao
46003d196b7SBharata B RaoThis property defines a lookup array in which the NUMA associativity
46103d196b7SBharata B Raoinformation for each LMB can be found. It is a property encoded array
46203d196b7SBharata B Raothat begins with an integer M, the number of associativity lists followed
46303d196b7SBharata B Raoby an integer N, the number of entries per associativity list and terminated
46403d196b7SBharata B Raoby M associativity lists each of length N integers.
46503d196b7SBharata B Rao
466*8915106cSLeonardo GarciaThis property provides the same information as given by ``ibm,associativity``
467*8915106cSLeonardo Garciaproperty in a ``/memory`` node. Each assigned LMB has an index value between
46803d196b7SBharata B Rao0 and M-1 which is used as an index into this table to select which
469*8915106cSLeonardo Garciaassociativity list to use for the LMB. This index value for each LMB is defined
470*8915106cSLeonardo Garciain ``ibm,dynamic-memory`` property.
47103d196b7SBharata B Rao
472*8915106cSLeonardo Garcia``ibm,dynamic-memory``
473*8915106cSLeonardo Garcia----------------------
47403d196b7SBharata B Rao
47503d196b7SBharata B RaoThis property describes the dynamically reconfigurable memory. It is a
47603d196b7SBharata B Raoproperty encoded array that has an integer N, the number of LMBs followed
47776ca4b58Szhaolichangby N LMB list entries.
47803d196b7SBharata B Rao
47903d196b7SBharata B RaoEach LMB list entry consists of the following elements:
48003d196b7SBharata B Rao
481*8915106cSLeonardo Garcia- Logical address of the start of the LMB encoded as a 64-bit integer. This
482*8915106cSLeonardo Garcia  corresponds to ``reg`` property in ``/memory`` node.
483*8915106cSLeonardo Garcia- DRC index of the LMB that corresponds to ``ibm,my-drc-index`` property
484*8915106cSLeonardo Garcia  in a ``/memory`` node.
48503d196b7SBharata B Rao- Four bytes reserved for expansion.
48603d196b7SBharata B Rao- Associativity list index for the LMB that is used as an index into
487*8915106cSLeonardo Garcia  ``ibm,associativity-lookup-arrays`` property described earlier. This is used
488*8915106cSLeonardo Garcia  to retrieve the right associativity list to be used for this LMB.
489*8915106cSLeonardo Garcia- A 32-bit flags word. The bit at bit position ``0x00000008`` defines whether
490df59feb1SDr. David Alan Gilbert  the LMB is assigned to the partition as of boot time.
49103d196b7SBharata B Rao
492*8915106cSLeonardo Garcia``ibm,dynamic-memory-v2``
493*8915106cSLeonardo Garcia-------------------------
494a324d6f1SBharata B Rao
495a324d6f1SBharata B RaoThis property describes the dynamically reconfigurable memory. This is
49676ca4b58Szhaolichangan alternate and newer way to describe dynamically reconfigurable memory.
497a324d6f1SBharata B RaoIt is a property encoded array that has an integer N (the number of
498a324d6f1SBharata B RaoLMB set entries) followed by N LMB set entries. There is an LMB set entry
499a324d6f1SBharata B Raofor each sequential group of LMBs that share common attributes.
500a324d6f1SBharata B Rao
501a324d6f1SBharata B RaoEach LMB set entry consists of the following elements:
502a324d6f1SBharata B Rao
503*8915106cSLeonardo Garcia- Number of sequential LMBs in the entry represented by a 32-bit integer.
504*8915106cSLeonardo Garcia- Logical address of the first LMB in the set encoded as a 64-bit integer.
505a324d6f1SBharata B Rao- DRC index of the first LMB in the set.
506a324d6f1SBharata B Rao- Associativity list index that is used as an index into
507*8915106cSLeonardo Garcia  ``ibm,associativity-lookup-arrays`` property described earlier. This
508a324d6f1SBharata B Rao  is used to retrieve the right associativity list to be used for all
509a324d6f1SBharata B Rao  the LMBs in this set.
510*8915106cSLeonardo Garcia- A 32-bit flags word that applies to all the LMBs in the set.
511