1*8915106cSLeonardo Garcia============================= 2*8915106cSLeonardo GarciasPAPR Dynamic Reconfiguration 3*8915106cSLeonardo Garcia============================= 411eec063SMichael Roth 5*8915106cSLeonardo GarciasPAPR or pSeries guests make use of a facility called dynamic reconfiguration 611eec063SMichael Rothto handle hot plugging of dynamic "physical" resources like PCI cards, or 7*8915106cSLeonardo Garcia"logical"/para-virtual resources like memory, CPUs, and "physical" 811eec063SMichael Rothhost-bridges, which are generally managed by the host/hypervisor and provided 9*8915106cSLeonardo Garciato guests as virtualized resources. The specifics of dynamic reconfiguration 10*8915106cSLeonardo Garciaare documented extensively in section 13 of the Linux on Power Architecture 11*8915106cSLeonardo GarciaReference document ([LoPAR]_). This document provides a summary of that 12*8915106cSLeonardo Garciainformation as it applies to the implementation within QEMU. 1311eec063SMichael Roth 14*8915106cSLeonardo GarciaDynamic-reconfiguration Connectors 15*8915106cSLeonardo Garcia================================== 1611eec063SMichael Roth 1711eec063SMichael RothTo manage hot plug/unplug of these resources, a firmware abstraction known as 1811eec063SMichael Rotha Dynamic Resource Connector (DRC) is used to assign a particular dynamic 1911eec063SMichael Rothresource to the guest, and provide an interface for the guest to manage 2011eec063SMichael Rothconfiguration/removal of the resource associated with it. 2111eec063SMichael Roth 22*8915106cSLeonardo GarciaDevice tree description of DRCs 23*8915106cSLeonardo Garcia=============================== 2411eec063SMichael Roth 25*8915106cSLeonardo GarciaA set of four Open Firmware device tree array properties are used to describe 2611eec063SMichael Roththe name/index/power-domain/type of each DRC allocated to a guest at 27*8915106cSLeonardo Garciaboot time. There may be multiple sets of these arrays, rooted at different 2811eec063SMichael Rothpaths in the device tree depending on the type of resource the DRCs manage. 2911eec063SMichael Roth 3011eec063SMichael RothIn some cases, the DRCs themselves may be provided by a dynamic resource, 3111eec063SMichael Rothsuch as the DRCs managing PCI slots on a hot plugged PHB. In this case the 3211eec063SMichael Rotharrays would be fetched as part of the device tree retrieval interfaces 33*8915106cSLeonardo Garciafor hot plugged resources described under :ref:`guest-host-interface`. 3411eec063SMichael Roth 3511eec063SMichael RothThe array properties are described below. Each entry/element in an array 3611eec063SMichael Rothdescribes the DRC identified by the element in the corresponding position 37*8915106cSLeonardo Garciaof ``ibm,drc-indexes``: 3811eec063SMichael Roth 39*8915106cSLeonardo Garcia``ibm,drc-names`` 40*8915106cSLeonardo Garcia----------------- 4111eec063SMichael Roth 42*8915106cSLeonardo Garcia First 4-bytes: big-endian (BE) encoded integer denoting the number of entries. 4311eec063SMichael Roth 44*8915106cSLeonardo Garcia Each entry: a NULL-terminated ``<name>`` string encoded as a byte array. 4511eec063SMichael Roth 46*8915106cSLeonardo Garcia ``<name>`` values for logical/virtual resources are defined in the Linux on 47*8915106cSLeonardo Garcia Power Architecture Reference ([LoPAR]_) section 13.5.2.4, and basically 48*8915106cSLeonardo Garcia consist of the type of the resource followed by a space and a numerical 49*8915106cSLeonardo Garcia value that's unique across resources of that type. 5011eec063SMichael Roth 51*8915106cSLeonardo Garcia ``<name>`` values for "physical" resources such as PCI or VIO devices are 52*8915106cSLeonardo Garcia defined as being "location codes", which are the "location labels" of each 53*8915106cSLeonardo Garcia encapsulating device, starting from the chassis down to the individual slot 54*8915106cSLeonardo Garcia for the device, concatenated by a hyphen. This provides a mapping of 55*8915106cSLeonardo Garcia resources to a physical location in a chassis for debugging purposes. For 56*8915106cSLeonardo Garcia QEMU, this mapping is less important, so we assign a location code that 57*8915106cSLeonardo Garcia conforms to naming specifications, but is simply a location label for the 58*8915106cSLeonardo Garcia slot by itself to simplify the implementation. The naming convention for 59*8915106cSLeonardo Garcia location labels is documented in detail in the [LoPAR]_ section 12.3.1.5, 60*8915106cSLeonardo Garcia and in our case amounts to using ``C<n>`` for PCI/VIO device slots, where 61*8915106cSLeonardo Garcia ``<n>`` is unique across all PCI/VIO device slots. 6211eec063SMichael Roth 63*8915106cSLeonardo Garcia``ibm,drc-indexes`` 64*8915106cSLeonardo Garcia------------------- 6511eec063SMichael Roth 66*8915106cSLeonardo Garcia First 4-bytes: BE-encoded integer denoting the number of entries. 67*8915106cSLeonardo Garcia 68*8915106cSLeonardo Garcia Each 4-byte entry: BE-encoded ``<index>`` integer that is unique across all 69*8915106cSLeonardo Garcia DRCs in the machine. 70*8915106cSLeonardo Garcia 71*8915106cSLeonardo Garcia ``<index>`` is arbitrary, but in the case of QEMU we try to maintain the 72*8915106cSLeonardo Garcia convention used to assign them to pSeries guests on pHyp (the hypervisor 73*8915106cSLeonardo Garcia portion of PowerVM): 74*8915106cSLeonardo Garcia 75*8915106cSLeonardo Garcia ``bit[31:28]``: integer encoding of ``<type>``, where ``<type>`` is: 76*8915106cSLeonardo Garcia 77*8915106cSLeonardo Garcia ``1`` for CPU resource. 78*8915106cSLeonardo Garcia 79*8915106cSLeonardo Garcia ``2`` for PHB resource. 80*8915106cSLeonardo Garcia 81*8915106cSLeonardo Garcia ``3`` for VIO resource. 82*8915106cSLeonardo Garcia 83*8915106cSLeonardo Garcia ``4`` for PCI resource. 84*8915106cSLeonardo Garcia 85*8915106cSLeonardo Garcia ``8`` for memory resource. 86*8915106cSLeonardo Garcia 87*8915106cSLeonardo Garcia ``bit[27:0]``: integer encoding of ``<id>``, where ``<id>`` is unique 88*8915106cSLeonardo Garcia across all resources of specified type. 89*8915106cSLeonardo Garcia 90*8915106cSLeonardo Garcia``ibm,drc-power-domains`` 91*8915106cSLeonardo Garcia------------------------- 92*8915106cSLeonardo Garcia 93*8915106cSLeonardo Garcia First 4-bytes: BE-encoded integer denoting the number of entries. 94*8915106cSLeonardo Garcia 95*8915106cSLeonardo Garcia Each 4-byte entry: 32-bit, BE-encoded ``<index>`` integer that specifies the 96*8915106cSLeonardo Garcia power domain the resource will be assigned to. In the case of QEMU we 97*8915106cSLeonardo Garcia associated all resources with a "live insertion" domain, where the power is 98*8915106cSLeonardo Garcia assumed to be managed automatically. The integer value for this domain is a 99*8915106cSLeonardo Garcia special value of ``-1``. 10011eec063SMichael Roth 10111eec063SMichael Roth 102*8915106cSLeonardo Garcia``ibm,drc-types`` 103*8915106cSLeonardo Garcia----------------- 10411eec063SMichael Roth 105*8915106cSLeonardo Garcia First 4-bytes: BE-encoded integer denoting the number of entries. 10611eec063SMichael Roth 107*8915106cSLeonardo Garcia Each entry: a NULL-terminated ``<type>`` string encoded as a byte array. 108*8915106cSLeonardo Garcia ``<type>`` is assigned as follows: 10911eec063SMichael Roth 110*8915106cSLeonardo Garcia "CPU" for a CPU. 11111eec063SMichael Roth 112*8915106cSLeonardo Garcia "PHB" for a physical host-bridge. 11311eec063SMichael Roth 114*8915106cSLeonardo Garcia "SLOT" for a VIO slot. 11511eec063SMichael Roth 116*8915106cSLeonardo Garcia "28" for a PCI slot. 11711eec063SMichael Roth 118*8915106cSLeonardo Garcia "MEM" for memory resource. 11911eec063SMichael Roth 120*8915106cSLeonardo Garcia.. _guest-host-interface: 12111eec063SMichael Roth 122*8915106cSLeonardo GarciaGuest->Host interface to manage dynamic resources 123*8915106cSLeonardo Garcia================================================= 12411eec063SMichael Roth 125*8915106cSLeonardo GarciaEach DRC is given a globally unique DRC index, and resources associated with a 126*8915106cSLeonardo Garciaparticular DRC are configured/managed by the guest via a number of RTAS calls 127*8915106cSLeonardo Garciawhich reference individual DRCs based on the DRC index. This can be considered 128*8915106cSLeonardo Garciathe guest->host interface. 12911eec063SMichael Roth 130*8915106cSLeonardo Garcia``rtas-set-power-level`` 131*8915106cSLeonardo Garcia------------------------ 13211eec063SMichael Roth 133*8915106cSLeonardo GarciaSet the power level for a specified power domain. 13411eec063SMichael Roth 135*8915106cSLeonardo Garcia ``arg[0]``: integer identifying power domain. 13611eec063SMichael Roth 137*8915106cSLeonardo Garcia ``arg[1]``: new power level for the domain, ``0-100``. 13811eec063SMichael Roth 139*8915106cSLeonardo Garcia ``output[0]``: status, ``0`` on success. 14011eec063SMichael Roth 141*8915106cSLeonardo Garcia ``output[1]``: power level after command. 142*8915106cSLeonardo Garcia 143*8915106cSLeonardo Garcia``rtas-get-power-level`` 144*8915106cSLeonardo Garcia------------------------ 145*8915106cSLeonardo Garcia 146*8915106cSLeonardo GarciaGet the power level for a specified power domain. 147*8915106cSLeonardo Garcia 148*8915106cSLeonardo Garcia ``arg[0]``: integer identifying power domain. 149*8915106cSLeonardo Garcia 150*8915106cSLeonardo Garcia ``output[0]``: status, ``0`` on success. 151*8915106cSLeonardo Garcia 152*8915106cSLeonardo Garcia ``output[1]``: current power level. 153*8915106cSLeonardo Garcia 154*8915106cSLeonardo Garcia``rtas-set-indicator`` 155*8915106cSLeonardo Garcia---------------------- 156*8915106cSLeonardo Garcia 157*8915106cSLeonardo GarciaSet the state of an indicator or sensor. 158*8915106cSLeonardo Garcia 159*8915106cSLeonardo Garcia ``arg[0]``: integer identifying sensor/indicator type. 160*8915106cSLeonardo Garcia 161*8915106cSLeonardo Garcia ``arg[1]``: index of sensor, for DR-related sensors this is generally the DRC 162*8915106cSLeonardo Garcia index. 163*8915106cSLeonardo Garcia 164*8915106cSLeonardo Garcia ``arg[2]``: desired sensor value. 165*8915106cSLeonardo Garcia 166*8915106cSLeonardo Garcia ``output[0]``: status, ``0`` on success. 167*8915106cSLeonardo Garcia 168*8915106cSLeonardo GarciaFor the purpose of this document we focus on the indicator/sensor types 169*8915106cSLeonardo Garciaassociated with a DRC. The types are: 170*8915106cSLeonardo Garcia 171*8915106cSLeonardo Garcia* ``9001``: ``isolation-state``, controls/indicates whether a device has been 172*8915106cSLeonardo Garcia made accessible to a guest. Supported sensor values: 173*8915106cSLeonardo Garcia 174*8915106cSLeonardo Garcia ``0``: ``isolate``, device is made inaccessible by guest OS. 175*8915106cSLeonardo Garcia 176*8915106cSLeonardo Garcia ``1``: ``unisolate``, device is made available to guest OS. 177*8915106cSLeonardo Garcia 178*8915106cSLeonardo Garcia* ``9002``: ``dr-indicator``, controls "visual" indicator associated with 179*8915106cSLeonardo Garcia device. Supported sensor values: 180*8915106cSLeonardo Garcia 181*8915106cSLeonardo Garcia ``0``: ``inactive``, resource may be safely removed. 182*8915106cSLeonardo Garcia 183*8915106cSLeonardo Garcia ``1``: ``active``, resource is in use and cannot be safely removed. 184*8915106cSLeonardo Garcia 185*8915106cSLeonardo Garcia ``2``: ``identify``, used to visually identify slot for interactive hot plug. 186*8915106cSLeonardo Garcia 187*8915106cSLeonardo Garcia ``3``: ``action``, in most cases, used in the same manner as identify. 188*8915106cSLeonardo Garcia 189*8915106cSLeonardo Garcia* ``9003``: ``allocation-state``, generally only used for "logical" DR resources 190*8915106cSLeonardo Garcia to request the allocation/deallocation of a resource prior to acquiring it via 191*8915106cSLeonardo Garcia ``isolation-state->unisolate``, or after releasing it via 192*8915106cSLeonardo Garcia ``isolation-state->isolate``, respectively. For "physical" DR (like PCI 193*8915106cSLeonardo Garcia hot plug/unplug) the pre-allocation of the resource is implied and this sensor 194*8915106cSLeonardo Garcia is unused. Supported sensor values: 195*8915106cSLeonardo Garcia 196*8915106cSLeonardo Garcia ``0``: ``unusable``, tell firmware/system the resource can be 197*8915106cSLeonardo Garcia unallocated/reclaimed and added back to the system resource pool. 198*8915106cSLeonardo Garcia 199*8915106cSLeonardo Garcia ``1``: ``usable``, request the resource be allocated/reserved for use by 200*8915106cSLeonardo Garcia guest OS. 201*8915106cSLeonardo Garcia 202*8915106cSLeonardo Garcia ``2``: ``exchange``, used to allocate a spare resource to use for fail-over 203*8915106cSLeonardo Garcia in certain situations. Unused in QEMU. 204*8915106cSLeonardo Garcia 205*8915106cSLeonardo Garcia ``3``: ``recover``, used to reclaim a previously allocated resource that's 206*8915106cSLeonardo Garcia not currently allocated to the guest OS. Unused in QEMU. 207*8915106cSLeonardo Garcia 208*8915106cSLeonardo Garcia``rtas-get-sensor-state:`` 209*8915106cSLeonardo Garcia-------------------------- 21011eec063SMichael Roth 21111eec063SMichael RothUsed to read an indicator or sensor value. 21211eec063SMichael Roth 213*8915106cSLeonardo Garcia ``arg[0]``: integer identifying sensor/indicator type. 21411eec063SMichael Roth 215*8915106cSLeonardo Garcia ``arg[1]``: index of sensor, for DR-related sensors this is generally the DRC 216*8915106cSLeonardo Garcia index 21711eec063SMichael Roth 218*8915106cSLeonardo Garcia ``output[0]``: status, 0 on success 21911eec063SMichael Roth 220*8915106cSLeonardo GarciaFor DR-related operations, the only noteworthy sensor is ``dr-entity-sense``, 221*8915106cSLeonardo Garciawhich has a type value of ``9003``, as ``allocation-state`` does in the case of 222*8915106cSLeonardo Garcia``rtas-set-indicator``. The semantics/encodings of the sensor values are 223*8915106cSLeonardo Garciadistinct however. 22411eec063SMichael Roth 225*8915106cSLeonardo GarciaSupported sensor values for ``dr-entity-sense`` (``9003``) sensor: 22611eec063SMichael Roth 227*8915106cSLeonardo Garcia ``0``: empty. 228*8915106cSLeonardo Garcia 229*8915106cSLeonardo Garcia For physical resources: DRC/slot is empty. 230*8915106cSLeonardo Garcia 231*8915106cSLeonardo Garcia For logical resources: unused. 232*8915106cSLeonardo Garcia 233*8915106cSLeonardo Garcia ``1``: present. 234*8915106cSLeonardo Garcia 235*8915106cSLeonardo Garcia For physical resources: DRC/slot is populated with a device/resource. 236*8915106cSLeonardo Garcia 237*8915106cSLeonardo Garcia For logical resources: resource has been allocated to the DRC. 238*8915106cSLeonardo Garcia 239*8915106cSLeonardo Garcia ``2``: unusable. 240*8915106cSLeonardo Garcia 241*8915106cSLeonardo Garcia For physical resources: unused. 242*8915106cSLeonardo Garcia 243*8915106cSLeonardo Garcia For logical resources: DRC has no resource allocated to it. 244*8915106cSLeonardo Garcia 245*8915106cSLeonardo Garcia ``3``: exchange. 246*8915106cSLeonardo Garcia 247*8915106cSLeonardo Garcia For physical resources: unused. 248*8915106cSLeonardo Garcia 249*8915106cSLeonardo Garcia For logical resources: resource available for exchange (see 250*8915106cSLeonardo Garcia ``allocation-state`` sensor semantics above). 251*8915106cSLeonardo Garcia 252*8915106cSLeonardo Garcia ``4``: recovery. 253*8915106cSLeonardo Garcia 254*8915106cSLeonardo Garcia For physical resources: unused. 255*8915106cSLeonardo Garcia 256*8915106cSLeonardo Garcia For logical resources: resource available for recovery (see 257*8915106cSLeonardo Garcia ``allocation-state`` sensor semantics above). 258*8915106cSLeonardo Garcia 259*8915106cSLeonardo Garcia``rtas-ibm-configure-connector`` 260*8915106cSLeonardo Garcia-------------------------------- 261*8915106cSLeonardo Garcia 262*8915106cSLeonardo GarciaUsed to fetch an OpenFirmware device tree description of the resource associated 263*8915106cSLeonardo Garciawith a particular DRC. 264*8915106cSLeonardo Garcia 265*8915106cSLeonardo Garcia ``arg[0]``: guest physical address of 4096-byte work area buffer. 266*8915106cSLeonardo Garcia 267*8915106cSLeonardo Garcia ``arg[1]``: 0, or address of additional 4096-byte work area buffer; only 268*8915106cSLeonardo Garcia non-zero if a prior RTAS response indicated a need for additional memory. 269*8915106cSLeonardo Garcia 270*8915106cSLeonardo Garcia ``output[0]``: status: 271*8915106cSLeonardo Garcia 272*8915106cSLeonardo Garcia ``0``: completed transmittal of device tree node. 273*8915106cSLeonardo Garcia 274*8915106cSLeonardo Garcia ``1``: instruct guest to prepare for next device tree sibling node. 275*8915106cSLeonardo Garcia 276*8915106cSLeonardo Garcia ``2``: instruct guest to prepare for next device tree child node. 277*8915106cSLeonardo Garcia 278*8915106cSLeonardo Garcia ``3``: instruct guest to prepare for next device tree property. 279*8915106cSLeonardo Garcia 280*8915106cSLeonardo Garcia ``4``: instruct guest to ascend to parent device tree node. 281*8915106cSLeonardo Garcia 282*8915106cSLeonardo Garcia ``5``: instruct guest to provide additional work-area buffer via ``arg[1]``. 283*8915106cSLeonardo Garcia 284*8915106cSLeonardo Garcia ``990x``: instruct guest that operation took too long and to try again 285*8915106cSLeonardo Garcia later. 286*8915106cSLeonardo Garcia 287*8915106cSLeonardo GarciaThe DRC index is encoded in the first 4-bytes of the first work area buffer. 288*8915106cSLeonardo GarciaWork area (``wa``) layout, using 4-byte offsets: 289*8915106cSLeonardo Garcia 290*8915106cSLeonardo Garcia ``wa[0]``: DRC index of the DRC to fetch device tree nodes from. 291*8915106cSLeonardo Garcia 292*8915106cSLeonardo Garcia ``wa[1]``: ``0`` (hard-coded). 293*8915106cSLeonardo Garcia 294*8915106cSLeonardo Garcia ``wa[2]``: 295*8915106cSLeonardo Garcia 296*8915106cSLeonardo Garcia For next-sibling/next-child response: 297*8915106cSLeonardo Garcia 298*8915106cSLeonardo Garcia ``wa`` offset of null-terminated string denoting the new node's name. 299*8915106cSLeonardo Garcia 300*8915106cSLeonardo Garcia For next-property response: 301*8915106cSLeonardo Garcia 302*8915106cSLeonardo Garcia ``wa`` offset of null-terminated string denoting new property's name. 303*8915106cSLeonardo Garcia 304*8915106cSLeonardo Garcia ``wa[3]``: for next-property response (unused otherwise): 305*8915106cSLeonardo Garcia 306*8915106cSLeonardo Garcia Byte-length of new property's value. 307*8915106cSLeonardo Garcia 308*8915106cSLeonardo Garcia ``wa[4]``: for next-property response (unused otherwise): 309*8915106cSLeonardo Garcia 310*8915106cSLeonardo Garcia New property's value, encoded as an OFDT-compatible byte array. 311*8915106cSLeonardo Garcia 312*8915106cSLeonardo GarciaHot plug/unplug events 313*8915106cSLeonardo Garcia====================== 31411eec063SMichael Roth 31511eec063SMichael RothFor most DR operations, the hypervisor will issue host->guest add/remove events 31611eec063SMichael Rothusing the EPOW/check-exception notification framework, where the host issues a 31711eec063SMichael Rothcheck-exception interrupt, then provides an RTAS event log via an 31811eec063SMichael Rothrtas-check-exception call issued by the guest in response. This framework is 31911eec063SMichael Rothdocumented by PAPR+ v2.7, and already use in by QEMU for generating powerdown 32011eec063SMichael Rothrequests via EPOW events. 32111eec063SMichael Roth 32211eec063SMichael RothFor DR, this framework has been extended to include hotplug events, which were 32311eec063SMichael Rothpreviously unneeded due to direct manipulation of DR-related guest userspace 32411eec063SMichael Rothtools by host-level management such as an HMC. This level of management is not 325*8915106cSLeonardo Garciaapplicable to KVM on Power, hence the reason for extending the notification 32611eec063SMichael Rothframework to support hotplug events. 32711eec063SMichael Roth 3289f992ccaSMichael RothThe format for these EPOW-signalled events is described below under 329*8915106cSLeonardo Garcia:ref:`hot-plug-unplug-event-structure`. Note that these events are not formally 330*8915106cSLeonardo Garciapart of the PAPR+ specification, and have been superseded by a newer format, 331*8915106cSLeonardo Garciaalso described below under :ref:`hot-plug-unplug-event-structure`, and so are 332*8915106cSLeonardo Garcianow deemed a "legacy" format. The formats are similar, but the "modern" format 333*8915106cSLeonardo Garciacontains additional fields/flags, which are denoted for the purposes of this 334*8915106cSLeonardo Garciadocumentation with ``#ifdef GUEST_SUPPORTS_MODERN`` guards. 3359f992ccaSMichael Roth 3369f992ccaSMichael RothQEMU should assume support only for "legacy" fields/flags unless the guest 337*8915106cSLeonardo Garciaadvertises support for the "modern" format via 338*8915106cSLeonardo Garcia``ibm,client-architecture-support`` hcall by setting byte 5, bit 6 of it's 339*8915106cSLeonardo Garcia``ibm,architecture-vec-5`` option vector structure (as described by [LoPAR]_, 340*8915106cSLeonardo Garciasection B.5.2.3). As with "legacy" format events, "modern" format events are 341*8915106cSLeonardo Garciasurfaced to the guest via check-exception RTAS calls, but use a dedicated event 342*8915106cSLeonardo Garciasource to signal the guest. This event source is advertised to the guest by the 343*8915106cSLeonardo Garciaaddition of a ``hot-plug-events`` node under ``/event-sources`` node of the 344*8915106cSLeonardo Garciaguest's device tree using the standard format described in [LoPAR]_, 345*8915106cSLeonardo Garciasection B.5.12.2. 3469f992ccaSMichael Roth 347*8915106cSLeonardo Garcia.. _hot-plug-unplug-event-structure: 3489f992ccaSMichael Roth 349*8915106cSLeonardo GarciaHot plug/unplug event structure 350*8915106cSLeonardo Garcia=============================== 351*8915106cSLeonardo Garcia 352*8915106cSLeonardo GarciaThe hot plug specific payload in QEMU is implemented as follows (with all values 35311eec063SMichael Rothencoded in big-endian format): 35411eec063SMichael Roth 355*8915106cSLeonardo Garcia.. code-block:: c 356*8915106cSLeonardo Garcia 35711eec063SMichael Roth struct rtas_event_log_v6_hp { 35811eec063SMichael Roth #define SECTION_ID_HOTPLUG 0x4850 /* HP */ 35911eec063SMichael Roth struct section_header { 36011eec063SMichael Roth uint16_t section_id; /* set to SECTION_ID_HOTPLUG */ 36111eec063SMichael Roth uint16_t section_length; /* sizeof(rtas_event_log_v6_hp), 36211eec063SMichael Roth * plus the length of the DRC name 36311eec063SMichael Roth * if a DRC name identifier is 36411eec063SMichael Roth * specified for hotplug_identifier 36511eec063SMichael Roth */ 36611eec063SMichael Roth uint8_t section_version; /* version 1 */ 36711eec063SMichael Roth uint8_t section_subtype; /* unused */ 36811eec063SMichael Roth uint16_t creator_component_id; /* unused */ 36911eec063SMichael Roth } hdr; 37011eec063SMichael Roth #define RTAS_LOG_V6_HP_TYPE_CPU 1 37111eec063SMichael Roth #define RTAS_LOG_V6_HP_TYPE_MEMORY 2 37211eec063SMichael Roth #define RTAS_LOG_V6_HP_TYPE_SLOT 3 37311eec063SMichael Roth #define RTAS_LOG_V6_HP_TYPE_PHB 4 37411eec063SMichael Roth #define RTAS_LOG_V6_HP_TYPE_PCI 5 37511eec063SMichael Roth uint8_t hotplug_type; /* type of resource/device */ 37611eec063SMichael Roth #define RTAS_LOG_V6_HP_ACTION_ADD 1 37711eec063SMichael Roth #define RTAS_LOG_V6_HP_ACTION_REMOVE 2 37811eec063SMichael Roth uint8_t hotplug_action; /* action (add/remove) */ 37911eec063SMichael Roth #define RTAS_LOG_V6_HP_ID_DRC_NAME 1 38011eec063SMichael Roth #define RTAS_LOG_V6_HP_ID_DRC_INDEX 2 38111eec063SMichael Roth #define RTAS_LOG_V6_HP_ID_DRC_COUNT 3 3829f992ccaSMichael Roth #ifdef GUEST_SUPPORTS_MODERN 3839f992ccaSMichael Roth #define RTAS_LOG_V6_HP_ID_DRC_COUNT_INDEXED 4 3849f992ccaSMichael Roth #endif 38511eec063SMichael Roth uint8_t hotplug_identifier; /* type of the resource identifier, 38611eec063SMichael Roth * which serves as the discriminator 38711eec063SMichael Roth * for the 'drc' union field below 38811eec063SMichael Roth */ 3899f992ccaSMichael Roth #ifdef GUEST_SUPPORTS_MODERN 3909f992ccaSMichael Roth uint8_t capabilities; /* capability flags, currently unused 3919f992ccaSMichael Roth * by QEMU 3929f992ccaSMichael Roth */ 3939f992ccaSMichael Roth #else 39411eec063SMichael Roth uint8_t reserved; 3959f992ccaSMichael Roth #endif 39611eec063SMichael Roth union { 39711eec063SMichael Roth uint32_t index; /* DRC index of resource to take action 39811eec063SMichael Roth * on 39911eec063SMichael Roth */ 40011eec063SMichael Roth uint32_t count; /* number of DR resources to take 40111eec063SMichael Roth * action on (guest chooses which) 40211eec063SMichael Roth */ 4039f992ccaSMichael Roth #ifdef GUEST_SUPPORTS_MODERN 4049f992ccaSMichael Roth struct { 4059f992ccaSMichael Roth uint32_t count; /* number of DR resources to take 4069f992ccaSMichael Roth * action on 4079f992ccaSMichael Roth */ 4089f992ccaSMichael Roth uint32_t index; /* DRC index of first resource to take 4099f992ccaSMichael Roth * action on. guest will take action 4109f992ccaSMichael Roth * on DRC index <index> through 4119f992ccaSMichael Roth * DRC index <index + count - 1> in 4129f992ccaSMichael Roth * sequential order 4139f992ccaSMichael Roth */ 4149f992ccaSMichael Roth } count_indexed; 4159f992ccaSMichael Roth #endif 41611eec063SMichael Roth char name[1]; /* string representing the name of the 41711eec063SMichael Roth * DRC to take action on 41811eec063SMichael Roth */ 41911eec063SMichael Roth } drc; 42011eec063SMichael Roth } QEMU_PACKED; 42111eec063SMichael Roth 422*8915106cSLeonardo Garcia``ibm,lrdr-capacity`` 423*8915106cSLeonardo Garcia===================== 424db4ef288SBharata B Rao 425*8915106cSLeonardo Garcia``ibm,lrdr-capacity`` is a property in the /rtas device tree node that 426*8915106cSLeonardo Garciaidentifies the dynamic reconfiguration capabilities of the guest. It consists 427*8915106cSLeonardo Garciaof a triple consisting of ``<phys>``, ``<size>`` and ``<maxcpus>``. 428db4ef288SBharata B Rao 429*8915106cSLeonardo Garcia ``<phys>``, encoded in BE format represents the maximum address in bytes and 430db4ef288SBharata B Rao hence the maximum memory that can be allocated to the guest. 431db4ef288SBharata B Rao 432*8915106cSLeonardo Garcia ``<size>``, encoded in BE format represents the size increments in which 433db4ef288SBharata B Rao memory can be hot-plugged to the guest. 434db4ef288SBharata B Rao 435*8915106cSLeonardo Garcia ``<maxcpus>``, a BE-encoded integer, represents the maximum number of 436db4ef288SBharata B Rao processors that the guest can have. 437db4ef288SBharata B Rao 438*8915106cSLeonardo Garcia``pseries`` guests use this property to note the maximum allowed CPUs for the 439db4ef288SBharata B Raoguest. 440db4ef288SBharata B Rao 441*8915106cSLeonardo Garcia``ibm,dynamic-reconfiguration-memory`` 442*8915106cSLeonardo Garcia====================================== 44303d196b7SBharata B Rao 444*8915106cSLeonardo Garcia``ibm,dynamic-reconfiguration-memory`` is a device tree node that represents 445*8915106cSLeonardo Garciadynamically reconfigurable logical memory blocks (LMB). This node is generated 446*8915106cSLeonardo Garciaonly when the guest advertises the support for it via 447*8915106cSLeonardo Garcia``ibm,client-architecture-support`` call. Memory that is not dynamically 448*8915106cSLeonardo Garciareconfigurable is represented by ``/memory`` nodes. The properties of this node 449*8915106cSLeonardo Garciathat are of interest to the sPAPR memory hotplug implementation in QEMU are 450*8915106cSLeonardo Garciadescribed here. 45103d196b7SBharata B Rao 452*8915106cSLeonardo Garcia``ibm,lmb-size`` 453*8915106cSLeonardo Garcia---------------- 45403d196b7SBharata B Rao 455*8915106cSLeonardo GarciaThis 64-bit integer defines the size of each dynamically reconfigurable LMB. 45603d196b7SBharata B Rao 457*8915106cSLeonardo Garcia``ibm,associativity-lookup-arrays`` 458*8915106cSLeonardo Garcia----------------------------------- 45903d196b7SBharata B Rao 46003d196b7SBharata B RaoThis property defines a lookup array in which the NUMA associativity 46103d196b7SBharata B Raoinformation for each LMB can be found. It is a property encoded array 46203d196b7SBharata B Raothat begins with an integer M, the number of associativity lists followed 46303d196b7SBharata B Raoby an integer N, the number of entries per associativity list and terminated 46403d196b7SBharata B Raoby M associativity lists each of length N integers. 46503d196b7SBharata B Rao 466*8915106cSLeonardo GarciaThis property provides the same information as given by ``ibm,associativity`` 467*8915106cSLeonardo Garciaproperty in a ``/memory`` node. Each assigned LMB has an index value between 46803d196b7SBharata B Rao0 and M-1 which is used as an index into this table to select which 469*8915106cSLeonardo Garciaassociativity list to use for the LMB. This index value for each LMB is defined 470*8915106cSLeonardo Garciain ``ibm,dynamic-memory`` property. 47103d196b7SBharata B Rao 472*8915106cSLeonardo Garcia``ibm,dynamic-memory`` 473*8915106cSLeonardo Garcia---------------------- 47403d196b7SBharata B Rao 47503d196b7SBharata B RaoThis property describes the dynamically reconfigurable memory. It is a 47603d196b7SBharata B Raoproperty encoded array that has an integer N, the number of LMBs followed 47776ca4b58Szhaolichangby N LMB list entries. 47803d196b7SBharata B Rao 47903d196b7SBharata B RaoEach LMB list entry consists of the following elements: 48003d196b7SBharata B Rao 481*8915106cSLeonardo Garcia- Logical address of the start of the LMB encoded as a 64-bit integer. This 482*8915106cSLeonardo Garcia corresponds to ``reg`` property in ``/memory`` node. 483*8915106cSLeonardo Garcia- DRC index of the LMB that corresponds to ``ibm,my-drc-index`` property 484*8915106cSLeonardo Garcia in a ``/memory`` node. 48503d196b7SBharata B Rao- Four bytes reserved for expansion. 48603d196b7SBharata B Rao- Associativity list index for the LMB that is used as an index into 487*8915106cSLeonardo Garcia ``ibm,associativity-lookup-arrays`` property described earlier. This is used 488*8915106cSLeonardo Garcia to retrieve the right associativity list to be used for this LMB. 489*8915106cSLeonardo Garcia- A 32-bit flags word. The bit at bit position ``0x00000008`` defines whether 490df59feb1SDr. David Alan Gilbert the LMB is assigned to the partition as of boot time. 49103d196b7SBharata B Rao 492*8915106cSLeonardo Garcia``ibm,dynamic-memory-v2`` 493*8915106cSLeonardo Garcia------------------------- 494a324d6f1SBharata B Rao 495a324d6f1SBharata B RaoThis property describes the dynamically reconfigurable memory. This is 49676ca4b58Szhaolichangan alternate and newer way to describe dynamically reconfigurable memory. 497a324d6f1SBharata B RaoIt is a property encoded array that has an integer N (the number of 498a324d6f1SBharata B RaoLMB set entries) followed by N LMB set entries. There is an LMB set entry 499a324d6f1SBharata B Raofor each sequential group of LMBs that share common attributes. 500a324d6f1SBharata B Rao 501a324d6f1SBharata B RaoEach LMB set entry consists of the following elements: 502a324d6f1SBharata B Rao 503*8915106cSLeonardo Garcia- Number of sequential LMBs in the entry represented by a 32-bit integer. 504*8915106cSLeonardo Garcia- Logical address of the first LMB in the set encoded as a 64-bit integer. 505a324d6f1SBharata B Rao- DRC index of the first LMB in the set. 506a324d6f1SBharata B Rao- Associativity list index that is used as an index into 507*8915106cSLeonardo Garcia ``ibm,associativity-lookup-arrays`` property described earlier. This 508a324d6f1SBharata B Rao is used to retrieve the right associativity list to be used for all 509a324d6f1SBharata B Rao the LMBs in this set. 510*8915106cSLeonardo Garcia- A 32-bit flags word that applies to all the LMBs in the set. 511