xref: /qemu/docs/specs/ppc-spapr-xive.rst (revision 834b9273d5cdab68180dc8c84d641aaa4344b057)
124563a58SCédric Le GoaterXIVE for sPAPR (pseries machines)
224563a58SCédric Le Goater=================================
324563a58SCédric Le Goater
424563a58SCédric Le GoaterThe POWER9 processor comes with a new interrupt controller
524563a58SCédric Le Goaterarchitecture, called XIVE as "eXternal Interrupt Virtualization
624563a58SCédric Le GoaterEngine". It supports a larger number of interrupt sources and offers
724563a58SCédric Le Goatervirtualization features which enables the HW to deliver interrupts
824563a58SCédric Le Goaterdirectly to virtual processors without hypervisor assistance.
924563a58SCédric Le Goater
1024563a58SCédric Le GoaterA QEMU ``pseries`` machine (which is PAPR compliant) using POWER9
1124563a58SCédric Le Goaterprocessors can run under two interrupt modes:
1224563a58SCédric Le Goater
1324563a58SCédric Le Goater- *Legacy Compatibility Mode*
1424563a58SCédric Le Goater
1524563a58SCédric Le Goater  the hypervisor provides identical interfaces and similar
1624563a58SCédric Le Goater  functionality to PAPR+ Version 2.7.  This is the default mode
1724563a58SCédric Le Goater
1824563a58SCédric Le Goater  It is also referred as *XICS* in QEMU.
1924563a58SCédric Le Goater
2024563a58SCédric Le Goater- *XIVE native exploitation mode*
2124563a58SCédric Le Goater
2224563a58SCédric Le Goater  the hypervisor provides new interfaces to manage the XIVE control
2324563a58SCédric Le Goater  structures, and provides direct control for interrupt management
2424563a58SCédric Le Goater  through MMIO pages.
2524563a58SCédric Le Goater
2624563a58SCédric Le GoaterWhich interrupt modes can be used by the machine is negotiated with
2724563a58SCédric Le Goaterthe guest O/S during the Client Architecture Support negotiation
2824563a58SCédric Le Goatersequence. The two modes are mutually exclusive.
2924563a58SCédric Le Goater
3024563a58SCédric Le GoaterBoth interrupt mode share the same IRQ number space. See below for the
3124563a58SCédric Le Goaterlayout.
3224563a58SCédric Le Goater
3324563a58SCédric Le GoaterCAS Negotiation
3424563a58SCédric Le Goater---------------
3524563a58SCédric Le Goater
3624563a58SCédric Le GoaterQEMU advertises the supported interrupt modes in the device tree
37b87a0100SCédric Le Goaterproperty ``ibm,arch-vec-5-platform-support`` in byte 23 and the OS
38b87a0100SCédric Le GoaterSelection for XIVE is indicated in the ``ibm,architecture-vec-5``
3924563a58SCédric Le Goaterproperty byte 23.
4024563a58SCédric Le Goater
4124563a58SCédric Le GoaterThe interrupt modes supported by the machine depend on the CPU type
4224563a58SCédric Le Goater(POWER9 is required for XIVE) but also on the machine property
4324563a58SCédric Le Goater``ic-mode`` which can be set on the command line. It can take the
44b87a0100SCédric Le Goaterfollowing values: ``xics``, ``xive``, and ``dual`` which is the
45b87a0100SCédric Le Goaterdefault mode. ``dual`` means that both modes XICS **and** XIVE are
46b87a0100SCédric Le Goatersupported and if the guest OS supports XIVE, this mode will be
47b87a0100SCédric Le Goaterselected.
4824563a58SCédric Le Goater
49*76ca4b58SzhaolichangThe chosen interrupt mode is activated after a reconfiguration done
5024563a58SCédric Le Goaterin a machine reset.
5124563a58SCédric Le Goater
52b87a0100SCédric Le GoaterKVM negotiation
53b87a0100SCédric Le Goater---------------
54b87a0100SCédric Le Goater
55b87a0100SCédric Le GoaterWhen the guest starts under KVM, the capabilities of the host kernel
56b87a0100SCédric Le Goaterand QEMU are also negotiated. Depending on the version of the host
57b87a0100SCédric Le Goaterkernel, KVM will advertise the XIVE capability to QEMU or not.
58b87a0100SCédric Le Goater
59b87a0100SCédric Le GoaterNevertheless, the available interrupt modes in the machine should not
60b87a0100SCédric Le Goaterdepend on the XIVE KVM capability of the host. On older kernels
61b87a0100SCédric Le Goaterwithout XIVE KVM support, QEMU will use the emulated XIVE device as a
62b87a0100SCédric Le Goaterfallback and on newer kernels (>=5.2), the KVM XIVE device.
63b87a0100SCédric Le Goater
648d14523bSCédric Le GoaterXIVE native exploitation mode is not supported for KVM nested guests,
658d14523bSCédric Le GoaterVMs running under a L1 hypervisor (KVM on pSeries). In that case, the
668d14523bSCédric Le Goaterhypervisor will not advertise the KVM capability and QEMU will use the
678d14523bSCédric Le Goateremulated XIVE device, same as for older versions of KVM.
688d14523bSCédric Le Goater
69b87a0100SCédric Le GoaterAs a final refinement, the user can also switch the use of the KVM
70b87a0100SCédric Le Goaterdevice with the machine option ``kernel_irqchip``.
71b87a0100SCédric Le Goater
72b87a0100SCédric Le Goater
73b87a0100SCédric Le GoaterXIVE support in KVM
74b87a0100SCédric Le Goater~~~~~~~~~~~~~~~~~~~
75b87a0100SCédric Le Goater
76b87a0100SCédric Le GoaterFor guest OSes supporting XIVE, the resulting interrupt modes on host
77b87a0100SCédric Le Goaterkernels with XIVE KVM support are the following:
78b87a0100SCédric Le Goater
79b87a0100SCédric Le Goater==============  =============  =============  ================
80b87a0100SCédric Le Goateric-mode                            kernel_irqchip
81b87a0100SCédric Le Goater--------------  ----------------------------------------------
82b87a0100SCédric Le Goater/               allowed        off            on
83b87a0100SCédric Le Goater                (default)
84b87a0100SCédric Le Goater==============  =============  =============  ================
85b87a0100SCédric Le Goaterdual (default)  XIVE KVM       XIVE emul.     XIVE KVM
86b87a0100SCédric Le Goaterxive            XIVE KVM       XIVE emul.     XIVE KVM
87b87a0100SCédric Le Goaterxics            XICS KVM       XICS emul.     XICS KVM
88b87a0100SCédric Le Goater==============  =============  =============  ================
89b87a0100SCédric Le Goater
90b87a0100SCédric Le GoaterFor legacy guest OSes without XIVE support, the resulting interrupt
91b87a0100SCédric Le Goatermodes are the following:
92b87a0100SCédric Le Goater
93b87a0100SCédric Le Goater==============  =============  =============  ================
94b87a0100SCédric Le Goateric-mode                            kernel_irqchip
95b87a0100SCédric Le Goater--------------  ----------------------------------------------
96b87a0100SCédric Le Goater/               allowed        off            on
97b87a0100SCédric Le Goater                (default)
98b87a0100SCédric Le Goater==============  =============  =============  ================
99b87a0100SCédric Le Goaterdual (default)  XICS KVM       XICS emul.     XICS KVM
100b87a0100SCédric Le Goaterxive            QEMU error(3)  QEMU error(3)  QEMU error(3)
101b87a0100SCédric Le Goaterxics            XICS KVM       XICS emul.     XICS KVM
102b87a0100SCédric Le Goater==============  =============  =============  ================
103b87a0100SCédric Le Goater
104b87a0100SCédric Le Goater(3) QEMU fails at CAS with ``Guest requested unavailable interrupt
105b87a0100SCédric Le Goater    mode (XICS), either don't set the ic-mode machine property or try
106b87a0100SCédric Le Goater    ic-mode=xics or ic-mode=dual``
107b87a0100SCédric Le Goater
108b87a0100SCédric Le Goater
109b87a0100SCédric Le GoaterNo XIVE support in KVM
110b87a0100SCédric Le Goater~~~~~~~~~~~~~~~~~~~~~~
111b87a0100SCédric Le Goater
112b87a0100SCédric Le GoaterFor guest OSes supporting XIVE, the resulting interrupt modes on host
113b87a0100SCédric Le Goaterkernels without XIVE KVM support are the following:
114b87a0100SCédric Le Goater
115b87a0100SCédric Le Goater==============  =============  =============  ================
116b87a0100SCédric Le Goateric-mode                            kernel_irqchip
117b87a0100SCédric Le Goater--------------  ----------------------------------------------
118b87a0100SCédric Le Goater/               allowed        off            on
119b87a0100SCédric Le Goater                (default)
120b87a0100SCédric Le Goater==============  =============  =============  ================
121b87a0100SCédric Le Goaterdual (default)  XIVE emul.(1)  XIVE emul.     QEMU error (2)
122b87a0100SCédric Le Goaterxive            XIVE emul.(1)  XIVE emul.     QEMU error (2)
123b87a0100SCédric Le Goaterxics            XICS KVM       XICS emul.     XICS KVM
124b87a0100SCédric Le Goater==============  =============  =============  ================
125b87a0100SCédric Le Goater
126b87a0100SCédric Le Goater
127b87a0100SCédric Le Goater(1) QEMU warns with ``warning: kernel_irqchip requested but unavailable:
128b87a0100SCédric Le Goater    IRQ_XIVE capability must be present for KVM``
129c55bcb1fSGreg Kurz    In some cases (old host kernels or KVM nested guests), one may hit a
130c55bcb1fSGreg Kurz    QEMU/KVM incompatibility due to device destruction in reset. QEMU fails
131c55bcb1fSGreg Kurz    with ``KVM is incompatible with ic-mode=dual,kernel-irqchip=on``
132b87a0100SCédric Le Goater(2) QEMU fails with ``kernel_irqchip requested but unavailable:
133b87a0100SCédric Le Goater    IRQ_XIVE capability must be present for KVM``
134b87a0100SCédric Le Goater
135b87a0100SCédric Le Goater
136b87a0100SCédric Le GoaterFor legacy guest OSes without XIVE support, the resulting interrupt
137b87a0100SCédric Le Goatermodes are the following:
138b87a0100SCédric Le Goater
139b87a0100SCédric Le Goater==============  =============  =============  ================
140b87a0100SCédric Le Goateric-mode                            kernel_irqchip
141b87a0100SCédric Le Goater--------------  ----------------------------------------------
142b87a0100SCédric Le Goater/               allowed        off            on
143b87a0100SCédric Le Goater                (default)
144b87a0100SCédric Le Goater==============  =============  =============  ================
145b87a0100SCédric Le Goaterdual (default)  QEMU error(4)  XICS emul.     QEMU error(4)
146b87a0100SCédric Le Goaterxive            QEMU error(3)  QEMU error(3)  QEMU error(3)
147b87a0100SCédric Le Goaterxics            XICS KVM       XICS emul.     XICS KVM
148b87a0100SCédric Le Goater==============  =============  =============  ================
149b87a0100SCédric Le Goater
150b87a0100SCédric Le Goater(3) QEMU fails at CAS with ``Guest requested unavailable interrupt
151b87a0100SCédric Le Goater    mode (XICS), either don't set the ic-mode machine property or try
152b87a0100SCédric Le Goater    ic-mode=xics or ic-mode=dual``
1537abc0c6dSGreg Kurz(4) QEMU/KVM incompatibility due to device destruction in reset. QEMU fails
154c55bcb1fSGreg Kurz    with ``KVM is incompatible with ic-mode=dual,kernel-irqchip=on``
155b87a0100SCédric Le Goater
156b87a0100SCédric Le Goater
15724563a58SCédric Le GoaterXIVE Device tree properties
15824563a58SCédric Le Goater---------------------------
15924563a58SCédric Le Goater
16024563a58SCédric Le GoaterThe properties for the PAPR interrupt controller node when the *XIVE
161*76ca4b58Szhaolichangnative exploitation mode* is selected should contain:
16224563a58SCédric Le Goater
16324563a58SCédric Le Goater- ``device_type``
16424563a58SCédric Le Goater
16524563a58SCédric Le Goater  value should be "power-ivpe".
16624563a58SCédric Le Goater
16724563a58SCédric Le Goater- ``compatible``
16824563a58SCédric Le Goater
16924563a58SCédric Le Goater  value should be "ibm,power-ivpe".
17024563a58SCédric Le Goater
17124563a58SCédric Le Goater- ``reg``
17224563a58SCédric Le Goater
17324563a58SCédric Le Goater  contains the base address and size of the thread interrupt
17424563a58SCédric Le Goater  managnement areas (TIMA), for the User level and for the Guest OS
17524563a58SCédric Le Goater  level. Only the Guest OS level is taken into account today.
17624563a58SCédric Le Goater
17724563a58SCédric Le Goater- ``ibm,xive-eq-sizes``
17824563a58SCédric Le Goater
17924563a58SCédric Le Goater  the size of the event queues. One cell per size supported, contains
18024563a58SCédric Le Goater  log2 of size, in ascending order.
18124563a58SCédric Le Goater
18224563a58SCédric Le Goater- ``ibm,xive-lisn-ranges``
18324563a58SCédric Le Goater
18424563a58SCédric Le Goater  the IRQ interrupt number ranges assigned to the guest for the IPIs.
18524563a58SCédric Le Goater
18624563a58SCédric Le GoaterThe root node also exports :
18724563a58SCédric Le Goater
18824563a58SCédric Le Goater- ``ibm,plat-res-int-priorities``
18924563a58SCédric Le Goater
19024563a58SCédric Le Goater  contains a list of priorities that the hypervisor has reserved for
19124563a58SCédric Le Goater  its own use.
19224563a58SCédric Le Goater
19324563a58SCédric Le GoaterIRQ number space
19424563a58SCédric Le Goater----------------
19524563a58SCédric Le Goater
19624563a58SCédric Le GoaterIRQ Number space of the ``pseries`` machine is 8K wide and is the same
19724563a58SCédric Le Goaterfor both interrupt mode. The different ranges are defined as follow :
19824563a58SCédric Le Goater
19924563a58SCédric Le Goater- ``0x0000 .. 0x0FFF`` 4K CPU IPIs (only used under XIVE)
20024563a58SCédric Le Goater- ``0x1000 .. 0x1000`` 1 EPOW
20124563a58SCédric Le Goater- ``0x1001 .. 0x1001`` 1 HOTPLUG
202b87a0100SCédric Le Goater- ``0x1002 .. 0x10FF`` unused
20324563a58SCédric Le Goater- ``0x1100 .. 0x11FF`` 256 VIO devices
204b87a0100SCédric Le Goater- ``0x1200 .. 0x127F`` 32x4 LSIs for PHB devices
20524563a58SCédric Le Goater- ``0x1280 .. 0x12FF`` unused
206b87a0100SCédric Le Goater- ``0x1300 .. 0x1FFF`` PHB MSIs (dynamically allocated)
20724563a58SCédric Le Goater
20824563a58SCédric Le GoaterMonitoring XIVE
20924563a58SCédric Le Goater---------------
21024563a58SCédric Le Goater
21124563a58SCédric Le GoaterThe state of the XIVE interrupt controller can be queried through the
21224563a58SCédric Le Goatermonitor commands ``info pic``. The output comes in two parts.
21324563a58SCédric Le Goater
21424563a58SCédric Le GoaterFirst, the state of the thread interrupt context registers is dumped
21524563a58SCédric Le Goaterfor each CPU :
21624563a58SCédric Le Goater
21724563a58SCédric Le Goater::
21824563a58SCédric Le Goater
21924563a58SCédric Le Goater   (qemu) info pic
22024563a58SCédric Le Goater   CPU[0000]:   QW   NSR CPPR IPB LSMFB ACK# INC AGE PIPR  W2
22124563a58SCédric Le Goater   CPU[0000]: USER    00   00  00    00   00  00  00   00  00000000
22224563a58SCédric Le Goater   CPU[0000]:   OS    00   ff  00    00   ff  00  ff   ff  80000400
22324563a58SCédric Le Goater   CPU[0000]: POOL    00   00  00    00   00  00  00   00  00000000
22424563a58SCédric Le Goater   CPU[0000]: PHYS    00   00  00    00   00  00  00   ff  00000000
22524563a58SCédric Le Goater   ...
22624563a58SCédric Le Goater
22724563a58SCédric Le GoaterIn the case of a ``pseries`` machine, QEMU acts as the hypervisor and only
22824563a58SCédric Le Goaterthe O/S and USER register rings make sense. ``W2`` contains the vCPU CAM
22924563a58SCédric Le Goaterline which is set to the VP identifier.
23024563a58SCédric Le Goater
23124563a58SCédric Le GoaterThen comes the routing information which aggregates the EAS and the
23224563a58SCédric Le GoaterEND configuration:
23324563a58SCédric Le Goater
23424563a58SCédric Le Goater::
23524563a58SCédric Le Goater
23624563a58SCédric Le Goater   ...
23724563a58SCédric Le Goater   LISN         PQ    EISN     CPU/PRIO EQ
23824563a58SCédric Le Goater   00000000 MSI --    00000010   0/6    380/16384 @1fe3e0000 ^1 [ 80000010 ... ]
23924563a58SCédric Le Goater   00000001 MSI --    00000010   1/6    305/16384 @1fc230000 ^1 [ 80000010 ... ]
24024563a58SCédric Le Goater   00000002 MSI --    00000010   2/6    220/16384 @1fc2f0000 ^1 [ 80000010 ... ]
24124563a58SCédric Le Goater   00000003 MSI --    00000010   3/6    201/16384 @1fc390000 ^1 [ 80000010 ... ]
24224563a58SCédric Le Goater   00000004 MSI -Q  M 00000000
24324563a58SCédric Le Goater   00000005 MSI -Q  M 00000000
24424563a58SCédric Le Goater   00000006 MSI -Q  M 00000000
24524563a58SCédric Le Goater   00000007 MSI -Q  M 00000000
24624563a58SCédric Le Goater   00001000 MSI --    00000012   0/6    380/16384 @1fe3e0000 ^1 [ 80000010 ... ]
24724563a58SCédric Le Goater   00001001 MSI --    00000013   0/6    380/16384 @1fe3e0000 ^1 [ 80000010 ... ]
24824563a58SCédric Le Goater   00001100 MSI --    00000100   1/6    305/16384 @1fc230000 ^1 [ 80000010 ... ]
24924563a58SCédric Le Goater   00001101 MSI -Q  M 00000000
25024563a58SCédric Le Goater   00001200 LSI -Q  M 00000000
25124563a58SCédric Le Goater   00001201 LSI -Q  M 00000000
25224563a58SCédric Le Goater   00001202 LSI -Q  M 00000000
25324563a58SCédric Le Goater   00001203 LSI -Q  M 00000000
25424563a58SCédric Le Goater   00001300 MSI --    00000102   1/6    305/16384 @1fc230000 ^1 [ 80000010 ... ]
25524563a58SCédric Le Goater   00001301 MSI --    00000103   2/6    220/16384 @1fc2f0000 ^1 [ 80000010 ... ]
25624563a58SCédric Le Goater   00001302 MSI --    00000104   3/6    201/16384 @1fc390000 ^1 [ 80000010 ... ]
25724563a58SCédric Le Goater
25824563a58SCédric Le GoaterThe source information and configuration:
25924563a58SCédric Le Goater
26024563a58SCédric Le Goater- The ``LISN`` column outputs the interrupt number of the source in
26124563a58SCédric Le Goater  range ``[ 0x0 ... 0x1FFF ]`` and its type : ``MSI`` or ``LSI``
26224563a58SCédric Le Goater- The ``PQ`` column reflects the state of the PQ bits of the source :
26324563a58SCédric Le Goater
26424563a58SCédric Le Goater  - ``--`` source is ready to take events
26524563a58SCédric Le Goater  - ``P-`` an event was sent and an EOI is PENDING
26624563a58SCédric Le Goater  - ``PQ`` an event was QUEUED
26724563a58SCédric Le Goater  - ``-Q`` source is OFF
26824563a58SCédric Le Goater
26924563a58SCédric Le Goater  a ``M`` indicates that source is *MASKED* at the EAS level,
27024563a58SCédric Le Goater
27124563a58SCédric Le GoaterThe targeting configuration :
27224563a58SCédric Le Goater
27324563a58SCédric Le Goater- The ``EISN`` column is the event data that will be queued in the event
27424563a58SCédric Le Goater  queue of the O/S.
27524563a58SCédric Le Goater- The ``CPU/PRIO`` column is the tuple defining the CPU number and
27624563a58SCédric Le Goater  priority queue serving the source.
27724563a58SCédric Le Goater- The ``EQ`` column outputs :
27824563a58SCédric Le Goater
27924563a58SCédric Le Goater  - the current index of the event queue/ the max number of entries
28024563a58SCédric Le Goater  - the O/S event queue address
28124563a58SCédric Le Goater  - the toggle bit
28224563a58SCédric Le Goater  - the last entries that were pushed in the event queue.
283