1*24563a58SCédric Le Goater================================ 2*24563a58SCédric Le GoaterPOWER9 XIVE interrupt controller 3*24563a58SCédric Le Goater================================ 4*24563a58SCédric Le Goater 5*24563a58SCédric Le GoaterThe POWER9 processor comes with a new interrupt controller 6*24563a58SCédric Le Goaterarchitecture, called XIVE as "eXternal Interrupt Virtualization 7*24563a58SCédric Le GoaterEngine". 8*24563a58SCédric Le Goater 9*24563a58SCédric Le GoaterCompared to the previous architecture, the main characteristics of 10*24563a58SCédric Le GoaterXIVE are to support a larger number of interrupt sources and to 11*24563a58SCédric Le Goaterdeliver interrupts directly to virtual processors without hypervisor 12*24563a58SCédric Le Goaterassistance. This removes the context switches required for the 13*24563a58SCédric Le Goaterdelivery process. 14*24563a58SCédric Le Goater 15*24563a58SCédric Le Goater 16*24563a58SCédric Le GoaterXIVE architecture 17*24563a58SCédric Le Goater================= 18*24563a58SCédric Le Goater 19*24563a58SCédric Le GoaterThe XIVE IC is composed of three sub-engines, each taking care of a 20*24563a58SCédric Le Goaterprocessing layer of external interrupts: 21*24563a58SCédric Le Goater 22*24563a58SCédric Le Goater- Interrupt Virtualization Source Engine (IVSE), or Source Controller 23*24563a58SCédric Le Goater (SC). These are found in PCI PHBs, in the PSI host bridge 24*24563a58SCédric Le Goater controller, but also inside the main controller for the core IPIs 25*24563a58SCédric Le Goater and other sub-chips (NX, CAP, NPU) of the chip/processor. They are 26*24563a58SCédric Le Goater configured to feed the IVRE with events. 27*24563a58SCédric Le Goater- Interrupt Virtualization Routing Engine (IVRE) or Virtualization 28*24563a58SCédric Le Goater Controller (VC). It handles event coalescing and perform interrupt 29*24563a58SCédric Le Goater routing by matching an event source number with an Event 30*24563a58SCédric Le Goater Notification Descriptor (END). 31*24563a58SCédric Le Goater- Interrupt Virtualization Presentation Engine (IVPE) or Presentation 32*24563a58SCédric Le Goater Controller (PC). It maintains the interrupt context state of each 33*24563a58SCédric Le Goater thread and handles the delivery of the external interrupt to the 34*24563a58SCédric Le Goater thread. 35*24563a58SCédric Le Goater 36*24563a58SCédric Le Goater:: 37*24563a58SCédric Le Goater 38*24563a58SCédric Le Goater XIVE Interrupt Controller 39*24563a58SCédric Le Goater +------------------------------------+ IPIs 40*24563a58SCédric Le Goater | +---------+ +---------+ +--------+ | +-------+ 41*24563a58SCédric Le Goater | |IVRE | |Common Q | |IVPE |----> | CORES | 42*24563a58SCédric Le Goater | | esb | | | | |----> | | 43*24563a58SCédric Le Goater | | eas | | Bridge | | tctx |----> | | 44*24563a58SCédric Le Goater | |SC end | | | | nvt | | | | 45*24563a58SCédric Le Goater +------+ | +---------+ +----+----+ +--------+ | +-+-+-+-+ 46*24563a58SCédric Le Goater | RAM | +------------------|-----------------+ | | | 47*24563a58SCédric Le Goater | | | | | | 48*24563a58SCédric Le Goater | | | | | | 49*24563a58SCédric Le Goater | | +--------------------v------------------------v-v-v--+ other 50*24563a58SCédric Le Goater | <--+ Power Bus +--> chips 51*24563a58SCédric Le Goater | esb | +---------+-----------------------+------------------+ 52*24563a58SCédric Le Goater | eas | | | 53*24563a58SCédric Le Goater | end | +--|------+ | 54*24563a58SCédric Le Goater | nvt | +----+----+ | +----+----+ 55*24563a58SCédric Le Goater +------+ |IVSE | | |IVSE | 56*24563a58SCédric Le Goater | | | | | 57*24563a58SCédric Le Goater | PQ-bits | | | PQ-bits | 58*24563a58SCédric Le Goater | local |-+ | in VC | 59*24563a58SCédric Le Goater +---------+ +---------+ 60*24563a58SCédric Le Goater PCIe NX,NPU,CAPI 61*24563a58SCédric Le Goater 62*24563a58SCédric Le Goater 63*24563a58SCédric Le Goater PQ-bits: 2 bits source state machine (P:pending Q:queued) 64*24563a58SCédric Le Goater esb: Event State Buffer (Array of PQ bits in an IVSE) 65*24563a58SCédric Le Goater eas: Event Assignment Structure 66*24563a58SCédric Le Goater end: Event Notification Descriptor 67*24563a58SCédric Le Goater nvt: Notification Virtual Target 68*24563a58SCédric Le Goater tctx: Thread interrupt Context registers 69*24563a58SCédric Le Goater 70*24563a58SCédric Le Goater 71*24563a58SCédric Le Goater 72*24563a58SCédric Le GoaterXIVE internal tables 73*24563a58SCédric Le Goater-------------------- 74*24563a58SCédric Le Goater 75*24563a58SCédric Le GoaterEach of the sub-engines uses a set of tables to redirect interrupts 76*24563a58SCédric Le Goaterfrom event sources to CPU threads. 77*24563a58SCédric Le Goater 78*24563a58SCédric Le Goater:: 79*24563a58SCédric Le Goater 80*24563a58SCédric Le Goater +-------+ 81*24563a58SCédric Le Goater User or O/S | EQ | 82*24563a58SCédric Le Goater or +------>|entries| 83*24563a58SCédric Le Goater Hypervisor | | .. | 84*24563a58SCédric Le Goater Memory | +-------+ 85*24563a58SCédric Le Goater | ^ 86*24563a58SCédric Le Goater | | 87*24563a58SCédric Le Goater +-------------------------------------------------+ 88*24563a58SCédric Le Goater | | 89*24563a58SCédric Le Goater Hypervisor +------+ +---+--+ +---+--+ +------+ 90*24563a58SCédric Le Goater Memory | ESB | | EAT | | ENDT | | NVTT | 91*24563a58SCédric Le Goater (skiboot) +----+-+ +----+-+ +----+-+ +------+ 92*24563a58SCédric Le Goater ^ | ^ | ^ | ^ 93*24563a58SCédric Le Goater | | | | | | | 94*24563a58SCédric Le Goater +-------------------------------------------------+ 95*24563a58SCédric Le Goater | | | | | | | 96*24563a58SCédric Le Goater | | | | | | | 97*24563a58SCédric Le Goater +----|--|--------|--|--------|--|-+ +-|-----+ +------+ 98*24563a58SCédric Le Goater | | | | | | | | | | tctx| |Thread| 99*24563a58SCédric Le Goater IPI or ---+ + v + v + v |---| + .. |-----> | 100*24563a58SCédric Le Goater HW events | | | | | | 101*24563a58SCédric Le Goater | IVRE | | IVPE | +------+ 102*24563a58SCédric Le Goater +---------------------------------+ +-------+ 103*24563a58SCédric Le Goater 104*24563a58SCédric Le Goater 105*24563a58SCédric Le GoaterThe IVSE have a 2-bits state machine, P for pending and Q for queued, 106*24563a58SCédric Le Goaterfor each source that allows events to be triggered. They are stored in 107*24563a58SCédric Le Goateran Event State Buffer (ESB) array and can be controlled by MMIOs. 108*24563a58SCédric Le Goater 109*24563a58SCédric Le GoaterIf the event is let through, the IVRE looks up in the Event Assignment 110*24563a58SCédric Le GoaterStructure (EAS) table for an Event Notification Descriptor (END) 111*24563a58SCédric Le Goaterconfigured for the source. Each Event Notification Descriptor defines 112*24563a58SCédric Le Goatera notification path to a CPU and an in-memory Event Queue, in which 113*24563a58SCédric Le Goaterwill be enqueued an EQ data for the O/S to pull. 114*24563a58SCédric Le Goater 115*24563a58SCédric Le GoaterThe IVPE determines if a Notification Virtual Target (NVT) can handle 116*24563a58SCédric Le Goaterthe event by scanning the thread contexts of the VCPUs dispatched on 117*24563a58SCédric Le Goaterthe processor HW threads. It maintains the interrupt context state of 118*24563a58SCédric Le Goatereach thread in a NVT table. 119*24563a58SCédric Le Goater 120*24563a58SCédric Le GoaterXIVE thread interrupt context 121*24563a58SCédric Le Goater----------------------------- 122*24563a58SCédric Le Goater 123*24563a58SCédric Le GoaterThe XIVE presenter can generate four different exceptions to its 124*24563a58SCédric Le GoaterHW threads: 125*24563a58SCédric Le Goater 126*24563a58SCédric Le Goater- hypervisor exception 127*24563a58SCédric Le Goater- O/S exception 128*24563a58SCédric Le Goater- Event-Based Branch (user level) 129*24563a58SCédric Le Goater- msgsnd (doorbell) 130*24563a58SCédric Le Goater 131*24563a58SCédric Le GoaterEach exception has a state independent from the others called a Thread 132*24563a58SCédric Le GoaterInterrupt Management context. This context is a set of registers which 133*24563a58SCédric Le Goaterlets the thread handle priority management and interrupt 134*24563a58SCédric Le Goateracknowledgment among other things. The most important ones being : 135*24563a58SCédric Le Goater 136*24563a58SCédric Le Goater- Interrupt Priority Register (PIPR) 137*24563a58SCédric Le Goater- Interrupt Pending Buffer (IPB) 138*24563a58SCédric Le Goater- Current Processor Priority (CPPR) 139*24563a58SCédric Le Goater- Notification Source Register (NSR) 140*24563a58SCédric Le Goater 141*24563a58SCédric Le GoaterTIMA 142*24563a58SCédric Le Goater~~~~ 143*24563a58SCédric Le Goater 144*24563a58SCédric Le GoaterThe Thread Interrupt Management registers are accessible through a 145*24563a58SCédric Le Goaterspecific MMIO region, called the Thread Interrupt Management Area 146*24563a58SCédric Le Goater(TIMA), four aligned pages, each exposing a different view of the 147*24563a58SCédric Le Goaterregisters. First page (page address ending in ``0b00``) gives access 148*24563a58SCédric Le Goaterto the entire context and is reserved for the ring 0 view for the 149*24563a58SCédric Le Goaterphysical thread context. The second (page address ending in ``0b01``) 150*24563a58SCédric Le Goateris for the hypervisor, ring 1 view. The third (page address ending in 151*24563a58SCédric Le Goater``0b10``) is for the operating system, ring 2 view. The fourth (page 152*24563a58SCédric Le Goateraddress ending in ``0b11``) is for user level, ring 3 view. 153*24563a58SCédric Le Goater 154*24563a58SCédric Le GoaterInterrupt flow from an O/S perspective 155*24563a58SCédric Le Goater~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 156*24563a58SCédric Le Goater 157*24563a58SCédric Le GoaterAfter an event data has been enqueued in the O/S Event Queue, the IVPE 158*24563a58SCédric Le Goaterraises the bit corresponding to the priority of the pending interrupt 159*24563a58SCédric Le Goaterin the register IBP (Interrupt Pending Buffer) to indicate that an 160*24563a58SCédric Le Goaterevent is pending in one of the 8 priority queues. The Pending 161*24563a58SCédric Le GoaterInterrupt Priority Register (PIPR) is also updated using the IPB. This 162*24563a58SCédric Le Goaterregister represent the priority of the most favored pending 163*24563a58SCédric Le Goaternotification. 164*24563a58SCédric Le Goater 165*24563a58SCédric Le GoaterThe PIPR is then compared to the the Current Processor Priority 166*24563a58SCédric Le GoaterRegister (CPPR). If it is more favored (numerically less than), the 167*24563a58SCédric Le GoaterCPU interrupt line is raised and the EO bit of the Notification Source 168*24563a58SCédric Le GoaterRegister (NSR) is updated to notify the presence of an exception for 169*24563a58SCédric Le Goaterthe O/S. The O/S acknowledges the interrupt with a special load in the 170*24563a58SCédric Le GoaterThread Interrupt Management Area. 171*24563a58SCédric Le Goater 172*24563a58SCédric Le GoaterThe O/S handles the interrupt and when done, performs an EOI using a 173*24563a58SCédric Le GoaterMMIO operation on the ESB management page of the associate source. 174*24563a58SCédric Le Goater 175*24563a58SCédric Le GoaterOverview of the QEMU models for XIVE 176*24563a58SCédric Le Goater==================================== 177*24563a58SCédric Le Goater 178*24563a58SCédric Le GoaterThe XiveSource models the IVSE in general, internal and external. It 179*24563a58SCédric Le Goaterhandles the source ESBs and the MMIO interface to control them. 180*24563a58SCédric Le Goater 181*24563a58SCédric Le GoaterThe XiveNotifier is a small helper interface interconnecting the 182*24563a58SCédric Le GoaterXiveSource to the XiveRouter. 183*24563a58SCédric Le Goater 184*24563a58SCédric Le GoaterThe XiveRouter is an abstract model acting as a combined IVRE and 185*24563a58SCédric Le GoaterIVPE. It routes event notifications using the EAS and END tables to 186*24563a58SCédric Le Goaterthe IVPE sub-engine which does a CAM scan to find a CPU to deliver the 187*24563a58SCédric Le Goaterexception. Storage should be provided by the inheriting classes. 188*24563a58SCédric Le Goater 189*24563a58SCédric Le GoaterXiveEnDSource is a special source object. It exposes the END ESB MMIOs 190*24563a58SCédric Le Goaterof the Event Queues which are used for coalescing event notifications 191*24563a58SCédric Le Goaterand for escalation. Not used on the field, only to sync the EQ cache 192*24563a58SCédric Le Goaterin OPAL. 193*24563a58SCédric Le Goater 194*24563a58SCédric Le GoaterFinally, the XiveTCTX contains the interrupt state context of a thread, 195*24563a58SCédric Le Goaterfour sets of registers, one for each exception that can be delivered 196*24563a58SCédric Le Goaterto a CPU. These contexts are scanned by the IVPE to find a matching VP 197*24563a58SCédric Le Goaterwhen a notification is triggered. It also models the Thread Interrupt 198*24563a58SCédric Le GoaterManagement Area (TIMA), which exposes the thread context registers to 199*24563a58SCédric Le Goaterthe CPU for interrupt management. 200