xref: /qemu/docs/specs/acpi_hest_ghes.rst (revision 0e3327b690b76b7c3966b028110ee053cc16a385)
15fb004a2SDongjiu GengAPEI tables generating and CPER record
25fb004a2SDongjiu Geng======================================
35fb004a2SDongjiu Geng
45fb004a2SDongjiu Geng..
55fb004a2SDongjiu Geng   Copyright (c) 2020 HUAWEI TECHNOLOGIES CO., LTD.
65fb004a2SDongjiu Geng
75fb004a2SDongjiu Geng   This work is licensed under the terms of the GNU GPL, version 2 or later.
85fb004a2SDongjiu Geng   See the COPYING file in the top-level directory.
95fb004a2SDongjiu Geng
105fb004a2SDongjiu GengDesign Details
115fb004a2SDongjiu Geng--------------
125fb004a2SDongjiu Geng
135fb004a2SDongjiu Geng::
145fb004a2SDongjiu Geng
155fb004a2SDongjiu Geng         etc/acpi/tables                           etc/hardware_errors
165fb004a2SDongjiu Geng      ====================                   ===============================
175fb004a2SDongjiu Geng  + +--------------------------+            +----------------------------+
185fb004a2SDongjiu Geng  | | HEST                     | +--------->|    error_block_address1    |------+
195fb004a2SDongjiu Geng  | +--------------------------+ |          +----------------------------+      |
205fb004a2SDongjiu Geng  | | GHES1                    | | +------->|    error_block_address2    |------+-+
215fb004a2SDongjiu Geng  | +--------------------------+ | |        +----------------------------+      | |
225fb004a2SDongjiu Geng  | | .................        | | |        |      ..............        |      | |
235fb004a2SDongjiu Geng  | | error_status_address-----+-+ |        -----------------------------+      | |
245fb004a2SDongjiu Geng  | | .................        |   |   +--->|    error_block_addressN    |------+-+---+
255fb004a2SDongjiu Geng  | | read_ack_register--------+-+ |   |    +----------------------------+      | |   |
265fb004a2SDongjiu Geng  | | read_ack_preserve        | +-+---+--->|     read_ack_register1     |      | |   |
275fb004a2SDongjiu Geng  | | read_ack_write           |   |   |    +----------------------------+      | |   |
285fb004a2SDongjiu Geng  + +--------------------------+   | +-+--->|     read_ack_register2     |      | |   |
295fb004a2SDongjiu Geng  | | GHES2                    |   | | |    +----------------------------+      | |   |
305fb004a2SDongjiu Geng  + +--------------------------+   | | |    |       .............        |      | |   |
315fb004a2SDongjiu Geng  | | .................        |   | | |    +----------------------------+      | |   |
325fb004a2SDongjiu Geng  | | error_status_address-----+---+ | | +->|     read_ack_registerN     |      | |   |
335fb004a2SDongjiu Geng  | | .................        |     | | |  +----------------------------+      | |   |
345fb004a2SDongjiu Geng  | | read_ack_register--------+-----+ | |  |Generic Error Status Block 1|<-----+ |   |
355fb004a2SDongjiu Geng  | | read_ack_preserve        |       | |  |-+------------------------+-+        |   |
365fb004a2SDongjiu Geng  | | read_ack_write           |       | |  | |          CPER          | |        |   |
375fb004a2SDongjiu Geng  + +--------------------------|       | |  | |          CPER          | |        |   |
385fb004a2SDongjiu Geng  | | ...............          |       | |  | |          ....          | |        |   |
395fb004a2SDongjiu Geng  + +--------------------------+       | |  | |          CPER          | |        |   |
405fb004a2SDongjiu Geng  | | GHESN                    |       | |  |-+------------------------+-|        |   |
415fb004a2SDongjiu Geng  + +--------------------------+       | |  |Generic Error Status Block 2|<-------+   |
425fb004a2SDongjiu Geng  | | .................        |       | |  |-+------------------------+-+            |
435fb004a2SDongjiu Geng  | | error_status_address-----+-------+ |  | |           CPER         | |            |
445fb004a2SDongjiu Geng  | | .................        |         |  | |           CPER         | |            |
455fb004a2SDongjiu Geng  | | read_ack_register--------+---------+  | |           ....         | |            |
465fb004a2SDongjiu Geng  | | read_ack_preserve        |            | |           CPER         | |            |
475fb004a2SDongjiu Geng  | | read_ack_write           |            +-+------------------------+-+            |
485fb004a2SDongjiu Geng  + +--------------------------+            |         ..........         |            |
495fb004a2SDongjiu Geng                                            |----------------------------+            |
505fb004a2SDongjiu Geng                                            |Generic Error Status Block N |<----------+
515fb004a2SDongjiu Geng                                            |-+-------------------------+-+
525fb004a2SDongjiu Geng                                            | |          CPER           | |
535fb004a2SDongjiu Geng                                            | |          CPER           | |
545fb004a2SDongjiu Geng                                            | |          ....           | |
555fb004a2SDongjiu Geng                                            | |          CPER           | |
565fb004a2SDongjiu Geng                                            +-+-------------------------+-+
575fb004a2SDongjiu Geng
585fb004a2SDongjiu Geng
595fb004a2SDongjiu Geng(1) QEMU generates the ACPI HEST table. This table goes in the current
605fb004a2SDongjiu Geng    "etc/acpi/tables" fw_cfg blob. Each error source has different
615fb004a2SDongjiu Geng    notification types.
625fb004a2SDongjiu Geng
635fb004a2SDongjiu Geng(2) A new fw_cfg blob called "etc/hardware_errors" is introduced. QEMU
645fb004a2SDongjiu Geng    also needs to populate this blob. The "etc/hardware_errors" fw_cfg blob
655fb004a2SDongjiu Geng    contains an address registers table and an Error Status Data Block table.
665fb004a2SDongjiu Geng
675fb004a2SDongjiu Geng(3) The address registers table contains N Error Block Address entries
685fb004a2SDongjiu Geng    and N Read Ack Register entries. The size for each entry is 8-byte.
695fb004a2SDongjiu Geng    The Error Status Data Block table contains N Error Status Data Block
70*84c14675SMauro Carvalho Chehab    entries. The size for each entry is defined at the source code as
71*84c14675SMauro Carvalho Chehab    ACPI_GHES_MAX_RAW_DATA_LENGTH (currently 1024 bytes). The total size
72*84c14675SMauro Carvalho Chehab    for the "etc/hardware_errors" fw_cfg blob is
73*84c14675SMauro Carvalho Chehab    (N * 8 * 2 + N * ACPI_GHES_MAX_RAW_DATA_LENGTH) bytes.
745fb004a2SDongjiu Geng    N is the number of the kinds of hardware error sources.
755fb004a2SDongjiu Geng
765fb004a2SDongjiu Geng(4) QEMU generates the ACPI linker/loader script for the firmware. The
775fb004a2SDongjiu Geng    firmware pre-allocates memory for "etc/acpi/tables", "etc/hardware_errors"
785fb004a2SDongjiu Geng    and copies blob contents there.
795fb004a2SDongjiu Geng
805fb004a2SDongjiu Geng(5) QEMU generates N ADD_POINTER commands, which patch addresses in the
815fb004a2SDongjiu Geng    "error_status_address" fields of the HEST table with a pointer to the
825fb004a2SDongjiu Geng    corresponding "address registers" in the "etc/hardware_errors" blob.
835fb004a2SDongjiu Geng
845fb004a2SDongjiu Geng(6) QEMU generates N ADD_POINTER commands, which patch addresses in the
855fb004a2SDongjiu Geng    "read_ack_register" fields of the HEST table with a pointer to the
865fb004a2SDongjiu Geng    corresponding "read_ack_register" within the "etc/hardware_errors" blob.
875fb004a2SDongjiu Geng
885fb004a2SDongjiu Geng(7) QEMU generates N ADD_POINTER commands for the firmware, which patch
895fb004a2SDongjiu Geng    addresses in the "error_block_address" fields with a pointer to the
905fb004a2SDongjiu Geng    respective "Error Status Data Block" in the "etc/hardware_errors" blob.
915fb004a2SDongjiu Geng
925fb004a2SDongjiu Geng(8) QEMU defines a third and write-only fw_cfg blob which is called
935fb004a2SDongjiu Geng    "etc/hardware_errors_addr". Through that blob, the firmware can send back
945fb004a2SDongjiu Geng    the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
955fb004a2SDongjiu Geng    blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER command
965fb004a2SDongjiu Geng    for the firmware. The firmware will write back the start address of
975fb004a2SDongjiu Geng    "etc/hardware_errors" blob to the fw_cfg file "etc/hardware_errors_addr".
985fb004a2SDongjiu Geng
995fb004a2SDongjiu Geng(9) When QEMU gets a SIGBUS from the kernel, QEMU writes CPER into corresponding
1005fb004a2SDongjiu Geng    "Error Status Data Block", guest memory, and then injects platform specific
1015fb004a2SDongjiu Geng    interrupt (in case of arm/virt machine it's Synchronous External Abort) as a
1025fb004a2SDongjiu Geng    notification which is necessary for notifying the guest.
1035fb004a2SDongjiu Geng
1045fb004a2SDongjiu Geng(10) This notification (in virtual hardware) will be handled by the guest
1055fb004a2SDongjiu Geng     kernel, on receiving notification, guest APEI driver could read the CPER error
1065fb004a2SDongjiu Geng     and take appropriate action.
1075fb004a2SDongjiu Geng
1085fb004a2SDongjiu Geng(11) kvm_arch_on_sigbus_vcpu() uses source_id as index in "etc/hardware_errors" to
1095fb004a2SDongjiu Geng     find out "Error Status Data Block" entry corresponding to error source. So supported
1105fb004a2SDongjiu Geng     source_id values should be assigned here and not be changed afterwards to make sure
1115fb004a2SDongjiu Geng     that guest will write error into expected "Error Status Data Block" even if guest was
1125fb004a2SDongjiu Geng     migrated to a newer QEMU.
113