1What: /sys/bus/edac/devices/<dev-name>/mem_repairX 2Date: March 2025 3KernelVersion: 6.15 4Contact: linux-edac@vger.kernel.org 5Description: 6 The sysfs EDAC bus devices /<dev-name>/mem_repairX subdirectory 7 pertains to the memory media repair features control, such as 8 PPR (Post Package Repair), memory sparing etc, where <dev-name> 9 directory corresponds to a device registered with the EDAC 10 device driver for the memory repair features. 11 12 Post Package Repair is a maintenance operation requests the memory 13 device to perform a repair operation on its media. It is a memory 14 self-healing feature that fixes a failing memory location by 15 replacing it with a spare row in a DRAM device. For example, a 16 CXL memory device with DRAM components that support PPR features may 17 implement PPR maintenance operations. DRAM components may support 18 two types of PPR functions: hard PPR, for a permanent row repair, and 19 soft PPR, for a temporary row repair. Soft PPR may be much faster 20 than hard PPR, but the repair is lost with a power cycle. 21 22 The sysfs attributes nodes for a repair feature are only 23 present if the parent driver has implemented the corresponding 24 attr callback function and provided the necessary operations 25 to the EDAC device driver during registration. 26 27 In some states of system configuration (e.g. before address 28 decoders have been configured), memory devices (e.g. CXL) 29 may not have an active mapping in the main host address 30 physical address map. As such, the memory to repair must be 31 identified by a device specific physical addressing scheme 32 using a device physical address(DPA). The DPA and other control 33 attributes to use will be presented in related error records. 34 35What: /sys/bus/edac/devices/<dev-name>/mem_repairX/repair_type 36Date: March 2025 37KernelVersion: 6.15 38Contact: linux-edac@vger.kernel.org 39Description: 40 (RO) Memory repair type. For eg. post package repair, 41 memory sparing etc. Valid values are: 42 43 - ppr - Post package repair. 44 45 - cacheline-sparing 46 47 - row-sparing 48 49 - bank-sparing 50 51 - rank-sparing 52 53 - All other values are reserved. 54 55What: /sys/bus/edac/devices/<dev-name>/mem_repairX/persist_mode 56Date: March 2025 57KernelVersion: 6.15 58Contact: linux-edac@vger.kernel.org 59Description: 60 (RW) Get/Set the current persist repair mode set for a 61 repair function. Persist repair modes supported in the 62 device, based on a memory repair function, either is temporary, 63 which is lost with a power cycle or permanent. Valid values are: 64 65 - 0 - Soft memory repair (temporary repair). 66 67 - 1 - Hard memory repair (permanent repair). 68 69 - All other values are reserved. 70 71What: /sys/bus/edac/devices/<dev-name>/mem_repairX/repair_safe_when_in_use 72Date: March 2025 73KernelVersion: 6.15 74Contact: linux-edac@vger.kernel.org 75Description: 76 (RO) True if memory media is accessible and data is retained 77 during the memory repair operation. 78 The data may not be retained and memory requests may not be 79 correctly processed during a repair operation. In such case 80 repair operation can not be executed at runtime. The memory 81 must be taken offline. 82 83What: /sys/bus/edac/devices/<dev-name>/mem_repairX/hpa 84Date: March 2025 85KernelVersion: 6.15 86Contact: linux-edac@vger.kernel.org 87Description: 88 (RW) Host Physical Address (HPA) of the memory to repair. 89 The HPA to use will be provided in related error records. 90 91What: /sys/bus/edac/devices/<dev-name>/mem_repairX/dpa 92Date: March 2025 93KernelVersion: 6.15 94Contact: linux-edac@vger.kernel.org 95Description: 96 (RW) Device Physical Address (DPA) of the memory to repair. 97 The specific DPA to use will be provided in related error 98 records. 99 100 In some states of system configuration (e.g. before address 101 decoders have been configured), memory devices (e.g. CXL) 102 may not have an active mapping in the main host address 103 physical address map. As such, the memory to repair must be 104 identified by a device specific physical addressing scheme 105 using a DPA. The device physical address(DPA) to use will be 106 presented in related error records. 107 108What: /sys/bus/edac/devices/<dev-name>/mem_repairX/nibble_mask 109Date: March 2025 110KernelVersion: 6.15 111Contact: linux-edac@vger.kernel.org 112Description: 113 (RW) Read/Write Nibble mask of the memory to repair. 114 Nibble mask identifies one or more nibbles in error on the 115 memory bus that produced the error event. Nibble Mask bit 0 116 shall be set if nibble 0 on the memory bus produced the 117 event, etc. For example, CXL PPR and sparing, a nibble mask 118 bit set to 1 indicates the request to perform repair 119 operation in the specific device. All nibble mask bits set 120 to 1 indicates the request to perform the operation in all 121 devices. Eg. for CXL memory repair, the specific value of 122 nibble mask to use will be provided in related error records. 123 For more details, See nibble mask field in CXL spec ver 3.1, 124 section 8.2.9.7.1.2 Table 8-103 soft PPR and section 125 8.2.9.7.1.3 Table 8-104 hard PPR, section 8.2.9.7.1.4 126 Table 8-105 memory sparing. 127 128What: /sys/bus/edac/devices/<dev-name>/mem_repairX/min_hpa 129What: /sys/bus/edac/devices/<dev-name>/mem_repairX/max_hpa 130What: /sys/bus/edac/devices/<dev-name>/mem_repairX/min_dpa 131What: /sys/bus/edac/devices/<dev-name>/mem_repairX/max_dpa 132Date: March 2025 133KernelVersion: 6.15 134Contact: linux-edac@vger.kernel.org 135Description: 136 (RW) The supported range of memory address that is to be 137 repaired. The memory device may give the supported range of 138 attributes to use and it will depend on the memory device 139 and the portion of memory to repair. 140 The userspace may receive the specific value of attributes 141 to use for a repair operation from the memory device via 142 related error records and trace events, for eg. CXL DRAM 143 and CXL general media error records in CXL memory devices. 144 145What: /sys/bus/edac/devices/<dev-name>/mem_repairX/bank_group 146What: /sys/bus/edac/devices/<dev-name>/mem_repairX/bank 147What: /sys/bus/edac/devices/<dev-name>/mem_repairX/rank 148What: /sys/bus/edac/devices/<dev-name>/mem_repairX/row 149What: /sys/bus/edac/devices/<dev-name>/mem_repairX/column 150What: /sys/bus/edac/devices/<dev-name>/mem_repairX/channel 151What: /sys/bus/edac/devices/<dev-name>/mem_repairX/sub_channel 152Date: March 2025 153KernelVersion: 6.15 154Contact: linux-edac@vger.kernel.org 155Description: 156 (RW) The control attributes for the memory to be repaired. 157 The specific value of attributes to use depends on the 158 portion of memory to repair and will be reported to the host 159 in related error records and be available to userspace 160 in trace events, such as CXL DRAM and CXL general media 161 error records of CXL memory devices. 162 163 When readng back these attributes, it returns the current 164 value of memory requested to be repaired. 165 166 bank_group - The bank group of the memory to repair. 167 168 bank - The bank number of the memory to repair. 169 170 rank - The rank of the memory to repair. Rank is defined as a 171 set of memory devices on a channel that together execute a 172 transaction. 173 174 row - The row number of the memory to repair. 175 176 column - The column number of the memory to repair. 177 178 channel - The channel of the memory to repair. Channel is 179 defined as an interface that can be independently accessed 180 for a transaction. 181 182 sub_channel - The subchannel of the memory to repair. 183 184 The requirement to set these attributes varies based on the 185 repair function. The attributes in sysfs are not present 186 unless required for a repair function. 187 188 For example, CXL spec ver 3.1, Section 8.2.9.7.1.2 Table 8-103 189 soft PPR and Section 8.2.9.7.1.3 Table 8-104 hard PPR operations, 190 these attributes are not required to set. CXL spec ver 3.1, 191 Section 8.2.9.7.1.4 Table 8-105 memory sparing, these attributes 192 are required to set based on memory sparing granularity. 193 194What: /sys/bus/edac/devices/<dev-name>/mem_repairX/repair 195Date: March 2025 196KernelVersion: 6.15 197Contact: linux-edac@vger.kernel.org 198Description: 199 (WO) Issue the memory repair operation for the specified 200 memory repair attributes. The operation may fail if resources 201 are insufficient based on the requirements of the memory 202 device and repair function. 203 204 - 1 - Issue the repair operation. 205 206 - All other values are reserved. 207