Lines Matching +full:memory +full:- +full:to +full:- +full:memory
1 .. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.2-no-invariants-or-later
4 EDAC Memory Repair Control
7 Copyright (c) 2024-2025 HiSilicon Limited.
11 Invariant Sections, Front-Cover Texts nor Back-Cover Texts.
15 - Written for: 6.15
18 ------------
20 Some memory devices support repair operations to address issues in their
21 memory media. Post Package Repair (PPR) and memory sparing are examples of
27 Post Package Repair is a maintenance operation which requests the memory
28 device to perform repair operation on its media. It is a memory self-healing
29 feature that fixes a failing memory location by replacing it with a spare row
32 For example, a CXL memory device with DRAM components that support PPR
36 - hard PPR, for a permanent row repair, and
37 - soft PPR, for a temporary row repair.
42 The data may not be retained and memory requests may not be correctly
46 For example, for CXL memory devices, see CXL spec rev 3.1 [1]_ sections
50 Memory Sparing
53 Memory sparing is a repair function that replaces a portion of memory with
54 a portion of functional memory at a particular granularity. Memory
56 rank memory-sparing mode, one memory rank serves as a spare for other ranks on
59 The spare rank is held in reserve and not used as active memory until
61 available memory in the system.
63 After an error threshold is surpassed in a system protected by memory sparing,
64 the content of a failing rank of DIMMs is copied to the spare rank. The
66 active memory in place of the failed rank.
68 For example, CXL memory devices can support various subclasses for sparing
71 Cacheline sparing subclass refers to a sparing action that can replace a full
72 cacheline. Row sparing is provided as an alternative to PPR sparing functions
74 to be replaced. Rank sparing is defined as an operation in which an entire DDR
77 See CXL spec 3.1 [1]_ section 8.2.9.7.1.4 Memory Sparing Maintenance
80 .. [1] https://computeexpresslink.org/cxl-specification/
82 Use cases of generic memory repair features control
85 1. The soft PPR, hard PPR and memory-sparing features share similar control
87 repair control that is exposed to userspace and used by administrators,
90 2. When a CXL device detects an error in a memory component, it informs the
93 specifies the device physical address (DPA) and attributes of the memory
95 media or DRAM trace event to userspace, and userspace tools (e.g.
96 rasdaemon) initiate a repair maintenance operation in response to the
99 3. Userspace tools, such as rasdaemon, request a repair operation on a memory
100 region when maintenance need flag set or an uncorrected memory error or
101 excess of corrected memory errors above a threshold value is reported or an
102 exceed corrected errors threshold flag set for that memory.
104 4. Multiple PPR/sparing instances may be present per memory device.
106 5. Drivers should enforce that live repair is safe. In systems where memory
107 mapping functions can change between boots, one approach to this is to log
108 memory errors seen on this boot against which to check live memory repair
112 ---------------
114 The control attributes of a registered memory repair instance could be
115 accessed in the /sys/bus/edac/devices/<dev-name>/mem_repairX/
118 -----
121 `Documentation/ABI/testing/sysfs-edac-memory-repair`.