1.. SPDX-License-Identifier: GPL-2.0 2 3=============================================================== 4Intel(R) Dynamic Platform and Thermal Framework Sysfs Interface 5=============================================================== 6 7:Copyright: © 2022 Intel Corporation 8 9:Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> 10 11Introduction 12------------ 13 14Intel(R) Dynamic Platform and Thermal Framework (DPTF) is a platform 15level hardware/software solution for power and thermal management. 16 17As a container for multiple power/thermal technologies, DPTF provides 18a coordinated approach for different policies to effect the hardware 19state of a system. 20 21Since it is a platform level framework, this has several components. 22Some parts of the technology is implemented in the firmware and uses 23ACPI and PCI devices to expose various features for monitoring and 24control. Linux has a set of kernel drivers exposing hardware interface 25to user space. This allows user space thermal solutions like 26"Linux Thermal Daemon" to read platform specific thermal and power 27tables to deliver adequate performance while keeping the system under 28thermal limits. 29 30DPTF ACPI Drivers interface 31---------------------------- 32 33:file:`/sys/bus/platform/devices/<N>/uuids`, where <N> 34=INT3400|INTC1040|INTC1041|INTC10A0 35 36``available_uuids`` (RO) 37 A set of UUIDs strings presenting available policies 38 which should be notified to the firmware when the 39 user space can support those policies. 40 41 UUID strings: 42 43 "42A441D6-AE6A-462b-A84B-4A8CE79027D3" : Passive 1 44 45 "3A95C389-E4B8-4629-A526-C52C88626BAE" : Active 46 47 "97C68AE7-15FA-499c-B8C9-5DA81D606E0A" : Critical 48 49 "63BE270F-1C11-48FD-A6F7-3AF253FF3E2D" : Adaptive performance 50 51 "5349962F-71E6-431D-9AE8-0A635B710AEE" : Emergency call 52 53 "9E04115A-AE87-4D1C-9500-0F3E340BFE75" : Passive 2 54 55 "F5A35014-C209-46A4-993A-EB56DE7530A1" : Power Boss 56 57 "6ED722A7-9240-48A5-B479-31EEF723D7CF" : Virtual Sensor 58 59 "16CAF1B7-DD38-40ED-B1C1-1B8A1913D531" : Cooling mode 60 61 "BE84BABF-C4D4-403D-B495-3128FD44dAC1" : HDC 62 63``current_uuid`` (RW) 64 User space can write strings from available UUIDs, one at a 65 time. 66 67:file:`/sys/bus/platform/devices/<N>/`, where <N> 68=INT3400|INTC1040|INTC1041|INTC10A0 69 70``imok`` (WO) 71 User space daemon write 1 to respond to firmware event 72 for sending keep alive notification. User space receives 73 THERMAL_EVENT_KEEP_ALIVE kobject uevent notification when 74 firmware calls for user space to respond with imok ACPI 75 method. 76 77``odvp*`` (RO) 78 Firmware thermal status variable values. Thermal tables 79 calls for different processing based on these variable 80 values. 81 82``data_vault`` (RO) 83 Binary thermal table. Refer to 84 https:/github.com/intel/thermal_daemon for decoding 85 thermal table. 86 87``production_mode`` (RO) 88 When different from zero, manufacturer locked thermal configuration 89 from further changes. 90 91ACPI Thermal Relationship table interface 92------------------------------------------ 93 94:file:`/dev/acpi_thermal_rel` 95 96 This device provides IOCTL interface to read standard ACPI 97 thermal relationship tables via ACPI methods _TRT and _ART. 98 These IOCTLs are defined in 99 drivers/thermal/intel/int340x_thermal/acpi_thermal_rel.h 100 101 IOCTLs: 102 103 ACPI_THERMAL_GET_TRT_LEN: Get length of TRT table 104 105 ACPI_THERMAL_GET_ART_LEN: Get length of ART table 106 107 ACPI_THERMAL_GET_TRT_COUNT: Number of records in TRT table 108 109 ACPI_THERMAL_GET_ART_COUNT: Number of records in ART table 110 111 ACPI_THERMAL_GET_TRT: Read binary TRT table, length to read is 112 provided via argument to ioctl(). 113 114 ACPI_THERMAL_GET_ART: Read binary ART table, length to read is 115 provided via argument to ioctl(). 116 117DPTF ACPI Sensor drivers 118------------------------- 119 120DPTF Sensor drivers are presented as standard thermal sysfs thermal_zone. 121 122 123DPTF ACPI Cooling drivers 124-------------------------- 125 126DPTF cooling drivers are presented as standard thermal sysfs cooling_device. 127 128 129DPTF Processor thermal PCI Driver interface 130-------------------------------------------- 131 132:file:`/sys/bus/pci/devices/0000\:00\:04.0/power_limits/` 133 134Refer to Documentation/power/powercap/powercap.rst for powercap 135ABI. 136 137``power_limit_0_max_uw`` (RO) 138 Maximum powercap sysfs constraint_0_power_limit_uw for Intel RAPL 139 140``power_limit_0_step_uw`` (RO) 141 Power limit increment/decrements for Intel RAPL constraint 0 power limit 142 143``power_limit_0_min_uw`` (RO) 144 Minimum powercap sysfs constraint_0_power_limit_uw for Intel RAPL 145 146``power_limit_0_tmin_us`` (RO) 147 Minimum powercap sysfs constraint_0_time_window_us for Intel RAPL 148 149``power_limit_0_tmax_us`` (RO) 150 Maximum powercap sysfs constraint_0_time_window_us for Intel RAPL 151 152``power_limit_1_max_uw`` (RO) 153 Maximum powercap sysfs constraint_1_power_limit_uw for Intel RAPL 154 155``power_limit_1_step_uw`` (RO) 156 Power limit increment/decrements for Intel RAPL constraint 1 power limit 157 158``power_limit_1_min_uw`` (RO) 159 Minimum powercap sysfs constraint_1_power_limit_uw for Intel RAPL 160 161``power_limit_1_tmin_us`` (RO) 162 Minimum powercap sysfs constraint_1_time_window_us for Intel RAPL 163 164``power_limit_1_tmax_us`` (RO) 165 Maximum powercap sysfs constraint_1_time_window_us for Intel RAPL 166 167``power_floor_status`` (RO) 168 When set to 1, the power floor of the system in the current 169 configuration has been reached. It needs to be reconfigured to allow 170 power to be reduced any further. 171 172``power_floor_enable`` (RW) 173 When set to 1, enable reading and notification of the power floor 174 status. Notifications are triggered for the power_floor_status 175 attribute value changes. 176 177:file:`/sys/bus/pci/devices/0000\:00\:04.0/` 178 179``tcc_offset_degree_celsius`` (RW) 180 TCC offset from the critical temperature where hardware will throttle 181 CPU. 182 183:file:`/sys/bus/pci/devices/0000\:00\:04.0/workload_request` 184 185``workload_available_types`` (RO) 186 Available workload types. User space can specify one of the workload type 187 it is currently executing via workload_type. For example: idle, bursty, 188 sustained etc. 189 190``workload_type`` (RW) 191 User space can specify any one of the available workload type using 192 this interface. 193 194:file:`/sys/bus/pci/devices/0000\:00\:04.0/ptc_0_control` 195:file:`/sys/bus/pci/devices/0000\:00\:04.0/ptc_1_control` 196:file:`/sys/bus/pci/devices/0000\:00\:04.0/ptc_2_control` 197 198All these controls needs admin privilege to update. 199 200``enable`` (RW) 201 1 for enable, 0 for disable. Shows the current enable status of 202 platform temperature control feature. User space can enable/disable 203 hardware controls. 204 205``temperature_target`` (RW) 206 Update a new temperature target in milli degree celsius for hardware to 207 use for the temperature control. 208 209``thermal_tolerance`` (RW) 210 This attribute ranges from 0 to 7, where 0 represents 211 the most aggressive control to avoid any temperature overshoots, and 212 7 represents a more graceful approach, favoring performance even at 213 the expense of temperature overshoots. 214 Note: This level may not scale linearly. For example, a value of 3 does 215 not necessarily imply a 50% improvement in performance compared to a 216 value of 0. 217 218Given that this is platform temperature control, it is expected that a 219single user-level manager owns and manages the controls. If multiple 220user-level software applications attempt to write different targets, it 221can lead to unexpected behavior. 222 223 224DPTF Processor thermal RFIM interface 225-------------------------------------------- 226 227RFIM interface allows adjustment of FIVR (Fully Integrated Voltage Regulator), 228DDR (Double Data Rate) and DLVR (Digital Linear Voltage Regulator) 229frequencies to avoid RF interference with WiFi and 5G. 230 231Switching voltage regulators (VR) generate radiated EMI or RFI at the 232fundamental frequency and its harmonics. Some harmonics may interfere 233with very sensitive wireless receivers such as Wi-Fi and cellular that 234are integrated into host systems like notebook PCs. One of mitigation 235methods is requesting SOC integrated VR (IVR) switching frequency to a 236small % and shift away the switching noise harmonic interference from 237radio channels. OEM or ODMs can use the driver to control SOC IVR 238operation within the range where it does not impact IVR performance. 239 240Some products use DLVR instead of FIVR as switching voltage regulator. 241In this case attributes of DLVR must be adjusted instead of FIVR. 242 243While shifting the frequencies additional clock noise can be introduced, 244which is compensated by adjusting Spread spectrum percent. This helps 245to reduce the clock noise to meet regulatory compliance. This spreading 246% increases bandwidth of signal transmission and hence reduces the 247effects of interference, noise and signal fading. 248 249DRAM devices of DDR IO interface and their power plane can generate EMI 250at the data rates. Similar to IVR control mechanism, Intel offers a 251mechanism by which DDR data rates can be changed if several conditions 252are met: there is strong RFI interference because of DDR; CPU power 253management has no other restriction in changing DDR data rates; 254PC ODMs enable this feature (real time DDR RFI Mitigation referred to as 255DDR-RFIM) for Wi-Fi from BIOS. 256 257 258FIVR attributes 259 260:file:`/sys/bus/pci/devices/0000\:00\:04.0/fivr/` 261 262``vco_ref_code_lo`` (RW) 263 The VCO reference code is an 11-bit field and controls the FIVR 264 switching frequency. This is the 3-bit LSB field. 265 266``vco_ref_code_hi`` (RW) 267 The VCO reference code is an 11-bit field and controls the FIVR 268 switching frequency. This is the 8-bit MSB field. 269 270``spread_spectrum_pct`` (RW) 271 Set the FIVR spread spectrum clocking percentage 272 273``spread_spectrum_clk_enable`` (RW) 274 Enable/disable of the FIVR spread spectrum clocking feature 275 276``rfi_vco_ref_code`` (RW) 277 This field is a read only status register which reflects the 278 current FIVR switching frequency 279 280``fivr_fffc_rev`` (RW) 281 This field indicated the revision of the FIVR HW. 282 283 284DVFS attributes 285 286:file:`/sys/bus/pci/devices/0000\:00\:04.0/dvfs/` 287 288``rfi_restriction_run_busy`` (RW) 289 Request the restriction of specific DDR data rate and set this 290 value 1. Self reset to 0 after operation. 291 292``rfi_restriction_err_code`` (RW) 293 0 :Request is accepted, 1:Feature disabled, 294 2: the request restricts more points than it is allowed 295 296``rfi_restriction_data_rate_Delta`` (RW) 297 Restricted DDR data rate for RFI protection: Lower Limit 298 299``rfi_restriction_data_rate_Base`` (RW) 300 Restricted DDR data rate for RFI protection: Upper Limit 301 302``ddr_data_rate_point_0`` (RO) 303 DDR data rate selection 1st point 304 305``ddr_data_rate_point_1`` (RO) 306 DDR data rate selection 2nd point 307 308``ddr_data_rate_point_2`` (RO) 309 DDR data rate selection 3rd point 310 311``ddr_data_rate_point_3`` (RO) 312 DDR data rate selection 4th point 313 314``rfi_disable (RW)`` 315 Disable DDR rate change feature 316 317DLVR attributes 318 319:file:`/sys/bus/pci/devices/0000\:00\:04.0/dlvr/` 320 321``dlvr_hardware_rev`` (RO) 322 DLVR hardware revision. 323 324``dlvr_freq_mhz`` (RO) 325 Current DLVR PLL frequency in MHz. 326 327``dlvr_freq_select`` (RW) 328 Sets DLVR PLL clock frequency. Once set, and enabled via 329 dlvr_rfim_enable, the dlvr_freq_mhz will show the current 330 DLVR PLL frequency. 331 332``dlvr_pll_busy`` (RO) 333 PLL can't accept frequency change when set. 334 335``dlvr_rfim_enable`` (RW) 336 0: Disable RF frequency hopping, 1: Enable RF frequency hopping. 337 338``dlvr_spread_spectrum_pct`` (RW) 339 Sets DLVR spread spectrum percent value. 340 341``dlvr_control_mode`` (RW) 342 Specifies how frequencies are spread using spread spectrum. 343 0: Down spread, 344 1: Spread in the Center. 345 346``dlvr_control_lock`` (RW) 347 1: future writes are ignored. 348 349DPTF Power supply and Battery Interface 350---------------------------------------- 351 352Refer to Documentation/ABI/testing/sysfs-platform-dptf 353 354DPTF Fan Control 355---------------------------------------- 356 357Refer to Documentation/admin-guide/acpi/fan_performance_states.rst 358 359Workload Type Hints 360---------------------------------------- 361 362The firmware in Meteor Lake processor generation is capable of identifying 363workload type and passing hints regarding it to the OS. A special sysfs 364interface is provided to allow user space to obtain workload type hints from 365the firmware and control the rate at which they are provided. 366 367User space can poll attribute "workload_type_index" for the current hint or 368can receive a notification whenever the value of this attribute is updated. 369 370file:`/sys/bus/pci/devices/0000:00:04.0/workload_hint/` 371Segment 0, bus 0, device 4, function 0 is reserved for the processor thermal 372device on all Intel client processors. So, the above path doesn't change 373based on the processor generation. 374 375``workload_hint_enable`` (RW) 376 Enable firmware to send workload type hints to user space. 377 378``notification_delay_ms`` (RW) 379 Minimum delay in milliseconds before firmware will notify OS. This is 380 for the rate control of notifications. This delay is between changing 381 the workload type prediction in the firmware and notifying the OS about 382 the change. The default delay is 1024 ms. The delay of 0 is invalid. 383 The delay is rounded up to the nearest power of 2 to simplify firmware 384 programming of the delay value. The read of notification_delay_ms 385 attribute shows the effective value used. 386 387``workload_type_index`` (RO) 388 Predicted workload type index. User space can get notification of 389 change via existing sysfs attribute change notification mechanism. 390 391 The supported index values and their meaning for the Meteor Lake 392 processor generation are as follows: 393 394 0 - Idle: System performs no tasks, power and idle residency are 395 consistently low for long periods of time. 396 397 1 – Battery Life: Power is relatively low, but the processor may 398 still be actively performing a task, such as video playback for 399 a long period of time. 400 401 2 – Sustained: Power level that is relatively high for a long period 402 of time, with very few to no periods of idleness, which will 403 eventually exhaust RAPL Power Limit 1 and 2. 404 405 3 – Bursty: Consumes a relatively constant average amount of power, but 406 periods of relative idleness are interrupted by bursts of 407 activity. The bursts are relatively short and the periods of 408 relative idleness between them typically prevent RAPL Power 409 Limit 1 from being exhausted. 410 411 4 – Unknown: Can't classify. 412