Lines Matching +full:kernel +full:- +full:policy

1 .. SPDX-License-Identifier: GPL-2.0
20 Operating Performance Points or P-states (in ACPI terminology). As a rule,
24 time (or the more power is drawn) by the CPU in the given P-state. Therefore
29 as possible and then there is no reason to use any P-states different from the
30 highest one (i.e. the highest-performance frequency/voltage configuration
38 put into different P-states.
41 capacity, so as to decide which P-states to put the CPUs into. Of course, since
51 The Linux kernel supports CPU performance scaling by means of the ``CPUFreq``
64 information on the available P-states (or P-state ranges in some cases) and
65 access platform-specific hardware interfaces to change CPU P-states as requested
70 performance scaling algorithms for P-state selection can be represented in a
71 platform-independent form in the majority of cases, so it should be possible
80 platform-independent way. For this reason, ``CPUFreq`` allows scaling drivers
85 ``CPUFreq`` Policy Objects
88 In some cases the hardware interface for P-state control is shared by multiple
90 control the P-state of multiple CPUs at the same time and writing to it affects
93 Sets of CPUs sharing hardware P-state control interfaces are represented by
100 CPUs share the same hardware P-state control interface, all of the pointers
104 of its user space interface is based on the policy concept.
123 logical CPU may be a physical single-core processor, or a single core in a
129 Once invoked, the ``CPUFreq`` core checks if the policy pointer is already set
130 for the given CPU and if so, it skips the policy object creation. Otherwise,
131 a new policy object is created and initialized, which involves the creation of
132 a new policy directory in ``sysfs``, and the policy pointer corresponding to
133 the given CPU is set to the new policy object's address in memory.
135 Next, the scaling driver's ``->init()`` callback is invoked with the policy
139 to, represented by its policy object) and, if the policy object it has been
140 called for is new, to set parameters of the policy, like the minimum and maximum
142 the set of supported P-states is not a continuous range), and the mask of CPUs
143 that belong to the same policy (including both online and offline CPUs). That
144 mask is then used by the core to populate the policy pointers for all of the
147 The next major initialization step for a new policy object is to attach a
149 determined by the kernel command line or configuration, but it may be changed
150 later via ``sysfs``). First, a pointer to the new policy object is passed to
151 the governor's ``->init()`` callback which is expected to initialize all of the
152 data structures necessary to handle the given policy and, possibly, to add
154 invoking its ``->start()`` callback.
156 That callback is expected to register per-CPU utilization update callbacks for
157 all of the online CPUs belonging to the given policy with the CPU scheduler.
162 to determine the P-state to use for the given policy going forward and to
164 the P-state selection. The scaling driver may be invoked directly from
165 scheduler context or asynchronously, via a kernel thread or workqueue, depending
168 Similar steps are taken for policy objects that are not new, but were "inactive"
171 to use the scaling governor previously used with the policy that became
172 "inactive" (and is re-initialized now) instead of the default governor.
175 other CPUs sharing the policy object with it are online already, there is no
176 need to re-initialize the policy object at all. In that case, it only is
178 into account. That is achieved by invoking the governor's ``->stop`` and
179 ``->start()`` callbacks, in this order, for the entire policy.
182 governor layer of ``CPUFreq`` and provides its own P-state selection algorithms.
184 new policy objects. Instead, the driver's ``->setpolicy()`` callback is invoked
185 to register per-CPU utilization update callbacks for each policy. These
187 governors, but in the |intel_pstate| case they both determine the P-state to
191 The policy objects created during CPU initialization and other data structures
193 (which happens when the kernel module containing it is unloaded, for example) or
194 when the last CPU belonging to the given policy in unregistered.
197 Policy Interface in ``sysfs``
200 During the initialization of the kernel, the ``CPUFreq`` core creates a
205 integer number) for every policy object maintained by the ``CPUFreq`` core.
209 associated with (or belonging to) the given policy. The ``policyX`` directories
210 in :file:`/sys/devices/system/cpu/cpufreq` each contain policy-specific
211 attributes (files) to control ``CPUFreq`` behavior for the corresponding policy
216 and what scaling governor is attached to the given policy. Some scaling drivers
217 also add driver-specific attributes to the policy directories in ``sysfs`` to
218 control policy-specific aspects of driver behavior.
224 List of online CPUs belonging to this policy (i.e. sharing the hardware
225 performance scaling interface represented by the ``policyX`` policy
235 BIOS/HW-based mechanisms.
244 Current frequency of the CPUs belonging to this policy as obtained from
252 An average frequency (in KHz) of all CPUs belonging to a given policy,
266 Maximum possible operating frequency the CPUs belonging to this policy
270 Minimum possible operating frequency the CPUs belonging to this policy
274 The time it takes to switch the CPUs belonging to this policy from one
275 P-state to another, in nanoseconds.
278 work with the `ondemand`_ governor, -1 (:c:macro:`CPUFREQ_ETERNAL`)
282 List of all (online and offline) CPUs belonging to this policy.
285 List of available frequencies of the CPUs belonging to this policy
289 List of ``CPUFreq`` scaling governors present in the kernel that can
290 be attached to this policy or (if the |intel_pstate| scaling driver is
292 applied to this policy.
295 kernel module for the governor held by it to become available and be
299 Current frequency of all of the CPUs belonging to this policy (in kHz).
301 In the majority of cases, this is the frequency of the last P-state
317 The scaling governor currently attached to this policy or (if the
319 provided by the driver that is currently applied to this policy.
321 This attribute is read-write and writing to it will cause a new scaling
322 governor to be attached to this policy or a new scaling algorithm
329 Maximum frequency the CPUs belonging to this policy are allowed to be
332 This attribute is read-write and writing a string representing an
337 Minimum frequency the CPUs belonging to this policy are allowed to be
340 This attribute is read-write and writing a string representing a
341 non-negative integer to it will cause a new limit to be set (it must not
346 is attached to the given policy.
349 be written to in order to set a new frequency for the policy.
359 Scaling governors are attached to policy objects and different policy objects
363 The scaling governor for a given policy object can be changed at any time with
364 the help of the ``scaling_governor`` policy attribute in ``sysfs``.
366 Some governors expose ``sysfs`` attributes to control or fine-tune the scaling
368 tunables, can be either global (system-wide) or per-policy, depending on the
370 per-policy, they are located in a subdirectory of each policy directory.
377 ---------------
379 When attached to a policy object, this governor causes the highest frequency,
380 within the ``scaling_max_freq`` policy limit, to be requested for that policy.
382 The request is made once at that time the governor for the policy is set to
384 policy limits change after that.
387 -------------
389 When attached to a policy object, this governor causes the lowest frequency,
390 within the ``scaling_min_freq`` policy limit, to be requested for that policy.
392 The request is made once at that time the governor for the policy is set to
394 policy limits change after that.
397 -------------
400 to set the CPU frequency for the policy it is attached to by writing to the
401 ``scaling_setspeed`` attribute of that policy.
404 -------------
412 should be changed for a given policy (that depends on whether or not the driver
418 the allowed maximum (that is, the ``scaling_max_freq`` policy limit). In turn,
420 Per-Entity Load Tracking (PELT) metric for the root control group of the
421 given CPU as the CPU utilization estimate (see the *Per-entity load tracking*
429 policy (if the PELT number is frequency-invariant), or the current CPU frequency
434 "IO-wait boosting". That happens when the :c:macro:`SCHED_CPUFREQ_IOWAIT` flag
457 ------------
463 time in which the given CPU was not idle. The ratio of the non-idle (active)
466 If this governor is attached to a policy shared by multiple CPUs, the load is
468 for the entire policy.
471 invoked asynchronously (via a workqueue) and CPU P-states are updated from
474 relatively often and the CPU P-state updates triggered by it can be relatively
480 the value of the ``cpuinfo_max_freq`` policy attribute corresponds to the load of
481 1 (or 100%), and the value of the ``cpuinfo_min_freq`` policy attribute
484 it is allowed to use (the ``scaling_max_freq`` policy limit).
494 to ``cpuinfo_transition_latency`` on each policy this governor is
498 If this tunable is per-policy, the following shell command sets the time
506 will set the frequency to the maximum value allowed for the policy.
543 f * (1 - ``powersave_bias`` / 1000)
557 The performance of a workload with the sensitivity of 0 (memory-bound or
558 IO-bound) is not expected to increase at all as a result of increasing
560 (CPU-bound) are expected to perform much better if the CPU frequency is
566 target, so as to avoid over-provisioning workloads that will not benefit
570 ----------------
579 battery-powered). To achieve that, it changes the frequency in relatively
580 small steps, one step at a time, up or down - depending on whether or not a
587 allowed to set (the ``scaling_max_freq`` policy limit), between 0 and
594 ``scaling_max_freq`` policy limits.
617 ----------
626 "Turbo-Core" or (in technical documentation) "Core Performance Boost" and so on.
631 The frequency boost mechanism may be either hardware-based or software-based.
632 If it is hardware-based (e.g. on x86), the decision to trigger the boosting is
635 limits). If it is software-based (e.g. on ARM), the scaling driver decides
639 -------------------------------
644 but provides a driver-specific interface for controlling it, like
649 trigger boosting (in the hardware-based case), or the software is allowed to
650 trigger boosting (in the software-based case). It does not mean that boosting
661 --------------------------------
691 single-thread performance may vary because of it which may lead to
697 -----------------------
699 The AMD powernow-k8 scaling driver supports a ``sysfs`` knob very similar to
703 If present, that knob is located in every ``CPUFreq`` policy directory in
706 implementation, however, works on the system-wide basis and setting that knob
707 for one policy causes the same value of it to be set for all of the other
711 hardware feature, but it may be configured out of the kernel (via the
726 .. [1] Jonathan Corbet, *Per-entity load tracking*,