Lines Matching +full:kernel +full:- +full:policy

1 .. SPDX-License-Identifier: GPL-2.0
20 Operating Performance Points or P-states (in ACPI terminology). As a rule,
24 time (or the more power is drawn) by the CPU in the given P-state. Therefore
29 as possible and then there is no reason to use any P-states different from the
30 highest one (i.e. the highest-performance frequency/voltage configuration
38 put into different P-states.
41 capacity, so as to decide which P-states to put the CPUs into. Of course, since
51 The Linux kernel supports CPU performance scaling by means of the ``CPUFreq``
64 information on the available P-states (or P-state ranges in some cases) and
65 access platform-specific hardware interfaces to change CPU P-states as requested
70 performance scaling algorithms for P-state selection can be represented in a
71 platform-independent form in the majority of cases, so it should be possible
80 platform-independent way. For this reason, ``CPUFreq`` allows scaling drivers
85 ``CPUFreq`` Policy Objects
88 In some cases the hardware interface for P-state control is shared by multiple
90 control the P-state of multiple CPUs at the same time and writing to it affects
93 Sets of CPUs sharing hardware P-state control interfaces are represented by
100 CPUs share the same hardware P-state control interface, all of the pointers
104 of its user space interface is based on the policy concept.
123 logical CPU may be a physical single-core processor, or a single core in a
129 Once invoked, the ``CPUFreq`` core checks if the policy pointer is already set
130 for the given CPU and if so, it skips the policy object creation. Otherwise,
131 a new policy object is created and initialized, which involves the creation of
132 a new policy directory in ``sysfs``, and the policy pointer corresponding to
133 the given CPU is set to the new policy object's address in memory.
135 Next, the scaling driver's ``->init()`` callback is invoked with the policy
139 to, represented by its policy object) and, if the policy object it has been
140 called for is new, to set parameters of the policy, like the minimum and maximum
142 the set of supported P-states is not a continuous range), and the mask of CPUs
143 that belong to the same policy (including both online and offline CPUs). That
144 mask is then used by the core to populate the policy pointers for all of the
147 The next major initialization step for a new policy object is to attach a
149 determined by the kernel command line or configuration, but it may be changed
150 later via ``sysfs``). First, a pointer to the new policy object is passed to
151 the governor's ``->init()`` callback which is expected to initialize all of the
152 data structures necessary to handle the given policy and, possibly, to add
154 invoking its ``->start()`` callback.
156 That callback is expected to register per-CPU utilization update callbacks for
157 all of the online CPUs belonging to the given policy with the CPU scheduler.
162 to determine the P-state to use for the given policy going forward and to
164 the P-state selection. The scaling driver may be invoked directly from
165 scheduler context or asynchronously, via a kernel thread or workqueue, depending
168 Similar steps are taken for policy objects that are not new, but were "inactive"
171 to use the scaling governor previously used with the policy that became
172 "inactive" (and is re-initialized now) instead of the default governor.
175 other CPUs sharing the policy object with it are online already, there is no
176 need to re-initialize the policy object at all. In that case, it only is
178 into account. That is achieved by invoking the governor's ``->stop`` and
179 ``->start()`` callbacks, in this order, for the entire policy.
182 governor layer of ``CPUFreq`` and provides its own P-state selection algorithms.
184 new policy objects. Instead, the driver's ``->setpolicy()`` callback is invoked
185 to register per-CPU utilization update callbacks for each policy. These
187 governors, but in the |intel_pstate| case they both determine the P-state to
191 The policy objects created during CPU initialization and other data structures
193 (which happens when the kernel module containing it is unloaded, for example) or
194 when the last CPU belonging to the given policy in unregistered.
197 Policy Interface in ``sysfs``
200 During the initialization of the kernel, the ``CPUFreq`` core creates a
205 integer number) for every policy object maintained by the ``CPUFreq`` core.
209 associated with (or belonging to) the given policy. The ``policyX`` directories
210 in :file:`/sys/devices/system/cpu/cpufreq` each contain policy-specific
211 attributes (files) to control ``CPUFreq`` behavior for the corresponding policy
216 and what scaling governor is attached to the given policy. Some scaling drivers
217 also add driver-specific attributes to the policy directories in ``sysfs`` to
218 control policy-specific aspects of driver behavior.
224 List of online CPUs belonging to this policy (i.e. sharing the hardware
225 performance scaling interface represented by the ``policyX`` policy
235 BIOS/HW-based mechanisms.
244 Current frequency of the CPUs belonging to this policy as obtained from
252 Maximum possible operating frequency the CPUs belonging to this policy
256 Minimum possible operating frequency the CPUs belonging to this policy
260 The time it takes to switch the CPUs belonging to this policy from one
261 P-state to another, in nanoseconds.
264 work with the `ondemand`_ governor, -1 (:c:macro:`CPUFREQ_ETERNAL`)
268 List of all (online and offline) CPUs belonging to this policy.
271 List of ``CPUFreq`` scaling governors present in the kernel that can
272 be attached to this policy or (if the |intel_pstate| scaling driver is
274 applied to this policy.
277 kernel module for the governor held by it to become available and be
281 Current frequency of all of the CPUs belonging to this policy (in kHz).
283 In the majority of cases, this is the frequency of the last P-state
298 The scaling governor currently attached to this policy or (if the
300 provided by the driver that is currently applied to this policy.
302 This attribute is read-write and writing to it will cause a new scaling
303 governor to be attached to this policy or a new scaling algorithm
310 Maximum frequency the CPUs belonging to this policy are allowed to be
313 This attribute is read-write and writing a string representing an
318 Minimum frequency the CPUs belonging to this policy are allowed to be
321 This attribute is read-write and writing a string representing a
322 non-negative integer to it will cause a new limit to be set (it must not
327 is attached to the given policy.
330 be written to in order to set a new frequency for the policy.
340 Scaling governors are attached to policy objects and different policy objects
344 The scaling governor for a given policy object can be changed at any time with
345 the help of the ``scaling_governor`` policy attribute in ``sysfs``.
347 Some governors expose ``sysfs`` attributes to control or fine-tune the scaling
349 tunables, can be either global (system-wide) or per-policy, depending on the
351 per-policy, they are located in a subdirectory of each policy directory.
358 ---------------
360 When attached to a policy object, this governor causes the highest frequency,
361 within the ``scaling_max_freq`` policy limit, to be requested for that policy.
363 The request is made once at that time the governor for the policy is set to
365 policy limits change after that.
368 -------------
370 When attached to a policy object, this governor causes the lowest frequency,
371 within the ``scaling_min_freq`` policy limit, to be requested for that policy.
373 The request is made once at that time the governor for the policy is set to
375 policy limits change after that.
378 -------------
381 to set the CPU frequency for the policy it is attached to by writing to the
382 ``scaling_setspeed`` attribute of that policy.
385 -------------
393 should be changed for a given policy (that depends on whether or not the driver
399 the allowed maximum (that is, the ``scaling_max_freq`` policy limit). In turn,
401 Per-Entity Load Tracking (PELT) metric for the root control group of the
402 given CPU as the CPU utilization estimate (see the *Per-entity load tracking*
410 policy (if the PELT number is frequency-invariant), or the current CPU frequency
415 "IO-wait boosting". That happens when the :c:macro:`SCHED_CPUFREQ_IOWAIT` flag
438 ------------
444 time in which the given CPU was not idle. The ratio of the non-idle (active)
447 If this governor is attached to a policy shared by multiple CPUs, the load is
449 for the entire policy.
452 invoked asynchronously (via a workqueue) and CPU P-states are updated from
455 relatively often and the CPU P-state updates triggered by it can be relatively
461 the value of the ``cpuinfo_max_freq`` policy attribute corresponds to the load of
462 1 (or 100%), and the value of the ``cpuinfo_min_freq`` policy attribute
465 it is allowed to use (the ``scaling_max_freq`` policy limit).
475 for each policy this governor is attached to (but since the unit here
480 If this tunable is per-policy, the following shell command sets the time
487 will set the frequency to the maximum value allowed for the policy.
524 f * (1 - ``powersave_bias`` / 1000)
538 The performance of a workload with the sensitivity of 0 (memory-bound or
539 IO-bound) is not expected to increase at all as a result of increasing
541 (CPU-bound) are expected to perform much better if the CPU frequency is
547 target, so as to avoid over-provisioning workloads that will not benefit
551 ----------------
560 battery-powered). To achieve that, it changes the frequency in relatively
561 small steps, one step at a time, up or down - depending on whether or not a
568 allowed to set (the ``scaling_max_freq`` policy limit), between 0 and
575 ``scaling_max_freq`` policy limits.
598 ----------
607 "Turbo-Core" or (in technical documentation) "Core Performance Boost" and so on.
612 The frequency boost mechanism may be either hardware-based or software-based.
613 If it is hardware-based (e.g. on x86), the decision to trigger the boosting is
616 limits). If it is software-based (e.g. on ARM), the scaling driver decides
620 -------------------------------
625 but provides a driver-specific interface for controlling it, like
630 trigger boosting (in the hardware-based case), or the software is allowed to
631 trigger boosting (in the software-based case). It does not mean that boosting
642 --------------------------------
672 single-thread performance may vary because of it which may lead to
678 -----------------------
680 The AMD powernow-k8 scaling driver supports a ``sysfs`` knob very similar to
684 If present, that knob is located in every ``CPUFreq`` policy directory in
687 implementation, however, works on the system-wide basis and setting that knob
688 for one policy causes the same value of it to be set for all of the other
692 hardware feature, but it may be configured out of the kernel (via the
707 .. [1] Jonathan Corbet, *Per-entity load tracking*,