主要分为支持HWP ,不支持HWP,主动模式和被动模式
主动 | 被动 |
---|---|
支持HWP | 依赖于gov来选择频率 |
不支持 HWP |
主动模式下,如果支持HWP, 测CPU的硬件会自动选择频率
主动模式下,如果不支持HWP, 那会采用Intel自己的GOV来选择频率, 主要是通过CPU的利用率来进行调频.
被动模式没啥好说的了, 基本就是依赖于gov的策略,比如schedutil的算法, 依据基本也是CPU的rq 负载来更新频率.
Operation Modes
intel_pstate can operate in two different modes, active or passive. In the active mode, it uses its own internal performance scaling governor algorithm or allows the hardware to do performance scaling by itself, while in the passive mode it responds to requests made by a generic CPUFreq governor implementing a certain performance scaling algorithm. Which of them will be in effect depends on what kernel command line options are used and on the capabilities of the processor.
Active Mode
This is the default operation mode of intel_pstate for processors with hardware-managed P-states (HWP) support. If it works in this mode, the scaling_driver policy attribute in sysfs for all CPUFreq policies contains the string “intel_pstate”.
In this mode the driver bypasses the scaling governors layer of CPUFreq and provides its own scaling algorithms for P-state selection. Those algorithms can be applied to CPUFreq policies in the same way as generic scaling governors (that is, through the scaling_governor policy attribute in sysfs). [Note that different P-state selection algorithms may be chosen for different policies, but that is not recommended.]
They are not generic scaling governors, but their names are the same as the names of some of those governors. Moreover, confusingly enough, they generally do not work in the same way as the generic governors they share the names with. For example, the powersave P-state selection algorithm provided by intel_pstate is not a counterpart of the generic powersave governor (roughly, it corresponds to the schedutil and ondemand governors).
There are two P-state selection algorithms provided by intel_pstate in the active mode: powersave and performance. The way they both operate depends on whether or not the hardware-managed P-states (HWP) feature has been enabled in the processor and possibly on the processor model.
Which of the P-state selection algorithms is used by default depends on the CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE kernel configuration option. Namely, if that option is set, the performance algorithm will be used by default, and the other one will be used by default if it is not set.
Active Mode With HWP
If the processor supports the HWP feature, it will be enabled during the processor initialization and cannot be disabled after that. It is possible to avoid enabling it by passing the intel_pstate=no_hwp argument to the kernel in the command line.
If the HWP feature has been enabled, intel_pstate relies on the processor to select P-states by itself, but still it can give hints to the processor’s internal P-state selection logic. What those hints are depends on which P-state selection algorithm has been applied to the given policy (or to the CPU it corresponds to).
Even though the P-state selection is carried out by the processor automatically, intel_pstate registers utilization update callbacks with the CPU scheduler in this mode. However, they are not used for running a P-state selection algorithm, but for periodic updates of the current CPU frequency information to be made available from the scaling_cur_freq policy attribute in sysfs.
HWP + performance
In this configuration intel_pstate will write 0 to the processor’s Energy-Performance Preference (EPP) knob (if supported) or its Energy-Performance Bias (EPB) knob (otherwise), which means that the processor’s internal P-state selection logic is expected to focus entirely on performance.
This will override the EPP/EPB setting coming from the sysfs interface (see Energy vs Performance Hints below). Moreover, any attempts to change the EPP/EPB to a value different from 0 (“performance”) via sysfs in this configuration will be rejected.
Also, in this configuration the range of P-states available to the processor’s internal P-state selection logic is always restricted to the upper boundary (that is, the maximum P-state that the driver is allowed to use).
HWP + powersave
In this configuration intel_pstate will set the processor’s Energy-Performance Preference (EPP) knob (if supported) or its Energy-Performance Bias (EPB) knob (otherwise) to whatever value it was previously set to via sysfs (or whatever default value it was set to by the platform firmware). This usually causes the processor’s internal P-state selection logic to be less performance-focused.
Active Mode Without HWP
This operation mode is optional for processors that do not support the HWP feature or when the intel_pstate=no_hwp argument is passed to the kernel in the command line. The active mode is used in those cases if the intel_pstate=active argument is passed to the kernel in the command line. In this mode intel_pstate may refuse to work with processors that are not recognized by it. [Note that intel_pstate will never refuse to work with any processor with the HWP feature enabled.]
In this mode intel_pstate registers utilization update callbacks with the CPU scheduler in order to run a P-state selection algorithm, either powersave or performance, depending on the scaling_governor policy setting in sysfs. The current CPU frequency information to be made available from the scaling_cur_freq policy attribute in sysfs is periodically updated by those utilization update callbacks too.
performance
Without HWP, this P-state selection algorithm is always the same regardless of the processor model and platform configuration.
It selects the maximum P-state it is allowed to use, subject to limits set via sysfs, every time the driver configuration for the given CPU is updated (e.g. via sysfs).
This is the default P-state selection algorithm if the CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE kernel configuration option is set.
powersave
Without HWP, this P-state selection algorithm is similar to the algorithm implemented by the generic schedutil scaling governor except that the utilization metric used by it is based on numbers coming from feedback registers of the CPU. It generally selects P-states proportional to the current CPU utilization.
This algorithm is run by the driver’s utilization update callback for the given CPU when it is invoked by the CPU scheduler, but not more often than every 10 ms. Like in the performance case, the hardware configuration is not touched if the new P-state turns out to be the same as the current one.
This is the default P-state selection algorithm if the CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE kernel configuration option is not set.
Passive Mode
This is the default operation mode of intel_pstate for processors without hardware-managed P-states (HWP) support. It is always used if the intel_pstate=passive argument is passed to the kernel in the command line regardless of whether or not the given processor supports HWP. [Note that the intel_pstate=no_hwp setting causes the driver to start in the passive mode if it is not combined with intel_pstate=active.] Like in the active mode without HWP support, in this mode intel_pstate may refuse to work with processors that are not recognized by it if HWP is prevented from being enabled through the kernel command line.
If the driver works in this mode, the scaling_driver policy attribute in sysfs for all CPUFreq policies contains the string “intel_cpufreq”. Then, the driver behaves like a regular CPUFreq scaling driver. That is, it is invoked by generic scaling governors when necessary to talk to the hardware in order to change the P-state of a CPU (in particular, the schedutil governor can invoke it directly from scheduler context).
While in this mode, intel_pstate can be used with all of the (generic) scaling governors listed by the scaling_available_governors policy attribute in sysfs (and the P-state selection algorithms described above are not used). Then, it is responsible for the configuration of policy objects corresponding to CPUs and provides the CPUFreq core (and the scaling governors attached to the policy objects) with accurate information on the maximum and minimum operating frequencies supported by the hardware (including the so-called “turbo” frequency ranges). In other words, in the passive mode the entire range of available P-states is exposed by intel_pstate to the CPUFreq core. However, in this mode the driver does not register utilization update callbacks with the CPU scheduler and the scaling_cur_freq information comes from the CPUFreq core (and is the last frequency selected by the current scaling governor for the given policy).