#
8b547cc2 |
| 26-Dec-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Wrap the written counter value with gp_counter_width
The check_emulated_instr() testcase fails when the KVM module parameter "force_emulation_prefix" is 1. The root cause is that the value
x86/pmu: Wrap the written counter value with gp_counter_width
The check_emulated_instr() testcase fails when the KVM module parameter "force_emulation_prefix" is 1. The root cause is that the value written by the counter exceeds the maximum bit width of the GP counter.
Signed-off-by: Like Xu <likexu@tencent.com> Link: https://lore.kernel.org/all/20221226075412.61167-3-likexu@tencent.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
006b089d |
| 26-Dec-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Add Intel Guest Transactional (commited) cycles testcase
On Intel platforms with TSX feature, pmu users in guest can collect the commited or total transactional cycles for a tsx-enabled wor
x86/pmu: Add Intel Guest Transactional (commited) cycles testcase
On Intel platforms with TSX feature, pmu users in guest can collect the commited or total transactional cycles for a tsx-enabled workload, adding new test cases to cover them, as they are not strictly the same as normal hardware events from the KVM implementation point of view.
Signed-off-by: Like Xu <likexu@tencent.com> Link: https://lore.kernel.org/r/20221226075412.61167-2-likexu@tencent.com Signed-off-by: Sean Christopherson <seanjc@google.com>
show more ...
|
#
b883751a |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Update testcases to cover AMD PMU
AMD core PMU before Zen4 did not have version numbers, there were no fixed counters, it had a hard-coded number of generic counters, bit-width, and only ha
x86/pmu: Update testcases to cover AMD PMU
AMD core PMU before Zen4 did not have version numbers, there were no fixed counters, it had a hard-coded number of generic counters, bit-width, and only hardware events common across amd generations (starting with K7) were added to amd_gp_events[] table.
All above differences are instantiated at the detection step, and it also covers the K7 PMU registers, which is consistent with bare-metal.
Cc: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Like Xu <likexu@tencent.com> [sean: set bases to K7 values for !PERFCTR_CORE case (reported by Paolo)] Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-27-seanjc@google.com
show more ...
|
#
7c648ce2 |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Add gp_events pointer to route different event tables
AMD and Intel do not share the same set of coding rules for performance events, and code to test the same performance event can be reus
x86/pmu: Add gp_events pointer to route different event tables
AMD and Intel do not share the same set of coding rules for performance events, and code to test the same performance event can be reused by pointing to a different coding table, noting that the table size also needs to be updated.
Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-25-seanjc@google.com
show more ...
|
#
62ba5036 |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Add global helpers to cover Intel Arch PMU Version 1
To test Intel arch pmu version 1, most of the basic framework and use cases which test any PMU counter do not require any changes, excep
x86/pmu: Add global helpers to cover Intel Arch PMU Version 1
To test Intel arch pmu version 1, most of the basic framework and use cases which test any PMU counter do not require any changes, except no access to registers introduced only in PMU version 2.
Adding some guardian's checks can seamlessly support version 1, while opening the door for normal AMD PMUs tests.
Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-24-seanjc@google.com
show more ...
|
#
8a2866d1 |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Track global status/control/clear MSRs in pmu_caps
Track the global PMU MSRs in pmu_caps so that tests don't need to manually differntiate between AMD and Intel. Although AMD and Intel PMU
x86/pmu: Track global status/control/clear MSRs in pmu_caps
Track the global PMU MSRs in pmu_caps so that tests don't need to manually differntiate between AMD and Intel. Although AMD and Intel PMUs have the same semantics in terms of global control features (including ctl and status), their MSR indexes are not the same
Signed-off-by: Like Xu <likexu@tencent.com> [sean: drop most getters/setters] Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-22-seanjc@google.com
show more ...
|
#
3f914933 |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Add helper to get fixed counter MSR index
Add a helper to get the index of a fixed counter instead of manually calculating the index, a future patch will add more users of the fixed counter
x86/pmu: Add helper to get fixed counter MSR index
Add a helper to get the index of a fixed counter instead of manually calculating the index, a future patch will add more users of the fixed counter MSRs.
No functional change intended.
Signed-off-by: Like Xu <likexu@tencent.com> [sean: move to separate patch, write changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-20-seanjc@google.com
show more ...
|
#
cda64e80 |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Track GP counter and event select base MSRs in pmu_caps
Snapshot the base MSRs for GP counters and event selects during pmu_init() so that tests don't need to manually compute the bases.
S
x86/pmu: Track GP counter and event select base MSRs in pmu_caps
Snapshot the base MSRs for GP counters and event selects during pmu_init() so that tests don't need to manually compute the bases.
Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Like Xu <likexu@tencent.com> [sean: rename helpers to look more like macros, drop wrmsr wrappers] Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-19-seanjc@google.com
show more ...
|
#
414ee7d1 |
| 02-Nov-2022 |
Sean Christopherson <seanjc@google.com> |
x86/pmu: Drop wrappers that just passthrough pmu_caps fields
Drop wrappers that are and always will be pure passthroughs of pmu_caps fields, e.g. the number of fixed/general_purpose counters can alw
x86/pmu: Drop wrappers that just passthrough pmu_caps fields
Drop wrappers that are and always will be pure passthroughs of pmu_caps fields, e.g. the number of fixed/general_purpose counters can always be determined during PMU initialization and doesn't need runtime logic.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-18-seanjc@google.com
show more ...
|
#
879e7f07 |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Snapshot PMU perf_capabilities during BSP initialization
Add a global "struct pmu_caps pmu" to snapshot PMU capabilities during the final stages of BSP initialization. Use the new hooks to
x86/pmu: Snapshot PMU perf_capabilities during BSP initialization
Add a global "struct pmu_caps pmu" to snapshot PMU capabilities during the final stages of BSP initialization. Use the new hooks to snapshot PERF_CAPABILITIES instead of re-reading the MSR every time a test wants to query capabilities. A software-defined struct will also simplify extending support to AMD CPUs, as many of the differences between AMD and Intel can be handled during pmu_init().
Init the PMU caps for all tests so that tests don't need to remember to call pmu_init() before using any of the PMU helpers, e.g. the nVMX test uses this_cpu_has_pmu(), which will be converted to rely on the global struct in a future patch.
Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Like Xu <likexu@tencent.com> [sean: reword changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-16-seanjc@google.com
show more ...
|
#
9f17508d |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Add lib/x86/pmu.[c.h] and move common code to header files
Given all the PMU stuff coming in, we need e.g. lib/x86/pmu.h to hold all of the hardware-defined stuff, e.g. #defines, accessors,
x86/pmu: Add lib/x86/pmu.[c.h] and move common code to header files
Given all the PMU stuff coming in, we need e.g. lib/x86/pmu.h to hold all of the hardware-defined stuff, e.g. #defines, accessors, helpers and structs that are dictated by hardware. This will greatly help with code reuse and reduce unnecessary vm-exit.
Opportunistically move lbr msrs definition to header processor.h.
Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-14-seanjc@google.com
show more ...
|
#
5a2cb3e6 |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Rename PC_VECTOR to PMI_VECTOR for better readability
The original name "PC_VECTOR" comes from the LVT Performance Counter Register. Rename it to PMI_VECTOR. That's much more familiar for K
x86/pmu: Rename PC_VECTOR to PMI_VECTOR for better readability
The original name "PC_VECTOR" comes from the LVT Performance Counter Register. Rename it to PMI_VECTOR. That's much more familiar for KVM developers and it's still correct, e.g. it's the PMI vector that's programmed into the LVT PC register.
Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-13-seanjc@google.com
show more ...
|
#
85c21181 |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Update rdpmc testcase to cover #GP path
Specifying an unsupported PMC encoding will cause a #GP(0).
There are multiple reasons RDPMC can #GP, the one that is being relied on to guarantee #
x86/pmu: Update rdpmc testcase to cover #GP path
Specifying an unsupported PMC encoding will cause a #GP(0).
There are multiple reasons RDPMC can #GP, the one that is being relied on to guarantee #GP is specifically that the PMC is invalid. The most extensible solution is to provide a safe variant.
Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-12-seanjc@google.com
show more ...
|
#
03041e97 |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Refine info to clarify the current support
Existing unit tests do not cover AMD pmu, nor Intel pmu that is not architecture (on some obsolete cpu's). AMD's PMU support will be coming in sub
x86/pmu: Refine info to clarify the current support
Existing unit tests do not cover AMD pmu, nor Intel pmu that is not architecture (on some obsolete cpu's). AMD's PMU support will be coming in subsequent commits.
Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-11-seanjc@google.com
show more ...
|
#
7ec3b67a |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Reset the expected count of the fixed counter 0 when i386
The pmu test check_counter_overflow() always fails with 32-bit binaries. The cnt.count obtained from the latter run of measure() (b
x86/pmu: Reset the expected count of the fixed counter 0 when i386
The pmu test check_counter_overflow() always fails with 32-bit binaries. The cnt.count obtained from the latter run of measure() (based on fixed counter 0) is not equal to the expected value (based on gp counter 0) and there is a positive error with a value of 2.
The two extra instructions come from inline wrmsr() and inline rdmsr() inside the global_disable() binary code block. Specifically, for each msr access, the i386 code will have two assembly mov instructions before rdmsr/wrmsr (mark it for fixed counter 0, bit 32), but only one assembly mov is needed for x86_64 and gp counter 0 on i386.
The sequence of instructions to count events using the #GP and #Fixed counters is different. Thus the fix is quite high level, to use the same counter (w/ same instruction sequences) to set initial value for the same counter. Fix the expected init cnt.count for fixed counter 0 overflow based on the same fixed counter 0, not always using gp counter 0.
The difference of 1 for this count enables the interrupt to be generated immediately after the selected event count has been reached, instead of waiting for the overflow to be propagation through the counter.
Adding a helper to measure/compute the overflow preset value. It provides a convenient location to document the weird behavior that's necessary to ensure immediate event delivery.
Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-9-seanjc@google.com
show more ...
|
#
8554261f |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Introduce multiple_{one, many}() to improve readability
The current measure_one() forces the common case to pass in unnecessary information in order to give flexibility to a single use case
x86/pmu: Introduce multiple_{one, many}() to improve readability
The current measure_one() forces the common case to pass in unnecessary information in order to give flexibility to a single use case. It's just syntatic sugar, but it really does help readers as it's not obvious that the "1" specifies the number of events, whereas multiple_many() and measure_one() are relatively self-explanatory.
Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-8-seanjc@google.com
show more ...
|
#
e9e7577b |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Introduce __start_event() to drop all of the manual zeroing
Most invocation of start_event() and measure() first sets evt.count=0. Instead of forcing each caller to ensure count is zeroed,
x86/pmu: Introduce __start_event() to drop all of the manual zeroing
Most invocation of start_event() and measure() first sets evt.count=0. Instead of forcing each caller to ensure count is zeroed, optimize the count to zero during start_event(), then drop all of the manual zeroing.
Accumulating counts can be handled by reading the current count before start_event(), and doing something like stuffing a high count to test an edge case could be handled by an inner helper, __start_event().
For overflow, just open code measure() for that one-off case. Requiring callers to zero out a field in most common cases isn't exactly flexible.
Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Like Xu <likexu@tencent.com> [sean: tag __measure() noinline so its count is stable] Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-7-seanjc@google.com
show more ...
|
#
4070b9c6 |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Fix printed messages for emulated instruction test
This test case uses MSR_IA32_PERFCTR0 to count branch instructions and PERFCTR1 to count instruction events. The same correspondence shoul
x86/pmu: Fix printed messages for emulated instruction test
This test case uses MSR_IA32_PERFCTR0 to count branch instructions and PERFCTR1 to count instruction events. The same correspondence should be maintained at report(), specifically this should use status bit 1 for instructions and bit 0 for branches.
Fixes: 20cf914 ("x86/pmu: Test PMU virtualization on emulated instructions") Reported-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-6-seanjc@google.com
show more ...
|
#
d7714e16 |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Pop up FW prefix to avoid out-of-context propagation
The inappropriate prefix "full-width writes" may be propagated to later test cases if it is not popped out.
Signed-off-by: Like Xu <lik
x86/pmu: Pop up FW prefix to avoid out-of-context propagation
The inappropriate prefix "full-width writes" may be propagated to later test cases if it is not popped out.
Signed-off-by: Like Xu <likexu@tencent.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-4-seanjc@google.com
show more ...
|
#
00dca75c |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Test emulation instructions on full-width counters
Move check_emulated_instr() into check_counters() so that full-width counters could be tested with ease by the same test case.
Signed-off
x86/pmu: Test emulation instructions on full-width counters
Move check_emulated_instr() into check_counters() so that full-width counters could be tested with ease by the same test case.
Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-3-seanjc@google.com
show more ...
|
#
c3cde0a5 |
| 02-Nov-2022 |
Like Xu <likexu@tencent.com> |
x86/pmu: Add PDCM check before accessing PERF_CAP register
On virtual platforms without PDCM support (e.g. AMD), #GP failure on MSR_IA32_PERF_CAPABILITIES is completely avoidable.
Suggested-by: Sea
x86/pmu: Add PDCM check before accessing PERF_CAP register
On virtual platforms without PDCM support (e.g. AMD), #GP failure on MSR_IA32_PERF_CAPABILITIES is completely avoidable.
Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102225110.3023543-2-seanjc@google.com
show more ...
|
#
73d9d850 |
| 01-Jun-2022 |
Bill Wendling <isanbard@gmail.com> |
x86/pmu: Disable inlining of measure()
Clang can be more aggressive at inlining than GCC and will fully inline calls to measure(). This can mess with the counter overflow check. To set up the PMC ov
x86/pmu: Disable inlining of measure()
Clang can be more aggressive at inlining than GCC and will fully inline calls to measure(). This can mess with the counter overflow check. To set up the PMC overflow, check_counter_overflow() first records the number of instructions retired in an invocation of measure() and checks to see that subsequent calls to measure() retire the same number of instructions. If inlining occurs, those numbers can be different and the overflow test fails.
FAIL: overflow: cntr-0 PASS: overflow: status-0 PASS: overflow: status clear-0 PASS: overflow: irq-0 FAIL: overflow: cntr-1 PASS: overflow: status-1 PASS: overflow: status clear-1 PASS: overflow: irq-1 FAIL: overflow: cntr-2 PASS: overflow: status-2 PASS: overflow: status clear-2 PASS: overflow: irq-2 FAIL: overflow: cntr-3 PASS: overflow: status-3 PASS: overflow: status clear-3 PASS: overflow: irq-3
Disabling inlining of measure() keeps the assumption that all calls to measure() retire the same number of instructions.
Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Bill Wendling <morbo@google.com> Message-Id: <20220601163012.3404212-1-morbo@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
7362976d |
| 08-Aug-2022 |
Sean Christopherson <seanjc@google.com> |
x86/pmu: Run the "emulation" test iff forced emulation is available
Run the PMU's emulation testcase if and only if forced emulation is available, and do so without requiring the user to manually sp
x86/pmu: Run the "emulation" test iff forced emulation is available
Run the PMU's emulation testcase if and only if forced emulation is available, and do so without requiring the user to manually specify they want to run the emulation testcase.
Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220808164707.537067-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
dfb0ec0f |
| 08-Aug-2022 |
Michal Luczaj <mhal@rbox.co> |
x86: Introduce ASM_TRY_FEP() to handle exceptions on forced emulation
Introduce ASM_TRY_FEP() to allow using the try-catch method to handle exceptions that occur on forced emulation. ASM_TRY() mish
x86: Introduce ASM_TRY_FEP() to handle exceptions on forced emulation
Introduce ASM_TRY_FEP() to allow using the try-catch method to handle exceptions that occur on forced emulation. ASM_TRY() mishandles exceptions thrown by the forced-emulation-triggered emulator. While the faulting address stored in the exception table points at forced emulation prefix, when an exceptions comes, RIP is 5 bytes (size of KVM_FEP) ahead due to KVM advancing RIP to skip the prefix and the exception ends up unhandled.
Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220808164707.537067-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
14b54ed7 |
| 26-Jul-2022 |
Paolo Bonzini <pbonzini@redhat.com> |
Merge tag 'for_paolo' of https://github.com/sean-jc/kvm-unit-tests into HEAD
x86 fixes, cleanups, and new sub-tests:
- Bug fix for the VMX-preemption timer expiration test - Refactor SVM tests
Merge tag 'for_paolo' of https://github.com/sean-jc/kvm-unit-tests into HEAD
x86 fixes, cleanups, and new sub-tests:
- Bug fix for the VMX-preemption timer expiration test - Refactor SVM tests to split out NPT tests - Add tests for MCE banks to MSR test - Add SMP Support for x86 UEFI tests - x86: nVMX: Add VMXON #UD test (and exception cleanup) - PMU cleanup and related nVMX bug fixes
show more ...
|