- 08 6月, 2022 6 次提交
-
-
由 Like Xu 提交于
The bit 12 represents "Processor Event Based Sampling Unavailable (RO)" : 1 = PEBS is not supported. 0 = PEBS is supported. A write to this PEBS_UNAVL available bit will bring #GP(0) when guest PEBS is enabled. Some PEBS drivers in guest may care about this bit. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Message-Id: <20220411101946.20262-13-likexu@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
If IA32_PERF_CAPABILITIES.PEBS_BASELINE [bit 14] is set, the adaptive PEBS is supported. The PEBS_DATA_CFG MSR and adaptive record enable bits (IA32_PERFEVTSELx.Adaptive_Record and IA32_FIXED_CTR_CTRL. FCx_Adaptive_Record) are also supported. Adaptive PEBS provides software the capability to configure the PEBS records to capture only the data of interest, keeping the record size compact. An overflow of PMCx results in generation of an adaptive PEBS record with state information based on the selections specified in MSR_PEBS_DATA_CFG.By default, the record only contain the Basic group. When guest adaptive PEBS is enabled, the IA32_PEBS_ENABLE MSR will be added to the perf_guest_switch_msr() and switched during the VMX transitions just like CORE_PERF_GLOBAL_CTRL MSR. According to Intel SDM, software is recommended to PEBS Baseline when the following is true. IA32_PERF_CAPABILITIES.PEBS_BASELINE[14] && IA32_PERF_CAPABILITIES.PEBS_FMT[11:8] ≥ 4. Co-developed-by: NLuwei Kang <luwei.kang@intel.com> Signed-off-by: NLuwei Kang <luwei.kang@intel.com> Signed-off-by: NLike Xu <likexu@tencent.com> Message-Id: <20220411101946.20262-12-likexu@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
When CPUID.01H:EDX.DS[21] is set, the IA32_DS_AREA MSR exists and points to the linear address of the first byte of the DS buffer management area, which is used to manage the PEBS records. When guest PEBS is enabled, the MSR_IA32_DS_AREA MSR will be added to the perf_guest_switch_msr() and switched during the VMX transitions just like CORE_PERF_GLOBAL_CTRL MSR. The WRMSR to IA32_DS_AREA MSR brings a #GP(0) if the source register contains a non-canonical address. Originally-by: NAndi Kleen <ak@linux.intel.com> Co-developed-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NLike Xu <like.xu@linux.intel.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Message-Id: <20220411101946.20262-11-likexu@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
If IA32_PERF_CAPABILITIES.PEBS_BASELINE [bit 14] is set, the IA32_PEBS_ENABLE MSR exists and all architecturally enumerated fixed and general-purpose counters have corresponding bits in IA32_PEBS_ENABLE that enable generation of PEBS records. The general-purpose counter bits start at bit IA32_PEBS_ENABLE[0], and the fixed counter bits start at bit IA32_PEBS_ENABLE[32]. When guest PEBS is enabled, the IA32_PEBS_ENABLE MSR will be added to the perf_guest_switch_msr() and atomically switched during the VMX transitions just like CORE_PERF_GLOBAL_CTRL MSR. Based on whether the platform supports x86_pmu.pebs_ept, it has also refactored the way to add more msrs to arr[] in intel_guest_get_msrs() for extensibility. Originally-by: NAndi Kleen <ak@linux.intel.com> Co-developed-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Co-developed-by: NLuwei Kang <luwei.kang@intel.com> Signed-off-by: NLuwei Kang <luwei.kang@intel.com> Signed-off-by: NLike Xu <like.xu@linux.intel.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Message-Id: <20220411101946.20262-8-likexu@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
The mask value of fixed counter control register should be dynamic adjusted with the number of fixed counters. This patch introduces a variable that includes the reserved bits of fixed counter control registers. This is a generic code refactoring. Co-developed-by: NLuwei Kang <luwei.kang@intel.com> Signed-off-by: NLuwei Kang <luwei.kang@intel.com> Signed-off-by: NLike Xu <like.xu@linux.intel.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Message-Id: <20220411101946.20262-6-likexu@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
On Intel platforms, the software can use the IA32_MISC_ENABLE[7] bit to detect whether the processor supports performance monitoring facility. It depends on the PMU is enabled for the guest, and a software write operation to this available bit will be ignored. The proposal to ignore the toggle in KVM is the way to go and that behavior matches bare metal. Signed-off-by: NLike Xu <likexu@tencent.com> Message-Id: <20220411101946.20262-5-likexu@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 22 4月, 2022 1 次提交
-
-
由 Like Xu 提交于
NMI-watchdog is one of the favorite features of kernel developers, but it does not work in AMD guest even with vPMU enabled and worse, the system misrepresents this capability via /proc. This is a PMC emulation error. KVM does not pass the latest valid value to perf_event in time when guest NMI-watchdog is running, thus the perf_event corresponding to the watchdog counter will enter the old state at some point after the first guest NMI injection, forcing the hardware register PMC0 to be constantly written to 0x800000000001. Meanwhile, the running counter should accurately reflect its new value based on the latest coordinated pmc->counter (from vPMC's point of view) rather than the value written directly by the guest. Fixes: 168d918f ("KVM: x86: Adjust counter sample period after a wrmsr") Reported-by: NDongli Cao <caodongli@kingsoft.com> Signed-off-by: NLike Xu <likexu@tencent.com> Reviewed-by: NYanan Wang <wangyanan55@huawei.com> Tested-by: NYanan Wang <wangyanan55@huawei.com> Reviewed-by: NJim Mattson <jmattson@google.com> Message-Id: <20220409015226.38619-1-likexu@tencent.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 14 4月, 2022 1 次提交
-
-
由 Like Xu 提交于
The pmu_ops should be moved to kvm_x86_init_ops and tagged as __initdata. That'll save those precious few bytes, and more importantly make the original ops unreachable, i.e. make it harder to sneak in post-init modification bugs. Suggested-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NLike Xu <likexu@tencent.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20220329235054.3534728-4-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 02 4月, 2022 2 次提交
-
-
由 Like Xu 提交于
HSW_IN_TX* bits are used in generic code which are not supported on AMD. Worse, these bits overlap with AMD EventSelect[11:8] and hence using HSW_IN_TX* bits unconditionally in generic code is resulting in unintentional pmu behavior on AMD. For example, if EventSelect[11:8] is 0x2, pmc_reprogram_counter() wrongly assumes that HSW_IN_TX_CHECKPOINTED is set and thus forces sampling period to be 0. Also per the SDM, both bits 32 and 33 "may only be set if the processor supports HLE or RTM" and for "IN_TXCP (bit 33): this bit may only be set for IA32_PERFEVTSEL2." Opportunistically eliminate code redundancy, because if the HSW_IN_TX* bit is set in pmc->eventsel, it is already set in attr.config. Reported-by: NRavi Bangoria <ravi.bangoria@amd.com> Reported-by: NJim Mattson <jmattson@google.com> Fixes: 103af0a9 ("perf, kvm: Support the in_tx/in_tx_cp modifiers in KVM arch perfmon emulation v5") Co-developed-by: NRavi Bangoria <ravi.bangoria@amd.com> Signed-off-by: NRavi Bangoria <ravi.bangoria@amd.com> Signed-off-by: NLike Xu <likexu@tencent.com> Message-Id: <20220309084257.88931-1-likexu@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jim Mattson 提交于
The third nybble of AMD's event select overlaps with Intel's IN_TX and IN_TXCP bits. Therefore, we can't use AMD64_RAW_EVENT_MASK on Intel platforms that support TSX. Declare a raw_event_mask in the kvm_pmu structure, initialize it in the vendor-specific pmu_refresh() functions, and use that mask for PERF_TYPE_RAW configurations in reprogram_gp_counter(). Fixes: 710c4765 ("KVM: x86/pmu: Use AMD64_RAW_EVENT_MASK for PERF_TYPE_RAW") Signed-off-by: NJim Mattson <jmattson@google.com> Message-Id: <20220308012452.3468611-1-jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 25 2月, 2022 1 次提交
-
-
由 David Dunn 提交于
Add a new capability, KVM_CAP_PMU_CAPABILITY, that takes a bitmask of settings/features to allow userspace to configure PMU virtualization on a per-VM basis. For now, support a single flag, KVM_PMU_CAP_DISABLE, to allow disabling PMU virtualization for a VM even when KVM is configured with enable_pmu=true a module level. To keep KVM simple, disallow changing VM's PMU configuration after vCPUs have been created. Signed-off-by: NDavid Dunn <daviddunn@google.com> Message-Id: <20220223225743.2703915-2-daviddunn@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 11 2月, 2022 1 次提交
-
-
由 Sean Christopherson 提交于
Refactor the nested VMX PMU refresh helper to pass it a flag stating whether or not the vCPU has PERF_GLOBAL_CTRL instead of having the nVMX helper query the information by bouncing through kvm_x86_ops.pmu_ops. This will allow a future patch to use static_call() for the PMU ops without having to export any static call definitions from common x86, and it is also a step toward unexported kvm_x86_ops. Alternatively, nVMX could call kvm_pmu_is_valid_msr() to indirectly use kvm_x86_ops.pmu_ops, but that would incur an extra layer of indirection and would require exporting kvm_pmu_is_valid_msr(). Opportunistically rename the helper to keep line lengths somewhat reasonable, and to better capture its high-level role. No functional change intended. Cc: Like Xu <like.xu.linux@gmail.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-9-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 02 2月, 2022 1 次提交
-
-
由 Wei Wang 提交于
KVM vPMU doesn't support to emulate all the fixed counters that the host PMU driver has supported, e.g. the fixed counter 3 used by Topdown metrics hasn't been supported by KVM so far. Rename MAX_FIXED_COUNTERS to KVM_PMC_MAX_FIXED to have a more straightforward naming convention as INTEL_PMC_MAX_FIXED used by the host PMU driver, and fix vPMU to use the KVM side KVM_PMC_MAX_FIXED for the virtual fixed counter emulation, instead of the host side INTEL_PMC_MAX_FIXED. Signed-off-by: NWei Wang <wei.w.wang@intel.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1643750603-100733-2-git-send-email-kan.liang@linux.intel.com
-
- 18 1月, 2022 2 次提交
-
-
由 Like Xu 提交于
The new module parameter to control PMU virtualization should apply to Intel as well as AMD, for situations where userspace is not trusted. If the module parameter allows PMU virtualization, there could be a new KVM_CAP or guest CPUID bits whereby userspace can enable/disable PMU virtualization on a per-VM basis. If the module parameter does not allow PMU virtualization, there should be no userspace override, since we have no precedent for authorizing that kind of override. If it's false, other counter-based profiling features (such as LBR including the associated CPUID bits if any) will not be exposed. Change its name from "pmu" to "enable_pmu" as we have temporary variables with the same name in our code like "struct kvm_pmu *pmu". Fixes: b1d66dad ("KVM: x86/svm: Add module param to control PMU virtualization") Suggested-by : Jim Mattson <jmattson@google.com> Signed-off-by: NLike Xu <likexu@tencent.com> Message-Id: <20220111073823.21885-1-likexu@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
According to CPUID 0x0A.EBX bit vector, the event [7] should be the unrealized event "Topdown Slots" instead of the *kernel* generalized common hardware event "REF_CPU_CYCLES", so we need to skip the cpuid unavaliblity check in the intel_pmc_perf_hw_id() for the last REF_CPU_CYCLES event and update the confusing comment. If the event is marked as unavailable in the Intel guest CPUID 0AH.EBX leaf, we need to avoid any perf_event creation, whether it's a gp or fixed counter. To distinguish whether it is a rejected event or an event that needs to be programmed with PERF_TYPE_RAW type, a new special returned value of "PERF_COUNT_HW_MAX + 1" is introduced. Fixes: 62079d8a ("KVM: PMU: add proper support for fixed counter 2") Signed-off-by: NLike Xu <likexu@tencent.com> Message-Id: <20220105051509.69437-1-likexu@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 07 1月, 2022 4 次提交
-
-
由 Like Xu 提交于
Since we set the same semantic event value for the fixed counter in pmc->eventsel, returning the perf_hw_id for the fixed counter via find_fixed_event() can be painlessly replaced by pmc_perf_hw_id() with the help of pmc_is_fixed() check. Signed-off-by: NLike Xu <likexu@tencent.com> Message-Id: <20211130074221.93635-4-likexu@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
The find_arch_event() returns a "unsigned int" value, which is used by the pmc_reprogram_counter() to program a PERF_TYPE_HARDWARE type perf_event. The returned value is actually the kernel defined generic perf_hw_id, let's rename it to pmc_perf_hw_id() with simpler incoming parameters for better self-explanation. Signed-off-by: NLike Xu <likexu@tencent.com> Message-Id: <20211130074221.93635-3-likexu@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
The current pmc->eventsel for fixed counter is underutilised. The pmc->eventsel can be setup for all known available fixed counters since we have mapping between fixed pmc index and the intel_arch_events array. Either gp or fixed counter, it will simplify the later checks for consistency between eventsel and perf_hw_id. Signed-off-by: NLike Xu <likexu@tencent.com> Message-Id: <20211130074221.93635-2-likexu@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Because IceLake has 4 fixed performance counters but KVM only supports 3, it is possible for reprogram_fixed_counters to pass to reprogram_fixed_counter an index that is out of bounds for the fixed_pmc_events array. Ultimately intel_find_fixed_event, which is the only place that uses fixed_pmc_events, handles this correctly because it checks against the size of fixed_pmc_events anyway. Every other place operates on the fixed_counters[] array which is sized according to INTEL_PMC_MAX_FIXED. However, it is cleaner if the unsupported performance counters are culled early on in reprogram_fixed_counters. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 11 11月, 2021 1 次提交
-
-
由 Jim Mattson 提交于
These function names sound like predicates, and they have siblings, *is_valid_msr(), which _are_ predicates. Moreover, there are comments that essentially warn that these functions behave unexpectedly. Flip the polarity of the return values, so that they become predicates, and convert the boolean result to a success/failure code at the outer call site. Suggested-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NJim Mattson <jmattson@google.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Message-Id: <20211105202058.1048757-1-jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 22 10月, 2021 1 次提交
-
-
由 Wanpeng Li 提交于
SDM section 18.2.3 mentioned that: "IA32_PERF_GLOBAL_OVF_CTL MSR allows software to clear overflow indicator(s) of any general-purpose or fixed-function counters via a single WRMSR." It is R/W mentioned by SDM, we read this msr on bare-metal during perf testing, the value is always 0 for ICX/SKX boxes on hands. Let's fill get_msr MSR_CORE_PERF_GLOBAL_OVF_CTRL w/ 0 as hardware behavior and drop global_ovf_ctrl variable. Tested-by: NLike Xu <likexu@tencent.com> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1634631160-67276-2-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 04 8月, 2021 1 次提交
-
-
由 Like Xu 提交于
Based on our observations, after any vm-exit associated with vPMU, there are at least two or more perf interfaces to be called for guest counter emulation, such as perf_event_{pause, read_value, period}(), and each one will {lock, unlock} the same perf_event_ctx. The frequency of calls becomes more severe when guest use counters in a multiplexed manner. Holding a lock once and completing the KVM request operations in the perf context would introduce a set of impractical new interfaces. So we can further optimize the vPMU implementation by avoiding repeated calls to these interfaces in the KVM context for at least one pattern: After we call perf_event_pause() once, the event will be disabled and its internal count will be reset to 0. So there is no need to pause it again or read its value. Once the event is paused, event period will not be updated until the next time it's resumed or reprogrammed. And there is also no need to call perf_event_period twice for a non-running counter, considering the perf_event for a running counter is never paused. Based on this implementation, for the following common usage of sampling 4 events using perf on a 4u8g guest: echo 0 > /proc/sys/kernel/watchdog echo 25 > /proc/sys/kernel/perf_cpu_time_max_percent echo 10000 > /proc/sys/kernel/perf_event_max_sample_rate echo 0 > /proc/sys/kernel/perf_cpu_time_max_percent for i in `seq 1 1 10` do taskset -c 0 perf record \ -e cpu-cycles -e instructions -e branch-instructions -e cache-misses \ /root/br_instr a done the average latency of the guest NMI handler is reduced from 37646.7 ns to 32929.3 ns (~1.14x speed up) on the Intel ICX server. Also, in addition to collecting more samples, no loss of sampling accuracy was observed compared to before the optimization. Signed-off-by: NLike Xu <likexu@tencent.com> Message-Id: <20210728120705.6855-1-likexu@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Acked-by: NPeter Zijlstra <peterz@infradead.org>
-
- 24 2月, 2021 1 次提交
-
-
由 Like Xu 提交于
If lbr_desc->event is successfully created, the intel_pmu_create_ guest_lbr_event() will return 0, otherwise it will return -ENOENT, and then jump to LBR msrs dummy handling. Fixes: 1b5ac322 ("KVM: vmx/pmu: Pass-through LBR msrs when the guest LBR event is ACTIVE") Signed-off-by: NLike Xu <like.xu@linux.intel.com> Message-Id: <20210223013958.1280444-1-like.xu@linux.intel.com> [Add "< 0" and PTR_ERR to make the code clearer. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 04 2月, 2021 8 次提交
-
-
由 Like Xu 提交于
The vPMU uses GUEST_LBR_IN_USE_IDX (bit 58) in 'pmu->pmc_in_use' to indicate whether a guest LBR event is still needed by the vcpu. If the vcpu no longer accesses LBR related registers within a scheduling time slice, and the enable bit of LBR has been unset, vPMU will treat the guest LBR event as a bland event of a vPMC counter and release it as usual. Also, the pass-through state of LBR records msrs is cancelled. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Message-Id: <20210201051039.255478-10-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
The current vPMU only supports Architecture Version 2. According to Intel SDM "17.4.7 Freezing LBR and Performance Counters on PMI", if IA32_DEBUGCTL.Freeze_LBR_On_PMI = 1, the LBR is frozen on the virtual PMI and the KVM would emulate to clear the LBR bit (bit 0) in IA32_DEBUGCTL. Also, guest needs to re-enable IA32_DEBUGCTL.LBR to resume recording branches. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Message-Id: <20210201051039.255478-9-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
When the LBR records msrs has already been pass-through, there is no need to call vmx_update_intercept_for_lbr_msrs() again and again, and vice versa. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Message-Id: <20210201051039.255478-8-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
In addition to DEBUGCTLMSR_LBR, any KVM trap caused by LBR msrs access will result in a creation of guest LBR event per-vcpu. If the guest LBR event is scheduled on with the corresponding vcpu context, KVM will pass-through all LBR records msrs to the guest. The LBR callstack mechanism implemented in the host could help save/restore the guest LBR records during the event context switches, which reduces a lot of overhead if we save/restore tens of LBR msrs (e.g. 32 LBR records entries) in the much more frequent VMX transitions. To avoid reclaiming LBR resources from any higher priority event on host, KVM would always check the exist of guest LBR event and its state before vm-entry as late as possible. A negative result would cancel the pass-through state, and it also prevents real registers accesses and potential data leakage. If host reclaims the LBR between two checks, the interception state and LBR records can be safely preserved due to native save/restore support from guest LBR event. The KVM emits a pr_warn() when the LBR hardware is unavailable to the guest LBR event. The administer is supposed to reminder users that the guest result may be inaccurate if someone is using LBR to record hypervisor on the host side. Suggested-by: NAndi Kleen <ak@linux.intel.com> Co-developed-by: NWei Wang <wei.w.wang@intel.com> Signed-off-by: NWei Wang <wei.w.wang@intel.com> Signed-off-by: NLike Xu <like.xu@linux.intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Message-Id: <20210201051039.255478-7-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
When vcpu sets DEBUGCTLMSR_LBR in the MSR_IA32_DEBUGCTLMSR, the KVM handler would create a guest LBR event which enables the callstack mode and none of hardware counter is assigned. The host perf would schedule and enable this event as usual but in an exclusive way. The guest LBR event will be released when the vPMU is reset but soon, the lazy release mechanism would be applied to this event like a vPMC. Suggested-by: NAndi Kleen <ak@linux.intel.com> Co-developed-by: NWei Wang <wei.w.wang@intel.com> Signed-off-by: NWei Wang <wei.w.wang@intel.com> Signed-off-by: NLike Xu <like.xu@linux.intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Message-Id: <20210201051039.255478-6-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
Usespace could set the bits [0, 5] of the IA32_PERF_CAPABILITIES MSR which tells about the record format stored in the LBR records. The LBR will be enabled on the guest if host perf supports LBR (checked via x86_perf_get_lbr()) and the vcpu model is compatible with the host one. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Message-Id: <20210201051039.255478-4-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Usespace could set the bits [0, 5] of the IA32_PERF_CAPABILITIES MSR which tells about the record format stored in the LBR records. The LBR will be enabled on the guest if host perf supports LBR (checked via x86_perf_get_lbr()) and the vcpu model is compatible with the host one. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Message-Id: <20210201051039.255478-4-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Once MSR_IA32_PERF_CAPABILITIES is changed via vmx_set_msr(), the value should not be changed by cpuid(). To ensure that the new value is kept, the default initialization path is moved to intel_pmu_init(). The effective value of the MSR will be 0 if PDCM is clear, however. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 1月, 2021 2 次提交
-
-
由 Like Xu 提交于
The HW_REF_CPU_CYCLES event on the fixed counter 2 is pseudo-encoded as 0x0300 in the intel_perfmon_event_map[]. Correct its usage. Fixes: 62079d8a ("KVM: PMU: add proper support for fixed counter 2") Signed-off-by: NLike Xu <like.xu@linux.intel.com> Message-Id: <20201230081916.63417-1-like.xu@linux.intel.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
Since we know vPMU will not work properly when (1) the guest bit_width(s) of the [gp|fixed] counters are greater than the host ones, or (2) guest requested architectural events exceeds the range supported by the host, so we can setup a smaller left shift value and refresh the guest cpuid entry, thus fixing the following UBSAN shift-out-of-bounds warning: shift exponent 197 is too large for 64-bit type 'long long unsigned int' Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395 intel_pmu_refresh.cold+0x75/0x99 arch/x86/kvm/vmx/pmu_intel.c:348 kvm_vcpu_after_set_cpuid+0x65a/0xf80 arch/x86/kvm/cpuid.c:177 kvm_vcpu_ioctl_set_cpuid2+0x160/0x440 arch/x86/kvm/cpuid.c:308 kvm_arch_vcpu_ioctl+0x11b6/0x2d70 arch/x86/kvm/x86.c:4709 kvm_vcpu_ioctl+0x7b9/0xdb0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3386 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported-by: syzbot+ae488dc136a4cc6ba32b@syzkaller.appspotmail.com Signed-off-by: NLike Xu <like.xu@linux.intel.com> Message-Id: <20210118025800.34620-1-like.xu@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 11 7月, 2020 1 次提交
-
-
由 Vitaly Kuznetsov 提交于
state_test/smm_test selftests are failing on AMD with: "Unexpected result from KVM_GET_MSRS, r: 51 (failed MSR was 0x345)" MSR_IA32_PERF_CAPABILITIES is an emulated MSR on Intel but it is not known to AMD code, we can move the emulation to common x86 code. For AMD, we basically just allow the host to read and write zero to the MSR. Fixes: 27461da3 ("KVM: x86/pmu: Support full width counting") Suggested-by: NJim Mattson <jmattson@google.com> Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200710152559.1645827-1-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 05 6月, 2020 1 次提交
-
-
由 Sean Christopherson 提交于
Unconditionally return true when querying the validity of MSR_IA32_PERF_CAPABILITIES so as to defer the validity check to intel_pmu_{get,set}_msr(), which can properly give the MSR a pass when the access is initiated from host userspace. The MSR is emulated so there is no underlying hardware dependency to worry about. Fixes: 27461da3 ("KVM: x86/pmu: Support full width counting") Cc: Like Xu <like.xu@linux.intel.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200603203303.28545-1-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 01 6月, 2020 2 次提交
-
-
由 Like Xu 提交于
Intel CPUs have a new alternative MSR range (starting from MSR_IA32_PMC0) for GP counters that allows writing the full counter width. Enable this range from a new capability bit (IA32_PERF_CAPABILITIES.FW_WRITE[bit 13]). The guest would query CPUID to get the counter width, and sign extends the counter values as needed. The traditional MSRs always limit to 32bit, even though the counter internally is larger (48 or 57 bits). When the new capability is set, use the alternative range which do not have these restrictions. This lowers the overhead of perf stat slightly because it has to do less interrupts to accumulate the counter value. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Message-Id: <20200529074347.124619-3-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wei Wang 提交于
Change kvm_pmu_get_msr() to get the msr_data struct, as the host_initiated field from the struct could be used by get_msr. This also makes this API consistent with kvm_pmu_set_msr. No functional changes. Signed-off-by: NWei Wang <wei.w.wang@intel.com> Message-Id: <20200529074347.124619-2-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 17 3月, 2020 2 次提交
-
-
由 Sean Christopherson 提交于
Use vmx_pt_mode_is_host_guest() in intel_pmu_refresh() instead of bouncing through kvm_x86_ops->pt_supported, and remove ->pt_supported() as the PMU code was the last remaining user. Opportunistically clean up the wording of a comment that referenced kvm_x86_ops->pt_supported(). No functional change intended. Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Eric Hankland 提交于
The sample_period of a counter tracks when that counter will overflow and set global status/trigger a PMI. However this currently only gets set when the initial counter is created or when a counter is resumed; this updates the sample period after a wrmsr so running counters will accurately reflect their new value. Signed-off-by: NEric Hankland <ehankland@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 05 2月, 2020 1 次提交
-
-
由 Eric Hankland 提交于
Correct the logic in intel_pmu_set_msr() for fixed and general purpose counters. This was recently changed to set pmc->counter without taking in to account the value of pmc_read_counter() which will be incorrect if the counter is currently running and non-zero; this changes back to the old logic which accounted for the value of currently running counters. Signed-off-by: NEric Hankland <ehankland@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-