1. 08 6月, 2022 6 次提交
  2. 22 4月, 2022 1 次提交
  3. 14 4月, 2022 1 次提交
  4. 02 4月, 2022 2 次提交
  5. 25 2月, 2022 1 次提交
  6. 11 2月, 2022 1 次提交
    • S
      KVM: nVMX: Refactor PMU refresh to avoid referencing kvm_x86_ops.pmu_ops · 0bcd556e
      Sean Christopherson 提交于
      Refactor the nested VMX PMU refresh helper to pass it a flag stating
      whether or not the vCPU has PERF_GLOBAL_CTRL instead of having the nVMX
      helper query the information by bouncing through kvm_x86_ops.pmu_ops.
      This will allow a future patch to use static_call() for the PMU ops
      without having to export any static call definitions from common x86, and
      it is also a step toward unexported kvm_x86_ops.
      
      Alternatively, nVMX could call kvm_pmu_is_valid_msr() to indirectly use
      kvm_x86_ops.pmu_ops, but that would incur an extra layer of indirection
      and would require exporting kvm_pmu_is_valid_msr().
      
      Opportunistically rename the helper to keep line lengths somewhat
      reasonable, and to better capture its high-level role.
      
      No functional change intended.
      
      Cc: Like Xu <like.xu.linux@gmail.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20220128005208.4008533-9-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0bcd556e
  7. 02 2月, 2022 1 次提交
  8. 18 1月, 2022 2 次提交
    • L
      KVM: x86: Making the module parameter of vPMU more common · 4732f244
      Like Xu 提交于
      The new module parameter to control PMU virtualization should apply
      to Intel as well as AMD, for situations where userspace is not trusted.
      If the module parameter allows PMU virtualization, there could be a
      new KVM_CAP or guest CPUID bits whereby userspace can enable/disable
      PMU virtualization on a per-VM basis.
      
      If the module parameter does not allow PMU virtualization, there
      should be no userspace override, since we have no precedent for
      authorizing that kind of override. If it's false, other counter-based
      profiling features (such as LBR including the associated CPUID bits
      if any) will not be exposed.
      
      Change its name from "pmu" to "enable_pmu" as we have temporary
      variables with the same name in our code like "struct kvm_pmu *pmu".
      
      Fixes: b1d66dad ("KVM: x86/svm: Add module param to control PMU virtualization")
      Suggested-by : Jim Mattson <jmattson@google.com>
      Signed-off-by: NLike Xu <likexu@tencent.com>
      Message-Id: <20220111073823.21885-1-likexu@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4732f244
    • L
      KVM: x86/pmu: Fix available_event_types check for REF_CPU_CYCLES event · a2186448
      Like Xu 提交于
      According to CPUID 0x0A.EBX bit vector, the event [7] should be the
      unrealized event "Topdown Slots" instead of the *kernel* generalized
      common hardware event "REF_CPU_CYCLES", so we need to skip the cpuid
      unavaliblity check in the intel_pmc_perf_hw_id() for the last
      REF_CPU_CYCLES event and update the confusing comment.
      
      If the event is marked as unavailable in the Intel guest CPUID
      0AH.EBX leaf, we need to avoid any perf_event creation, whether
      it's a gp or fixed counter. To distinguish whether it is a rejected
      event or an event that needs to be programmed with PERF_TYPE_RAW type,
      a new special returned value of "PERF_COUNT_HW_MAX + 1" is introduced.
      
      Fixes: 62079d8a ("KVM: PMU: add proper support for fixed counter 2")
      Signed-off-by: NLike Xu <likexu@tencent.com>
      Message-Id: <20220105051509.69437-1-likexu@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a2186448
  9. 07 1月, 2022 4 次提交
    • L
      KVM: x86/pmu: Reuse pmc_perf_hw_id() and drop find_fixed_event() · 6ed1298e
      Like Xu 提交于
      Since we set the same semantic event value for the fixed counter in
      pmc->eventsel, returning the perf_hw_id for the fixed counter via
      find_fixed_event() can be painlessly replaced by pmc_perf_hw_id()
      with the help of pmc_is_fixed() check.
      Signed-off-by: NLike Xu <likexu@tencent.com>
      Message-Id: <20211130074221.93635-4-likexu@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6ed1298e
    • L
      KVM: x86/pmu: Refactoring find_arch_event() to pmc_perf_hw_id() · 7c174f30
      Like Xu 提交于
      The find_arch_event() returns a "unsigned int" value,
      which is used by the pmc_reprogram_counter() to
      program a PERF_TYPE_HARDWARE type perf_event.
      
      The returned value is actually the kernel defined generic
      perf_hw_id, let's rename it to pmc_perf_hw_id() with simpler
      incoming parameters for better self-explanation.
      Signed-off-by: NLike Xu <likexu@tencent.com>
      Message-Id: <20211130074221.93635-3-likexu@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7c174f30
    • L
      KVM: x86/pmu: Setup pmc->eventsel for fixed PMCs · 76187563
      Like Xu 提交于
      The current pmc->eventsel for fixed counter is underutilised. The
      pmc->eventsel can be setup for all known available fixed counters
      since we have mapping between fixed pmc index and
      the intel_arch_events array.
      
      Either gp or fixed counter, it will simplify the later checks for
      consistency between eventsel and perf_hw_id.
      Signed-off-by: NLike Xu <likexu@tencent.com>
      Message-Id: <20211130074221.93635-2-likexu@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      76187563
    • P
      KVM: x86: avoid out of bounds indices for fixed performance counters · 006a0f06
      Paolo Bonzini 提交于
      Because IceLake has 4 fixed performance counters but KVM only
      supports 3, it is possible for reprogram_fixed_counters to pass
      to reprogram_fixed_counter an index that is out of bounds for the
      fixed_pmc_events array.
      
      Ultimately intel_find_fixed_event, which is the only place that uses
      fixed_pmc_events, handles this correctly because it checks against the
      size of fixed_pmc_events anyway.  Every other place operates on the
      fixed_counters[] array which is sized according to INTEL_PMC_MAX_FIXED.
      However, it is cleaner if the unsupported performance counters are culled
      early on in reprogram_fixed_counters.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      006a0f06
  10. 11 11月, 2021 1 次提交
  11. 22 10月, 2021 1 次提交
  12. 04 8月, 2021 1 次提交
    • L
      KVM: x86/pmu: Introduce pmc->is_paused to reduce the call time of perf interfaces · e79f49c3
      Like Xu 提交于
      Based on our observations, after any vm-exit associated with vPMU, there
      are at least two or more perf interfaces to be called for guest counter
      emulation, such as perf_event_{pause, read_value, period}(), and each one
      will {lock, unlock} the same perf_event_ctx. The frequency of calls becomes
      more severe when guest use counters in a multiplexed manner.
      
      Holding a lock once and completing the KVM request operations in the perf
      context would introduce a set of impractical new interfaces. So we can
      further optimize the vPMU implementation by avoiding repeated calls to
      these interfaces in the KVM context for at least one pattern:
      
      After we call perf_event_pause() once, the event will be disabled and its
      internal count will be reset to 0. So there is no need to pause it again
      or read its value. Once the event is paused, event period will not be
      updated until the next time it's resumed or reprogrammed. And there is
      also no need to call perf_event_period twice for a non-running counter,
      considering the perf_event for a running counter is never paused.
      
      Based on this implementation, for the following common usage of
      sampling 4 events using perf on a 4u8g guest:
      
        echo 0 > /proc/sys/kernel/watchdog
        echo 25 > /proc/sys/kernel/perf_cpu_time_max_percent
        echo 10000 > /proc/sys/kernel/perf_event_max_sample_rate
        echo 0 > /proc/sys/kernel/perf_cpu_time_max_percent
        for i in `seq 1 1 10`
        do
        taskset -c 0 perf record \
        -e cpu-cycles -e instructions -e branch-instructions -e cache-misses \
        /root/br_instr a
        done
      
      the average latency of the guest NMI handler is reduced from
      37646.7 ns to 32929.3 ns (~1.14x speed up) on the Intel ICX server.
      Also, in addition to collecting more samples, no loss of sampling
      accuracy was observed compared to before the optimization.
      Signed-off-by: NLike Xu <likexu@tencent.com>
      Message-Id: <20210728120705.6855-1-likexu@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      e79f49c3
  13. 24 2月, 2021 1 次提交
  14. 04 2月, 2021 8 次提交
  15. 26 1月, 2021 2 次提交
    • L
      KVM: x86/pmu: Fix HW_REF_CPU_CYCLES event pseudo-encoding in intel_arch_events[] · 98dd2f10
      Like Xu 提交于
      The HW_REF_CPU_CYCLES event on the fixed counter 2 is pseudo-encoded as
      0x0300 in the intel_perfmon_event_map[]. Correct its usage.
      
      Fixes: 62079d8a ("KVM: PMU: add proper support for fixed counter 2")
      Signed-off-by: NLike Xu <like.xu@linux.intel.com>
      Message-Id: <20201230081916.63417-1-like.xu@linux.intel.com>
      Reviewed-by: NSean Christopherson <seanjc@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      98dd2f10
    • L
      KVM: x86/pmu: Fix UBSAN shift-out-of-bounds warning in intel_pmu_refresh() · e61ab2a3
      Like Xu 提交于
      Since we know vPMU will not work properly when (1) the guest bit_width(s)
      of the [gp|fixed] counters are greater than the host ones, or (2) guest
      requested architectural events exceeds the range supported by the host, so
      we can setup a smaller left shift value and refresh the guest cpuid entry,
      thus fixing the following UBSAN shift-out-of-bounds warning:
      
      shift exponent 197 is too large for 64-bit type 'long long unsigned int'
      
      Call Trace:
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x107/0x163 lib/dump_stack.c:120
       ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
       __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
       intel_pmu_refresh.cold+0x75/0x99 arch/x86/kvm/vmx/pmu_intel.c:348
       kvm_vcpu_after_set_cpuid+0x65a/0xf80 arch/x86/kvm/cpuid.c:177
       kvm_vcpu_ioctl_set_cpuid2+0x160/0x440 arch/x86/kvm/cpuid.c:308
       kvm_arch_vcpu_ioctl+0x11b6/0x2d70 arch/x86/kvm/x86.c:4709
       kvm_vcpu_ioctl+0x7b9/0xdb0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3386
       vfs_ioctl fs/ioctl.c:48 [inline]
       __do_sys_ioctl fs/ioctl.c:753 [inline]
       __se_sys_ioctl fs/ioctl.c:739 [inline]
       __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported-by: syzbot+ae488dc136a4cc6ba32b@syzkaller.appspotmail.com
      Signed-off-by: NLike Xu <like.xu@linux.intel.com>
      Message-Id: <20210118025800.34620-1-like.xu@linux.intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e61ab2a3
  16. 11 7月, 2020 1 次提交
  17. 05 6月, 2020 1 次提交
  18. 01 6月, 2020 2 次提交
    • L
      KVM: x86/pmu: Support full width counting · 27461da3
      Like Xu 提交于
      Intel CPUs have a new alternative MSR range (starting from MSR_IA32_PMC0)
      for GP counters that allows writing the full counter width. Enable this
      range from a new capability bit (IA32_PERF_CAPABILITIES.FW_WRITE[bit 13]).
      
      The guest would query CPUID to get the counter width, and sign extends
      the counter values as needed. The traditional MSRs always limit to 32bit,
      even though the counter internally is larger (48 or 57 bits).
      
      When the new capability is set, use the alternative range which do not
      have these restrictions. This lowers the overhead of perf stat slightly
      because it has to do less interrupts to accumulate the counter value.
      Signed-off-by: NLike Xu <like.xu@linux.intel.com>
      Message-Id: <20200529074347.124619-3-like.xu@linux.intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      27461da3
    • W
      KVM: x86/pmu: Tweak kvm_pmu_get_msr to pass 'struct msr_data' in · cbd71758
      Wei Wang 提交于
      Change kvm_pmu_get_msr() to get the msr_data struct, as the host_initiated
      field from the struct could be used by get_msr. This also makes this API
      consistent with kvm_pmu_set_msr. No functional changes.
      Signed-off-by: NWei Wang <wei.w.wang@intel.com>
      Message-Id: <20200529074347.124619-2-like.xu@linux.intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cbd71758
  19. 17 3月, 2020 2 次提交
  20. 05 2月, 2020 1 次提交