提交 · b4e12b2d70fd9eccdb3cef8015dc1788ca38e3fd · openeuler / Kernel

13 9月, 2022 1 次提交

perf: Kill __PERF_SAMPLE_CALLCHAIN_EARLY · b4e12b2d

由 Namhyung Kim 提交于 9月 08, 2022

There's no in-tree user anymore.  Let's get rid of it.
Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220908214104.3851807-3-namhyung@kernel.org

b4e12b2d

08 9月, 2022 5 次提交

perf/x86/intel: Optimize FIXED_CTR_CTRL access · fae9ebde

由 Kan Liang 提交于 8月 04, 2022

All the fixed counters share a fixed control register. The current
perf reads and re-writes the fixed control register for each fixed
counter disable/enable, which is unnecessary.

When changing the fixed control register, the entire PMU must be
disabled via the global control register. The changing cannot be taken
effect until the entire PMU is re-enabled. Only updating the fixed
control register once right before the entire PMU re-enabling is
enough.

The read of the fixed control register is not necessary either. The
value can be cached in the per CPU cpu_hw_events.

Test results:

Counting all the fixed counters with the perf bench sched pipe as below
on a SPR machine.

 $perf stat -e cycles,instructions,ref-cycles,slots --no-inherit --
  taskset -c 1 perf bench sched pipe

The Total elapsed time reduces from 5.36s (without the patch) to 4.99s
(with the patch), which is ~6.9% improvement.
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220804140729.2951259-1-kan.liang@linux.intel.com

fae9ebde

perf/x86/intel: Remove x86_pmu::update_topdown_event · 1acab2e0

由 Peter Zijlstra 提交于 5月 11, 2022

Now that it is all internal to the intel driver, remove
x86_pmu::update_topdown_event.

Assumes that is_topdown_count(event) can only be true when the
hardware has topdown stuff and the function is set.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220829101321.771635301@infradead.org

1acab2e0

perf/x86/intel: Remove x86_pmu::set_topdown_event_period · 23685167

由 Peter Zijlstra 提交于 5月 11, 2022

Now that it is all internal to the intel driver, remove
x86_pmu::set_topdown_event_period.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220829101321.706354189@infradead.org

23685167

perf/x86: Change x86_pmu::limit_period signature · 28f0f3c4

由 Peter Zijlstra 提交于 5月 10, 2022

In preparation for making it a static_call, change the signature.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220829101321.573713839@infradead.org

28f0f3c4

perf/x86/intel: Move the topdown stuff into the intel driver · e577bb17

由 Peter Zijlstra 提交于 5月 10, 2022

Use the new x86_pmu::{set_period,update}() methods to push the topdown
stuff into the Intel driver, where it belongs.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220829101321.505933457@infradead.org

e577bb17

06 9月, 2022 1 次提交

perf: Use sample_flags for branch stack · a9a931e2

由 Kan Liang 提交于 9月 01, 2022

Use the new sample_flags to indicate whether the branch stack is filled
by the PMU driver.

Remove the br_stack from the perf_sample_data_init() to minimize the number
of cache lines touched.
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220901130959.1285717-4-kan.liang@linux.intel.com

a9a931e2

04 7月, 2022 2 次提交

perf/x86/intel: Fix PEBS data source encoding for ADL · ccf170e9

由 Kan Liang 提交于 6月 29, 2022

The PEBS data source encoding for the e-core is different from the
p-core.

Add the pebs_data_source[] in the struct x86_hybrid_pmu to store the
data source encoding for each type of the core.

Add intel_pmu_pebs_data_source_grt() for the e-core.
There is nothing changed for the data source encoding of the p-core,
which still reuse the intel_pmu_pebs_data_source_skl().

Fixes: f83d2f91 ("perf/x86/intel: Add Alder Lake Hybrid support")
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/20220629150840.2235741-2-kan.liang@linux.intel.com

ccf170e9

perf/x86/intel: Fix PEBS memory access info encoding for ADL · 39a41278

由 Kan Liang 提交于 6月 29, 2022

The PEBS memory access latency encoding for the e-core is slightly
different from the p-core. The bit 4 is Lock, while the bit 5 is TLB
access.

Add a new flag to indicate the load/store latency event on a hybrid
platform.
Add a new function pointer to retrieve the latency data for a hybrid
platform. Only implement the new flag and function for the e-core on
ADL. Still use the existing PERF_X86_EVENT_PEBS_LDLAT/STLAT flag for the
p-core on ADL.

Factor out pebs_set_tlb_lock() to set the generic memory data source
information of the TLB access and lock for both load and store latency.

Move the intel_get_event_constraints() to ahead of the :ppp check,
otherwise the new flag never gets a chance to be set for the :ppp
events.

Fixes: f83d2f91 ("perf/x86/intel: Add Alder Lake Hybrid support")
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/20220629150840.2235741-1-kan.liang@linux.intel.com

39a41278

09 6月, 2022 1 次提交

perf/x86/intel: Fix the comment about guest LBR support on KVM · 92d80178

由 Like Xu 提交于 5月 17, 2022

Starting from v5.12, KVM reports guest LBR and extra_regs support
when the host has relevant support. Just delete this part of the
comment and fix a typo incidentally.

Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: NKan Liang <kan.liang@linux.intel.com>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Signed-off-by: NYang Weijiang <weijiang.yang@intel.com>
Message-Id: <20220517154100.29983-2-weijiang.yang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

92d80178

08 6月, 2022 9 次提交

KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations · 85425032

由 Like Xu 提交于 4月 11, 2022

The guest PEBS will be disabled when some users try to perf KVM and
its user-space through the same PEBS facility OR when the host perf
doesn't schedule the guest PEBS counter in a one-to-one mapping manner
(neither of these are typical scenarios).

The PEBS records in the guest DS buffer are still accurate and the
above two restrictions will be checked before each vm-entry only if
guest PEBS is deemed to be enabled.
Suggested-by: NWei Wang <wei.w.wang@intel.com>
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Message-Id: <20220411101946.20262-15-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

85425032

KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS · 902caeb6

由 Like Xu 提交于 4月 11, 2022

If IA32_PERF_CAPABILITIES.PEBS_BASELINE [bit 14] is set, the adaptive
PEBS is supported. The PEBS_DATA_CFG MSR and adaptive record enable
bits (IA32_PERFEVTSELx.Adaptive_Record and IA32_FIXED_CTR_CTRL.
FCx_Adaptive_Record) are also supported.

Adaptive PEBS provides software the capability to configure the PEBS
records to capture only the data of interest, keeping the record size
compact. An overflow of PMCx results in generation of an adaptive PEBS
record with state information based on the selections specified in
MSR_PEBS_DATA_CFG.By default, the record only contain the Basic group.

When guest adaptive PEBS is enabled, the IA32_PEBS_ENABLE MSR will
be added to the perf_guest_switch_msr() and switched during the VMX
transitions just like CORE_PERF_GLOBAL_CTRL MSR.

According to Intel SDM, software is recommended to  PEBS Baseline
when the following is true. IA32_PERF_CAPABILITIES.PEBS_BASELINE[14]
&& IA32_PERF_CAPABILITIES.PEBS_FMT[11:8] ≥ 4.
Co-developed-by: NLuwei Kang <luwei.kang@intel.com>
Signed-off-by: NLuwei Kang <luwei.kang@intel.com>
Signed-off-by: NLike Xu <likexu@tencent.com>
Message-Id: <20220411101946.20262-12-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

902caeb6

KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to support guest DS · 8183a538

由 Like Xu 提交于 4月 11, 2022

When CPUID.01H:EDX.DS[21] is set, the IA32_DS_AREA MSR exists and points
to the linear address of the first byte of the DS buffer management area,
which is used to manage the PEBS records.

When guest PEBS is enabled, the MSR_IA32_DS_AREA MSR will be added to the
perf_guest_switch_msr() and switched during the VMX transitions just like
CORE_PERF_GLOBAL_CTRL MSR. The WRMSR to IA32_DS_AREA MSR brings a #GP(0)
if the source register contains a non-canonical address.
Originally-by: NAndi Kleen <ak@linux.intel.com>
Co-developed-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Message-Id: <20220411101946.20262-11-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8183a538

KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter · 6ebe4436

由 Like Xu 提交于 4月 11, 2022

The PEBS-PDIR facility on Ice Lake server is supported on IA31_FIXED0 only.
If the guest configures counter 32 and PEBS is enabled, the PEBS-PDIR
facility is supposed to be used, in which case KVM adjusts attr.precise_ip
to 3 and request host perf to assign the exactly requested counter or fail.

The CPU model check is also required since some platforms may place the
PEBS-PDIR facility in another counter index.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Message-Id: <20220411101946.20262-10-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6ebe4436

KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS · c59a1f10

由 Like Xu 提交于 4月 11, 2022

If IA32_PERF_CAPABILITIES.PEBS_BASELINE [bit 14] is set, the
IA32_PEBS_ENABLE MSR exists and all architecturally enumerated fixed
and general-purpose counters have corresponding bits in IA32_PEBS_ENABLE
that enable generation of PEBS records. The general-purpose counter bits
start at bit IA32_PEBS_ENABLE[0], and the fixed counter bits start at
bit IA32_PEBS_ENABLE[32].

When guest PEBS is enabled, the IA32_PEBS_ENABLE MSR will be
added to the perf_guest_switch_msr() and atomically switched during
the VMX transitions just like CORE_PERF_GLOBAL_CTRL MSR.

Based on whether the platform supports x86_pmu.pebs_ept, it has also
refactored the way to add more msrs to arr[] in intel_guest_get_msrs()
for extensibility.
Originally-by: NAndi Kleen <ak@linux.intel.com>
Co-developed-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Co-developed-by: NLuwei Kang <luwei.kang@intel.com>
Signed-off-by: NLuwei Kang <luwei.kang@intel.com>
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Message-Id: <20220411101946.20262-8-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c59a1f10

x86/perf/core: Add pebs_capable to store valid PEBS_COUNTER_MASK value · 0d23dc34

由 Peter Zijlstra (Intel) 提交于 4月 11, 2022

The value of pebs_counter_mask will be accessed frequently
for repeated use in the intel_guest_get_msrs(). So it can be
optimized instead of endlessly mucking about with branches.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Message-Id: <20220411101946.20262-7-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0d23dc34

perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values · 39a4d779

由 Like Xu 提交于 4月 11, 2022

Splitting the logic for determining the guest values is unnecessarily
confusing, and potentially fragile. Perf should have full knowledge and
control of what values are loaded for the guest.

If we change .guest_get_msrs() to take a struct kvm_pmu pointer, then it
can generate the full set of guest values by grabbing guest ds_area and
pebs_data_cfg. Alternatively, .guest_get_msrs() could take the desired
guest MSR values directly (ds_area and pebs_data_cfg), but kvm_pmu is
vendor agnostic, so we don't see any reason to not just pass the pointer.
Suggested-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Message-Id: <20220411101946.20262-4-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

39a4d779

perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest · 69e575dd

由 Like Xu 提交于 4月 11, 2022

With PEBS virtualization, the guest PEBS records get delivered to the
guest DS, and the host pmi handler uses perf_guest_cbs->is_in_guest()
to distinguish whether the PMI comes from the guest code like Intel PT.

No matter how many guest PEBS counters are overflowed, only triggering
one fake event is enough. The fake event causes the KVM PMI callback to
be called, thereby injecting the PEBS overflow PMI into the guest.

KVM may inject the PMI with BUFFER_OVF set, even if the guest DS is
empty. That should really be harmless. Thus guest PEBS handler would
retrieve the correct information from its own PEBS records buffer.

Cc: linux-perf-users@vger.kernel.org
Originally-by: NAndi Kleen <ak@linux.intel.com>
Co-developed-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NLike Xu <likexu@tencent.com>
Message-Id: <20220411101946.20262-3-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

69e575dd

perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server · fb358e0b

由 Like Xu 提交于 4月 11, 2022

Add support for EPT-Friendly PEBS, a new CPU feature that enlightens PEBS
to translate guest linear address through EPT, and facilitates handling
VM-Exits that occur when accessing PEBS records.  More information can
be found in the December 2021 release of Intel's SDM, Volume 3,
18.9.5 "EPT-Friendly PEBS". This new hardware facility makes sure the
guest PEBS records will not be lost, which is available on Intel Ice Lake
Server platforms (and later).

KVM will check this field through perf_get_x86_pmu_capability() instead
of hard coding the CPU models in the KVM code. If it is supported, the
guest PEBS capability will be exposed to the guest. Guest PEBS can be
enabled when and only when "EPT-Friendly PEBS" is supported and
EPT is enabled.

Cc: linux-perf-users@vger.kernel.org
Signed-off-by: NLike Xu <likexu@tencent.com>
Message-Id: <20220411101946.20262-2-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fb358e0b

25 5月, 2022 1 次提交

perf/x86/intel: Fix event constraints for ICL · 86dca369

由 Kan Liang 提交于 5月 25, 2022

According to the latest event list, the event encoding 0x55
INST_DECODED.DECODERS and 0x56 UOPS_DECODED.DEC0 are only available on
the first 4 counters. Add them into the event constraints table.

Fixes: 60176089 ("perf/x86/intel: Add Icelake support")
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220525133952.1660658-1-kan.liang@linux.intel.com

86dca369

11 5月, 2022 1 次提交

perf/x86: Add new Alder Lake and Raptor Lake support · c2a960f7

由 Kan Liang 提交于 5月 04, 2022

From PMU's perspective, there is no difference for the new Alder Lake N
and Raptor Lake P.
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220504194413.1003071-1-kan.liang@linux.intel.com

c2a960f7

05 4月, 2022 3 次提交

perf/x86/intel: Update the FRONTEND MSR mask on Sapphire Rapids · e590928d

由 Kan Liang 提交于 3月 28, 2022

On Sapphire Rapids, the FRONTEND_RETIRED.MS_FLOWS event requires the
FRONTEND MSR value 0x8. However, the current FRONTEND MSR mask doesn't
support it.

Update intel_spr_extra_regs[] to support it.

Fixes: 61b985e3 ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids")
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1648482543-14923-2-git-send-email-kan.liang@linux.intel.com

e590928d

perf/x86/intel: Don't extend the pseudo-encoding to GP counters · 4a263bf3

由 Kan Liang 提交于 3月 28, 2022

The INST_RETIRED.PREC_DIST event (0x0100) doesn't count on SPR.
perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0

 Performance counter stats for 'CPU(s) 0':

           607,246      cpu/event=0xc0,umask=0x0/
                 0      cpu/event=0x0,umask=0x1/

The encoding for INST_RETIRED.PREC_DIST is pseudo-encoding, which
doesn't work on the generic counters. However, current perf extends its
mask to the generic counters.

The pseudo event-code for a fixed counter must be 0x00. Check and avoid
extending the mask for the fixed counter event which using the
pseudo-encoding, e.g., ref-cycles and PREC_DIST event.

With the patch,
perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0

 Performance counter stats for 'CPU(s) 0':

           583,184      cpu/event=0xc0,umask=0x0/
           583,048      cpu/event=0x0,umask=0x1/

Fixes: 2de71ee1 ("perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings")
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1648482543-14923-1-git-send-email-kan.liang@linux.intel.com

4a263bf3

perf/x86: Add Intel Raptor Lake support · c61759e5

由 Kan Liang 提交于 3月 15, 2022

From PMU's perspective, Raptor Lake is the same as the Alder Lake. The
only difference is the event list, which will be supported in the perf
tool later.
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/1647366360-82824-1-git-send-email-kan.liang@linux.intel.com

c61759e5

02 2月, 2022 2 次提交

perf/x86/intel: Increase max number of the fixed counters · ee28855a

由 Kan Liang 提交于 2月 01, 2022

The new PEBS format 5 implies that the number of the fixed counters can
be up to 16. The current INTEL_PMC_MAX_FIXED is still 4. If the current
kernel runs on a future platform which has more than 4 fixed counters,
a warning will be triggered. The number of the fixed counters will be
clipped to 4. Users have to upgrade the kernel to access the new fixed
counters.

Add a new default constraint for PerfMon v5 and up, which can support
up to 16 fixed counters. The pseudo-encoding is applied for the fixed
counters 4 and later. The user can have generic support for the new
fixed counters on the future platfroms without updating the kernel.

Increase the INTEL_PMC_MAX_FIXED to 16.
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1643750603-100733-3-git-send-email-kan.liang@linux.intel.com

ee28855a

x86/perf: Default set FREEZE_ON_SMI for all · a01994f5

由 Peter Zijlstra 提交于 1月 27, 2022

Kyle reported that rr[0] has started to malfunction on Comet Lake and
later CPUs due to EFI starting to make use of CPL3 [1] and the PMU
event filtering not distinguishing between regular CPL3 and SMM CPL3.

Since this is a privilege violation, default disable SMM visibility
where possible.

Administrators wanting to observe SMM cycles can easily change this
using the sysfs attribute while regular users don't have access to
this file.

[0] https://rr-project.org/

[1] See the Intel white paper "Trustworthy SMM on the Intel vPro Platform"
at https://bugzilla.kernel.org/attachment.cgi?id=300300, particularly the
end of page 5.
Reported-by: NKyle Huey <me@kylehuey.com>
Suggested-by: NAndrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@kernel.org
Link: https://lkml.kernel.org/r/YfKChjX61OW4CkYm@hirez.programming.kicks-ass.net

a01994f5

18 1月, 2022 2 次提交

perf/x86/intel/lbr: Support LBR format V7 · 1ac7fd81

由 Peter Zijlstra (Intel) 提交于 1月 04, 2022

The Goldmont plus and Tremont have LBR format V7. The V7 has LBR_INFO,
which is the same as LBR format V5. But V7 doesn't support TSX.

Without the patch, the associated misprediction and cycles information
in the LBR_INFO may be lost on a Goldmont plus platform.
For Tremont, the patch only impacts the non-PEBS events. Because of the
adaptive PEBS, the LBR_INFO is always processed for a PEBS event.

Currently, two different ways are used to check the LBR capabilities,
which make the codes complex and confusing.
For the LBR format V4 and earlier, the global static lbr_desc array is
used to store the flags for the LBR capabilities in each LBR format.
For LBR format V5 and V6, the current code checks the version number
for the LBR capabilities.

There are common LBR capabilities among LBR format versions. Several
flags for the LBR capabilities are introduced into the struct x86_pmu.
The flags, which can be shared among LBR formats, are used to check
the LBR capabilities. Add intel_pmu_lbr_init() to set the flags
accordingly at boot time.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: NKan Liang <kan.liang@linux.intel.com>
Link: https://lkml.kernel.org/r/1641315077-96661-1-git-send-email-peterz@infradead.org

1ac7fd81

perf/x86/intel: Add a quirk for the calculation of the number of counters on Alder Lake · 7fa981ca

由 Kan Liang 提交于 1月 11, 2022

For some Alder Lake machine with all E-cores disabled in a BIOS, the
below warning may be triggered.

[ 2.010766] hw perf events fixed 5 > max(4), clipping!

Current perf code relies on the CPUID leaf 0xA and leaf 7.EDX[15] to
calculate the number of the counters and follow the below assumption.

For a hybrid configuration, the leaf 7.EDX[15] (X86_FEATURE_HYBRID_CPU)
is set. The leaf 0xA only enumerate the common counters. Linux perf has
to manually add the extra GP counters and fixed counters for P-cores.
For a non-hybrid configuration, the X86_FEATURE_HYBRID_CPU should not
be set. The leaf 0xA enumerates all counters.

However, that's not the case when all E-cores are disabled in a BIOS.
Although there are only P-cores in the system, the leaf 7.EDX[15]
(X86_FEATURE_HYBRID_CPU) is still set. But the leaf 0xA is updated
to enumerate all counters of P-cores. The inconsistency triggers the
warning.

Several software ways were considered to handle the inconsistency.
- Drop the leaf 0xA and leaf 7.EDX[15] CPUID enumeration support.
  Hardcode the number of counters. This solution may be a problem for
  virtualization. A hypervisor cannot control the number of counters
  in a Linux guest via changing the guest CPUID enumeration anymore.
- Find another CPUID bit that is also updated with E-cores disabled.
  There may be a problem in the virtualization environment too. Because
  a hypervisor may disable the feature/CPUID bit.
- The P-cores have a maximum of 8 GP counters and 4 fixed counters on
  ADL. The maximum number can be used to detect the case.
  This solution is implemented in this patch.

Fixes: ee72a94e ("perf/x86/intel: Fix fixed counter check warning for some Alder Lake")
Reported-by: NDamjan Marion (damarion) <damarion@cisco.com>
Reported-by: Chan Edison <edison_chan_gz@hotmail.com>
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: NDamjan Marion (damarion) <damarion@cisco.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1641925238-149288-1-git-send-email-kan.liang@linux.intel.com

7fa981ca

17 11月, 2021 4 次提交

perf: Add wrappers for invoking guest callbacks · 1c343051

由 Sean Christopherson 提交于 11月 11, 2021

Add helpers for the guest callbacks to prepare for burying the callbacks
behind a Kconfig (it's a lot easier to provide a few stubs than to #ifdef
piles of code), and also to prepare for converting the callbacks to
static_call().  perf_instruction_pointer() in particular will have subtle
semantics with static_call(), as the "no callbacks" case will return 0 if
the callbacks are unregistered between querying guest state and getting
the IP.  Implement the change now to avoid a functional change when adding
static_call() support, and because the new helper needs to return
_something_ in this case.
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20211111020738.2512932-8-seanjc@google.com

1c343051

perf/core: Rework guest callbacks to prepare for static_call support · b9f5621c

由 Like Xu 提交于 11月 11, 2021

To prepare for using static_calls to optimize perf's guest callbacks,
replace ->is_in_guest and ->is_user_mode with a new multiplexed hook
->state, tweak ->handle_intel_pt_intr to play nice with being called when
there is no active guest, and drop "guest" from ->get_guest_ip.

Return '0' from ->state and ->handle_intel_pt_intr to indicate "not in
guest" so that DEFINE_STATIC_CALL_RET0 can be used to define the static
calls, i.e. no callback == !guest.

[sean: extracted from static_call patch, fixed get_ip() bug, wrote changelog]
Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Originally-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Signed-off-by: NZhu Lingshan <lingshan.zhu@intel.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20211111020738.2512932-7-seanjc@google.com

b9f5621c

perf: Protect perf_guest_cbs with RCU · ff083a2d

由 Sean Christopherson 提交于 11月 11, 2021

Protect perf_guest_cbs with RCU to fix multiple possible errors.  Luckily,
all paths that read perf_guest_cbs already require RCU protection, e.g. to
protect the callback chains, so only the direct perf_guest_cbs touchpoints
need to be modified.

Bug #1 is a simple lack of WRITE_ONCE/READ_ONCE behavior to ensure
perf_guest_cbs isn't reloaded between a !NULL check and a dereference.
Fixed via the READ_ONCE() in rcu_dereference().

Bug #2 is that on weakly-ordered architectures, updates to the callbacks
themselves are not guaranteed to be visible before the pointer is made
visible to readers.  Fixed by the smp_store_release() in
rcu_assign_pointer() when the new pointer is non-NULL.

Bug #3 is that, because the callbacks are global, it's possible for
readers to run in parallel with an unregisters, and thus a module
implementing the callbacks can be unloaded while readers are in flight,
resulting in a use-after-free.  Fixed by a synchronize_rcu() call when
unregistering callbacks.

Bug #1 escaped notice because it's extremely unlikely a compiler will
reload perf_guest_cbs in this sequence.  perf_guest_cbs does get reloaded
for future derefs, e.g. for ->is_user_mode(), but the ->is_in_guest()
guard all but guarantees the consumer will win the race, e.g. to nullify
perf_guest_cbs, KVM has to completely exit the guest and teardown down
all VMs before KVM start its module unload / unregister sequence.  This
also makes it all but impossible to encounter bug #3.

Bug #2 has not been a problem because all architectures that register
callbacks are strongly ordered and/or have a static set of callbacks.

But with help, unloading kvm_intel can trigger bug #1 e.g. wrapping
perf_guest_cbs with READ_ONCE in perf_misc_flags() while spamming
kvm_intel module load/unload leads to:

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] PREEMPT SMP
  CPU: 6 PID: 1825 Comm: stress Not tainted 5.14.0-rc2+ #459
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:perf_misc_flags+0x1c/0x70
  Call Trace:
   perf_prepare_sample+0x53/0x6b0
   perf_event_output_forward+0x67/0x160
   __perf_event_overflow+0x52/0xf0
   handle_pmi_common+0x207/0x300
   intel_pmu_handle_irq+0xcf/0x410
   perf_event_nmi_handler+0x28/0x50
   nmi_handle+0xc7/0x260
   default_do_nmi+0x6b/0x170
   exc_nmi+0x103/0x130
   asm_exc_nmi+0x76/0xbf

Fixes: 39447b38 ("perf: Enhance perf to allow for guest statistic collection from host")
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211111020738.2512932-2-seanjc@google.com

ff083a2d

x86/perf: Fix snapshot_branch_stack warning in VM · f3fd84a3

由 Song Liu 提交于 11月 11, 2021

When running in VM intel_pmu_snapshot_branch_stack triggers WRMSR warning
like:

 [ ] unchecked MSR access error: WRMSR to 0x3f1 (tried to write 0x0000000000000000) at rIP: 0xffffffff81011a5b (intel_pmu_snapshot_branch_stack+0x3b/0xd0)

This can be triggered with BPF selftests:

  tools/testing/selftests/bpf/test_progs -t get_branch_snapshot

This warning is caused by __intel_pmu_pebs_disable_all() in the VM.
Since it is not necessary to disable PEBS for LBR, remove it from
intel_pmu_snapshot_branch_stack and intel_pmu_snapshot_arch_branch_stack.

Fixes: c22ac2a3 ("perf: Enable branch record for software events")
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: NLike Xu <likexu@tencent.com>
Link: https://lore.kernel.org/r/20211112054510.2667030-1-songliubraving@fb.com

f3fd84a3

11 11月, 2021 1 次提交

perf/x86/vlbr: Add c->flags to vlbr event constraints · 58637025

由 Like Xu 提交于 11月 03, 2021

Just like what we do in the x86_get_event_constraints(), the
PERF_X86_EVENT_LBR_SELECT flag should also be propagated
to event->hw.flags so that the host lbr driver can save/restore
MSR_LBR_SELECT for the special vlbr event created by KVM or BPF.

Fixes: 097e4311 ("perf/x86: Add constraint to create guest LBR event without hw counter")
Reported-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NLike Xu <likexu@tencent.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: NWanpeng Li <wanpengli@tencent.com>
Link: https://lore.kernel.org/r/20211103091716.59906-1-likexu@tencent.com

58637025

30 10月, 2021 1 次提交

perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings · 2de71ee1

由 Stephane Eranian 提交于 10月 13, 2021

This patch fixes the encoding for INST_RETIRED.PREC_DIST as published by Intel
(download.01.org/perfmon/) for Icelake. The official encoding
is event code 0x00 umask 0x1, a change from Skylake where it was code 0xc0
umask 0x1.

With this patch applied it is possible to run:
$ perf record -a -e cpu/event=0x00,umask=0x1/pp .....

Whereas before this would fail.

To avoid problems with tools which may use the old code, we maintain the old
encoding for Icelake.
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20211014001214.2680534-1-eranian@google.com

2de71ee1

15 10月, 2021 1 次提交

perf/x86: Add new event for AUX output counter index · 8b8ff8cc

由 Adrian Hunter 提交于 9月 07, 2021

PEBS-via-PT records contain a mask of applicable counters. To identify
which event belongs to which counter, a side-band event is needed. Until
now, there has been no side-band event, and consequently users were limited
to using a single event.

Add such a side-band event. Note the event is optimised to output only
when the counter index changes for an event. That works only so long as
all PEBS-via-PT events are scheduled together, which they are for a
recording session because they are in a single group.

Also no attribute bit is used to select the new event, so a new
kernel is not compatible with older perf tools.  The assumption
being that PEBS-via-PT is sufficiently esoteric that users will not
be troubled by this.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210907163903.11820-2-adrian.hunter@intel.com

8b8ff8cc

01 10月, 2021 1 次提交

perf/x86/intel: Update event constraints for ICX · ecc2123e

由 Kan Liang 提交于 9月 28, 2021

According to the latest event list, the event encoding 0xEF is only
available on the first 4 counters. Add it into the event constraints
table.

Fixes: 60176089 ("perf/x86/intel: Add Icelake support")
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1632842343-25862-1-git-send-email-kan.liang@linux.intel.com

ecc2123e

14 9月, 2021 1 次提交

perf: Enable branch record for software events · c22ac2a3

由 Song Liu 提交于 9月 10, 2021

The typical way to access branch record (e.g. Intel LBR) is via hardware
perf_event. For CPUs with FREEZE_LBRS_ON_PMI support, PMI could capture
reliable LBR. On the other hand, LBR could also be useful in non-PMI
scenario. For example, in kretprobe or bpf fexit program, LBR could
provide a lot of information on what happened with the function. Add API
to use branch record for software use.

Note that, when the software event triggers, it is necessary to stop the
branch record hardware asap. Therefore, static_call is used to remove some
branch instructions in this process.
Suggested-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/bpf/20210910183352.3151445-2-songliubraving@fb.com

c22ac2a3

26 8月, 2021 1 次提交

perf/x86/intel: Replace deprecated CPU-hotplug functions · eda8a2c5

由 Sebastian Andrzej Siewior 提交于 8月 03, 2021

The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().

Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210803141621.780504-11-bigeasy@linutronix.de

eda8a2c5

06 8月, 2021 1 次提交

perf/x86/intel: Apply mid ACK for small core · acade637

由 Kan Liang 提交于 8月 03, 2021

A warning as below may be occasionally triggered in an ADL machine when
these conditions occur:

 - Two perf record commands run one by one. Both record a PEBS event.
 - Both runs on small cores.
 - They have different adaptive PEBS configuration (PEBS_DATA_CFG).

  [ ] WARNING: CPU: 4 PID: 9874 at arch/x86/events/intel/ds.c:1743 setup_pebs_adaptive_sample_data+0x55e/0x5b0
  [ ] RIP: 0010:setup_pebs_adaptive_sample_data+0x55e/0x5b0
  [ ] Call Trace:
  [ ]  <NMI>
  [ ]  intel_pmu_drain_pebs_icl+0x48b/0x810
  [ ]  perf_event_nmi_handler+0x41/0x80
  [ ]  </NMI>
  [ ]  __perf_event_task_sched_in+0x2c2/0x3a0

Different from the big core, the small core requires the ACK right
before re-enabling counters in the NMI handler, otherwise a stale PEBS
record may be dumped into the later NMI handler, which trigger the
warning.

Add a new mid_ack flag to track the case. Add all PMI handler bits in
the struct x86_hybrid_pmu to track the bits for different types of
PMUs.  Apply mid ACK for the small cores on an Alder Lake machine.

The existing hybrid() macro has a compile error when taking address of
a bit-field variable. Add a new macro hybrid_bit() to get the
bit-field value of a given PMU.

Fixes: f83d2f91 ("perf/x86/intel: Add Alder Lake Hybrid support")
Reported-by: NAmmy Yi <ammy.yi@intel.com>
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Tested-by: NAmmy Yi <ammy.yi@intel.com>
Link: https://lkml.kernel.org/r/1627997128-57891-1-git-send-email-kan.liang@linux.intel.com

acade637

24 6月, 2021 1 次提交

perf/x86/intel: Fix instructions:ppp support in Sapphire Rapids · 1d5c7880

由 Kan Liang 提交于 6月 18, 2021

Perf errors out when sampling instructions:ppp.

$ perf record -e instructions:ppp -- true
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument)
for event (instructions:ppp).

The instruction PDIR is only available on the fixed counter 0. The event
constraint has been updated to fixed0_constraint in
icl_get_event_constraints(). The Sapphire Rapids codes unconditionally
error out for the event which is not available on the GP counter 0.

Make the instructions:ppp an exception.

Fixes: 61b985e3 ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids")
Reported-by: NYasin, Ahmad <ahmad.yasin@intel.com>
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1624029174-122219-4-git-send-email-kan.liang@linux.intel.com

1d5c7880

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功