- 22 3月, 2021 2 次提交
-
-
由 Ingo Molnar 提交于
Fix another ~42 single-word typos in arch/x86/ code comments, missed a few in the first pass, in particular in .S files. Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-kernel@vger.kernel.org
-
由 Ingo Molnar 提交于
We've accumulated a few unusual Unicode characters in arch/x86/ over the years, substitute them with their proper ASCII equivalents. A few of them were a whitespace equivalent: ' ' - the use was harmless. Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org
-
- 18 3月, 2021 1 次提交
-
-
由 Ingo Molnar 提交于
Fix ~144 single-word typos in arch/x86/ code comments. Doing this in a single commit should reduce the churn. Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-kernel@vger.kernel.org
-
- 17 3月, 2021 2 次提交
-
-
由 Kan Liang 提交于
On a Haswell machine, the perf_fuzzer managed to trigger this message: [117248.075892] unchecked MSR access error: WRMSR to 0x3f1 (tried to write 0x0400000000000000) at rIP: 0xffffffff8106e4f4 (native_write_msr+0x4/0x20) [117248.089957] Call Trace: [117248.092685] intel_pmu_pebs_enable_all+0x31/0x40 [117248.097737] intel_pmu_enable_all+0xa/0x10 [117248.102210] __perf_event_task_sched_in+0x2df/0x2f0 [117248.107511] finish_task_switch.isra.0+0x15f/0x280 [117248.112765] schedule_tail+0xc/0x40 [117248.116562] ret_from_fork+0x8/0x30 A fake event called VLBR_EVENT may use the bit 58 of the PEBS_ENABLE, if the precise_ip is set. The bit 58 is reserved by the HW. Accessing the bit causes the unchecked MSR access error. The fake event doesn't support PEBS. The case should be rejected. Fixes: 097e4311 ("perf/x86: Add constraint to create guest LBR event without hw counter") Reported-by: NVince Weaver <vincent.weaver@maine.edu> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1615555298-140216-2-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
A repeatable crash can be triggered by the perf_fuzzer on some Haswell system. https://lore.kernel.org/lkml/7170d3b-c17f-1ded-52aa-cc6d9ae999f4@maine.edu/ For some old CPUs (HSW and earlier), the PEBS status in a PEBS record may be mistakenly set to 0. To minimize the impact of the defect, the commit was introduced to try to avoid dropping the PEBS record for some cases. It adds a check in the intel_pmu_drain_pebs_nhm(), and updates the local pebs_status accordingly. However, it doesn't correct the PEBS status in the PEBS record, which may trigger the crash, especially for the large PEBS. It's possible that all the PEBS records in a large PEBS have the PEBS status 0. If so, the first get_next_pebs_record_by_bit() in the __intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large PEBS, the 'count' parameter must > 1. The second get_next_pebs_record_by_bit() will crash. Besides the local pebs_status, correct the PEBS status in the PEBS record as well. Fixes: 01330d72 ("perf/x86: Allow zero PEBS status with only single active event") Reported-by: NVince Weaver <vincent.weaver@maine.edu> Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1615555298-140216-1-git-send-email-kan.liang@linux.intel.com
-
- 06 3月, 2021 1 次提交
-
-
由 Kan Liang 提交于
To supply a PID/TID for large PEBS, it requires flushing the PEBS buffer in a context switch. For normal LBRs, a context switch can flip the address space and LBR entries are not tagged with an identifier, we need to wipe the LBR, even for per-cpu events. For LBR callstack, save/restore the stack is required during a context switch. Set PERF_ATTACH_SCHED_CB for the event with large PEBS & LBR. Fixes: 9c964efa ("perf/x86/intel: Drain the PEBS buffer during context switches") Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NIngo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20201130193842.10569-2-kan.liang@linux.intel.com
-
- 10 2月, 2021 1 次提交
-
-
由 Jim Mattson 提交于
Cascade Lake Xeon parts have the same model number as Skylake Xeon parts, so they are tagged with the intel_pebs_isolation quirk. However, as with Skylake Xeon H0 stepping parts, the PEBS isolation issue is fixed in all microcode versions. Add the Cascade Lake Xeon steppings (5, 6, and 7) to the isolation_ucodes[] table so that these parts benefit from Andi's optimization in commit 9b545c04 ("perf/x86/kvm: Avoid unnecessary work in guest filtering"). Signed-off-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/20210205191324.2889006-1-jmattson@google.com
-
- 01 2月, 2021 5 次提交
-
-
由 Kan Liang 提交于
With Architectural Performance Monitoring Version 5, CPUID 10.ECX cpu leaf indicates the fixed counter enumeration. This extends the previous count to a bitmap which allows disabling even lower fixed counters. It could be used by a Hypervisor. The existing intel_ctrl variable is used to remember the bitmask of the counters. All code that reads all counters is fixed to check this extra bitmask. Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Originally-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-6-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Add perf core PMU support for the Intel Sapphire Rapids server, which is the successor of the Intel Ice Lake server. The enabling code is based on Ice Lake, but there are several new features introduced. The event encoding is changed and simplified, e.g., the event codes which are below 0x90 are restricted to counters 0-3. The event codes which above 0x90 are likely to have no restrictions. The event constraints, extra_regs(), and hardware cache events table are changed accordingly. A new Precise Distribution (PDist) facility is introduced, which further minimizes the skid when a precise event is programmed on the GP counter 0. Enable the Precise Distribution (PDist) facility with :ppp event. For this facility to work, the period must be initialized with a value larger than 127. Add spr_limit_period() to apply the limit for :ppp event. Two new data source fields, data block & address block, are added in the PEBS Memory Info Record for the load latency event. To enable the feature, - An auxiliary event has to be enabled together with the load latency event on Sapphire Rapids. A new flag PMU_FL_MEM_LOADS_AUX is introduced to indicate the case. A new event, mem-loads-aux, is exposed to sysfs for the user tool. Add a check in hw_config(). If the auxiliary event is not detected, return an unique error -ENODATA. - The union perf_mem_data_src is extended to support the new fields. - Ice Lake and earlier models do not support block information, but the fields may be set by HW on some machines. Add pebs_no_block to explicitly indicate the previous platforms which don't support the new block fields. Accessing the new block fields are ignored on those platforms. A new store Latency facility is introduced, which leverages the PEBS facility where it can provide additional information about sampled stores. The additional information includes the data address, memory auxiliary info (e.g. Data Source, STLB miss) and the latency of the store access. To enable the facility, the new event (0x02cd) has to be programed on the GP counter 0. A new flag PERF_X86_EVENT_PEBS_STLAT is introduced to indicate the event. The store_latency_data() is introduced to parse the memory auxiliary info. The layout of access latency field of PEBS Memory Info Record has been changed. Two latency, instruction latency (bit 15:0) and cache access latency (bit 47:32) are recorded. - The cache access latency is similar to previous memory access latency. For loads, the latency starts by the actual cache access until the data is returned by the memory subsystem. For stores, the latency starts when the demand write accesses the L1 data cache and lasts until the cacheline write is completed in the memory subsystem. The cache access latency is stored in low 32bits of the sample type PERF_SAMPLE_WEIGHT_STRUCT. - The instruction latency starts by the dispatch of the load operation for execution and lasts until completion of the instruction it belongs to. Add a new flag PMU_FL_INSTR_LATENCY to indicate the instruction latency support. The instruction latency is stored in the bit 47:32 of the sample type PERF_SAMPLE_WEIGHT_STRUCT. Extends the PERF_METRICS MSR to feature TMA method level 2 metrics. The lower half of the register is the TMA level 1 metrics (legacy). The upper half is also divided into four 8-bit fields for the new level 2 metrics. Expose all eight Topdown metrics events to user space. The full description for the SPR features can be found at Intel Architecture Instruction Set Extensions and Future Features Programming Reference, 319433-041. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-5-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Intel Sapphire Rapids server will introduce 8 metrics events. Intel Ice Lake only supports 4 metrics events. A perf tool user may mistakenly use the unsupported events via RAW format on Ice Lake. The user can still get a value from the unsupported Topdown metrics event once the following Sapphire Rapids enabling patch is applied. To enable the 8 metrics events on Intel Sapphire Rapids, the INTEL_TD_METRIC_MAX has to be updated, which impacts the is_metric_event(). The is_metric_event() is a generic function. On Ice Lake, the newly added SPR metrics events will be mistakenly accepted as metric events on creation. At runtime, the unsupported Topdown metrics events will be updated. Add a variable num_topdown_events in x86_pmu to indicate the available number of the Topdown metrics event on the platform. Apply the number into is_metric_event(). Only the supported Topdown metrics events should be created as metrics events. Apply the num_topdown_events in icl_update_topdown_event() as well. The function can be reused by the following patch. Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-4-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Similar to Ice Lake, Intel Sapphire Rapids server also supports the topdown performance metrics feature. The difference is that Intel Sapphire Rapids server extends the PERF_METRICS MSR to feature TMA method level two metrics, which will introduce 8 metrics events. Current icl_update_topdown_event() only check 4 level one metrics events. Factor out intel_update_topdown_event() to facilitate the code sharing between Ice Lake and Sapphire Rapids. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-3-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Current PERF_SAMPLE_WEIGHT sample type is very useful to expresses the cost of an action represented by the sample. This allows the profiler to scale the samples to be more informative to the programmer. It could also help to locate a hotspot, e.g., when profiling by memory latencies, the expensive load appear higher up in the histograms. But current PERF_SAMPLE_WEIGHT sample type is solely determined by one factor. This could be a problem, if users want two or more factors to contribute to the weight. For example, Golden Cove core PMU can provide both the instruction latency and the cache Latency information as factors for the memory profiling. For current X86 platforms, although meminfo::latency is defined as a u64, only the lower 32 bits include the valid data in practice (No memory access could last than 4G cycles). The higher 32 bits can be used to store new factors. Add a new sample type, PERF_SAMPLE_WEIGHT_STRUCT, to indicate the new sample weight structure. It shares the same space as the PERF_SAMPLE_WEIGHT sample type. Users can apply either the PERF_SAMPLE_WEIGHT sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample type to retrieve the sample weight, but they cannot apply both sample types simultaneously. Currently, only X86 and PowerPC use the PERF_SAMPLE_WEIGHT sample type. - For PowerPC, there is nothing changed for the PERF_SAMPLE_WEIGHT sample type. There is no effect for the new PERF_SAMPLE_WEIGHT_STRUCT sample type. PowerPC can re-struct the weight field similarly later. - For X86, the same value will be dumped for the PERF_SAMPLE_WEIGHT sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample type for now. The following patches will apply the new factors for the PERF_SAMPLE_WEIGHT_STRUCT sample type. The field in the union perf_sample_weight should be shared among different architectures. A generic name is required, but it's hard to abstract a name that applies to all architectures. For example, on X86, the fields are to store all kinds of latency. While on PowerPC, it stores MMCRA[TECX/TECM], which should not be latency. So a general name prefix 'var$NUM' is used here. Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-2-git-send-email-kan.liang@linux.intel.com
-
- 28 1月, 2021 2 次提交
-
-
由 Peter Zijlstra 提交于
Perfmon-v4 counter freezing is fundamentally broken; remove this default disabled code to make sure nobody uses it. The feature is called Freeze-on-PMI in the SDM, and if it would do that, there wouldn't actually be a problem, *however* it does something subtly different. It globally disables the whole PMU when it raises the PMI, not when the PMI hits. This means there's a window between the PMI getting raised and the PMI actually getting served where we loose events and this violates the perf counter independence. That is, a counting event should not result in a different event count when there is a sampling event co-scheduled. This is known to break existing software (RR). Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
-
由 Like Xu 提交于
Clean up that CONFIG_RETPOLINE crud and replace the indirect call x86_pmu.guest_get_msrs with static_call(). Reported-by: Nkernel test robot <lkp@intel.com> Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NLike Xu <like.xu@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210125121458.181635-1-like.xu@linux.intel.com
-
- 14 1月, 2021 2 次提交
-
-
由 Steve Wahl 提交于
The registers used to determine which die a pci bus belongs to don't contain enough information to uniquely specify more than 8 dies, so when more than 8 dies are present, use NUMA information instead. Continue to use the previous method for 8 or fewer because it works there, and covers cases of NUMA being disabled. Signed-off-by: NSteve Wahl <steve.wahl@hpe.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NKan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/20210108153549.108989-3-steve.wahl@hpe.com
-
由 Steve Wahl 提交于
The phys_id isn't really used other than to map to a logical die id. Calculate the logical die id earlier, and store that instead of the phys_id. Signed-off-by: NSteve Wahl <steve.wahl@hpe.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NKan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/20210108153549.108989-2-steve.wahl@hpe.com
-
- 10 12月, 2020 3 次提交
-
-
由 Kan Liang 提交于
Tremont has four L1 Topdown events, TOPDOWN_FE_BOUND.ALL, TOPDOWN_BAD_SPECULATION.ALL, TOPDOWN_BE_BOUND.ALL and TOPDOWN_RETIRING.ALL. They are available on GP counters. Export them to sysfs and facilitate the perf stat tool. $perf stat --topdown -- sleep 1 Performance counter stats for 'sleep 1': retiring bad speculation frontend bound backend bound 24.9% 16.8% 31.7% 26.6% 1.001224610 seconds time elapsed 0.001150000 seconds user 0.000000000 seconds sys Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1607457952-3519-1-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The cycle count of a timed LBR is always 1 in perf record -D. The cycle count is stored in the first 16 bits of the IA32_LBR_x_INFO register, but the get_lbr_cycles() return Boolean type. Use u16 to replace the Boolean type. Fixes: 47125db2 ("perf/x86/intel/lbr: Support Architectural LBR") Reported-by: NStephane Eranian <eranian@google.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201125213720.15692-2-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
According to the event list from icelake_core_v1.09.json, the encoding of the RTM_RETIRED.ABORTED event on Ice Lake should be, "EventCode": "0xc9", "UMask": "0x04", "EventName": "RTM_RETIRED.ABORTED", Correct the wrong encoding. Fixes: 60176089 ("perf/x86/intel: Add Icelake support") Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201125213720.15692-1-kan.liang@linux.intel.com
-
- 03 12月, 2020 2 次提交
-
-
由 Stephane Eranian 提交于
The kernel cannot disambiguate when 2+ PEBS counters overflow at the same time. This is what the comment for this code suggests. However, I see the comparison is done with the unfiltered p->status which is a copy of IA32_PERF_GLOBAL_STATUS at the time of the sample. This register contains more than the PEBS counter overflow bits. It also includes many other bits which could also be set. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201126110922.317681-2-namhyung@kernel.org
-
由 Namhyung Kim 提交于
The commit 3966c3fe ("x86/perf/amd: Remove need to check "running" bit in NMI handler") introduced this. It seems x86_pmu_stop can be called recursively (like when it losts some samples) like below: x86_pmu_stop intel_pmu_disable_event (x86_pmu_disable) intel_pmu_pebs_disable intel_pmu_drain_pebs_nhm (x86_pmu_drain_pebs_buffer) x86_pmu_stop While commit 35d1ce6b ("perf/x86/intel/ds: Fix x86_pmu_stop warning for large PEBS") fixed it for the normal cases, there's another path to call x86_pmu_stop() recursively when a PEBS error was detected (like two or more counters overflowed at the same time). Like in the Kan's previous fix, we can skip the interrupt accounting for large PEBS, so check the iregs which is set for PMI only. Fixes: 3966c3fe ("x86/perf/amd: Remove need to check "running" bit in NMI handler") Reported-by: NJohn Sperbeck <jsperbeck@google.com> Suggested-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201126110922.317681-1-namhyung@kernel.org
-
- 17 11月, 2020 1 次提交
-
-
由 Sami Tolvanen 提交于
This change switches rapl to use PMU_FORMAT_ATTR, and fixes two other macros to use device_attribute instead of kobj_attribute to avoid callback type mismatches that trip indirect call checking with Clang's Control-Flow Integrity (CFI). Reported-by: NSedat Dilek <sedat.dilek@gmail.com> Signed-off-by: NSami Tolvanen <samitolvanen@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NKees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20201113183126.1239404-1-samitolvanen@google.com
-
- 11 11月, 2020 1 次提交
-
-
由 Arnd Bergmann 提交于
gcc -Wextra points out a duplicate initialization of one array member: arch/x86/events/intel/uncore_snb.c:478:37: warning: initialized field overwritten [-Woverride-init] 478 | [SNB_PCI_UNCORE_IMC_DATA_READS] = { SNB_UNCORE_PCI_IMC_DATA_WRITES_BASE, The only sensible explanation is that a duplicate 'READS' was used instead of the correct 'WRITES', so change it back. Fixes: 24633d90 ("perf/x86/intel/uncore: Add BW counters for GT, IA and IO breakdown") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201026215203.3893972-1-arnd@kernel.org
-
- 10 11月, 2020 4 次提交
-
-
由 Stephane Eranian 提交于
Starting with Arch Perfmon v5, the anythread filter on generic counters may be deprecated. The current kernel was exporting the any filter without checking. On Icelake, it means you could do cpu/event=0x3c,any/ even though the filter does not exist. This patch corrects the problem by relying on the CPUID 0xa leaf function to determine if anythread is supported or not as described in the Intel SDM Vol3b 18.2.5.1 AnyThread Deprecation section. Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201028194247.3160610-1-eranian@google.com
-
由 Peter Zijlstra 提交于
Having pt_regs on-stack is unfortunate, it's 168 bytes. Since it isn't actually used, make it a static variable. This both gets if off the stack and ensures it gets 0 initialized, just in case someone does look at it. Reported-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201030151955.324273677@infradead.org
-
由 Peter Zijlstra 提交于
intel_pmu_drain_pebs_*() is typically called from handle_pmi_common(), both have an on-stack struct perf_sample_data, which is *big*. Rewire things so that drain_pebs() can use the one handle_pmi_common() has. Reported-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201030151955.054099690@infradead.org
-
由 Peter Zijlstra 提交于
__perf_output_begin() has an on-stack struct perf_sample_data in the unlikely case it needs to generate a LOST record. However, every call to perf_output_begin() must already have a perf_sample_data on-stack. Reported-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201030151954.985416146@infradead.org
-
- 29 10月, 2020 5 次提交
-
-
由 Kan Liang 提交于
The event CYCLE_ACTIVITY.STALLS_MEM_ANY (0x14a3) should be available on all 8 GP counters on ICL, but it's only scheduled on the first four counters due to the current ICL constraint table. Add a line for the CYCLE_ACTIVITY.STALLS_MEM_ANY event in the ICL constraint table. Correct the comments for the CYCLE_ACTIVITY.CYCLES_MEM_ANY event. Fixes: 60176089 ("perf/x86/intel: Add Icelake support") Reported-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201019164529.32154-1-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
For Rocket Lake, the MSR uncore, e.g., CBOX, ARB and CLOCKBOX, are the same as Tiger Lake. Share the perf code with it. For Rocket Lake and Tiger Lake, the 8th CBOX is not mapped into a different MSR space anymore. Add rkl_uncore_msr_init_box() to replace skl_uncore_msr_init_box(). The IMC uncore is the similar to Ice Lake. Add new PCIIDs of IMC for Rocket Lake. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201019153528.13850-4-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
From the perspective of Intel cstate residency counters, Rocket Lake is the same as Ice Lake and Tiger Lake. Share the code with them. Update the comments for Rocket Lake. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201019153528.13850-2-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
From the perspective of Intel PMU, Rocket Lake is the same as Ice Lake and Tiger Lake. Share the perf code with them. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201019153528.13850-1-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The new sample type, PERF_SAMPLE_DATA_PAGE_SIZE, requires the virtual address. Update the data->addr if the sample type is set. The large PEBS is disabled with the sample type, because perf doesn't support munmap tracking yet. The PEBS buffer for large PEBS cannot be flushed for each munmap. Wrong page size may be calculated. The large PEBS can be enabled later separately when munmap tracking is supported. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201001135749.2804-3-kan.liang@linux.intel.com
-
- 26 10月, 2020 1 次提交
-
-
由 Gabriel Krisman Bertazi 提交于
In preparation to remove TIF_IA32, stop using it in perf events code. Tested by running perf on 32-bit, 64-bit and x32 applications. Suggested-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20201004032536.1229030-2-krisman@collabora.com
-
- 03 10月, 2020 1 次提交
-
-
由 Kan Liang 提交于
It might be possible that different CPUs have different CPU metrics on a platform. In this case, writing the GLOBAL_CTRL_EN_PERF_METRICS bit to the GLOBAL_CTRL register of a CPU, which doesn't support the TopDown perf metrics feature, causes MSR access error. Current TopDown perf metrics feature is enumerated using the boot CPU's PERF_CAPABILITIES MSR. The MSR only indicates the boot CPU supports this feature. Check the PERF_CAPABILITIES MSR for each CPU. If any CPU doesn't support the perf metrics feature, disable the feature globally. Fixes: 59a854e2 ("perf/x86/intel: Support TopDown metrics on Ice Lake") Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201001211711.25708-1-kan.liang@linux.intel.com
-
- 29 9月, 2020 6 次提交
-
-
由 Kan Liang 提交于
An error occues when sampling non-PEBS INST_RETIRED.PREC_DIST(0x01c0) event. perf record -e cpu/event=0xc0,umask=0x01/ -- sleep 1 Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu/event=0xc0,umask=0x01/). /bin/dmesg | grep -i perf may provide additional information. The idxmsk64 of the event is set to 0. The event never be successfully scheduled. The event should be limit to the fixed counter 0. Fixes: 60176089 ("perf/x86/intel: Add Icelake support") Reported-by: NYi, Ammy <ammy.yi@intel.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200928134726.13090-1-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The "MiB" result of the IMC free-running bandwidth events, uncore_imc_free_running/read/ and uncore_imc_free_running/write/ are 16 times too small. The "MiB" value equals the raw IMC free-running bandwidth counter value times a "scale" which is inaccurate. The IMC free-running bandwidth events should be incremented per 64B cache line, not DWs (4 bytes). The "scale" should be 6.103515625e-5. Fix the "scale" for both Snow Ridge and Ice Lake. Fixes: 2b3b76b5 ("perf/x86/intel/uncore: Add Ice Lake server uncore support") Fixes: ee49532b ("perf/x86/intel/uncore: Add IMC uncore support for Snow Ridge") Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200928133240.12977-1-kan.liang@linux.intel.com
-
由 Alexander Antonov 提交于
Introduced early attributes /sys/devices/uncore_iio_<pmu_idx>/die* are initialized by skx_iio_set_mapping(), however, for example, for multiple segment platforms skx_iio_get_topology() returns -EPERM before a list of attributes in skx_iio_mapping_group will have been initialized. As a result the list is being NULL. Thus the warning "sysfs: (bin_)attrs not set by subsystem for group: uncore_iio_*/" appears and uncore_iio pmus are not available in sysfs. Clear IIO attr_update to properly handle the cases when topology information cannot be retrieved. Fixes: bb42b3d3 ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping") Reported-by: NKyle Meyer <kyle.meyer@hpe.com> Suggested-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NAlexander Antonov <alexander.antonov@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NAlexei Budankov <alexey.budankov@linux.intel.com> Reviewed-by: NKan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/20200928102133.61041-1-alexander.antonov@linux.intel.com
-
由 Kan Liang 提交于
The Jasper Lake processor is also a Tremont microarchitecture. From the perspective of Intel PMU, there is nothing changed compared with Elkhart Lake. Share the perf code with Elkhart Lake. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1601296242-32763-1-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
An oops is triggered by the fuzzy test. [ 327.853081] unchecked MSR access error: RDMSR from 0x70c at rIP: 0xffffffffc082c820 (uncore_msr_read_counter+0x10/0x50 [intel_uncore]) [ 327.853083] Call Trace: [ 327.853085] <IRQ> [ 327.853089] uncore_pmu_event_start+0x85/0x170 [intel_uncore] [ 327.853093] uncore_pmu_event_add+0x1a4/0x410 [intel_uncore] [ 327.853097] ? event_sched_in.isra.118+0xca/0x240 There are 2 GP counters for each CBOX, but the current code claims 4 counters. Accessing the invalid registers triggers the oops. Fixes: 6e394376 ("perf/x86/intel/uncore: Add Intel Icelake uncore support") Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200925134905.8839-3-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
There are some updates for the Icelake model specific uncore performance monitors. (The update can be found at 10th generation intel core processors families specification update Revision 004, ICL068) 1) Counter 0 of ARB uncore unit is not available for software use 2) The global 'enable bit' (bit 29) and 'freeze bit' (bit 31) of MSR_UNC_PERF_GLOBAL_CTRL cannot be used to control counter behavior. Needs to use local enable in event select MSR. Accessing the modified bit/registers will be ignored by HW. Users may observe inaccurate results with the current code. The changes of the MSR_UNC_PERF_GLOBAL_CTRL imply that groups cannot be read atomically anymore. Although the error of the result for a group becomes a bit bigger, it still far lower than not using a group. The group support is still kept. Only Remove the *_box() related implementation. Since the counter 0 of ARB uncore unit is not available, update the MSR address for the ARB uncore unit. There is no change for IMC uncore unit, which only include free-running counters. Fixes: 6e394376 ("perf/x86/intel/uncore: Add Intel Icelake uncore support") Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200925134905.8839-2-kan.liang@linux.intel.com
-