- 20 4月, 2021 9 次提交
-
-
由 Kan Liang 提交于
Each Hybrid PMU has to check and update its own event constraints before registration. The intel_pmu_check_event_constraints will be reused later to check the event constraints of each hybrid PMU. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-13-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Each Hybrid PMU has to check its own number of counters and mask fixed counters before registration. The intel_pmu_check_num_counters will be reused later to check the number of the counters for each hybrid PMU. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-12-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Different hybrid PMU may have different extra registers, e.g. Core PMU may have offcore registers, frontend register and ldlat register. Atom core may only have offcore registers and ldlat register. Each hybrid PMU should use its own extra_regs. An Intel Hybrid system should always have extra registers. Unconditionally allocate shared_regs for Intel Hybrid system. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-11-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The events are different among hybrid PMUs. Each hybrid PMU should use its own event constraints. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-10-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The unconstrained value depends on the number of GP and fixed counters. Each hybrid PMU should use its own unconstrained. Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618237865-33448-8-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The number of GP and fixed counters are different among hybrid PMUs. Each hybrid PMU should use its own counter related information. When handling a certain hybrid PMU, apply the number of counters from the corresponding hybrid PMU. When reserving the counters in the initialization of a new event, reserve all possible counters. The number of counter recored in the global x86_pmu is for the architecture counters which are available for all hybrid PMUs. KVM doesn't support the hybrid PMU yet. Return the number of the architecture counters for now. For the functions only available for the old platforms, e.g., intel_pmu_drain_pebs_nhm(), nothing is changed. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-7-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The intel_ctrl is the counter mask of a PMU. The PMU counter information may be different among hybrid PMUs, each hybrid PMU should use its own intel_ctrl to check and access the counters. When handling a certain hybrid PMU, apply the intel_ctrl from the corresponding hybrid PMU. When checking the HW existence, apply the PMU and number of counters from the corresponding hybrid PMU as well. Perf will check the HW existence for each Hybrid PMU before registration. Expose the check_hw_exists() for a later patch. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-6-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Some platforms, e.g. Alder Lake, have hybrid architecture. Although most PMU capabilities are the same, there are still some unique PMU capabilities for different hybrid PMUs. Perf should register a dedicated pmu for each hybrid PMU. Add a new struct x86_hybrid_pmu, which saves the dedicated pmu and capabilities for each hybrid PMU. The architecture MSR, MSR_IA32_PERF_CAPABILITIES, only indicates the architecture features which are available on all hybrid PMUs. The architecture features are stored in the global x86_pmu.intel_cap. For Alder Lake, the model-specific features are perf metrics and PEBS-via-PT. The corresponding bits of the global x86_pmu.intel_cap should be 0 for these two features. Perf should not use the global intel_cap to check the features on a hybrid system. Add a dedicated intel_cap in the x86_hybrid_pmu to store the model-specific capabilities. Use the dedicated intel_cap to replace the global intel_cap for thse two features. The dedicated intel_cap will be set in the following "Add Alder Lake Hybrid support" patch. Add is_hybrid() to distinguish a hybrid system. ADL may have an alternative configuration. With that configuration, the X86_FEATURE_HYBRID_CPU is not set. Perf cannot rely on the feature bit. Add a new static_key_false, perf_is_hybrid, to indicate a hybrid system. It will be assigned in the following "Add Alder Lake Hybrid support" patch as well. Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618237865-33448-5-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Some platforms, e.g. Alder Lake, have hybrid architecture. In the same package, there may be more than one type of CPU. The PMU capabilities are different among different types of CPU. Perf will register a dedicated PMU for each type of CPU. Add a 'pmu' variable in the struct cpu_hw_events to track the dedicated PMU of the current CPU. Current x86_get_pmu() use the global 'pmu', which will be broken on a hybrid platform. Modify it to apply the 'pmu' of the specific CPU. Initialize the per-CPU 'pmu' variable with the global 'pmu'. There is nothing changed for the non-hybrid platforms. The is_x86_event() will be updated in the later patch ("perf/x86: Register hybrid PMUs") for hybrid platforms. For the non-hybrid platforms, nothing is changed here. Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618237865-33448-4-git-send-email-kan.liang@linux.intel.com
-
- 16 4月, 2021 1 次提交
-
-
由 Kan Liang 提交于
The 'running' variable is only used in the P4 PMU. Current perf sets the variable in the critical function x86_pmu_start(), which wastes cycles for everybody not running on P4. Move cpuc->running into the P4 specific p4_pmu_enable_event(). Add a static per-CPU 'p4_running' variable to replace the 'running' variable in the struct cpu_hw_events. Saves space for the generic structure. The p4_pmu_enable_all() also invokes the p4_pmu_enable_event(), but it should not set cpuc->running. Factor out __p4_pmu_enable_event() for p4_pmu_enable_all(). Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618410990-21383-1-git-send-email-kan.liang@linux.intel.com
-
- 02 4月, 2021 6 次提交
-
-
由 Alexander Antonov 提交于
IIO stacks to PMON mapping on Skylake servers is exposed through introduced early attributes /sys/devices/uncore_iio_<pmu_idx>/dieX, where dieX is a file which holds "Segment:Root Bus" for PCIe root port which can be monitored by that IIO PMON block. These sysfs attributes are disabled for multiple segment topologies except VMD domains which start at 0x10000. This patch removes the limitation and enables IIO stacks to PMON mapping for multi-segment Skylake servers by introducing segment-aware intel_uncore_topology structure and attributing the topology configuration to the segment in skx_iio_get_topology() function. Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NAlexander Antonov <alexander.antonov@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NKan Liang <kan.liang@linux.intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Tested-by: NKyle Meyer <kyle.meyer@hpe.com> Link: https://lkml.kernel.org/r/20210323150507.2013-1-alexander.antonov@linux.intel.com
-
由 Kan Liang 提交于
The discovery table provides the generic uncore block information for the MMIO type of uncore blocks, which is good enough to provide basic uncore support. The box control field is composed of the BAR address and box control offset. When initializing the uncore blocks, perf should ioremap the address from the box control field. Implement the generic support for the MMIO type of uncore block. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1616003977-90612-6-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The discovery table provides the generic uncore block information for the PCI type of uncore blocks, which is good enough to provide basic uncore support. The PCI BUS and DEVFN information can be retrieved from the box control field. Introduce the uncore_pci_pmus_register() to register all the PCICFG type of uncore blocks. The old PCI probe/remove way is dropped. The PCI BUS and DEVFN information are different among dies. Add box_ctls to store the box control field of each die. Add a new BUS notifier for the PCI type of uncore block to support the hotplug. If the device is "hot remove", the corresponding registered PMU has to be unregistered. Perf cannot locate the PMU by searching a const pci_device_id table, because the discovery tables don't provide such information. Introduce uncore_pci_find_dev_pmu_from_types() to search the whole uncore_pci_uncores for the PMU. Implement generic support for the PCI type of uncore block. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1616003977-90612-5-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Perf will use a similar method to the PCI sub driver to register the PMUs for the PCI type of uncore blocks. The method requires a BUS notifier to support hotplug. The current BUS notifier cannot be reused, because it searches a const id_table for the corresponding registered PMU. The PCI type of uncore blocks in the discovery tables doesn't provide an id_table. Factor out uncore_bus_notify() and add the pointer of an id_table as a parameter. The uncore_bus_notify() will be reused in the following patch. The current BUS notifier is only used by the PCI sub driver. Its name is too generic. Rename it to uncore_pci_sub_notifier, which is specific for the PCI sub driver. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1616003977-90612-4-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The discovery table provides the generic uncore block information for the MSR type of uncore blocks, e.g., the counter width, the number of counters, the location of control/counter registers, which is good enough to provide basic uncore support. It can be used as a fallback solution when the kernel doesn't support a platform. The name of the uncore box cannot be retrieved from the discovery table. uncore_type_&typeID_&boxID will be used as its name. Save the type ID and the box ID information in the struct intel_uncore_type. Factor out uncore_get_pmu_name() to handle different naming methods. Implement generic support for the MSR type of uncore block. Some advanced features, such as filters and constraints, cannot be retrieved from discovery tables. Features that rely on that information are not be supported here. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1616003977-90612-3-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
A self-describing mechanism for the uncore PerfMon hardware has been introduced with the latest Intel platforms. By reading through an MMIO page worth of information, perf can 'discover' all the standard uncore PerfMon registers in a machine. The discovery mechanism relies on BIOS's support. With a proper BIOS, a PCI device with the unique capability ID 0x23 can be found on each die. Perf can retrieve the information of all available uncore PerfMons from the device via MMIO. The information is composed of one global discovery table and several unit discovery tables. - The global discovery table includes global uncore information of the die, e.g., the address of the global control register, the offset of the global status register, the number of uncore units, the offset of unit discovery tables, etc. - The unit discovery table includes generic uncore unit information, e.g., the access type, the counter width, the address of counters, the address of the counter control, the unit ID, the unit type, etc. The unit is also called "box" in the code. Perf can provide basic uncore support based on this information with the following patches. To locate the PCI device with the discovery tables, check the generic PCI ID first. If it doesn't match, go through the entire PCI device tree and locate the device with the unique capability ID. The uncore information is similar among dies. To save parsing time and space, only completely parse and store the discovery tables on the first die and the first box of each die. The parsed information is stored in an RB tree structure, intel_uncore_discovery_type. The size of the stored discovery tables varies among platforms. It's around 4KB for a Sapphire Rapids server. If a BIOS doesn't support the 'discovery' mechanism, the uncore driver will exit with -ENODEV. There is nothing changed. Add a module parameter to disable the discovery feature. If a BIOS gets the discovery tables wrong, users can have an option to disable the feature. For the current patchset, the uncore driver will exit with -ENODEV. In the future, it may fall back to the hardcode uncore driver on a known platform. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1616003977-90612-2-git-send-email-kan.liang@linux.intel.com
-
- 06 3月, 2021 1 次提交
-
-
由 Kan Liang 提交于
To supply a PID/TID for large PEBS, it requires flushing the PEBS buffer in a context switch. For normal LBRs, a context switch can flip the address space and LBR entries are not tagged with an identifier, we need to wipe the LBR, even for per-cpu events. For LBR callstack, save/restore the stack is required during a context switch. Set PERF_ATTACH_SCHED_CB for the event with large PEBS & LBR. Fixes: 9c964efa ("perf/x86/intel: Drain the PEBS buffer during context switches") Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NIngo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20201130193842.10569-2-kan.liang@linux.intel.com
-
- 10 2月, 2021 1 次提交
-
-
由 Jim Mattson 提交于
Cascade Lake Xeon parts have the same model number as Skylake Xeon parts, so they are tagged with the intel_pebs_isolation quirk. However, as with Skylake Xeon H0 stepping parts, the PEBS isolation issue is fixed in all microcode versions. Add the Cascade Lake Xeon steppings (5, 6, and 7) to the isolation_ucodes[] table so that these parts benefit from Andi's optimization in commit 9b545c04 ("perf/x86/kvm: Avoid unnecessary work in guest filtering"). Signed-off-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/20210205191324.2889006-1-jmattson@google.com
-
- 01 2月, 2021 5 次提交
-
-
由 Kan Liang 提交于
With Architectural Performance Monitoring Version 5, CPUID 10.ECX cpu leaf indicates the fixed counter enumeration. This extends the previous count to a bitmap which allows disabling even lower fixed counters. It could be used by a Hypervisor. The existing intel_ctrl variable is used to remember the bitmask of the counters. All code that reads all counters is fixed to check this extra bitmask. Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Originally-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-6-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Add perf core PMU support for the Intel Sapphire Rapids server, which is the successor of the Intel Ice Lake server. The enabling code is based on Ice Lake, but there are several new features introduced. The event encoding is changed and simplified, e.g., the event codes which are below 0x90 are restricted to counters 0-3. The event codes which above 0x90 are likely to have no restrictions. The event constraints, extra_regs(), and hardware cache events table are changed accordingly. A new Precise Distribution (PDist) facility is introduced, which further minimizes the skid when a precise event is programmed on the GP counter 0. Enable the Precise Distribution (PDist) facility with :ppp event. For this facility to work, the period must be initialized with a value larger than 127. Add spr_limit_period() to apply the limit for :ppp event. Two new data source fields, data block & address block, are added in the PEBS Memory Info Record for the load latency event. To enable the feature, - An auxiliary event has to be enabled together with the load latency event on Sapphire Rapids. A new flag PMU_FL_MEM_LOADS_AUX is introduced to indicate the case. A new event, mem-loads-aux, is exposed to sysfs for the user tool. Add a check in hw_config(). If the auxiliary event is not detected, return an unique error -ENODATA. - The union perf_mem_data_src is extended to support the new fields. - Ice Lake and earlier models do not support block information, but the fields may be set by HW on some machines. Add pebs_no_block to explicitly indicate the previous platforms which don't support the new block fields. Accessing the new block fields are ignored on those platforms. A new store Latency facility is introduced, which leverages the PEBS facility where it can provide additional information about sampled stores. The additional information includes the data address, memory auxiliary info (e.g. Data Source, STLB miss) and the latency of the store access. To enable the facility, the new event (0x02cd) has to be programed on the GP counter 0. A new flag PERF_X86_EVENT_PEBS_STLAT is introduced to indicate the event. The store_latency_data() is introduced to parse the memory auxiliary info. The layout of access latency field of PEBS Memory Info Record has been changed. Two latency, instruction latency (bit 15:0) and cache access latency (bit 47:32) are recorded. - The cache access latency is similar to previous memory access latency. For loads, the latency starts by the actual cache access until the data is returned by the memory subsystem. For stores, the latency starts when the demand write accesses the L1 data cache and lasts until the cacheline write is completed in the memory subsystem. The cache access latency is stored in low 32bits of the sample type PERF_SAMPLE_WEIGHT_STRUCT. - The instruction latency starts by the dispatch of the load operation for execution and lasts until completion of the instruction it belongs to. Add a new flag PMU_FL_INSTR_LATENCY to indicate the instruction latency support. The instruction latency is stored in the bit 47:32 of the sample type PERF_SAMPLE_WEIGHT_STRUCT. Extends the PERF_METRICS MSR to feature TMA method level 2 metrics. The lower half of the register is the TMA level 1 metrics (legacy). The upper half is also divided into four 8-bit fields for the new level 2 metrics. Expose all eight Topdown metrics events to user space. The full description for the SPR features can be found at Intel Architecture Instruction Set Extensions and Future Features Programming Reference, 319433-041. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-5-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Intel Sapphire Rapids server will introduce 8 metrics events. Intel Ice Lake only supports 4 metrics events. A perf tool user may mistakenly use the unsupported events via RAW format on Ice Lake. The user can still get a value from the unsupported Topdown metrics event once the following Sapphire Rapids enabling patch is applied. To enable the 8 metrics events on Intel Sapphire Rapids, the INTEL_TD_METRIC_MAX has to be updated, which impacts the is_metric_event(). The is_metric_event() is a generic function. On Ice Lake, the newly added SPR metrics events will be mistakenly accepted as metric events on creation. At runtime, the unsupported Topdown metrics events will be updated. Add a variable num_topdown_events in x86_pmu to indicate the available number of the Topdown metrics event on the platform. Apply the number into is_metric_event(). Only the supported Topdown metrics events should be created as metrics events. Apply the num_topdown_events in icl_update_topdown_event() as well. The function can be reused by the following patch. Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-4-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Similar to Ice Lake, Intel Sapphire Rapids server also supports the topdown performance metrics feature. The difference is that Intel Sapphire Rapids server extends the PERF_METRICS MSR to feature TMA method level two metrics, which will introduce 8 metrics events. Current icl_update_topdown_event() only check 4 level one metrics events. Factor out intel_update_topdown_event() to facilitate the code sharing between Ice Lake and Sapphire Rapids. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-3-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Current PERF_SAMPLE_WEIGHT sample type is very useful to expresses the cost of an action represented by the sample. This allows the profiler to scale the samples to be more informative to the programmer. It could also help to locate a hotspot, e.g., when profiling by memory latencies, the expensive load appear higher up in the histograms. But current PERF_SAMPLE_WEIGHT sample type is solely determined by one factor. This could be a problem, if users want two or more factors to contribute to the weight. For example, Golden Cove core PMU can provide both the instruction latency and the cache Latency information as factors for the memory profiling. For current X86 platforms, although meminfo::latency is defined as a u64, only the lower 32 bits include the valid data in practice (No memory access could last than 4G cycles). The higher 32 bits can be used to store new factors. Add a new sample type, PERF_SAMPLE_WEIGHT_STRUCT, to indicate the new sample weight structure. It shares the same space as the PERF_SAMPLE_WEIGHT sample type. Users can apply either the PERF_SAMPLE_WEIGHT sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample type to retrieve the sample weight, but they cannot apply both sample types simultaneously. Currently, only X86 and PowerPC use the PERF_SAMPLE_WEIGHT sample type. - For PowerPC, there is nothing changed for the PERF_SAMPLE_WEIGHT sample type. There is no effect for the new PERF_SAMPLE_WEIGHT_STRUCT sample type. PowerPC can re-struct the weight field similarly later. - For X86, the same value will be dumped for the PERF_SAMPLE_WEIGHT sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample type for now. The following patches will apply the new factors for the PERF_SAMPLE_WEIGHT_STRUCT sample type. The field in the union perf_sample_weight should be shared among different architectures. A generic name is required, but it's hard to abstract a name that applies to all architectures. For example, on X86, the fields are to store all kinds of latency. While on PowerPC, it stores MMCRA[TECX/TECM], which should not be latency. So a general name prefix 'var$NUM' is used here. Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-2-git-send-email-kan.liang@linux.intel.com
-
- 28 1月, 2021 2 次提交
-
-
由 Peter Zijlstra 提交于
Perfmon-v4 counter freezing is fundamentally broken; remove this default disabled code to make sure nobody uses it. The feature is called Freeze-on-PMI in the SDM, and if it would do that, there wouldn't actually be a problem, *however* it does something subtly different. It globally disables the whole PMU when it raises the PMI, not when the PMI hits. This means there's a window between the PMI getting raised and the PMI actually getting served where we loose events and this violates the perf counter independence. That is, a counting event should not result in a different event count when there is a sampling event co-scheduled. This is known to break existing software (RR). Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
-
由 Like Xu 提交于
Clean up that CONFIG_RETPOLINE crud and replace the indirect call x86_pmu.guest_get_msrs with static_call(). Reported-by: Nkernel test robot <lkp@intel.com> Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NLike Xu <like.xu@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210125121458.181635-1-like.xu@linux.intel.com
-
- 14 1月, 2021 2 次提交
-
-
由 Steve Wahl 提交于
The registers used to determine which die a pci bus belongs to don't contain enough information to uniquely specify more than 8 dies, so when more than 8 dies are present, use NUMA information instead. Continue to use the previous method for 8 or fewer because it works there, and covers cases of NUMA being disabled. Signed-off-by: NSteve Wahl <steve.wahl@hpe.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NKan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/20210108153549.108989-3-steve.wahl@hpe.com
-
由 Steve Wahl 提交于
The phys_id isn't really used other than to map to a logical die id. Calculate the logical die id earlier, and store that instead of the phys_id. Signed-off-by: NSteve Wahl <steve.wahl@hpe.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NKan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/20210108153549.108989-2-steve.wahl@hpe.com
-
- 10 12月, 2020 3 次提交
-
-
由 Kan Liang 提交于
Tremont has four L1 Topdown events, TOPDOWN_FE_BOUND.ALL, TOPDOWN_BAD_SPECULATION.ALL, TOPDOWN_BE_BOUND.ALL and TOPDOWN_RETIRING.ALL. They are available on GP counters. Export them to sysfs and facilitate the perf stat tool. $perf stat --topdown -- sleep 1 Performance counter stats for 'sleep 1': retiring bad speculation frontend bound backend bound 24.9% 16.8% 31.7% 26.6% 1.001224610 seconds time elapsed 0.001150000 seconds user 0.000000000 seconds sys Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1607457952-3519-1-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The cycle count of a timed LBR is always 1 in perf record -D. The cycle count is stored in the first 16 bits of the IA32_LBR_x_INFO register, but the get_lbr_cycles() return Boolean type. Use u16 to replace the Boolean type. Fixes: 47125db2 ("perf/x86/intel/lbr: Support Architectural LBR") Reported-by: NStephane Eranian <eranian@google.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201125213720.15692-2-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
According to the event list from icelake_core_v1.09.json, the encoding of the RTM_RETIRED.ABORTED event on Ice Lake should be, "EventCode": "0xc9", "UMask": "0x04", "EventName": "RTM_RETIRED.ABORTED", Correct the wrong encoding. Fixes: 60176089 ("perf/x86/intel: Add Icelake support") Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201125213720.15692-1-kan.liang@linux.intel.com
-
- 03 12月, 2020 2 次提交
-
-
由 Stephane Eranian 提交于
The kernel cannot disambiguate when 2+ PEBS counters overflow at the same time. This is what the comment for this code suggests. However, I see the comparison is done with the unfiltered p->status which is a copy of IA32_PERF_GLOBAL_STATUS at the time of the sample. This register contains more than the PEBS counter overflow bits. It also includes many other bits which could also be set. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201126110922.317681-2-namhyung@kernel.org
-
由 Namhyung Kim 提交于
The commit 3966c3fe ("x86/perf/amd: Remove need to check "running" bit in NMI handler") introduced this. It seems x86_pmu_stop can be called recursively (like when it losts some samples) like below: x86_pmu_stop intel_pmu_disable_event (x86_pmu_disable) intel_pmu_pebs_disable intel_pmu_drain_pebs_nhm (x86_pmu_drain_pebs_buffer) x86_pmu_stop While commit 35d1ce6b ("perf/x86/intel/ds: Fix x86_pmu_stop warning for large PEBS") fixed it for the normal cases, there's another path to call x86_pmu_stop() recursively when a PEBS error was detected (like two or more counters overflowed at the same time). Like in the Kan's previous fix, we can skip the interrupt accounting for large PEBS, so check the iregs which is set for PMI only. Fixes: 3966c3fe ("x86/perf/amd: Remove need to check "running" bit in NMI handler") Reported-by: NJohn Sperbeck <jsperbeck@google.com> Suggested-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201126110922.317681-1-namhyung@kernel.org
-
- 17 11月, 2020 1 次提交
-
-
由 Sami Tolvanen 提交于
This change switches rapl to use PMU_FORMAT_ATTR, and fixes two other macros to use device_attribute instead of kobj_attribute to avoid callback type mismatches that trip indirect call checking with Clang's Control-Flow Integrity (CFI). Reported-by: NSedat Dilek <sedat.dilek@gmail.com> Signed-off-by: NSami Tolvanen <samitolvanen@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NKees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20201113183126.1239404-1-samitolvanen@google.com
-
- 11 11月, 2020 1 次提交
-
-
由 Arnd Bergmann 提交于
gcc -Wextra points out a duplicate initialization of one array member: arch/x86/events/intel/uncore_snb.c:478:37: warning: initialized field overwritten [-Woverride-init] 478 | [SNB_PCI_UNCORE_IMC_DATA_READS] = { SNB_UNCORE_PCI_IMC_DATA_WRITES_BASE, The only sensible explanation is that a duplicate 'READS' was used instead of the correct 'WRITES', so change it back. Fixes: 24633d90 ("perf/x86/intel/uncore: Add BW counters for GT, IA and IO breakdown") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201026215203.3893972-1-arnd@kernel.org
-
- 10 11月, 2020 4 次提交
-
-
由 Stephane Eranian 提交于
Starting with Arch Perfmon v5, the anythread filter on generic counters may be deprecated. The current kernel was exporting the any filter without checking. On Icelake, it means you could do cpu/event=0x3c,any/ even though the filter does not exist. This patch corrects the problem by relying on the CPUID 0xa leaf function to determine if anythread is supported or not as described in the Intel SDM Vol3b 18.2.5.1 AnyThread Deprecation section. Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201028194247.3160610-1-eranian@google.com
-
由 Peter Zijlstra 提交于
Having pt_regs on-stack is unfortunate, it's 168 bytes. Since it isn't actually used, make it a static variable. This both gets if off the stack and ensures it gets 0 initialized, just in case someone does look at it. Reported-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201030151955.324273677@infradead.org
-
由 Peter Zijlstra 提交于
intel_pmu_drain_pebs_*() is typically called from handle_pmi_common(), both have an on-stack struct perf_sample_data, which is *big*. Rewire things so that drain_pebs() can use the one handle_pmi_common() has. Reported-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201030151955.054099690@infradead.org
-
由 Peter Zijlstra 提交于
__perf_output_begin() has an on-stack struct perf_sample_data in the unlikely case it needs to generate a LOST record. However, every call to perf_output_begin() must already have a perf_sample_data on-stack. Reported-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201030151954.985416146@infradead.org
-
- 29 10月, 2020 2 次提交
-
-
由 Kan Liang 提交于
The event CYCLE_ACTIVITY.STALLS_MEM_ANY (0x14a3) should be available on all 8 GP counters on ICL, but it's only scheduled on the first four counters due to the current ICL constraint table. Add a line for the CYCLE_ACTIVITY.STALLS_MEM_ANY event in the ICL constraint table. Correct the comments for the CYCLE_ACTIVITY.CYCLES_MEM_ANY event. Fixes: 60176089 ("perf/x86/intel: Add Icelake support") Reported-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201019164529.32154-1-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
For Rocket Lake, the MSR uncore, e.g., CBOX, ARB and CLOCKBOX, are the same as Tiger Lake. Share the perf code with it. For Rocket Lake and Tiger Lake, the 8th CBOX is not mapped into a different MSR space anymore. Add rkl_uncore_msr_init_box() to replace skl_uncore_msr_init_box(). The IMC uncore is the similar to Ice Lake. Add new PCIIDs of IMC for Rocket Lake. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201019153528.13850-4-kan.liang@linux.intel.com
-