“6f09801ef97d58ac9a2cc4962edbb812287a9df5”上不存在“projects/ThrowTheSwitch/imports.yml”
- 18 8月, 2020 3 次提交
-
-
由 Kan Liang 提交于
Intro ===== The TopDown Microarchitecture Analysis (TMA) Method is a structured analysis methodology to identify critical performance bottlenecks in out-of-order processors. Current perf has supported the method. The method works well, but there is one problem. To collect the TopDown events, several GP counters have to be used. If a user wants to collect other events at the same time, the multiplexing probably be triggered, which impacts the accuracy. To free up the scarce GP counters, the hardware TopDown metrics feature is introduced from Ice Lake. The hardware implements an additional "metrics" register and a new Fixed Counter 3 that measures pipeline "slots". The TopDown events can be calculated from them instead. Events ====== The level 1 TopDown has four metrics. There is no event-code assigned to the TopDown metrics. Four metric events are exported as separate perf events, which map to the internal "metrics" counter register. Those events do not exist in hardware, but can be allocated by the scheduler. For the event mapping, a special 0x00 event code is used, which is reserved for fake events. The metric events start from umask 0x10. When setting up the metric events, they point to the Fixed Counter 3. They have to be specially handled. - Add the update_topdown_event() callback to read the additional metrics MSR and generate the metrics. - Add the set_topdown_event_period() callback to initialize metrics MSR and the fixed counter 3. - Add a variable n_metric_event to track the number of the accepted metrics events. The sharing between multiple users of the same metric without multiplexing is not allowed. - Only enable/disable the fixed counter 3 when there are no other active TopDown events, which avoid the unnecessary writing of the fixed control register. - Disable the PMU when reading the metrics event. The metrics MSR and the fixed counter 3 are read separately. The values may be modified by an NMI. All four metric events don't support sampling. Since they will be handled specially for event update, a flag PERF_X86_EVENT_TOPDOWN is introduced to indicate this case. The slots event can support both sampling and counting. For counting, the flag is also applied. For sampling, it will be handled normally as other normal events. Groups ====== The slots event is required in a Topdown group. To avoid reading the METRICS register multiple times, the metrics and slots value can only be updated by slots event in a group. All active slots and metrics events will be updated one time. Therefore, the slots event must be before any metric events in a Topdown group. NMI ====== The METRICS related register may be overflow. The bit 48 of the STATUS register will be set. If so, PERF_METRICS and Fixed counter 3 are required to be reset. The patch also update all active slots and metrics events in the NMI handler. The update_topdown_event() has to read two registers separately. The values may be modified by an NMI. PMU has to be disabled before calling the function. RDPMC ====== RDPMC is temporarily disabled. A later patch will enable it. Suggested-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200723171117.9918-9-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Currently, the if-else is used in the intel_pmu_disable/enable_event to check the type of an event. It works well, but with more and more types added later, e.g., perf metrics, compared to the switch statement, the if-else may impair the readability of the code. There is no harm to use the switch statement to replace the if-else here. Also, some optimizing compilers may compile a switch statement into a jump-table which is more efficient than if-else for a large number of cases. The performance gain may not be observed for now, because the number of cases is only 5, but the benefits may be observed with more and more types added in the future. Use switch to replace the if-else in the intel_pmu_disable/enable_event. If the idx is invalid, print a warning. For the case INTEL_PMC_IDX_FIXED_BTS in intel_pmu_disable_event, don't need to check the event->attr.precise_ip. Use return for the case. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200723171117.9918-7-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Magic numbers are used in the current NMI handler for the global status bit. Use a meaningful name to replace the magic numbers to improve the readability of the code. Remove a Tab for all GLOBAL_STATUS_* and INTEL_PMC_IDX_FIXED_BTS macros to reduce the length of the line. Suggested-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200723171117.9918-3-kan.liang@linux.intel.com
-
- 08 7月, 2020 15 次提交
-
-
由 Kan Liang 提交于
Reading LBR registers in a perf NMI handler for a non-PEBS event causes a high overhead because the number of LBR registers is huge. To reduce the overhead, the XSAVES instruction should be used to replace the LBR registers' reading method. The XSAVES buffer used for LBR read has to be per-CPU because the NMI handler invoked the lbr_read(). The existing task_ctx_data buffer cannot be used which is per-task and only be allocated for the LBR call stack mode. A new lbr_xsave pointer is introduced in the cpu_hw_events as an XSAVES buffer for LBR read. The XSAVES buffer should be allocated only when LBR is used by a non-PEBS event on the CPU because the total size of the lbr_xsave is not small (~1.4KB). The XSAVES buffer is allocated when a non-PEBS event is added, but it is lazily released in x86_release_hardware() when perf releases the entire PMU hardware resource, because perf may frequently schedule the event, e.g. high context switch. The lazy release method reduces the overhead of frequently allocate/free the buffer. If the lbr_xsave fails to be allocated, roll back to normal Arch LBR lbr_read(). Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NDave Hansen <dave.hansen@intel.com> Link: https://lkml.kernel.org/r/1593780569-62993-24-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
In the LBR call stack mode, LBR information is used to reconstruct a call stack. To get the complete call stack, perf has to save/restore all LBR registers during a context switch. Due to a large number of the LBR registers, this process causes a high CPU overhead. To reduce the CPU overhead during a context switch, use the XSAVES/XRSTORS instructions. Every XSAVE area must follow a canonical format: the legacy region, an XSAVE header and the extended region. Although the LBR information is only kept in the extended region, a space for the legacy region and XSAVE header is still required. Add a new dedicated structure for LBR XSAVES support. Before enabling XSAVES support, the size of the LBR state has to be sanity checked, because: - the size of the software structure is calculated from the max number of the LBR depth, which is enumerated by the CPUID leaf for Arch LBR. The size of the LBR state is enumerated by the CPUID leaf for XSAVE support of Arch LBR. If the values from the two CPUID leaves are not consistent, it may trigger a buffer overflow. For example, a hypervisor may unconsciously set inconsistent values for the two emulated CPUID. - unlike other state components, the size of an LBR state depends on the max number of LBRs, which may vary from generation to generation. Expose the function xfeature_size() for the sanity check. The LBR XSAVES support will be disabled if the size of the LBR state enumerated by CPUID doesn't match with the size of the software structure. The XSAVE instruction requires 64-byte alignment for state buffers. A new macro is added to reflect the alignment requirement. A 64-byte aligned kmem_cache is created for architecture LBR. Currently, the structure for each state component is maintained in fpu/types.h. The structure for the new LBR state component should be maintained in the same place. Move structure lbr_entry to fpu/types.h as well for broader sharing. Add dedicated lbr_save/lbr_restore functions for LBR XSAVES support, which invokes the corresponding xstate helpers to XSAVES/XRSTORS LBR information at the context switch when the call stack mode is enabled. Since the XSAVES/XRSTORS instructions will be eventually invoked, the dedicated functions is named with '_xsaves'/'_xrstors' postfix. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NDave Hansen <dave.hansen@intel.com> Link: https://lkml.kernel.org/r/1593780569-62993-23-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
A new kmem_cache method has replaced the kzalloc() to allocate the PMU specific data. The task_ctx_size is not required anymore. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-19-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
A new kmem_cache method is introduced to allocate the PMU specific data task_ctx_data, which requires the PMU specific code to create a kmem_cache. Currently, the task_ctx_data is only used by the Intel LBR call stack feature, which is introduced since Haswell. The kmem_cache should be only created for Haswell and later platforms. There is no alignment requirement for the existing platforms. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-18-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Last Branch Records (LBR) enables recording of software path history by logging taken branches and other control flows within architectural registers now. Intel CPUs have had model-specific LBR for quite some time, but this evolves them into an architectural feature now. The main improvements of Architectural LBR implemented includes: - Linux kernel can support the LBR features without knowing the model number of the current CPU. - Architectural LBR capabilities can be enumerated by CPUID. The lbr_ctl_map is based on the CPUID Enumeration. - The possible LBR depth can be retrieved from CPUID enumeration. The max value is written to the new MSR_ARCH_LBR_DEPTH as the number of LBR entries. - A new IA32_LBR_CTL MSR is introduced to enable and configure LBRs, which replaces the IA32_DEBUGCTL[bit 0] and the LBR_SELECT MSR. - Each LBR record or entry is still comprised of three MSRs, IA32_LBR_x_FROM_IP, IA32_LBR_x_TO_IP and IA32_LBR_x_TO_IP. But they become the architectural MSRs. - Architectural LBR is stack-like now. Entry 0 is always the youngest branch, entry 1 the next youngest... The TOS MSR has been removed. The way to enable/disable Architectural LBR is similar to the previous model-specific LBR. __intel_pmu_lbr_enable/disable() can be reused, but some modifications are required, which include: - MSR_ARCH_LBR_CTL is used to enable and configure the Architectural LBR. - When checking the value of the IA32_DEBUGCTL MSR, ignoring the DEBUGCTLMSR_LBR (bit 0) for Architectural LBR, which has no meaning and always return 0. - The FREEZE_LBRS_ON_PMI has to be explicitly set/clear, because MSR_IA32_DEBUGCTLMSR is not touched in __intel_pmu_lbr_disable() for Architectural LBR. - Only MSR_ARCH_LBR_CTL is cleared in __intel_pmu_lbr_disable() for Architectural LBR. Some Architectural LBR dedicated functions are implemented to reset/read/save/restore LBR. - For reset, writing to the ARCH_LBR_DEPTH MSR clears all Arch LBR entries, which is a lot faster and can improve the context switch latency. - For read, the branch type information can be retrieved from the MSR_ARCH_LBR_INFO_*. But it's not fully compatible due to OTHER_BRANCH type. The software decoding is still required for the OTHER_BRANCH case. LBR records are stored in the age order as well. Reuse intel_pmu_store_lbr(). Check the CPUID enumeration before accessing the corresponding bits in LBR_INFO. - For save/restore, applying the fast reset (writing ARCH_LBR_DEPTH). Reading 'lbr_from' of entry 0 instead of the TOS MSR to check if the LBR registers are reset in the deep C-state. If 'the deep C-state reset' bit is not set in CPUID enumeration, ignoring the check. XSAVE support for Architectural LBR will be implemented later. The number of LBR entries cannot be hardcoded anymore, which should be retrieved from CPUID enumeration. A new structure x86_perf_task_context_arch_lbr is introduced for Architectural LBR. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-15-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The way to store the LBR information from a PEBS LBR record can be reused in Architecture LBR, because - The LBR information is stored like a stack. Entry 0 is always the youngest branch. - The layout of the LBR INFO MSR is similar. The LBR information may be retrieved from either the LBR registers (non-PEBS event) or a buffer (PEBS event). Extend rdlbr_*() to support both methods. Explicitly check the invalid entry (0s), which can avoid unnecessary MSR access if using a non-PEBS event. For a PEBS event, the check should slightly improve the performance as well. The invalid entries are cut. The intel_pmu_lbr_filter() doesn't need to check and filter them out. Cannot share the function with current model-specific LBR read, because the direction of the LBR growth is opposite. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-14-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The previous model-specific LBR and Architecture LBR (legacy way) use a similar method to save/restore the LBR information, which directly accesses the LBR registers. The codes which read/write a set of LBR registers can be shared between them. Factor out two functions which are used to read/write a set of LBR registers. Add lbr_info into structure x86_pmu, and use it to replace the hardcoded LBR INFO MSR, because the LBR INFO MSR address of the previous model-specific LBR is different from Architecture LBR. The MSR address should be assigned at boot time. For now, only Sky Lake and later platforms have the LBR INFO MSR. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-13-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The {rd,wr}lbr_{to,from} wrappers are invoked in hot paths, e.g. context switch and NMI handler. They should be always inline to achieve better performance. However, the CONFIG_OPTIMIZE_INLINING allows the compiler to uninline functions marked 'inline'. Mark the {rd,wr}lbr_{to,from} wrappers as __always_inline to force inline the wrappers. Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-12-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Current LBR information in the structure x86_perf_task_context is stored in a different format from the PEBS LBR record and Architecture LBR, which prevents the sharing of the common codes. Use the format of the PEBS LBR record as a unified format. Use a generic name lbr_entry to replace pebs_lbr_entry. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-11-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
An IA32_LBR_CTL is introduced for Architecture LBR to enable and config LBR registers to replace the previous LBR_SELECT. All the related members in struct cpu_hw_events and struct x86_pmu have to be renamed. Some new macros are added to reflect the layout of LBR_CTL. The mapping from PERF_SAMPLE_BRANCH_* to the corresponding bits in LBR_CTL MSR is saved in lbr_ctl_map now, which is not a const value. The value relies on the CPUID enumeration. For the previous model-specific LBR, most of the bits in LBR_SELECT operate in the suppressed mode. For the bits in LBR_CTL, the polarity is inverted. For the previous model-specific LBR format 5 (LBR_FORMAT_INFO), if the NO_CYCLES and NO_FLAGS type are set, the flag LBR_NO_INFO will be set to avoid the unnecessary LBR_INFO MSR read. Although Architecture LBR also has a dedicated LBR_INFO MSR, perf doesn't need to check and set the flag LBR_NO_INFO. For Architecture LBR, XSAVES instruction will be used as the default way to read the LBR MSRs all together. The overhead which the flag tries to avoid doesn't exist anymore. Dropping the flag can save the extra check for the flag in the lbr_read() later, and make the code cleaner. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-10-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The type of task_ctx is hardcoded as struct x86_perf_task_context, which doesn't apply for Architecture LBR. For example, Architecture LBR doesn't have the TOS MSR. The number of LBR entries is variable. A new struct will be introduced for Architecture LBR. Perf has to determine the type of task_ctx at run time. The type of task_ctx pointer is changed to 'void *', which will be determined at run time. The generic LBR optimization can be shared between Architecture LBR and model-specific LBR. Both need to access the structure for the generic LBR optimization. A helper task_context_opt() is introduced to retrieve the pointer of the structure at run time. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-7-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
To reduce the overhead of a context switch with LBR enabled, some generic optimizations were introduced, e.g. avoiding restore LBR if no one else touched them. The generic optimizations can also be used by Architecture LBR later. Currently, the fields for the generic optimizations are part of structure x86_perf_task_context, which will be deprecated by Architecture LBR. A new structure should be introduced for the common fields of generic optimization, which can be shared between Architecture LBR and model-specific LBR. Both 'valid_lbrs' and 'tos' are also used by the generic optimizations, but they are not moved into the new structure, because Architecture LBR is stack-like. The 'valid_lbrs' which records the index of the valid LBR is not required anymore. The TOS MSR will be removed. LBR registers may be cleared in the deep Cstate. If so, the generic optimizations should not be applied. Perf has to unconditionally restore the LBR registers. A generic function is required to detect the reset due to the deep Cstate. lbr_is_reset_in_cstate() is introduced. Currently, for the model-specific LBR, the TOS MSR is used to detect the reset. There will be another method introduced for Architecture LBR later. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-6-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The MSRs of Architectural LBR are different from previous model-specific LBR. Perf has to implement different functions to save and restore them. The function pointers for LBR save and restore are introduced. Perf should initialize the corresponding functions at boot time. The generic optimizations, e.g. avoiding restore LBR if no one else touched them, still apply for Architectural LBRs. The related codes are not moved to model-specific functions. Current model-specific LBR functions are set as default. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-5-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The method to read Architectural LBRs is different from previous model-specific LBR. Perf has to implement a different function. A function pointer for LBR read is introduced. Perf should initialize the corresponding function at boot time, and avoid checking lbr_format at run time. The current 64-bit LBR read function is set as default. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-4-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The method to reset Architectural LBRs is different from previous model-specific LBR. Perf has to implement a different function. A function pointer is introduced for LBR reset. The enum of LBR_FORMAT_* is also moved to perf_event.h. Perf should initialize the corresponding functions at boot time, and avoid checking lbr_format at run time. The current 64-bit LBR reset function is set as default. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-3-git-send-email-kan.liang@linux.intel.com
-
- 02 7月, 2020 4 次提交
-
-
由 Like Xu 提交于
When a guest wants to use the LBR registers, its hypervisor creates a guest LBR event and let host perf schedules it. The LBR records msrs are accessible to the guest when its guest LBR event is scheduled on by the perf subsystem. Before scheduling this event out, we should avoid host changes on IA32_DEBUGCTLMSR or LBR_SELECT. Otherwise, some unexpected branch operations may interfere with guest behavior, pollute LBR records, and even cause host branches leakage. In addition, the read operation on host is also avoidable. To ensure that guest LBR records are not lost during the context switch, the guest LBR event would enable the callstack mode which could save/restore guest unread LBR records with the help of intel_pmu_lbr_sched_task() naturally. However, the guest LBR_SELECT may changes for its own use and the host LBR event doesn't save/restore it. To ensure that we doesn't lost the guest LBR_SELECT value when the guest LBR event is running, the vlbr_constraint is bound up with a new constraint flag PERF_X86_EVENT_LBR_SELECT. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Signed-off-by: NWei Wang <wei.w.wang@intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200514083054.62538-6-like.xu@linux.intel.com
-
由 Like Xu 提交于
The hypervisor may request the perf subsystem to schedule a time window to directly access the LBR records msrs for its own use. Normally, it would create a guest LBR event with callstack mode enabled, which is scheduled along with other ordinary LBR events on the host but in an exclusive way. To avoid wasting a counter for the guest LBR event, the perf tracks its hw->idx via INTEL_PMC_IDX_FIXED_VLBR and assigns it with a fake VLBR counter with the help of new vlbr_constraint. As with the BTS event, there is actually no hardware counter assigned for the guest LBR event. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200514083054.62538-5-like.xu@linux.intel.com
-
由 Like Xu 提交于
The LBR records msrs are model specific. The perf subsystem has already obtained the base addresses of LBR records based on the cpu model. Therefore, an interface is added to allow callers outside the perf subsystem to obtain these LBR information. It's useful for hypervisors to emulate the LBR feature for guests with less code. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Signed-off-by: NWei Wang <wei.w.wang@intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200613080958.132489-4-like.xu@linux.intel.com
-
由 Like Xu 提交于
For intel_pmu_en/disable_event(), reorder the branches checks for hw->idx and make them sorted by probability: gp,fixed,bts,others. Clean up the x86_assign_hw_event() by converting multiple if-else statements to a switch statement. To skip x86_perf_event_update() and x86_perf_event_set_period(), it's generic to replace "idx == INTEL_PMC_IDX_FIXED_BTS" check with '!hwc->event_base' because that should be 0 for all non-gp/fixed cases. Wrap related bit operations into intel_set/clear_masks() and make the main path more cleaner and readable. No functional changes. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Original-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200613080958.132489-3-like.xu@linux.intel.com
-
- 15 6月, 2020 7 次提交
-
-
由 Roman Sudarikov 提交于
Current version supports a server line starting Intel® Xeon® Processor Scalable Family and introduces mapping for IIO Uncore units only. Other units can be added on demand. IIO stack to PMON mapping is exposed through: /sys/devices/uncore_iio_<pmu_idx>/dieX where dieX is file which holds "Segment:Root Bus" for PCIe root port, which can be monitored by that IIO PMON block. Details are explained in Documentation/ABI/testing/sysfs-devices-mapping Reported-by: Nkbuild test robot <lkp@intel.com> Signed-off-by: NAlexander Antonov <alexander.antonov@linux.intel.com> Signed-off-by: NRoman Sudarikov <roman.sudarikov@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NKan Liang <kan.liang@linux.intel.com> Reviewed-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://lkml.kernel.org/r/20200601083543.30011-4-alexander.antonov@linux.intel.com
-
由 Roman Sudarikov 提交于
The accessor to return number of dies on the platform. Signed-off-by: NAlexander Antonov <alexander.antonov@linux.intel.com> Signed-off-by: NRoman Sudarikov <roman.sudarikov@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NKan Liang <kan.liang@linux.intel.com> Reviewed-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://lkml.kernel.org/r/20200601083543.30011-3-alexander.antonov@linux.intel.com
-
由 Roman Sudarikov 提交于
Each Uncore unit type, by its nature, can be mapped to its own context - which platform component each PMON block of that type is supposed to monitor. Intel® Xeon® Scalable processor family (code name Skylake-SP) makes significant changes in the integrated I/O (IIO) architecture. The new solution introduces IIO stacks which are responsible for managing traffic between the PCIe domain and the Mesh domain. Each IIO stack has its own PMON block and can handle either DMI port, x16 PCIe root port, MCP-Link or various built-in accelerators. IIO PMON blocks allow concurrent monitoring of I/O flows up to 4 x4 bifurcation within each IIO stack. Software is supposed to program required perf counters within each IIO stack and gather performance data. The tricky thing here is that IIO PMON reports data per IIO stack but users have no idea what IIO stacks are - they only know devices which are connected to the platform. Understanding IIO stack concept to find which IIO stack that particular IO device is connected to, or to identify an IIO PMON block to program for monitoring specific IIO stack assumes a lot of implicit knowledge about given Intel server platform architecture. Usage example: ls /sys/devices/uncore_<type>_<pmu_idx>/die* Signed-off-by: NAlexander Antonov <alexander.antonov@linux.intel.com> Signed-off-by: NRoman Sudarikov <roman.sudarikov@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NKan Liang <kan.liang@linux.intel.com> Reviewed-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://lkml.kernel.org/r/20200601083543.30011-2-alexander.antonov@linux.intel.com
-
由 Kan Liang 提交于
An oops will be triggered, if perf tries to access an invalid address which exceeds the mapped area. Check the address before the actual access to MMIO sapce of an uncore unit. Suggested-by: NDavid Laight <David.Laight@ACULAB.COM> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1590679169-61823-3-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Perf cannot validate an address before the actual access to MMIO space of some uncore units, e.g. IMC on TGL. Accessing an invalid address, which exceeds mapped area, can trigger oops. Perf never records the size of mapped area. Generic functions, e.g. uncore_mmio_read_counter(), cannot get the correct size for address validation. Add mmio_map_size in intel_uncore_type to record the size of mapped area. Print warning message if ioremap fails. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1590679169-61823-2-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
When counting IMC uncore events on some TGL machines, an oops will be triggered. [ 393.101262] BUG: unable to handle page fault for address: ffffb45200e15858 [ 393.101269] #PF: supervisor read access in kernel mode [ 393.101271] #PF: error_code(0x0000) - not-present page Current perf uncore driver still use the IMC MAP SIZE inherited from SNB, which is 0x6000. However, the offset of IMC uncore counters is larger than 0x6000, e.g. 0xd8a0. Enlarge the IMC MAP SIZE for TGL to 0xe000. Fixes: fdb64822 ("perf/x86: Add Intel Tiger Lake uncore support") Reported-by: NAmmy Yi <ammy.yi@intel.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NAmmy Yi <ammy.yi@intel.com> Tested-by: NChao Qin <chao.qin@intel.com> Link: https://lkml.kernel.org/r/1590679169-61823-1-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The uncore subsystem on Comet Lake is similar to Sky Lake. The only difference is the new PCI IDs for IMC. Share the perf code with Sky Lake. Add new PCI IDs in the table. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1589915905-55870-1-git-send-email-kan.liang@linux.intel.com
-
- 28 5月, 2020 1 次提交
-
-
由 Stephane Eranian 提交于
To prepare for support of both Intel and AMD RAPL. As per the AMD PPR, Fam17h support Package RAPL counters to monitor power usage. The RAPL counter operates as with Intel RAPL, and as such it is beneficial to share the code. No change in functionality. Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NIngo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200527224659.206129-2-eranian@google.com
-
- 20 5月, 2020 3 次提交
-
-
由 Gustavo A. R. Silva 提交于
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues. This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732 ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200511200911.GA13149@embeddedor
-
由 Kan Liang 提交于
The mask in the extra_regs for Intel Tremont need to be extended to allow more defined bits. "Outstanding Requests" (bit 63) is only available on MSR_OFFCORE_RSP0; Fixes: 6daeb873 ("perf/x86/intel: Add Tremont core PMU support") Reported-by: NStephane Eranian <eranian@google.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200501125442.7030-1-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
Enable RAPL support for Intel Ice Lake X and Ice Lake D. For RAPL support, it is identical to Sky Lake X. Reported-by: NStephane Eranian <eranian@google.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1588857258-38213-1-git-send-email-kan.liang@linux.intel.com
-
- 01 5月, 2020 1 次提交
-
-
由 Paul Gortmaker 提交于
Only a few lines below this removed line is this: attrs = kzalloc(size, GFP_KERNEL); and since there is no code path where this could be avoided, the NULL assignment is a pointless relic of history and can be removed. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200408235216.108980-1-paul.gortmaker@windriver.com
-
- 23 4月, 2020 1 次提交
-
-
由 Harry Pan 提交于
The Jasper Lake processor is Tremont microarchitecture, reuse the glm_cstates table of Goldmont and Goldmont Plus to enable the C-states residency profiling. Signed-off-by: NHarry Pan <harry.pan@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200402190658.1.Ic02e891daac41303aed1f2fc6c64f6110edd27bd@changeid
-
- 08 4月, 2020 1 次提交
-
-
由 Kan Liang 提交于
The uncore subsystem in Ice Lake server is similar to previous server. There are some differences in config register encoding and pci device IDs. The uncore PMON units in Ice Lake server include Ubox, Chabox, IIO, IRP, M2PCIE, PCU, M2M, PCIE3 and IMC. - For CHA, filter 1 register has been removed. The filter 0 register can be used by and of CHA events to be filterd by Thread/Core-ID. To do so, the control register's tid_en bit must be set to 1. - For IIO, there are some changes on event constraints. The MSR address and MSR offsets among counters are also changed. - For IRP, the MSR address and MSR offsets among counters are changed. - For M2PCIE, the counters are accessed by MSR now. Add new MSR address and MSR offsets. Change event constraints. - To determine the number of CHAs, have to read CAPID6(Low) and CAPID7 (High) now. - For M2M, update the PCICFG address and Device ID. - For UPI, update the PCICFG address, Device ID and counter address. - For M3UPI, update the PCICFG address, Device ID, counter address and event constraints. - For IMC, update the formular to calculate MMIO BAR address, which is MMIO_BASE + specific MEM_BAR offset. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NIngo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/1585842411-150452-1-git-send-email-kan.liang@linux.intel.com
-
- 25 3月, 2020 1 次提交
-
-
由 Thomas Gleixner 提交于
The new macro set has a consistent namespace and uses C99 initializers instead of the grufty C89 ones. Get rid the of the local macro wrappers for consistency. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lkml.kernel.org/r/20200320131509.029267418@linutronix.de
-
- 20 3月, 2020 2 次提交
-
-
由 Kan Liang 提交于
The IMC uncore unit in Ice Lake server can only be accessed by MMIO, which is similar as Snow Ridge. Factor out __snr_uncore_mmio_init_box which can be shared with Ice Lake server in the following patch. No functional changes. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1584470314-46657-2-git-send-email-kan.liang@linux.intel.com
-
由 Kan Liang 提交于
The offset between uncore boxes of free-running counters varies, e.g. IIO free-running counters on Ice Lake server. Add box_offsets, an array of offsets between adjacent uncore boxes. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1584470314-46657-1-git-send-email-kan.liang@linux.intel.com
-
- 11 2月, 2020 1 次提交
-
-
由 Kan Liang 提交于
For MSR type of uncore units, there is no difference between Ice Lake and Tiger Lake. Share the same code with Ice Lake. Tiger Lake has two MCs. Both of them are located at 0:0:0. The BAR offset is still 0x48. The offset of the two MCs is 0x10000. Each MC has three counters to count every read/write/total issued by the Memory Controller to DRAM. The counters can be accessed by MMIO. They are free-running counters. The offset of counters are different for TIGERLAKE_L and TIGERLAKE. Add separated mmio_init() functions. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/20200206161527.3529-1-kan.liang@linux.intel.com
-