“6f09801ef97d58ac9a2cc4962edbb812287a9df5”上不存在“projects/ThrowTheSwitch/imports.yml”
  1. 18 8月, 2020 3 次提交
    • K
      perf/x86/intel: Generic support for hardware TopDown metrics · 7b2c05a1
      Kan Liang 提交于
      Intro
      =====
      
      The TopDown Microarchitecture Analysis (TMA) Method is a structured
      analysis methodology to identify critical performance bottlenecks in
      out-of-order processors. Current perf has supported the method.
      
      The method works well, but there is one problem. To collect the TopDown
      events, several GP counters have to be used. If a user wants to collect
      other events at the same time, the multiplexing probably be triggered,
      which impacts the accuracy.
      
      To free up the scarce GP counters, the hardware TopDown metrics feature
      is introduced from Ice Lake. The hardware implements an additional
      "metrics" register and a new Fixed Counter 3 that measures pipeline
      "slots". The TopDown events can be calculated from them instead.
      
      Events
      ======
      
      The level 1 TopDown has four metrics. There is no event-code assigned to
      the TopDown metrics. Four metric events are exported as separate perf
      events, which map to the internal "metrics" counter register. Those
      events do not exist in hardware, but can be allocated by the scheduler.
      
      For the event mapping, a special 0x00 event code is used, which is
      reserved for fake events. The metric events start from umask 0x10.
      
      When setting up the metric events, they point to the Fixed Counter 3.
      They have to be specially handled.
      - Add the update_topdown_event() callback to read the additional metrics
        MSR and generate the metrics.
      - Add the set_topdown_event_period() callback to initialize metrics MSR
        and the fixed counter 3.
      - Add a variable n_metric_event to track the number of the accepted
        metrics events. The sharing between multiple users of the same metric
        without multiplexing is not allowed.
      - Only enable/disable the fixed counter 3 when there are no other active
        TopDown events, which avoid the unnecessary writing of the fixed
        control register.
      - Disable the PMU when reading the metrics event. The metrics MSR and
        the fixed counter 3 are read separately. The values may be modified by
        an NMI.
      
      All four metric events don't support sampling. Since they will be
      handled specially for event update, a flag PERF_X86_EVENT_TOPDOWN is
      introduced to indicate this case.
      
      The slots event can support both sampling and counting.
      For counting, the flag is also applied.
      For sampling, it will be handled normally as other normal events.
      
      Groups
      ======
      
      The slots event is required in a Topdown group.
      To avoid reading the METRICS register multiple times, the metrics and
      slots value can only be updated by slots event in a group.
      All active slots and metrics events will be updated one time.
      Therefore, the slots event must be before any metric events in a Topdown
      group.
      
      NMI
      ======
      
      The METRICS related register may be overflow. The bit 48 of the STATUS
      register will be set. If so, PERF_METRICS and Fixed counter 3 are
      required to be reset. The patch also update all active slots and
      metrics events in the NMI handler.
      
      The update_topdown_event() has to read two registers separately. The
      values may be modified by an NMI. PMU has to be disabled before calling
      the function.
      
      RDPMC
      ======
      
      RDPMC is temporarily disabled. A later patch will enable it.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20200723171117.9918-9-kan.liang@linux.intel.com
      7b2c05a1
    • K
      perf/x86/intel: Use switch in intel_pmu_disable/enable_event · 58da7dbe
      Kan Liang 提交于
      Currently, the if-else is used in the intel_pmu_disable/enable_event to
      check the type of an event. It works well, but with more and more types
      added later, e.g., perf metrics, compared to the switch statement, the
      if-else may impair the readability of the code.
      
      There is no harm to use the switch statement to replace the if-else
      here. Also, some optimizing compilers may compile a switch statement
      into a jump-table which is more efficient than if-else for a large
      number of cases. The performance gain may not be observed for now,
      because the number of cases is only 5, but the benefits may be observed
      with more and more types added in the future.
      
      Use switch to replace the if-else in the intel_pmu_disable/enable_event.
      
      If the idx is invalid, print a warning.
      
      For the case INTEL_PMC_IDX_FIXED_BTS in intel_pmu_disable_event, don't
      need to check the event->attr.precise_ip. Use return for the case.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20200723171117.9918-7-kan.liang@linux.intel.com
      58da7dbe
    • K
      perf/x86/intel: Name the global status bit in NMI handler · 60a2a271
      Kan Liang 提交于
      Magic numbers are used in the current NMI handler for the global status
      bit. Use a meaningful name to replace the magic numbers to improve the
      readability of the code.
      
      Remove a Tab for all GLOBAL_STATUS_* and INTEL_PMC_IDX_FIXED_BTS macros
      to reduce the length of the line.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20200723171117.9918-3-kan.liang@linux.intel.com
      60a2a271
  2. 08 7月, 2020 15 次提交
    • K
      perf/x86/intel/lbr: Support XSAVES for arch LBR read · c085fb87
      Kan Liang 提交于
      Reading LBR registers in a perf NMI handler for a non-PEBS event
      causes a high overhead because the number of LBR registers is huge.
      To reduce the overhead, the XSAVES instruction should be used to replace
      the LBR registers' reading method.
      
      The XSAVES buffer used for LBR read has to be per-CPU because the NMI
      handler invoked the lbr_read(). The existing task_ctx_data buffer
      cannot be used which is per-task and only be allocated for the LBR call
      stack mode. A new lbr_xsave pointer is introduced in the cpu_hw_events
      as an XSAVES buffer for LBR read.
      
      The XSAVES buffer should be allocated only when LBR is used by a
      non-PEBS event on the CPU because the total size of the lbr_xsave is
      not small (~1.4KB).
      
      The XSAVES buffer is allocated when a non-PEBS event is added, but it
      is lazily released in x86_release_hardware() when perf releases the
      entire PMU hardware resource, because perf may frequently schedule the
      event, e.g. high context switch. The lazy release method reduces the
      overhead of frequently allocate/free the buffer.
      
      If the lbr_xsave fails to be allocated, roll back to normal Arch LBR
      lbr_read().
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Link: https://lkml.kernel.org/r/1593780569-62993-24-git-send-email-kan.liang@linux.intel.com
      c085fb87
    • K
      perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch · ce711ea3
      Kan Liang 提交于
      In the LBR call stack mode, LBR information is used to reconstruct a
      call stack. To get the complete call stack, perf has to save/restore
      all LBR registers during a context switch. Due to a large number of the
      LBR registers, this process causes a high CPU overhead. To reduce the
      CPU overhead during a context switch, use the XSAVES/XRSTORS
      instructions.
      
      Every XSAVE area must follow a canonical format: the legacy region, an
      XSAVE header and the extended region. Although the LBR information is
      only kept in the extended region, a space for the legacy region and
      XSAVE header is still required. Add a new dedicated structure for LBR
      XSAVES support.
      
      Before enabling XSAVES support, the size of the LBR state has to be
      sanity checked, because:
      - the size of the software structure is calculated from the max number
      of the LBR depth, which is enumerated by the CPUID leaf for Arch LBR.
      The size of the LBR state is enumerated by the CPUID leaf for XSAVE
      support of Arch LBR. If the values from the two CPUID leaves are not
      consistent, it may trigger a buffer overflow. For example, a hypervisor
      may unconsciously set inconsistent values for the two emulated CPUID.
      - unlike other state components, the size of an LBR state depends on the
      max number of LBRs, which may vary from generation to generation.
      
      Expose the function xfeature_size() for the sanity check.
      The LBR XSAVES support will be disabled if the size of the LBR state
      enumerated by CPUID doesn't match with the size of the software
      structure.
      
      The XSAVE instruction requires 64-byte alignment for state buffers. A
      new macro is added to reflect the alignment requirement. A 64-byte
      aligned kmem_cache is created for architecture LBR.
      
      Currently, the structure for each state component is maintained in
      fpu/types.h. The structure for the new LBR state component should be
      maintained in the same place. Move structure lbr_entry to fpu/types.h as
      well for broader sharing.
      
      Add dedicated lbr_save/lbr_restore functions for LBR XSAVES support,
      which invokes the corresponding xstate helpers to XSAVES/XRSTORS LBR
      information at the context switch when the call stack mode is enabled.
      Since the XSAVES/XRSTORS instructions will be eventually invoked, the
      dedicated functions is named with '_xsaves'/'_xrstors' postfix.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Link: https://lkml.kernel.org/r/1593780569-62993-23-git-send-email-kan.liang@linux.intel.com
      ce711ea3
    • K
      perf/x86: Remove task_ctx_size · 5a09928d
      Kan Liang 提交于
      A new kmem_cache method has replaced the kzalloc() to allocate the PMU
      specific data. The task_ctx_size is not required anymore.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1593780569-62993-19-git-send-email-kan.liang@linux.intel.com
      5a09928d
    • K
      perf/x86/intel/lbr: Create kmem_cache for the LBR context data · 33cad284
      Kan Liang 提交于
      A new kmem_cache method is introduced to allocate the PMU specific data
      task_ctx_data, which requires the PMU specific code to create a
      kmem_cache.
      
      Currently, the task_ctx_data is only used by the Intel LBR call stack
      feature, which is introduced since Haswell. The kmem_cache should be
      only created for Haswell and later platforms. There is no alignment
      requirement for the existing platforms.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1593780569-62993-18-git-send-email-kan.liang@linux.intel.com
      33cad284
    • K
      perf/x86/intel/lbr: Support Architectural LBR · 47125db2
      Kan Liang 提交于
      Last Branch Records (LBR) enables recording of software path history by
      logging taken branches and other control flows within architectural
      registers now. Intel CPUs have had model-specific LBR for quite some
      time, but this evolves them into an architectural feature now.
      
      The main improvements of Architectural LBR implemented includes:
      - Linux kernel can support the LBR features without knowing the model
        number of the current CPU.
      - Architectural LBR capabilities can be enumerated by CPUID. The
        lbr_ctl_map is based on the CPUID Enumeration.
      - The possible LBR depth can be retrieved from CPUID enumeration. The
        max value is written to the new MSR_ARCH_LBR_DEPTH as the number of
        LBR entries.
      - A new IA32_LBR_CTL MSR is introduced to enable and configure LBRs,
        which replaces the IA32_DEBUGCTL[bit 0] and the LBR_SELECT MSR.
      - Each LBR record or entry is still comprised of three MSRs,
        IA32_LBR_x_FROM_IP, IA32_LBR_x_TO_IP and IA32_LBR_x_TO_IP.
        But they become the architectural MSRs.
      - Architectural LBR is stack-like now. Entry 0 is always the youngest
        branch, entry 1 the next youngest... The TOS MSR has been removed.
      
      The way to enable/disable Architectural LBR is similar to the previous
      model-specific LBR. __intel_pmu_lbr_enable/disable() can be reused, but
      some modifications are required, which include:
      - MSR_ARCH_LBR_CTL is used to enable and configure the Architectural
        LBR.
      - When checking the value of the IA32_DEBUGCTL MSR, ignoring the
        DEBUGCTLMSR_LBR (bit 0) for Architectural LBR, which has no meaning
        and always return 0.
      - The FREEZE_LBRS_ON_PMI has to be explicitly set/clear, because
        MSR_IA32_DEBUGCTLMSR is not touched in __intel_pmu_lbr_disable() for
        Architectural LBR.
      - Only MSR_ARCH_LBR_CTL is cleared in __intel_pmu_lbr_disable() for
        Architectural LBR.
      
      Some Architectural LBR dedicated functions are implemented to
      reset/read/save/restore LBR.
      - For reset, writing to the ARCH_LBR_DEPTH MSR clears all Arch LBR
        entries, which is a lot faster and can improve the context switch
        latency.
      - For read, the branch type information can be retrieved from
        the MSR_ARCH_LBR_INFO_*. But it's not fully compatible due to
        OTHER_BRANCH type. The software decoding is still required for the
        OTHER_BRANCH case.
        LBR records are stored in the age order as well. Reuse
        intel_pmu_store_lbr(). Check the CPUID enumeration before accessing
        the corresponding bits in LBR_INFO.
      - For save/restore, applying the fast reset (writing ARCH_LBR_DEPTH).
        Reading 'lbr_from' of entry 0 instead of the TOS MSR to check if the
        LBR registers are reset in the deep C-state. If 'the deep C-state
        reset' bit is not set in CPUID enumeration, ignoring the check.
        XSAVE support for Architectural LBR will be implemented later.
      
      The number of LBR entries cannot be hardcoded anymore, which should be
      retrieved from CPUID enumeration. A new structure
      x86_perf_task_context_arch_lbr is introduced for Architectural LBR.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1593780569-62993-15-git-send-email-kan.liang@linux.intel.com
      47125db2
    • K
      perf/x86/intel/lbr: Factor out intel_pmu_store_lbr · 631618a0
      Kan Liang 提交于
      The way to store the LBR information from a PEBS LBR record can be
      reused in Architecture LBR, because
      - The LBR information is stored like a stack. Entry 0 is always the
        youngest branch.
      - The layout of the LBR INFO MSR is similar.
      
      The LBR information may be retrieved from either the LBR registers
      (non-PEBS event) or a buffer (PEBS event). Extend rdlbr_*() to support
      both methods.
      
      Explicitly check the invalid entry (0s), which can avoid unnecessary MSR
      access if using a non-PEBS event. For a PEBS event, the check should
      slightly improve the performance as well. The invalid entries are cut.
      The intel_pmu_lbr_filter() doesn't need to check and filter them out.
      
      Cannot share the function with current model-specific LBR read, because
      the direction of the LBR growth is opposite.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1593780569-62993-14-git-send-email-kan.liang@linux.intel.com
      631618a0
    • K
      perf/x86/intel/lbr: Factor out rdlbr_all() and wrlbr_all() · fda1f99f
      Kan Liang 提交于
      The previous model-specific LBR and Architecture LBR (legacy way) use a
      similar method to save/restore the LBR information, which directly
      accesses the LBR registers. The codes which read/write a set of LBR
      registers can be shared between them.
      
      Factor out two functions which are used to read/write a set of LBR
      registers.
      
      Add lbr_info into structure x86_pmu, and use it to replace the hardcoded
      LBR INFO MSR, because the LBR INFO MSR address of the previous
      model-specific LBR is different from Architecture LBR. The MSR address
      should be assigned at boot time. For now, only Sky Lake and later
      platforms have the LBR INFO MSR.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1593780569-62993-13-git-send-email-kan.liang@linux.intel.com
      fda1f99f
    • K
      perf/x86/intel/lbr: Mark the {rd,wr}lbr_{to,from} wrappers __always_inline · 020d91e5
      Kan Liang 提交于
      The {rd,wr}lbr_{to,from} wrappers are invoked in hot paths, e.g. context
      switch and NMI handler. They should be always inline to achieve better
      performance. However, the CONFIG_OPTIMIZE_INLINING allows the compiler
      to uninline functions marked 'inline'.
      
      Mark the {rd,wr}lbr_{to,from} wrappers as __always_inline to force
      inline the wrappers.
      Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1593780569-62993-12-git-send-email-kan.liang@linux.intel.com
      020d91e5
    • K
      perf/x86/intel/lbr: Unify the stored format of LBR information · 5624986d
      Kan Liang 提交于
      Current LBR information in the structure x86_perf_task_context is stored
      in a different format from the PEBS LBR record and Architecture LBR,
      which prevents the sharing of the common codes.
      
      Use the format of the PEBS LBR record as a unified format. Use a generic
      name lbr_entry to replace pebs_lbr_entry.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1593780569-62993-11-git-send-email-kan.liang@linux.intel.com
      5624986d
    • K
      perf/x86/intel/lbr: Support LBR_CTL · 49d8184f
      Kan Liang 提交于
      An IA32_LBR_CTL is introduced for Architecture LBR to enable and config
      LBR registers to replace the previous LBR_SELECT.
      
      All the related members in struct cpu_hw_events and struct x86_pmu
      have to be renamed.
      
      Some new macros are added to reflect the layout of LBR_CTL.
      
      The mapping from PERF_SAMPLE_BRANCH_* to the corresponding bits in
      LBR_CTL MSR is saved in lbr_ctl_map now, which is not a const value.
      The value relies on the CPUID enumeration.
      
      For the previous model-specific LBR, most of the bits in LBR_SELECT
      operate in the suppressed mode. For the bits in LBR_CTL, the polarity is
      inverted.
      
      For the previous model-specific LBR format 5 (LBR_FORMAT_INFO), if the
      NO_CYCLES and NO_FLAGS type are set, the flag LBR_NO_INFO will be set to
      avoid the unnecessary LBR_INFO MSR read. Although Architecture LBR also
      has a dedicated LBR_INFO MSR, perf doesn't need to check and set the
      flag LBR_NO_INFO. For Architecture LBR, XSAVES instruction will be used
      as the default way to read the LBR MSRs all together. The overhead which
      the flag tries to avoid doesn't exist anymore. Dropping the flag can
      save the extra check for the flag in the lbr_read() later, and make the
      code cleaner.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1593780569-62993-10-git-send-email-kan.liang@linux.intel.com
      49d8184f
    • K
      perf/x86/intel/lbr: Use dynamic data structure for task_ctx · f42be865
      Kan Liang 提交于
      The type of task_ctx is hardcoded as struct x86_perf_task_context,
      which doesn't apply for Architecture LBR. For example, Architecture LBR
      doesn't have the TOS MSR. The number of LBR entries is variable. A new
      struct will be introduced for Architecture LBR. Perf has to determine
      the type of task_ctx at run time.
      
      The type of task_ctx pointer is changed to 'void *', which will be
      determined at run time.
      
      The generic LBR optimization can be shared between Architecture LBR and
      model-specific LBR. Both need to access the structure for the generic
      LBR optimization. A helper task_context_opt() is introduced to retrieve
      the pointer of the structure at run time.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1593780569-62993-7-git-send-email-kan.liang@linux.intel.com
      f42be865
    • K
      perf/x86/intel/lbr: Factor out a new struct for generic optimization · 530bfff6
      Kan Liang 提交于
      To reduce the overhead of a context switch with LBR enabled, some
      generic optimizations were introduced, e.g. avoiding restore LBR if no
      one else touched them. The generic optimizations can also be used by
      Architecture LBR later. Currently, the fields for the generic
      optimizations are part of structure x86_perf_task_context, which will be
      deprecated by Architecture LBR. A new structure should be introduced
      for the common fields of generic optimization, which can be shared
      between Architecture LBR and model-specific LBR.
      
      Both 'valid_lbrs' and 'tos' are also used by the generic optimizations,
      but they are not moved into the new structure, because Architecture LBR
      is stack-like. The 'valid_lbrs' which records the index of the valid LBR
      is not required anymore. The TOS MSR will be removed.
      
      LBR registers may be cleared in the deep Cstate. If so, the generic
      optimizations should not be applied. Perf has to unconditionally
      restore the LBR registers. A generic function is required to detect the
      reset due to the deep Cstate. lbr_is_reset_in_cstate() is introduced.
      Currently, for the model-specific LBR, the TOS MSR is used to detect the
      reset. There will be another method introduced for Architecture LBR
      later.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1593780569-62993-6-git-send-email-kan.liang@linux.intel.com
      530bfff6
    • K
      perf/x86/intel/lbr: Add the function pointers for LBR save and restore · 799571bf
      Kan Liang 提交于
      The MSRs of Architectural LBR are different from previous model-specific
      LBR. Perf has to implement different functions to save and restore them.
      
      The function pointers for LBR save and restore are introduced. Perf
      should initialize the corresponding functions at boot time.
      
      The generic optimizations, e.g. avoiding restore LBR if no one else
      touched them, still apply for Architectural LBRs. The related codes are
      not moved to model-specific functions.
      
      Current model-specific LBR functions are set as default.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1593780569-62993-5-git-send-email-kan.liang@linux.intel.com
      799571bf
    • K
      perf/x86/intel/lbr: Add a function pointer for LBR read · c301b1d8
      Kan Liang 提交于
      The method to read Architectural LBRs is different from previous
      model-specific LBR. Perf has to implement a different function.
      
      A function pointer for LBR read is introduced. Perf should initialize
      the corresponding function at boot time, and avoid checking lbr_format
      at run time.
      
      The current 64-bit LBR read function is set as default.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1593780569-62993-4-git-send-email-kan.liang@linux.intel.com
      c301b1d8
    • K
      perf/x86/intel/lbr: Add a function pointer for LBR reset · 9f354a72
      Kan Liang 提交于
      The method to reset Architectural LBRs is different from previous
      model-specific LBR. Perf has to implement a different function.
      
      A function pointer is introduced for LBR reset. The enum of
      LBR_FORMAT_* is also moved to perf_event.h. Perf should initialize the
      corresponding functions at boot time, and avoid checking lbr_format at
      run time.
      
      The current 64-bit LBR reset function is set as default.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1593780569-62993-3-git-send-email-kan.liang@linux.intel.com
      9f354a72
  3. 02 7月, 2020 4 次提交
  4. 15 6月, 2020 7 次提交
  5. 28 5月, 2020 1 次提交
  6. 20 5月, 2020 3 次提交
  7. 01 5月, 2020 1 次提交
  8. 23 4月, 2020 1 次提交
  9. 08 4月, 2020 1 次提交
    • K
      perf/x86/intel/uncore: Add Ice Lake server uncore support · 2b3b76b5
      Kan Liang 提交于
      The uncore subsystem in Ice Lake server is similar to previous server.
      There are some differences in config register encoding and pci device
      IDs. The uncore PMON units in Ice Lake server include Ubox, Chabox, IIO,
      IRP, M2PCIE, PCU, M2M, PCIE3 and IMC.
      
       - For CHA, filter 1 register has been removed. The filter 0 register can
         be used by and of CHA events to be filterd by Thread/Core-ID. To do
         so, the control register's tid_en bit must be set to 1.
       - For IIO, there are some changes on event constraints. The MSR address
         and MSR offsets among counters are also changed.
       - For IRP, the MSR address and MSR offsets among counters are changed.
       - For M2PCIE, the counters are accessed by MSR now. Add new MSR address
         and MSR offsets. Change event constraints.
       - To determine the number of CHAs, have to read CAPID6(Low) and CAPID7
         (High) now.
       - For M2M, update the PCICFG address and Device ID.
       - For UPI, update the PCICFG address, Device ID and counter address.
       - For M3UPI, update the PCICFG address, Device ID, counter address and
         event constraints.
       - For IMC, update the formular to calculate MMIO BAR address, which is
         MMIO_BASE + specific MEM_BAR offset.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Link: https://lkml.kernel.org/r/1585842411-150452-1-git-send-email-kan.liang@linux.intel.com
      2b3b76b5
  10. 25 3月, 2020 1 次提交
  11. 20 3月, 2020 2 次提交
  12. 11 2月, 2020 1 次提交