1. 08 6月, 2022 4 次提交
  2. 25 5月, 2022 1 次提交
  3. 11 5月, 2022 1 次提交
  4. 05 4月, 2022 3 次提交
  5. 02 2月, 2022 2 次提交
  6. 18 1月, 2022 2 次提交
    • P
      perf/x86/intel/lbr: Support LBR format V7 · 1ac7fd81
      Peter Zijlstra (Intel) 提交于
      The Goldmont plus and Tremont have LBR format V7. The V7 has LBR_INFO,
      which is the same as LBR format V5. But V7 doesn't support TSX.
      
      Without the patch, the associated misprediction and cycles information
      in the LBR_INFO may be lost on a Goldmont plus platform.
      For Tremont, the patch only impacts the non-PEBS events. Because of the
      adaptive PEBS, the LBR_INFO is always processed for a PEBS event.
      
      Currently, two different ways are used to check the LBR capabilities,
      which make the codes complex and confusing.
      For the LBR format V4 and earlier, the global static lbr_desc array is
      used to store the flags for the LBR capabilities in each LBR format.
      For LBR format V5 and V6, the current code checks the version number
      for the LBR capabilities.
      
      There are common LBR capabilities among LBR format versions. Several
      flags for the LBR capabilities are introduced into the struct x86_pmu.
      The flags, which can be shared among LBR formats, are used to check
      the LBR capabilities. Add intel_pmu_lbr_init() to set the flags
      accordingly at boot time.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: NKan Liang <kan.liang@linux.intel.com>
      Link: https://lkml.kernel.org/r/1641315077-96661-1-git-send-email-peterz@infradead.org
      1ac7fd81
    • K
      perf/x86/intel: Add a quirk for the calculation of the number of counters on Alder Lake · 7fa981ca
      Kan Liang 提交于
      For some Alder Lake machine with all E-cores disabled in a BIOS, the
      below warning may be triggered.
      
      [ 2.010766] hw perf events fixed 5 > max(4), clipping!
      
      Current perf code relies on the CPUID leaf 0xA and leaf 7.EDX[15] to
      calculate the number of the counters and follow the below assumption.
      
      For a hybrid configuration, the leaf 7.EDX[15] (X86_FEATURE_HYBRID_CPU)
      is set. The leaf 0xA only enumerate the common counters. Linux perf has
      to manually add the extra GP counters and fixed counters for P-cores.
      For a non-hybrid configuration, the X86_FEATURE_HYBRID_CPU should not
      be set. The leaf 0xA enumerates all counters.
      
      However, that's not the case when all E-cores are disabled in a BIOS.
      Although there are only P-cores in the system, the leaf 7.EDX[15]
      (X86_FEATURE_HYBRID_CPU) is still set. But the leaf 0xA is updated
      to enumerate all counters of P-cores. The inconsistency triggers the
      warning.
      
      Several software ways were considered to handle the inconsistency.
      - Drop the leaf 0xA and leaf 7.EDX[15] CPUID enumeration support.
        Hardcode the number of counters. This solution may be a problem for
        virtualization. A hypervisor cannot control the number of counters
        in a Linux guest via changing the guest CPUID enumeration anymore.
      - Find another CPUID bit that is also updated with E-cores disabled.
        There may be a problem in the virtualization environment too. Because
        a hypervisor may disable the feature/CPUID bit.
      - The P-cores have a maximum of 8 GP counters and 4 fixed counters on
        ADL. The maximum number can be used to detect the case.
        This solution is implemented in this patch.
      
      Fixes: ee72a94e ("perf/x86/intel: Fix fixed counter check warning for some Alder Lake")
      Reported-by: NDamjan Marion (damarion) <damarion@cisco.com>
      Reported-by: yinchu's avatarChan Edison <edison_chan_gz@hotmail.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: NDamjan Marion (damarion) <damarion@cisco.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/1641925238-149288-1-git-send-email-kan.liang@linux.intel.com
      7fa981ca
  7. 17 11月, 2021 4 次提交
    • S
      perf: Add wrappers for invoking guest callbacks · 1c343051
      Sean Christopherson 提交于
      Add helpers for the guest callbacks to prepare for burying the callbacks
      behind a Kconfig (it's a lot easier to provide a few stubs than to #ifdef
      piles of code), and also to prepare for converting the callbacks to
      static_call().  perf_instruction_pointer() in particular will have subtle
      semantics with static_call(), as the "no callbacks" case will return 0 if
      the callbacks are unregistered between querying guest state and getting
      the IP.  Implement the change now to avoid a functional change when adding
      static_call() support, and because the new helper needs to return
      _something_ in this case.
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Link: https://lore.kernel.org/r/20211111020738.2512932-8-seanjc@google.com
      1c343051
    • L
      perf/core: Rework guest callbacks to prepare for static_call support · b9f5621c
      Like Xu 提交于
      To prepare for using static_calls to optimize perf's guest callbacks,
      replace ->is_in_guest and ->is_user_mode with a new multiplexed hook
      ->state, tweak ->handle_intel_pt_intr to play nice with being called when
      there is no active guest, and drop "guest" from ->get_guest_ip.
      
      Return '0' from ->state and ->handle_intel_pt_intr to indicate "not in
      guest" so that DEFINE_STATIC_CALL_RET0 can be used to define the static
      calls, i.e. no callback == !guest.
      
      [sean: extracted from static_call patch, fixed get_ip() bug, wrote changelog]
      Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Originally-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NLike Xu <like.xu@linux.intel.com>
      Signed-off-by: NZhu Lingshan <lingshan.zhu@intel.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Link: https://lore.kernel.org/r/20211111020738.2512932-7-seanjc@google.com
      b9f5621c
    • S
      perf: Protect perf_guest_cbs with RCU · ff083a2d
      Sean Christopherson 提交于
      Protect perf_guest_cbs with RCU to fix multiple possible errors.  Luckily,
      all paths that read perf_guest_cbs already require RCU protection, e.g. to
      protect the callback chains, so only the direct perf_guest_cbs touchpoints
      need to be modified.
      
      Bug #1 is a simple lack of WRITE_ONCE/READ_ONCE behavior to ensure
      perf_guest_cbs isn't reloaded between a !NULL check and a dereference.
      Fixed via the READ_ONCE() in rcu_dereference().
      
      Bug #2 is that on weakly-ordered architectures, updates to the callbacks
      themselves are not guaranteed to be visible before the pointer is made
      visible to readers.  Fixed by the smp_store_release() in
      rcu_assign_pointer() when the new pointer is non-NULL.
      
      Bug #3 is that, because the callbacks are global, it's possible for
      readers to run in parallel with an unregisters, and thus a module
      implementing the callbacks can be unloaded while readers are in flight,
      resulting in a use-after-free.  Fixed by a synchronize_rcu() call when
      unregistering callbacks.
      
      Bug #1 escaped notice because it's extremely unlikely a compiler will
      reload perf_guest_cbs in this sequence.  perf_guest_cbs does get reloaded
      for future derefs, e.g. for ->is_user_mode(), but the ->is_in_guest()
      guard all but guarantees the consumer will win the race, e.g. to nullify
      perf_guest_cbs, KVM has to completely exit the guest and teardown down
      all VMs before KVM start its module unload / unregister sequence.  This
      also makes it all but impossible to encounter bug #3.
      
      Bug #2 has not been a problem because all architectures that register
      callbacks are strongly ordered and/or have a static set of callbacks.
      
      But with help, unloading kvm_intel can trigger bug #1 e.g. wrapping
      perf_guest_cbs with READ_ONCE in perf_misc_flags() while spamming
      kvm_intel module load/unload leads to:
      
        BUG: kernel NULL pointer dereference, address: 0000000000000000
        #PF: supervisor read access in kernel mode
        #PF: error_code(0x0000) - not-present page
        PGD 0 P4D 0
        Oops: 0000 [#1] PREEMPT SMP
        CPU: 6 PID: 1825 Comm: stress Not tainted 5.14.0-rc2+ #459
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
        RIP: 0010:perf_misc_flags+0x1c/0x70
        Call Trace:
         perf_prepare_sample+0x53/0x6b0
         perf_event_output_forward+0x67/0x160
         __perf_event_overflow+0x52/0xf0
         handle_pmi_common+0x207/0x300
         intel_pmu_handle_irq+0xcf/0x410
         perf_event_nmi_handler+0x28/0x50
         nmi_handle+0xc7/0x260
         default_do_nmi+0x6b/0x170
         exc_nmi+0x103/0x130
         asm_exc_nmi+0x76/0xbf
      
      Fixes: 39447b38 ("perf: Enhance perf to allow for guest statistic collection from host")
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20211111020738.2512932-2-seanjc@google.com
      ff083a2d
    • S
      x86/perf: Fix snapshot_branch_stack warning in VM · f3fd84a3
      Song Liu 提交于
      When running in VM intel_pmu_snapshot_branch_stack triggers WRMSR warning
      like:
      
       [ ] unchecked MSR access error: WRMSR to 0x3f1 (tried to write 0x0000000000000000) at rIP: 0xffffffff81011a5b (intel_pmu_snapshot_branch_stack+0x3b/0xd0)
      
      This can be triggered with BPF selftests:
      
        tools/testing/selftests/bpf/test_progs -t get_branch_snapshot
      
      This warning is caused by __intel_pmu_pebs_disable_all() in the VM.
      Since it is not necessary to disable PEBS for LBR, remove it from
      intel_pmu_snapshot_branch_stack and intel_pmu_snapshot_arch_branch_stack.
      
      Fixes: c22ac2a3 ("perf: Enable branch record for software events")
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: NLike Xu <likexu@tencent.com>
      Link: https://lore.kernel.org/r/20211112054510.2667030-1-songliubraving@fb.com
      f3fd84a3
  8. 11 11月, 2021 1 次提交
  9. 30 10月, 2021 1 次提交
  10. 15 10月, 2021 1 次提交
  11. 01 10月, 2021 1 次提交
  12. 14 9月, 2021 1 次提交
  13. 26 8月, 2021 1 次提交
  14. 06 8月, 2021 1 次提交
    • K
      perf/x86/intel: Apply mid ACK for small core · acade637
      Kan Liang 提交于
      A warning as below may be occasionally triggered in an ADL machine when
      these conditions occur:
      
       - Two perf record commands run one by one. Both record a PEBS event.
       - Both runs on small cores.
       - They have different adaptive PEBS configuration (PEBS_DATA_CFG).
      
        [ ] WARNING: CPU: 4 PID: 9874 at arch/x86/events/intel/ds.c:1743 setup_pebs_adaptive_sample_data+0x55e/0x5b0
        [ ] RIP: 0010:setup_pebs_adaptive_sample_data+0x55e/0x5b0
        [ ] Call Trace:
        [ ]  <NMI>
        [ ]  intel_pmu_drain_pebs_icl+0x48b/0x810
        [ ]  perf_event_nmi_handler+0x41/0x80
        [ ]  </NMI>
        [ ]  __perf_event_task_sched_in+0x2c2/0x3a0
      
      Different from the big core, the small core requires the ACK right
      before re-enabling counters in the NMI handler, otherwise a stale PEBS
      record may be dumped into the later NMI handler, which trigger the
      warning.
      
      Add a new mid_ack flag to track the case. Add all PMI handler bits in
      the struct x86_hybrid_pmu to track the bits for different types of
      PMUs.  Apply mid ACK for the small cores on an Alder Lake machine.
      
      The existing hybrid() macro has a compile error when taking address of
      a bit-field variable. Add a new macro hybrid_bit() to get the
      bit-field value of a given PMU.
      
      Fixes: f83d2f91 ("perf/x86/intel: Add Alder Lake Hybrid support")
      Reported-by: NAmmy Yi <ammy.yi@intel.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NAmmy Yi <ammy.yi@intel.com>
      Link: https://lkml.kernel.org/r/1627997128-57891-1-git-send-email-kan.liang@linux.intel.com
      acade637
  15. 24 6月, 2021 3 次提交
  16. 15 6月, 2021 1 次提交
  17. 18 5月, 2021 1 次提交
  18. 22 4月, 2021 1 次提交
  19. 20 4月, 2021 10 次提交