1. 17 4月, 2019 3 次提交
    • L
      x86/perf/amd: Remove need to check "running" bit in NMI handler · 05626465
      Lendacky, Thomas 提交于
      commit 3966c3feca3fd10b2935caa0b4a08c7dd59469e5 upstream.
      
      Spurious interrupt support was added to perf in the following commit, almost
      a decade ago:
      
        63e6be6d ("perf, x86: Catch spurious interrupts after disabling counters")
      
      The two previous patches (resolving the race condition when disabling a
      PMC and NMI latency mitigation) allow for the removal of this older
      spurious interrupt support.
      
      Currently in x86_pmu_stop(), the bit for the PMC in the active_mask bitmap
      is cleared before disabling the PMC, which sets up a race condition. This
      race condition was mitigated by introducing the running bitmap. That race
      condition can be eliminated by first disabling the PMC, waiting for PMC
      reset on overflow and then clearing the bit for the PMC in the active_mask
      bitmap. The NMI handler will not re-enable a disabled counter.
      
      If x86_pmu_stop() is called from the perf NMI handler, the NMI latency
      mitigation support will guard against any unhandled NMI messages.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@vger.kernel.org> # 4.14.x-
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/Message-ID:
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      05626465
    • L
      x86/perf/amd: Resolve NMI latency issues for active PMCs · 23d39b0a
      Lendacky, Thomas 提交于
      commit 6d3edaae16c6c7d238360f2841212c2b26774d5e upstream.
      
      On AMD processors, the detection of an overflowed PMC counter in the NMI
      handler relies on the current value of the PMC. So, for example, to check
      for overflow on a 48-bit counter, bit 47 is checked to see if it is 1 (not
      overflowed) or 0 (overflowed).
      
      When the perf NMI handler executes it does not know in advance which PMC
      counters have overflowed. As such, the NMI handler will process all active
      PMC counters that have overflowed. NMI latency in newer AMD processors can
      result in multiple overflowed PMC counters being processed in one NMI and
      then a subsequent NMI, that does not appear to be a back-to-back NMI, not
      finding any PMC counters that have overflowed. This may appear to be an
      unhandled NMI resulting in either a panic or a series of messages,
      depending on how the kernel was configured.
      
      To mitigate this issue, add an AMD handle_irq callback function,
      amd_pmu_handle_irq(), that will invoke the common x86_pmu_handle_irq()
      function and upon return perform some additional processing that will
      indicate if the NMI has been handled or would have been handled had an
      earlier NMI not handled the overflowed PMC. Using a per-CPU variable, a
      minimum value of the number of active PMCs or 2 will be set whenever a
      PMC is active. This is used to indicate the possible number of NMIs that
      can still occur. The value of 2 is used for when an NMI does not arrive
      at the LAPIC in time to be collapsed into an already pending NMI. Each
      time the function is called without having handled an overflowed counter,
      the per-CPU value is checked. If the value is non-zero, it is decremented
      and the NMI indicates that it handled the NMI. If the value is zero, then
      the NMI indicates that it did not handle the NMI.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@vger.kernel.org> # 4.14.x-
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/Message-ID:
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      23d39b0a
    • L
      x86/perf/amd: Resolve race condition when disabling PMC · e5a791b4
      Lendacky, Thomas 提交于
      commit 914123fa39042e651d79eaf86bbf63a1b938dddf upstream.
      
      On AMD processors, the detection of an overflowed counter in the NMI
      handler relies on the current value of the counter. So, for example, to
      check for overflow on a 48 bit counter, bit 47 is checked to see if it
      is 1 (not overflowed) or 0 (overflowed).
      
      There is currently a race condition present when disabling and then
      updating the PMC. Increased NMI latency in newer AMD processors makes this
      race condition more pronounced. If the counter value has overflowed, it is
      possible to update the PMC value before the NMI handler can run. The
      updated PMC value is not an overflowed value, so when the perf NMI handler
      does run, it will not find an overflowed counter. This may appear as an
      unknown NMI resulting in either a panic or a series of messages, depending
      on how the kernel is configured.
      
      To eliminate this race condition, the PMC value must be checked after
      disabling the counter. Add an AMD function, amd_pmu_disable_all(), that
      will wait for the NMI handler to reset any active and overflowed counter
      after calling x86_pmu_disable_all().
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@vger.kernel.org> # 4.14.x-
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/Message-ID:
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e5a791b4
  2. 01 3月, 2017 1 次提交
  3. 18 11月, 2016 1 次提交
  4. 16 9月, 2016 1 次提交
  5. 14 7月, 2016 1 次提交
  6. 28 4月, 2016 1 次提交
    • A
      perf/x86/amd: Set the size of event map array to PERF_COUNT_HW_MAX · 0a25556f
      Adam Borowski 提交于
      The entry for PERF_COUNT_HW_REF_CPU_CYCLES is not used on AMD, but is
      referenced by filter_events() which expects undefined events to have a
      value of 0.
      
      Found via KASAN:
      
        UBSAN: Undefined behaviour in arch/x86/events/amd/core.c:132:30
        index 9 is out of range for type 'u64 [9]'
        UBSAN: Undefined behaviour in arch/x86/events/amd/core.c:132:9
        load of address ffffffff81c021c8 with insufficient space for an object of type 'const u64'
      Signed-off-by: NAdam Borowski <kilobyte@angband.pl>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1461749731-30979-1-git-send-email-kilobyte@angband.plSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0a25556f
  7. 29 3月, 2016 1 次提交
  8. 17 2月, 2016 1 次提交
  9. 09 2月, 2016 1 次提交
  10. 06 1月, 2016 1 次提交
  11. 19 12月, 2015 1 次提交
  12. 02 4月, 2015 2 次提交
  13. 27 8月, 2014 1 次提交
    • C
      x86: Replace __get_cpu_var uses · 89cbc767
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86@kernel.org
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      89cbc767
  14. 02 9月, 2013 1 次提交
  15. 28 5月, 2013 1 次提交
  16. 21 4月, 2013 1 次提交
  17. 16 2月, 2013 1 次提交
  18. 07 2月, 2013 5 次提交
  19. 24 10月, 2012 1 次提交
  20. 06 7月, 2012 1 次提交
  21. 18 5月, 2012 1 次提交
  22. 09 5月, 2012 1 次提交
    • R
      perf/x86-ibs: Precise event sampling with IBS for AMD CPUs · 450bbd49
      Robert Richter 提交于
      This patch adds support for precise event sampling with IBS. There are
      two counting modes to count either cycles or micro-ops. If the
      corresponding performance counter events (hw events) are setup with
      the precise flag set, the request is redirected to the ibs pmu:
      
       perf record -a -e cpu-cycles:p ...    # use ibs op counting cycle count
       perf record -a -e r076:p ...          # same as -e cpu-cycles:p
       perf record -a -e r0C1:p ...          # use ibs op counting micro-ops
      
      Each ibs sample contains a linear address that points to the
      instruction that was causing the sample to trigger. With ibs we have
      skid 0. Thus, ibs supports precise levels 1 and 2. Samples are marked
      with the PERF_EFLAGS_EXACT flag set. In rare cases the rip is invalid
      when IBS was not able to record the rip correctly. Then the
      PERF_EFLAGS_EXACT flag is cleared and the rip is taken from pt_regs.
      
      V2:
      * don't drop samples in precise level 2 if rip is invalid, instead
        support the PERF_EFLAGS_EXACT flag
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20120502103309.GP18810@erda.amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      450bbd49
  23. 26 4月, 2012 1 次提交
  24. 17 3月, 2012 1 次提交
  25. 05 3月, 2012 1 次提交
  26. 02 3月, 2012 1 次提交
  27. 06 12月, 2011 1 次提交
    • R
      perf, x86: Fix event scheduler for constraints with overlapping counters · bc1738f6
      Robert Richter 提交于
      The current x86 event scheduler fails to resolve scheduling problems
      of certain combinations of events and constraints. This happens if the
      counter mask of such an event is not a subset of any other counter
      mask of a constraint with an equal or higher weight, e.g. constraints
      of the AMD family 15h pmu:
      
                              counter mask    weight
      
       amd_f15_PMC30          0x09            2  <--- overlapping counters
       amd_f15_PMC20          0x07            3
       amd_f15_PMC53          0x38            3
      
      The scheduler does not find then an existing solution. Here is an
      example:
      
       event code     counter         failure         possible solution
      
       0x02E          PMC[3,0]        0               3
       0x043          PMC[2:0]        1               0
       0x045          PMC[2:0]        2               1
       0x046          PMC[2:0]        FAIL            2
      
      The event scheduler may not select the correct counter in the first
      cycle because it needs to know which subsequent events will be
      scheduled. It may fail to schedule the events then.
      
      To solve this, we now save the scheduler state of events with
      overlapping counter counstraints.  If we fail to schedule the events
      we rollback to those states and try to use another free counter.
      
      Constraints with overlapping counters are marked with a new introduced
      overlap flag. We set the overlap flag for such constraints to give the
      scheduler a hint which events to select for counter rescheduling. The
      EVENT_CONSTRAINT_OVERLAP() macro can be used for this.
      
      Care must be taken as the rescheduling algorithm is O(n!) which will
      increase scheduling cycles for an over-commited system dramatically.
      The number of such EVENT_CONSTRAINT_OVERLAP() macros and its counter
      masks must be kept at a minimum. Thus, the current stack is limited to
      2 states to limit the number of loops the algorithm takes in the worst
      case.
      
      On systems with no overlapping-counter constraints, this
      implementation does not increase the loop count compared to the
      previous algorithm.
      
      V2:
      * Renamed redo -> overlap.
      * Reimplementation using perf scheduling helper functions.
      
      V3:
      * Added WARN_ON_ONCE() if out of save states.
      * Changed function interface of perf_sched_restore_state() to use bool
        as return value.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1321616122-1533-3-git-send-email-robert.richter@amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      bc1738f6
  28. 10 10月, 2011 1 次提交
  29. 06 10月, 2011 1 次提交
  30. 28 9月, 2011 1 次提交
  31. 26 9月, 2011 1 次提交
  32. 14 8月, 2011 1 次提交
  33. 01 7月, 2011 1 次提交
    • P
      perf, arch: Add generic NODE cache events · 89d6c0b5
      Peter Zijlstra 提交于
      Add a NODE level to the generic cache events which is used to measure
      local vs remote memory accesses. Like all other cache events, an
      ACCESS is HIT+MISS, if there is no way to distinguish between reads
      and writes do reads only etc..
      
      The below needs filling out for !x86 (which I filled out with
      unsupported events).
      
      I'm fairly sure ARM can leave it like that since it doesn't strike me as
      an architecture that even has NUMA support. SH might have something since
      it does appear to have some NUMA bits.
      
      Sparc64, PowerPC and MIPS certainly want a good look there since they
      clearly are NUMA capable.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: David Miller <davem@davemloft.net>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: David Daney <ddaney@caviumnetworks.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1303508226.4865.8.camel@laptopSigned-off-by: NIngo Molnar <mingo@elte.hu>
      89d6c0b5