1. 11 1月, 2017 2 次提交
  2. 05 1月, 2017 1 次提交
    • D
      perf/x86: Set pmu->module in Intel PMU modules · 74545f63
      David Carrillo-Cisneros 提交于
      The conversion of Intel PMU drivers into modules did not include reference
      counting. The machine will crash when attempting to  access deleted code
      if an event from a module PMU is started and the module removed before the
      event is destroyed.
      
      i.e. this crashes the machine:
      
      	$ insmod intel-rapl-perf.ko
      	$ perf stat -e power/energy-cores/ -C 0 &
      	$ rmmod intel-rapl-perf.ko
      
      Set THIS_MODULE to pmu->module in Intel module PMUs so that generic code
      can handle reference counting and deny rmmod while an event still exists.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1482455860-116269-1-git-send-email-davidcc@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      74545f63
  3. 25 12月, 2016 2 次提交
  4. 23 12月, 2016 2 次提交
    • P
      perf/x86: Fix overlap counter scheduling bug · 1134c2b5
      Peter Zijlstra 提交于
      Jiri reported the overlap scheduling exceeding its max stack.
      
      Looking at the constraint that triggered this, it turns out the
      overlap marker isn't needed.
      
      The comment with EVENT_CONSTRAINT_OVERLAP states: "This is the case if
      the counter mask of such an event is not a subset of any other counter
      mask of a constraint with an equal or higher weight".
      
      Esp. that latter part is of interest here I think, our overlapping mask
      is 0x0e, that has 3 bits set and is the highest weight mask in on the
      PMU, therefore it will be placed last. Can we still create a scenario
      where we would need to rewind that?
      
      The scenario for AMD Fam15h is we're having masks like:
      
      	0x3F -- 111111
      	0x38 -- 111000
      	0x07 -- 000111
      
      	0x09 -- 001001
      
      And we mark 0x09 as overlapping, because it is not a direct subset of
      0x38 or 0x07 and has less weight than either of those. This means we'll
      first try and place the 0x09 event, then try and place 0x38/0x07 events.
      Now imagine we have:
      
      	3 * 0x07 + 0x09
      
      and the initial pick for the 0x09 event is counter 0, then we'll fail to
      place all 0x07 events. So we'll pop back, try counter 4 for the 0x09
      event, and then re-try all 0x07 events, which will now work.
      
      The masks on the PMU in question are:
      
        0x01 - 0001
        0x03 - 0011
        0x0e - 1110
        0x0c - 1100
      
      But since all the masks that have overlap (0xe -> {0xc,0x3}) and (0x3 ->
      0x1) are of heavier weight, it should all work out.
      Reported-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NJiri Olsa <jolsa@kernel.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Liang Kan <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vince@deater.net>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/20161109155153.GQ3142@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1134c2b5
    • S
      perf/x86/pebs: Fix handling of PEBS buffer overflows · daa864b8
      Stephane Eranian 提交于
      This patch solves a race condition between PEBS and the PMU handler.
      
      In case multiple PEBS events are sampled at the same time,
      it is possible to have GLOBAL_STATUS bit 62 set indicating
      PEBS buffer overflow and also seeing at most 3 PEBS counters
      having their bits set in the status register. This is a sign
      that there was at least one PEBS record pending at the time
      of the PMU interrupt. PEBS counters must only be processed
      via the drain_pebs() calls, and not via the regular sample
      processing loop coming after that the function, otherwise
      phony regular samples may be generated in the sampling buffer
      not marked with the EXACT tag.
      
      Another possibility is to have one PEBS event and at least
      one non-PEBS event whic hoverflows while PEBS has armed. In this
      case, bit 62 of GLOBAL_STATUS will not be set, yet the overflow
      status bit for the PEBS counter will be on Skylake.
      
      To avoid this problem, we systematically ignore the PEBS-enabled
      counters from the GLOBAL_STATUS mask and we always process PEBS
      events via drain_pebs().
      
      The problem manifested itself by having non-exact samples when
      sampling only PEBS events, i.e., the PERF_SAMPLE_RECORD would
      not have the EXACT flag set.
      
      Note that this problem is only present on Skylake processor.
      This fix is harmless on older processors.
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1482395366-8992-1-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      daa864b8
  5. 06 12月, 2016 2 次提交
  6. 22 11月, 2016 2 次提交
    • P
      perf/x86/intel/uncore: Allow only a single PMU/box within an events group · 033ac60c
      Peter Zijlstra 提交于
      Group validation expects all events to be of the same PMU; however
      is_uncore_pmu() is too wide, it matches _all_ uncore events, even
      across PMUs.
      
      This triggers failure when we group different events from different
      uncore PMUs, like:
      
        perf stat -vv -e '{uncore_cbox_0/config=0x0334/,uncore_qpi_0/event=1/}' -a sleep 1
      
      Fix is_uncore_pmu() by only matching events to the box at hand.
      
      Note that generic code; ran after this step; will disallow this
      mixture of PMU events.
      Reported-by: NJiri Olsa <jolsa@redhat.com>
      Tested-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vince@deater.net>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/20161118125354.GQ3117@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      033ac60c
    • P
      perf/x86/intel: Cure bogus unwind from PEBS entries · b8000586
      Peter Zijlstra 提交于
      Vince Weaver reported that perf_fuzzer + KASAN detects that PEBS event
      unwinds sometimes do 'weird' things. In particular, we seemed to be
      ending up unwinding from random places on the NMI stack.
      
      While it was somewhat expected that the event record BP,SP would not
      match the interrupt BP,SP in that the interrupt is strictly later than
      the record event, it was overlooked that it could be on an already
      overwritten stack.
      
      Therefore, don't copy the recorded BP,SP over the interrupted BP,SP
      when we need stack unwinds.
      
      Note that its still possible the unwind doesn't full match the actual
      event, as its entirely possible to have done an (I)RET between record
      and interrupt, but on average it should still point in the general
      direction of where the event came from. Also, it's the best we can do,
      considering.
      
      The particular scenario that triggered the bogus NMI stack unwind was
      a PEBS event with very short period, upon enabling the event at the
      tail of the PMI handler (FREEZE_ON_PMI is not used), it instantly
      triggers a record (while still on the NMI stack) which in turn
      triggers the next PMI. This then causes back-to-back NMIs and we'll
      try and unwind the stack-frame from the last NMI, which obviously is
      now overwritten by our own.
      Analyzed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: davej@codemonkey.org.uk <davej@codemonkey.org.uk>
      Cc: dvyukov@google.com <dvyukov@google.com>
      Cc: stable@vger.kernel.org
      Fixes: ca037701 ("perf, x86: Add PEBS infrastructure")
      Link: http://lkml.kernel.org/r/20161117171731.GV3157@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b8000586
  7. 16 11月, 2016 2 次提交
  8. 11 11月, 2016 1 次提交
  9. 28 10月, 2016 1 次提交
    • I
      perf/x86/intel: Honour the CPUID for number of fixed counters in hypervisors · f92b7604
      Imre Palik 提交于
      perf doesn't seem to honour the number of fixed counters specified by CPUID
      leaf 0xa. It always assumes that Intel CPUs have at least 3 fixed counters.
      
      So if some of the fixed counters are masked out by the hypervisor, it still
      tries to check/set them.
      
      This patch makes perf behave nicer when the kernel is running under a
      hypervisor that doesn't expose all the counters.
      
      This patch contains some ideas from Matt Wilson.
      Signed-off-by: NImre Palik <imrep@amazon.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Kozyrev <alexander.kozyrev@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Artyom Kuanbekov <artyom.kuanbekov@intel.com>
      Cc: David Carrillo-Cisneros <davidcc@google.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Wilson <msw@amazon.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1477037939-15605-1-git-send-email-imrep.amz@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f92b7604
  10. 27 10月, 2016 2 次提交
    • F
      x86/cqm: Share PQR_ASSOC related data between CQM and CAT · 6b281569
      Fenghua Yu 提交于
      PQR_ASSOC MSR contains the RMID used for preformance monitoring of cache
      occupancy and memory bandwidth. The upper 32bit of this MSR contain the
      CLOSID for cache allocation. So we need to share the information between
      the two facilities.
      
      Move the rdt data structure declaration into the shared header file and
      make the per cpu data structure containing the MSR values global.
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
      Cc: "Tony Luck" <tony.luck@intel.com>
      Cc: "David Carrillo-Cisneros" <davidcc@google.com>
      Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
      Cc: "Peter Zijlstra" <peterz@infradead.org>
      Cc: "Stephane Eranian" <eranian@google.com>
      Cc: "Dave Hansen" <dave.hansen@intel.com>
      Cc: "Shaohua Li" <shli@fb.com>
      Cc: "Nilay Vaish" <nilayvaish@gmail.com>
      Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
      Cc: "Ingo Molnar" <mingo@elte.hu>
      Cc: "Borislav Petkov" <bp@suse.de>
      Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
      Link: http://lkml.kernel.org/r/1477142405-32078-10-git-send-email-fenghua.yu@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      6b281569
    • F
      x86/intel_rdt: Add Haswell feature discovery · 113c6097
      Fenghua Yu 提交于
      Some Haswell generation CPUs support RDT, but they don't enumerate this via
      CPUID.  Use rdmsr_safe() and wrmsr_safe() to probe the MSRs on cpu model 63
      (INTEL_FAM6_HASWELL_X)
      
      Move the relevant defines into a common header file which is shared between
      RDT/CQM and RDT/Allocation to avoid duplication.
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
      Cc: "Tony Luck" <tony.luck@intel.com>
      Cc: "David Carrillo-Cisneros" <davidcc@google.com>
      Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
      Cc: "Peter Zijlstra" <peterz@infradead.org>
      Cc: "Stephane Eranian" <eranian@google.com>
      Cc: "Dave Hansen" <dave.hansen@intel.com>
      Cc: "Shaohua Li" <shli@fb.com>
      Cc: "Nilay Vaish" <nilayvaish@gmail.com>
      Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
      Cc: "Ingo Molnar" <mingo@elte.hu>
      Cc: "Borislav Petkov" <bp@suse.de>
      Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
      Link: http://lkml.kernel.org/r/1477142405-32078-8-git-send-email-fenghua.yu@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      113c6097
  11. 19 10月, 2016 1 次提交
  12. 17 10月, 2016 3 次提交
  13. 16 10月, 2016 1 次提交
    • D
      perf/x86/intel: Remove an inconsistent NULL check · 5c38181c
      Dan Carpenter 提交于
      Smatch complains that we don't check "event->ctx" consistently.  It's
      never NULL so we can just remove the check.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Carrillo-Cisneros <davidcc@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: kernel-janitors@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5c38181c
  14. 22 9月, 2016 1 次提交
  15. 20 9月, 2016 2 次提交
  16. 16 9月, 2016 3 次提交
  17. 15 9月, 2016 1 次提交
  18. 10 9月, 2016 6 次提交
  19. 06 9月, 2016 1 次提交
  20. 05 9月, 2016 2 次提交
  21. 18 8月, 2016 2 次提交