1. 18 7月, 2017 1 次提交
    • H
      perf/x86/intel: Enable C-state residency events for Apollo Lake · 5c10b048
      Harry Pan 提交于
      Goldmont microarchitecture supports C1/C3/C6, PC2/PC3/PC6/PC10 state
      residency counters, the patch enables them for Apollo Lake platform.
      
      The MSR information is based on Intel Software Developers' Manual,
      Vol. 4, Order No. 335592, Table 2-6 and 2-12.
      Signed-off-by: NHarry Pan <harry.pan@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bp@suse.de
      Cc: davidcc@google.com
      Cc: gs0622@gmail.com
      Cc: lukasz.odzioba@intel.com
      Cc: piotr.luc@intel.com
      Cc: srinivas.pandruvada@linux.intel.com
      Link: http://lkml.kernel.org/r/20170717103749.24337-1-harry.pan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5c10b048
  2. 30 6月, 2017 2 次提交
  3. 22 6月, 2017 1 次提交
  4. 08 6月, 2017 1 次提交
  5. 05 6月, 2017 1 次提交
    • A
      x86/mm: Rework lazy TLB to track the actual loaded mm · 3d28ebce
      Andy Lutomirski 提交于
      Lazy TLB state is currently managed in a rather baroque manner.
      AFAICT, there are three possible states:
      
       - Non-lazy.  This means that we're running a user thread or a
         kernel thread that has called use_mm().  current->mm ==
         current->active_mm == cpu_tlbstate.active_mm and
         cpu_tlbstate.state == TLBSTATE_OK.
      
       - Lazy with user mm.  We're running a kernel thread without an mm
         and we're borrowing an mm_struct.  We have current->mm == NULL,
         current->active_mm == cpu_tlbstate.active_mm, cpu_tlbstate.state
         != TLBSTATE_OK (i.e. TLBSTATE_LAZY or 0).  The current cpu is set
         in mm_cpumask(current->active_mm).  CR3 points to
         current->active_mm->pgd.  The TLB is up to date.
      
       - Lazy with init_mm.  This happens when we call leave_mm().  We
         have current->mm == NULL, current->active_mm ==
         cpu_tlbstate.active_mm, but that mm is only relelvant insofar as
         the scheduler is tracking it for refcounting.  cpu_tlbstate.state
         != TLBSTATE_OK.  The current cpu is clear in
         mm_cpumask(current->active_mm).  CR3 points to swapper_pg_dir,
         i.e. init_mm->pgd.
      
      This patch simplifies the situation.  Other than perf, x86 stops
      caring about current->active_mm at all.  We have
      cpu_tlbstate.loaded_mm pointing to the mm that CR3 references.  The
      TLB is always up to date for that mm.  leave_mm() just switches us
      to init_mm.  There are no longer any special cases for mm_cpumask,
      and switch_mm() switches mms without worrying about laziness.
      
      After this patch, cpu_tlbstate.state serves only to tell the TLB
      flush code whether it may switch to init_mm instead of doing a
      normal flush.
      
      This makes fairly extensive changes to xen_exit_mmap(), which used
      to look a bit like black magic.
      
      Perf is unchanged.  With or without this change, perf may behave a bit
      erratically if it tries to read user memory in kernel thread context.
      We should build on this patch to teach perf to never look at user
      memory when cpu_tlbstate.loaded_mm != current->mm.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      3d28ebce
  6. 26 5月, 2017 3 次提交
  7. 23 5月, 2017 1 次提交
    • K
      perf/x86: Add sysfs entry to freeze counters on SMI · 6089327f
      Kan Liang 提交于
      Currently, the SMIs are visible to all performance counters, because
      many users want to measure everything including SMIs. But in some
      cases, the SMI cycles should not be counted - for example, to calculate
      the cost of an SMI itself. So a knob is needed.
      
      When setting FREEZE_WHILE_SMM bit in IA32_DEBUGCTL, all performance
      counters will be effected. There is no way to do per-counter freeze
      on SMI. So it should not use the per-event interface (e.g. ioctl or
      event attribute) to set FREEZE_WHILE_SMM bit.
      
      Adds sysfs entry /sys/device/cpu/freeze_on_smi to set FREEZE_WHILE_SMM
      bit in IA32_DEBUGCTL. When set, freezes perfmon and trace messages
      while in SMM.
      
      Value has to be 0 or 1. It will be applied to all processors.
      
      Also serialize the entire setting so we don't get multiple concurrent
      threads trying to update to different values.
      Signed-off-by: NKan Liang <Kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: bp@alien8.de
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1494600673-244667-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6089327f
  8. 15 5月, 2017 1 次提交
    • P
      x86/tsc: Remodel cyc2ns to use seqcount_latch() · 59eaef78
      Peter Zijlstra 提交于
      Replace the custom multi-value scheme with the more regular
      seqcount_latch() scheme. Along with scrapping a lot of lines, the latch
      scheme is better documented and used in more places.
      
      The immediate benefit however is not being limited on the update side.
      The current code has a limit where the writers block which is hit by
      future changes.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      59eaef78
  9. 03 5月, 2017 1 次提交
    • V
      perf/x86: Fix Broadwell-EP DRAM RAPL events · 33b88e70
      Vince Weaver 提交于
      It appears as though the Broadwell-EP DRAM units share the special
      units quirk with Haswell-EP/KNL.
      
      Without this patch, you get really high results (a single DRAM using 20W
      of power).
      
      The powercap driver in drivers/powercap/intel_rapl.c already has this
      change.
      Signed-off-by: NVince Weaver <vincent.weaver@maine.edu>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      33b88e70
  10. 14 4月, 2017 2 次提交
    • K
      perf/x86: Fix spurious NMI with PEBS Load Latency event · fd583ad1
      Kan Liang 提交于
      Spurious NMIs will be observed with the following command:
      
        while :; do
          perf record -bae "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp"
                        -e "cpu/umask=0x03,event=0x0/"
                        -e "cpu/umask=0x02,event=0x0/"
                        -e cycles,branches,cache-misses
                        -e cache-references -- sleep 10
        done
      
      The bug was introduced by commit:
      
        8077eca0 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")
      
      That commit clears the status bits for the counters used for PEBS
      events, by masking the whole 64 bits pebs_enabled. However, only the
      low 32 bits of both status and pebs_enabled are reserved for PEBS-able
      counters.
      
      For status bits 32-34 are fixed counter overflow bits. For
      pebs_enabled bits 32-34 are for PEBS Load Latency.
      
      In the test case, the PEBS Load Latency event and fixed counter event
      could overflow at the same time. The fixed counter overflow bit will
      be cleared by mistake. Once it is cleared, the fixed counter overflow
      never be processed, which finally trigger spurious NMI.
      
      Correct the PEBS enabled mask by ignoring the non-PEBS bits.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: 8077eca0 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")
      Link: http://lkml.kernel.org/r/1491333246-3965-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fd583ad1
    • P
      perf/x86: Avoid exposing wrong/stale data in intel_pmu_lbr_read_32() · f2200ac3
      Peter Zijlstra 提交于
      When the perf_branch_entry::{in_tx,abort,cycles} fields were added,
      intel_pmu_lbr_read_32() wasn't updated to initialize them.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Cc: <stable@vger.kernel.org>
      Fixes: 135c5612 ("perf/x86/intel: Support Haswell/v4 LBR format")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f2200ac3
  11. 11 4月, 2017 3 次提交
  12. 30 3月, 2017 10 次提交
  13. 23 3月, 2017 1 次提交
  14. 17 3月, 2017 2 次提交
  15. 16 3月, 2017 3 次提交
  16. 02 3月, 2017 2 次提交
  17. 01 3月, 2017 1 次提交
  18. 12 2月, 2017 1 次提交
  19. 01 2月, 2017 3 次提交
    • A
      perf/x86/intel/pt: Add format strings for PTWRITE and power event tracing · 5443624b
      Alexander Shishkin 提交于
      Commit:
      
        8ee83b2a ("perf/x86/intel/pt: Add support for PTWRITE and power event tracing")
      
      forgot to add format strings to the PT driver. So one could enable these features
      by setting corresponding bits in the event config, but not by their mnemonic names.
      
      This patch adds the format strings.
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: vince@deater.net
      Fixes: 8ee83b2a ("perf/x86/intel/pt: Add support for PTWRITE...")
      Link: http://lkml.kernel.org/r/20170127151644.8585-2-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5443624b
    • T
      perf/x86/intel/uncore: Make package handling more robust · fff4b87e
      Thomas Gleixner 提交于
      The package management code in uncore relies on package mapping being
      available before a CPU is started. This changed with:
      
        9d85eb91 ("x86/smpboot: Make logical package management more robust")
      
      because the ACPI/BIOS information turned out to be unreliable, but that
      left uncore in broken state. This was not noticed because on a regular boot
      all CPUs are online before uncore is initialized.
      
      Move the allocation to the CPU online callback and simplify the hotplug
      handling. At this point the package mapping is established and correct.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
      Fixes: 9d85eb91 ("x86/smpboot: Make logical package management more robust")
      Link: http://lkml.kernel.org/r/20170131230141.377156255@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fff4b87e
    • T
      perf/x86/intel/uncore: Clean up hotplug conversion fallout · 1aa6cfd3
      Thomas Gleixner 提交于
      The recent conversion to the hotplug state machine kept two mechanisms from
      the original code:
      
       1) The first_init logic which adds the number of online CPUs in a package
          to the refcount. That's wrong because the callbacks are executed for
          all online CPUs.
      
          Remove it so the refcounting is correct.
      
       2) The on_each_cpu() call to undo box->init() in the error handling
          path. That's bogus because when the prepare callback fails no box has
          been initialized yet.
      
          Remove it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
      Fixes: 1a246b9f ("perf/x86/intel/uncore: Convert to hotplug state machine")
      Link: http://lkml.kernel.org/r/20170131230141.298032324@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1aa6cfd3