1. 10 8月, 2016 5 次提交
    • P
      perf/x86/intel: Clean up LBR state tracking · 3e2c1a67
      Peter Zijlstra 提交于
      The lbr_context logic confused me; it appears to me to try and do the
      same thing the pmu::sched_task() callback does now, but limited to
      per-task events.
      
      So rip it out. Afaict this should also improve performance, because I
      think the current code can end up doing lbr_reset() twice, once from
      the pmu::add() and then again from pmu::sched_task(), and MSR writes
      (all 3*16 of them) are expensive!!
      
      While thinking through the cases that need the reset it occured to me
      the first install of an event in an active context needs to reset the
      LBR (who knows what crap is in there), but detecting this case is
      somewhat hard.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      3e2c1a67
    • P
      perf/x86/intel: Remove redundant test from intel_pmu_lbr_add() · a5dcff62
      Peter Zijlstra 提交于
      By the time we call pmu::add(), event->ctx must be set, and we
      even already rely on this, so remove that test from
      intel_pmu_lbr_add().
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a5dcff62
    • P
      perf/x86/intel: Eliminate dead code in intel_pmu_lbr_del() · c3a61a2c
      Peter Zijlstra 提交于
      Since pmu::del() is always called under perf_pmu_disable(), the block
      conditional on cpuc->enabled is dead.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c3a61a2c
    • P
      perf/x86: Ensure perf_sched_cb_{inc,dec}() is only called from pmu::{add,del}() · 68f7082f
      Peter Zijlstra 提交于
      Currently perf_sched_cb_{inc,dec}() are called from
      pmu::{start,stop}(), which has the problem that this can happen from
      NMI context, this is making it hard to optimize perf_pmu_sched_task().
      
      Furthermore, we really only need this accounting on pmu::{add,del}(),
      so doing it from pmu::{start,stop}() is doing more work than we really
      need.
      
      Introduce x86_pmu::{add,del}() and wire up the LBR and PEBS.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      68f7082f
    • P
      perf/x86/intel: Rework the large PEBS setup code · 09e61b4f
      Peter Zijlstra 提交于
      In order to allow optimizing perf_pmu_sched_task() we must ensure
      perf_sched_cb_{inc,dec}() are no longer called from NMI context; this
      means that pmu::{start,stop}() can no longer use them.
      
      Prepare for this by reworking the whole large PEBS setup code.
      
      The current code relied on the cpuc->pebs_enabled state, however since
      that reflects the current active state as per pmu::{start,stop}() we
      can no longer rely on this.
      
      Introduce two counters: cpuc->n_pebs and cpuc->n_large_pebs which
      count the total number of PEBS events and the number of PEBS events
      that have FREERUNNING set, resp.. With this we can tell if the current
      setup requires a single record interrupt threshold or can use a larger
      buffer.
      
      This also improves the code in that it re-enables the large threshold
      once the PEBS event that required single record gets removed.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      09e61b4f
  2. 01 8月, 2016 1 次提交
    • J
      perf/x86: Modify error message in virtualized environment · 005bd007
      Juergen Gross 提交于
      It is known that PMU isn't working in some virtualized environments.
      
      Modify the message issued in that case to mention why hardware PMU
      isn't usable instead of reporting it to be broken.
      
      As a side effect this will correct a little bug in the error message:
      The error message was meant to be either of level err or info
      depending on the environment (native or virtualized). As the level is
      taken from the format string and not the printed string, specifying
      it via %s and a conditional argument didn't work the way intended.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Link: http://lkml.kernel.org/r/1470051427-16795-1-git-send-email-jgross@suse.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      005bd007
  3. 16 7月, 2016 1 次提交
    • D
      perf, events: add non-linear data support for raw records · 7e3f977e
      Daniel Borkmann 提交于
      This patch adds support for non-linear data on raw records. It
      extends raw records to have one or multiple fragments that will
      be written linearly into the ring slot, where each fragment can
      optionally have a custom callback handler to walk and extract
      complex, possibly non-linear data.
      
      If a callback handler is provided for a fragment, then the new
      __output_custom() will be used instead of __output_copy() for
      the perf_output_sample() part. perf_prepare_sample() does all
      the size calculation only once, so perf_output_sample() doesn't
      need to redo the same work anymore, meaning real_size and padding
      will be cached in the raw record. The raw record becomes 32 bytes
      in size without holes; to not increase it further and to avoid
      doing unnecessary recalculations in fast-path, we can reuse
      next pointer of the last fragment, idea here is borrowed from
      ZERO_OR_NULL_PTR(), which should keep the perf_output_sample()
      path for PERF_SAMPLE_RAW minimal.
      
      This facility is needed for BPF's event output helper as a first
      user that will, in a follow-up, add an additional perf_raw_frag
      to its perf_raw_record in order to be able to more efficiently
      dump skb context after a linear head meta data related to it.
      skbs can be non-linear and thus need a custom output function to
      dump buffers. Currently, the skb data needs to be copied twice;
      with the help of __output_custom() this work only needs to be
      done once. Future users could be things like XDP/BPF programs
      that work on different context though and would thus also have
      a different callback function.
      
      The few users of raw records are adapted to initialize their frag
      data from the raw record itself, no change in behavior for them.
      The code is based upon a PoC diff provided by Peter Zijlstra [1].
      
        [1] http://thread.gmane.org/gmane.linux.network/421294Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7e3f977e
  4. 14 7月, 2016 10 次提交
  5. 11 7月, 2016 1 次提交
  6. 07 7月, 2016 2 次提交
  7. 06 7月, 2016 1 次提交
  8. 03 7月, 2016 2 次提交
    • J
      perf/x86: Fix 32-bit perf user callgraph collection · fc188225
      Josh Poimboeuf 提交于
      A basic perf callgraph record operation causes an immediate panic on a
      32-bit kernel compiled with CONFIG_CC_STACKPROTECTOR=y:
      
        $ perf record -g ls
        Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: c0404fbd
      
        CPU: 0 PID: 998 Comm: ls Not tainted 4.7.0-rc5+ #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
         c0dd5967 ff7afe1c 00000086 f41dbc2c c07445a0 464c457f f41dbca8 f41dbc44
         c05646f4 f41dbca8 464c457f f41dbca8 464c457f f41dbc54 c04625be c0ce56fc
         c0404fbd f41dbc88 c0404fbd b74668f0 f41dc000 00000000 c0000000 00000000
        Call Trace:
         [<c07445a0>] dump_stack+0x58/0x78
         [<c05646f4>] panic+0x8e/0x1c6
         [<c04625be>] __stack_chk_fail+0x1e/0x30
         [<c0404fbd>] ? perf_callchain_user+0x22d/0x230
         [<c0404fbd>] perf_callchain_user+0x22d/0x230
         [<c055f89f>] get_perf_callchain+0x1ff/0x270
         [<c055f988>] perf_callchain+0x78/0x90
         [<c055c7eb>] perf_prepare_sample+0x24b/0x370
         [<c055c934>] perf_event_output_forward+0x24/0x70
         [<c05531c0>] __perf_event_overflow+0xa0/0x210
         [<c0550a93>] ? cpu_clock_event_read+0x43/0x50
         [<c0553431>] perf_swevent_hrtimer+0x101/0x180
         [<c0456235>] ? kmap_atomic_prot+0x35/0x140
         [<c056dc69>] ? get_page_from_freelist+0x279/0x950
         [<c058fdd8>] ? vma_interval_tree_remove+0x158/0x230
         [<c05939f4>] ? wp_page_copy.isra.82+0x2f4/0x630
         [<c05a050d>] ? page_add_file_rmap+0x1d/0x50
         [<c0565611>] ? unlock_page+0x61/0x80
         [<c0566755>] ? filemap_map_pages+0x305/0x320
         [<c059769f>] ? handle_mm_fault+0xb7f/0x1560
         [<c074cbeb>] ? timerqueue_del+0x1b/0x70
         [<c04cfefe>] ? __remove_hrtimer+0x2e/0x60
         [<c04d017b>] __hrtimer_run_queues+0xcb/0x2a0
         [<c0553330>] ? __perf_event_overflow+0x210/0x210
         [<c04d0a2a>] hrtimer_interrupt+0x8a/0x180
         [<c043ecc2>] local_apic_timer_interrupt+0x32/0x60
         [<c043f643>] smp_apic_timer_interrupt+0x33/0x50
         [<c0b0cd38>] apic_timer_interrupt+0x34/0x3c
        Kernel Offset: disabled
        ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: c0404fbd
      
      The panic is caused by the fact that perf_callchain_user() mistakenly
      assumes it's 64-bit only and ends up corrupting the stack.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: stable@vger.kernel.org # v4.5+
      Fixes: 75925e1a ("perf/x86: Optimize stack walk user accesses")
      Link: http://lkml.kernel.org/r/1a547f5077ec30f75f9b57074837c3c80df86e5e.1467432113.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fc188225
    • S
      perf/x86/intel: Update event constraints when HT is off · 9010ae4a
      Stephane Eranian 提交于
      This patch updates the event constraints for non-PEBS mode for
      Intel Broadwell and Skylake processors. When HT is off, each
      CPU gets 8 generic counters. However, not all events can be
      programmed on any of the 8 counters.  This patch adds the
      constraints for the MEM_* events which can only be measured on the
      bottom 4 counters. The constraints are also valid when HT is off
      because, then, there are only 4 generic counters and they are the
      bottom counters.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1467411742-13245-1-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9010ae4a
  9. 27 6月, 2016 5 次提交
    • P
      perf/x86/intel: Add {rd,wr}lbr_{to,from} wrappers · d4cf1949
      Peter Zijlstra 提交于
      The whole rdmsr()/wrmsr() for lbr_from got a little unweildy with the
      sign extension quirk, provide a few simple wrappers to clean things up.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Carrillo-Cisneros <davidcc@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d4cf1949
    • D
      perf/x86/intel: Add MSR_LAST_BRANCH_FROM_x quirk for ctx switch · 71adae99
      David Carrillo-Cisneros 提交于
      Add quirk for context switch to save/restore the value of
      MSR_LAST_BRANCH_FROM_x when LBR is enabled and there is potential for
      kernel addresses to be in the lbr_from register.
      
      To test this patch, use a perf tool and kernel with the patch
      next in this series. That patch removes the work around that masked
      the hw bug:
      
        $ ./lbr_perf record --call-graph lbr -e cycles:k sleep 1
      
      where lbr_perf is the patched perf tool, that allows to specify :k
      on lbr mode. The above command will trigger a #GPF :
      
       WARNING: CPU: 28 PID: 14096 at arch/x86/mm/extable.c:65 ex_handler_wrmsr_unsafe+0x70/0x80
       unchecked MSR access error: WRMSR to 0x681 (tried to write 0x1fffffff81010794)
       ...
       Call Trace:
        [<ffffffff8167af49>] dump_stack+0x4d/0x63
        [<ffffffff810b9b15>] __warn+0xe5/0x100
        [<ffffffff810b9be9>] warn_slowpath_fmt+0x49/0x50
        [<ffffffff810abb40>] ex_handler_wrmsr_unsafe+0x70/0x80
        [<ffffffff810abc42>] fixup_exception+0x42/0x50
        [<ffffffff81079d1a>] do_general_protection+0x8a/0x160
        [<ffffffff81684ec2>] general_protection+0x22/0x30
        [<ffffffff810101b9>] ? intel_pmu_lbr_sched_task+0xc9/0x380
        [<ffffffff81009d7c>] intel_pmu_sched_task+0x3c/0x60
        [<ffffffff81003a2b>] x86_pmu_sched_task+0x1b/0x20
        [<ffffffff81192a5b>] perf_pmu_sched_task+0x6b/0xb0
        [<ffffffff8119746d>] __perf_event_task_sched_in+0x7d/0x150
        [<ffffffff810dd9dc>] finish_task_switch+0x15c/0x200
        [<ffffffff8167f894>] __schedule+0x274/0x6cc
        [<ffffffff8167fdd9>] schedule+0x39/0x90
        [<ffffffff81675398>] exit_to_usermode_loop+0x39/0x89
        [<ffffffff810028ce>] prepare_exit_to_usermode+0x2e/0x30
        [<ffffffff81683c1b>] retint_user+0x8/0x10
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1466533874-52003-5-git-send-email-davidcc@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      71adae99
    • D
      perf/x86/intel: Fix trivial formatting and style bug · 3812bba8
      David Carrillo-Cisneros 提交于
      Replace spaces by tabs in LBR_FROM_* constants to align with newly
      defined constant. Use BIT_ULL.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1466533874-52003-4-git-send-email-davidcc@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3812bba8
    • D
      perf/x86/intel: Fix MSR_LAST_BRANCH_FROM_x bug when no TSX · 19fc9ddd
      David Carrillo-Cisneros 提交于
      Intel's SDM states that bits 61:62 in MSR_LAST_BRANCH_FROM_x are the
      TSX flags for formats with LBR_TSX flags (i.e. LBR_FORMAT_EIP_EFLAGS2).
      
      However, when the CPU has TSX support deactivated, bits 61:62 actually
      behave as follows:
      
        - For wrmsr(), bits 61:62 are considered part of the sign extension.
        - When capturing branches, the LBR hw will always clear bits 61:62.
          regardless of the sign extension.
      
      Therefore, if:
      
        1) LBR has TSX format.
        2) CPU has no TSX support enabled.
      
      ... then any value passed to wrmsr() must be sign extended to 63 bits
      and any value from rdmsr() must be converted to have a sign extension
      of 61 bits, ignoring the values at TSX flags.
      
      This bug was masked by the work-around to the Intel's CPU bug:
      BJ94. "LBR May Contain Incorrect Information When Using FREEZE_LBRS_ON_PMI"
      in Document Number: 324643-037US.
      
      The aforementioned work-around uses hw flags to filter out all kernel
      branches, limiting LBR callstack to user level execution only.
      
      Since user addresses are not sign extended, they do not trigger the wrmsr()
      bug in MSR_LAST_BRANCH_FROM_x when saved/restored at context switch.
      
      To verify the hw bug:
      
        $ perf record -b -e cycles sleep 1
        $ rdmsr -p 0 0x680
        0x1fffffffb0b9b0cc
        $ wrmsr -p 0 0x680 0x1fffffffb0b9b0cc
        write(): Input/output error
      
      The quirk for LBR_FROM_ MSRs is required before calls to wrmsrl() and
      after rdmsrl().
      
      This patch introduces it for wrmsrl()'s done for testing LBR support.
      
      Future patch in series adds the quirk for context switch, that would
      be required if LBR callstack is to be enabled for ring 0.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1466533874-52003-3-git-send-email-davidcc@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      19fc9ddd
    • D
      perf/x86/intel: Print LBR support statement after validation · f09509b9
      David Carrillo-Cisneros 提交于
      The following commit:
      
        338b522c ("perf/x86/intel: Protect LBR and extra_regs against KVM lying")
      
      added an additional test to LBR support detection that is performed after
      printing the LBR support statement to dmesg.
      
      Move the LBR support output after the very last test, to make sure we
      print the true status of LBR support.
      Signed-off-by: NDavid Carrillo-Cisneros <davidcc@google.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1466533874-52003-2-git-send-email-davidcc@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f09509b9
  10. 14 6月, 2016 1 次提交
  11. 08 6月, 2016 7 次提交
    • J
      perf/x86/rapl: Add Skylake server model detection · 348c5ac6
      Jacob Pan 提交于
      SKX uses similar RAPL interface as Broadwell server.
      Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: jacob.jun.pan@intel.com
      Link: http://lkml.kernel.org/r/20160603001953.38848836@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      348c5ac6
    • D
      perf/x86/uncore: Use Intel family name macros for uncore · a07301ab
      Dave Hansen 提交于
      Another straightforward replacement of magic numbers
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: jacob.jun.pan@intel.com
      Link: http://lkml.kernel.org/r/20160603001942.537570B6@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a07301ab
    • D
      perf/x86/cstate: Use Intel Model name macros · bf4ad541
      Dave Hansen 提交于
      This should be getting old by now.  Use the new macros intead of
      open-coded magic numbers.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: jacob.jun.pan@intel.com
      Link: http://lkml.kernel.org/r/20160603001940.FE69D646@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bf4ad541
    • D
      perf/x86/msr: Add missing Intel models · 5134596c
      Dave Hansen 提交于
      This patch presumes that Kabylake and Skylake Server will be the
      same as the existing Skylake parts and adds them to the MSR
      events code.
      
      Also add handling for "WESTMERE2".
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: jacob.jun.pan@intel.com
      Link: http://lkml.kernel.org/r/20160603001935.FE6B3847@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5134596c
    • D
      perf/x86/msr: Use Intel family macros for MSR events code · 353bf605
      Dave Hansen 提交于
      Use the new INTEL_MODEL_* macros for arch/x86/events/msr.c.
      
      This code appears to be missing handling for "WESTMERE2" and
      "SKYLAKE_X".
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: jacob.jun.pan@intel.com
      Link: http://lkml.kernel.org/r/20160603001933.99A402B0@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      353bf605
    • D
      perf/x86/rapl: Use Intel family macros for RAPL · 7f2236d0
      Dave Hansen 提交于
      Use the new INTEL_FAM6_* macros for rapl.c.
      
      Note that this is missing at least one Westmere model and Skylake
      Server which will we fixed later in this series.
      
      The resulting binary structure 'rapl_cpu_match' is the same
      before and after this patch.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: jacob.jun.pan@intel.com
      Link: http://lkml.kernel.org/r/20160603001930.6AC50BE3@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7f2236d0
    • D
      perf/x86/intel: Use Intel family macros for core perf events · ef5f9f47
      Dave Hansen 提交于
      Use the new model number macros instead of spelling things out
      in the comments.
      
      Note that this is missing a Nehalem model that is mentioned in
      intel_idle which is fixed up in a later patch.
      
      The resulting binary (arch/x86/events/intel/core.o) is exactly
      the same with and without this patch modulo some harmless changes
      to restoring %esi in the return path of functions, even those
      untouched by this patch.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: jacob.jun.pan@intel.com
      Link: http://lkml.kernel.org/r/20160603001929.C5F1C079@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ef5f9f47
  12. 03 6月, 2016 4 次提交
    • A
      perf/x86/intel: Use new topology_max_smt_threads() in HT leak workaround · 030ba6cd
      Andi Kleen 提交于
      Now that we have topology_max_smt_threads() use it
      to detect the HT workarounds for older CPUs.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1463703002-19686-6-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      030ba6cd
    • A
      perf/x86/intel: Add topdown events to Intel Atom · eb12b8ec
      Andi Kleen 提交于
      Add topdown event declarations to Silvermont / Airmont.
      These cores do not support the full Top Down metrics, but an useful
      subset (FrontendBound, Retiring, Backend Bound/Bad Speculation).
      
      The perf stat tool automatically handles the missing events
      and combines the available metrics.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1463703002-19686-5-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      eb12b8ec
    • A
      perf/x86/intel: Add topdown events to Intel Core · a39fcae7
      Andi Kleen 提交于
      Add declarations for the events needed for topdown to the
      Intel big core CPUs starting with Sandy Bridge. We need
      to report different values if HyperThreading is on or off.
      
      The only thing this patch does is to export some events
      in sysfs.
      
      topdown level 1 uses a set of abstracted metrics which
      are generic to out of order CPU cores (although some
      CPUs may not implement all of them):
      
      topdown-total-slots	  Available slots in the pipeline
      topdown-slots-issued	  Slots issued into the pipeline
      topdown-slots-retired	  Slots successfully retired
      topdown-fetch-bubbles	  Pipeline gaps in the frontend
      topdown-recovery-bubbles  Pipeline gaps during recovery
      			  from misspeculation
      
      A slot is a single operation in the CPU pipe line.
      
      These metrics then allow to compute four useful metrics:
      FrontendBound, BackendBound, Retiring, BadSpeculation.
      
      The formulas to compute the metrics are generic, they
      only change based on the availability on the abstracted
      input values.
      
      The kernel declares the events supported by the current
      CPU and their scaling factors (such as the pipeline width)
      and perf stat then computes the formulas based on the
      available metrics.  This is similar how existing
      perf metrics, such as TSC metrics or IPC, are implemented.
      
      This abstracts all CPU pipe line specific knowledge in the
      kernel driver, but still avoids the need for larger scale perf
      interface changes.
      
      For HyperThreading the any bit is needed to get accurate
      values when both threads are executing. This implies that
      the events can only be collected as root or with
      perf_event_paranoid=-1 for now.
      
      The basic scheme is based on the following paper:
        Yasin, A Top Down Method for Performance analysis and Counter architecture ISPASS14 (pdf available via google)
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1463703002-19686-4-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a39fcae7
    • A
      perf/x86: Support sysfs files depending on SMT status · fc07e9f9
      Andi Kleen 提交于
      Add a way to show different sysfs events attributes depending on
      HyperThreading is on or off. This is difficult to determine
      early at boot, so we just do it dynamically when the sysfs
      attribute is read.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1463703002-19686-3-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fc07e9f9