1. 07 6月, 2016 11 次提交
  2. 06 6月, 2016 1 次提交
  3. 04 6月, 2016 4 次提交
    • H
      perf script: Show call graphs when 1st event doesn't have it but some other has · 40f20e50
      He Kuang 提交于
      There's a display inconsistency when there are multiple tracepoint
      events, some of which have the 'call-graph' config option set but the
      first one hasn't, i.e. the whole logic for call graph processing is
      enabled only if the first tracepoint event has call-graph set.
      
      For instance, if we record signal_deliver with call-graph and
      signal_generate without:
      
        $ perf record -g -a -e signal:signal_deliver -e signal:signal_generate/call-graph=no/
      
        [ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ]
      
        $ perf script
      
        kworker/u2:1    13 [000]  6563.875949: signal:signal_generate: sig=2 errno=0 code=128 comm=perf pid=1313 grp=1 res=0 ff61cc __send_signal+0x3ec ([kernel.kallsyms])
        perf  1313 [000]  6563.877584:  signal:signal_deliver: sig=2 errno=0 code=128 sa_handler=43115e sa_flags=14000000
                    7ffff314 get_signal+0x80007f0023a4 ([kernel.kallsyms])
                    7fffe358 do_signal+0x80007f002028 ([kernel.kallsyms])
                    7fffa5e8 exit_to_usermode_loop+0x80007f002053 ([kernel.kallsyms])
                    ...
      
      Then we exchange the order of these two events in commandline, and keep
      signal_generate without call-graph.
      
        $ perf record -g -a -e signal:signal_generate/call-graph=no/ -e signal:signal_deliver
      
        [ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ]
      
        $ perf script
      
          kworker/u2:2  1314 [000]  6933.353060: signal:signal_generate: sig=2 errno=0 code=128 comm=perf pid=1321 grp=1 res=0
                  perf  1321 [000]  6933.353872:  signal:signal_deliver: sig=2 errno=0 code=128 sa_handler=43115e sa_flags=14000000
      
      This time, the callchain of the event signal_deliver disappeared. The
      problem is caused by that perf only checks for the first evsel in evlist
      and decides if callchain should be printed.
      
      This patch traverses all evsels in evlist to see if any of them have
      callchains, and shows the right result:
      
        $ perf script
      
        kworker/u2:2  1314 [000]  6933.353060: signal:signal_generate: sig=2 errno=0 code=128 comm=perf pid=1321 grp=1 res=0 ff61cc __send_signal+0x3ec ([kernel.kallsyms])
        perf  1321 [000]  6933.353872:  signal:signal_deliver: sig=2 errno=0 code=128 sa_handler=43115e sa_flags=14000000
                    7ffff314 get_signal+0x80007f0023a4 ([kernel.kallsyms])
                    7fffe358 do_signal+0x80007f002028 ([kernel.kallsyms])
                    7fffa5e8 exit_to_usermode_loop+0x80007f002053 ([kernel.kallsyms])
                    ...
      Signed-off-by: NHe Kuang <hekuang@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1463374279-97209-1-git-send-email-hekuang@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      40f20e50
    • L
      tools lib api: Respect CROSS_COMPILE for the linker · 703e0165
      Lucas Stach 提交于
      This fixes cross compilation of libapi.
      Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: kernel@pengutronix.de
      Cc: patchwork-lst@pengutronix.de
      Link: http://lkml.kernel.org/r/1458235670-27341-1-git-send-email-l.stach@pengutronix.deSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      703e0165
    • W
      perf evlist: Fix alloc_mmap() failure path · 946ae1d4
      Wang Nan 提交于
      If zalloc fail, setting evlist->mmap[i].fd is unsafe and
      perf_evlist__alloc_mmap() should bail out right after that.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Fixes: d4c6fb36 ("perf evsel: Record fd into perf_mmap")
      Link: http://lkml.kernel.org/r/1464699975-230440-1-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      946ae1d4
    • A
      perf evsel: Provide way to extract integer value from format_field · 90525176
      Arnaldo Carvalho de Melo 提交于
      Out of perf_evsel__intval(), that requires passing the variable name,
      that will then be searched in the list of tracepoint variables for the
      given evsel.
      
      In cases such as syscall file descriptor ("fd") tracking, this is
      wasteful, we need just to use perf_evsel__field() and cache the
      format_field.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-r6f89jx9j5nkx037d0naviqy@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      90525176
  4. 03 6月, 2016 13 次提交
    • A
      perf/x86/intel: Use new topology_max_smt_threads() in HT leak workaround · 030ba6cd
      Andi Kleen 提交于
      Now that we have topology_max_smt_threads() use it
      to detect the HT workarounds for older CPUs.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1463703002-19686-6-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      030ba6cd
    • A
      perf/x86/intel: Add topdown events to Intel Atom · eb12b8ec
      Andi Kleen 提交于
      Add topdown event declarations to Silvermont / Airmont.
      These cores do not support the full Top Down metrics, but an useful
      subset (FrontendBound, Retiring, Backend Bound/Bad Speculation).
      
      The perf stat tool automatically handles the missing events
      and combines the available metrics.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1463703002-19686-5-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      eb12b8ec
    • A
      perf/x86/intel: Add topdown events to Intel Core · a39fcae7
      Andi Kleen 提交于
      Add declarations for the events needed for topdown to the
      Intel big core CPUs starting with Sandy Bridge. We need
      to report different values if HyperThreading is on or off.
      
      The only thing this patch does is to export some events
      in sysfs.
      
      topdown level 1 uses a set of abstracted metrics which
      are generic to out of order CPU cores (although some
      CPUs may not implement all of them):
      
      topdown-total-slots	  Available slots in the pipeline
      topdown-slots-issued	  Slots issued into the pipeline
      topdown-slots-retired	  Slots successfully retired
      topdown-fetch-bubbles	  Pipeline gaps in the frontend
      topdown-recovery-bubbles  Pipeline gaps during recovery
      			  from misspeculation
      
      A slot is a single operation in the CPU pipe line.
      
      These metrics then allow to compute four useful metrics:
      FrontendBound, BackendBound, Retiring, BadSpeculation.
      
      The formulas to compute the metrics are generic, they
      only change based on the availability on the abstracted
      input values.
      
      The kernel declares the events supported by the current
      CPU and their scaling factors (such as the pipeline width)
      and perf stat then computes the formulas based on the
      available metrics.  This is similar how existing
      perf metrics, such as TSC metrics or IPC, are implemented.
      
      This abstracts all CPU pipe line specific knowledge in the
      kernel driver, but still avoids the need for larger scale perf
      interface changes.
      
      For HyperThreading the any bit is needed to get accurate
      values when both threads are executing. This implies that
      the events can only be collected as root or with
      perf_event_paranoid=-1 for now.
      
      The basic scheme is based on the following paper:
        Yasin, A Top Down Method for Performance analysis and Counter architecture ISPASS14 (pdf available via google)
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1463703002-19686-4-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a39fcae7
    • A
      perf/x86: Support sysfs files depending on SMT status · fc07e9f9
      Andi Kleen 提交于
      Add a way to show different sysfs events attributes depending on
      HyperThreading is on or off. This is difficult to determine
      early at boot, so we just do it dynamically when the sysfs
      attribute is read.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1463703002-19686-3-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fc07e9f9
    • A
      x86/topology: Add topology_max_smt_threads() · 70b8301f
      Andi Kleen 提交于
      For SMT specific workarounds it is useful to know if SMT is active
      on any online CPU in the system. This currently requires a loop
      over all online CPUs.
      
      Add a global variable that is updated with the maximum number
      of smt threads on any CPU on online/offline, and use it for
      topology_max_smt_threads()
      
      The single call is easier to use than a loop.
      
      Not exported to user space because user space already can use
      the existing sibling interfaces to find this out.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1463703002-19686-2-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      70b8301f
    • V
      tools/perf: Handle -EOPNOTSUPP for sampling events · dc89e75a
      Vineet Gupta 提交于
      This allows (with a previous change to the perf error return ABI) for
      calling out in userspace the exact reason for perf record failing
      when PMU doesn't support overflow interrupts.
      
      Note that this needs to be put ahead of existing precise_ip check as
      that gets hit otherwise for the sampling fail case as well.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <acme@redhat.com>
      Cc: <linux-snps-arc@lists.infradead.org>
      Cc: <vincent.weaver@maine.edu>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
      Link: http://lkml.kernel.org/r/1462786660-2900-2-git-send-email-vgupta@synopsys.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      dc89e75a
    • V
      perf/abi: Change the errno for sampling event not supported in hardware · a1396555
      Vineet Gupta 提交于
      Change the return code for sampling event not supported from -ENOTSUPP
      to -EOPNOTSUPP.
      
      This allows userspace to identify this case specifically, instead of
      printing the catch-all error message it did previously.
      
      Technically this is an ABI change, but we think we can get away
      with it.
      
      Old behavior:
       -------
       | # perf record ls
       | Error:
       | The sys_perf_event_open() syscall returned with 524 (Unknown error 524)
       | for event (cycles:ppp).
       | /bin/dmesg may provide additional information.
       | No CONFIG_PERF_EVENTS=y kernel support configured?
      
      New behavior:
       -------
       | # perf record ls
       | Error:
       | PMU Hardware doesn't support sampling/overflow-interrupts.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <acme@redhat.com>
      Cc: <linux-snps-arc@lists.infradead.org>
      Cc: <vincent.weaver@maine.edu>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
      Link: http://lkml.kernel.org/r/1462786660-2900-3-git-send-email-vgupta@synopsys.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a1396555
    • K
      perf/x86/intel/uncore: Locate specific box by checking full device info · a54fa079
      Kan Liang 提交于
      Some platforms, e.g. Knights Landing, use a common PCI device ID for
      multiple instances of an uncore PMU device type. So it is impossible to
      locate the specific instances only by PCI device ID.
      
      The current code specially handles Knights Landing by arbitrarily pointing
      an instance to an unused uncore box. However, we still have no idea
      which uncore device is mapped to which box.
      
      Furthermore, there could be more platforms which use a common PCI device ID
      for uncore devices. We have to specially handle them one by one.
      
      This patch records full device information (slot, func, and device ID)
      in id_table[]. So the probe function can point the instance to a specific
      uncore box by checking the full device information.
      Tested-by: NLukasz Odzioba <lukasz.odzioba@intel.com>
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: tglx@linutronix.de
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bp@suse.de
      Cc: harish.chegondi@intel.com
      Cc: hubert.chrzaniuk@intel.com
      Cc: lawrence.f.meadows@intel.com
      Link: http://lkml.kernel.org/r/1463379504-39003-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a54fa079
    • L
      perf/x86/intel: Change offcore response masks for Knights Landing · 9c489fce
      Lukasz Odzioba 提交于
      Due to change in register definition we need to update OCR mask:
      
      MSR_OFFCORE_RESP0 reserved bits: 3,4,18,29,30,33,34, 8,11,14
      MSR_OFFCORE_RESP1 reserved bits: 3,4,18,29,30,33,34, 38
      Reported-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLukasz Odzioba <lukasz.odzioba@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: akpm@linux-foundation.org
      Cc: hpa@zytor.com
      Cc: kan.liang@intel.com
      Cc: lukasz.anaczkowski@intel.com
      Cc: zheng.z.yan@intel.com
      Link: http://lkml.kernel.org/r/1463433419-16893-1-git-send-email-lukasz.odzioba@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9c489fce
    • L
      perf/x86/intel: Add 'static' keyword to locally used arrays · 20f36278
      Lukasz Odzioba 提交于
      Add the 'static' keyword to intel_bdw_event_constraints[], snb_events_attrs[],
      nhm_events_attrs[] and intel_skl_event_constraints arrays[], because
      they are only used locally.
      Signed-off-by: NLukasz Odzioba <lukasz.odzioba@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: akpm@linux-foundation.org
      Cc: hpa@zytor.com
      Cc: kan.liang@intel.com
      Cc: lukasz.anaczkowski@intel.com
      Cc: zheng.z.yan@intel.com
      Link: http://lkml.kernel.org/r/1463433378-16816-1-git-send-email-lukasz.odzioba@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      20f36278
    • K
      perf/core: Fix implicitly enable dynamic interrupt throttle · ab7fdefb
      Kan Liang 提交于
      This patch fixes an issue which was introduced by commit:
      
        91a612ee ("perf/core: Fix dynamic interrupt throttle")
      
      ... which commit unconditionally sets the perf_sample_allowed_ns value
      to !0. But that could trigger a bug in the following corner case:
      
      The user can disable the dynamic interrupt throttle mechanism by setting
      perf_cpu_time_max_percent to 0. Then they change perf_event_max_sample_rate.
      For this case, the mechanism will be enabled implicitly, because
      perf_sample_allowed_ns becomes !0 - which is not what we want.
      
      This patch only updates perf_sample_allowed_ns when the dynamic
      interrupt throttle mechanism is enabled.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Link: http://lkml.kernel.org/r/1462260366-3160-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ab7fdefb
    • P
      perf/core: Rename the perf_event_aux*() APIs to perf_event_sb*(), to separate... · aab5b71e
      Peter Zijlstra 提交于
      perf/core: Rename the perf_event_aux*() APIs to perf_event_sb*(), to separate them from AUX ring-buffer records
      
      There are now two different things called AUX in perf, the
      infrastructure to deliver the mmap/comm/task records and the
      AUX part in the mmap buffer (with associated AUX_RECORD).
      
      Since the former is internal, rename it to side-band to reduce
      the confusion factor.
      
      No change in functionality.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      aab5b71e
    • K
      perf/core: Optimize side-band event delivery · f2fb6bef
      Kan Liang 提交于
      The perf_event_aux() function iterates all PMUs and all events in
      their respective per-CPU contexts to find the events to deliver
      side-band records to.
      
      For example, the brk test case in lkp triggers many mmap() operations,
      which, if we're also running perf, results in many perf_event_aux()
      invocations.
      
      If we enable uncore PMU support (even when uncore events are not used),
      dozens of uncore PMUs will be iterated, which can significantly
      decrease brk_test's throughput.
      
      For example, the brk throughput:
      
        without uncore PMUs: 2647573 ops_per_sec
        with    uncore PMUs: 1768444 ops_per_sec
      
      ... a 33% reduction.
      
      To get at the per-CPU events that need side-band records, this patch
      puts these events on a per-CPU list, this avoids iterating the PMUs
      and any events that do not need side-band records.
      
      Per task events are unchanged to avoid extra overhead on the context
      switch paths.
      Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reported-by: NHuang, Ying <ying.huang@linux.intel.com>
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1458757477-3781-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f2fb6bef
  5. 31 5月, 2016 4 次提交
  6. 30 5月, 2016 7 次提交