1. 27 3月, 2015 4 次提交
    • A
      perf/x86/intel: Add INST_RETIRED.ALL workarounds · 294fe0f5
      Andi Kleen 提交于
      On Broadwell INST_RETIRED.ALL cannot be used with any period
      that doesn't have the lowest 6 bits cleared. And the period
      should not be smaller than 128.
      
      This is erratum BDM11 and BDM55:
      
        http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/5th-gen-core-family-spec-update.pdf
      
      BDM11: When using a period < 100; we may get incorrect PEBS/PMI
      interrupts and/or an invalid counter state.
      BDM55: When bit0-5 of the period are !0 we may get redundant PEBS
      records on overflow.
      
      Add a new callback to enforce this, and set it for Broadwell.
      
      How does this handle the case when an app requests a specific
      period with some of the bottom bits set?
      
      Short answer:
      
      Any useful instruction sampling period needs to be 4-6 orders
      of magnitude larger than 128, as an PMI every 128 instructions
      would instantly overwhelm the system and be throttled.
      So the +-64 error from this is really small compared to the
      period, much smaller than normal system jitter.
      
      Long answer (by Peterz):
      
      IFF we guarantee perf_event_attr::sample_period >= 128.
      
      Suppose we start out with sample_period=192; then we'll set period_left
      to 192, we'll end up with left = 128 (we truncate the lower bits). We
      get an interrupt, find that period_left = 64 (>0 so we return 0 and
      don't get an overflow handler), up that to 128. Then we trigger again,
      at n=256. Then we find period_left = -64 (<=0 so we return 1 and do get
      an overflow). We increment with sample_period so we get left = 128. We
      fire again, at n=384, period_left = 0 (<=0 so we return 1 and get an
      overflow). And on and on.
      
      So while the individual interrupts are 'wrong' we get then with
      interval=256,128 in exactly the right ratio to average out at 192. And
      this works for everything >=128.
      
      So the num_samples*fixed_period thing is still entirely correct +- 127,
      which is good enough I'd say, as you already have that error anyhow.
      
      So no need to 'fix' the tools, al we need to do is refuse to create
      INST_RETIRED:ALL events with sample_period < 128.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      [ Updated comments and changelog a bit. ]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1424225886-18652-3-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      294fe0f5
    • A
      perf/x86/intel: Add Broadwell core support · 91f1b705
      Andi Kleen 提交于
      Add Broadwell support for Broadwell to perf.
      
      The basic support is very similar to Haswell. We use the new cache
      event list added for Haswell earlier. The only differences
      are a few bits related to remote nodes. To avoid an extra,
      mostly identical, table these are patched up in the initialization code.
      
      The constraint list has one new event that needs to be handled over Haswell.
      
      Includes code and testing from Kan Liang.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1424225886-18652-2-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      91f1b705
    • A
      perf/x86/intel: Add new cache events table for Haswell · 0f1b5ca2
      Andi Kleen 提交于
      Haswell offcore events are quite different from Sandy Bridge.
      Add a new table to handle Haswell properly.
      
      Note that the offcore bits listed in the SDM are not quite correct
      (this is currently being fixed). An uptodate list of bits is
      in the patch.
      
      The basic setup is similar to Sandy Bridge. The prefetch columns
      have been removed, as prefetch counting is not very reliable
      on Haswell. One L1 event that is not in the event list anymore
      has been also removed.
      
      - data reads do not include code reads (comparable to earlier Sandy Bridge tables)
      - data counts include speculative execution (except L1 write, dtlb, bpu)
      - remote node access includes both remote memory, remote cache, remote mmio.
      - prefetches are not included in the counts for consistency
        (different from Sandy Bridge, which includes prefetches in the remote node)
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      [ Removed the HSM30 comments; we don't have them for SNB/IVB either. ]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1424225886-18652-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0f1b5ca2
    • I
      Merge tag 'perf-core-for-mingo' of... · 30fdaa6b
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Show the first event with an invalid filter (David Ahern, Arnaldo Carvalho de Melo)
      
        - Fix garbage output when intermixing syscalls from different threads in 'perf trace' (Arnaldo Carvalho de Melo)
      
        - Fix 'perf timechart' SIBGUS error on sparc64 (David Ahern)
      
      Infrastructure changes:
      
        - Set JOBS based on CPU or processor, making it work on SPARC, where
          /proc/cpuinfo has "CPU", not "processor" (David Ahern)
      
        - Zero should not be considered "not found" in libtraceevent's eval_flag() (Steven Rostedt)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      30fdaa6b
  2. 26 3月, 2015 6 次提交
  3. 25 3月, 2015 3 次提交
  4. 24 3月, 2015 16 次提交
  5. 23 3月, 2015 7 次提交
  6. 22 3月, 2015 4 次提交
    • I
      Merge tag 'perf-core-for-mingo-2' of... · 963a70b8
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo-2' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
        - Handle legacy syscalls tracepoints (David Ahern, Arnaldo Carvalho de Melo)
      
        - Indicate which callchain entries are annotated in the
          TUI hists browser (report/top) (Arnaldo Carvalho de Melo)
      
        - Fix failure to add multiple probes without debuginfo (He Kuang)
      
        - Fix 'trace' summary_only option (David Ahern)
      
        - Fix race in build_id_cache__add_s() in 'buildid-cache' (Milos Vyletel)
      
        - Don't allow empty argument for field-separator, fixing segfault (Wang Nan)
      
      Infrastructure:
      
        - Add destructor for format_field in libtraceevent (David Ahern)
      
        - Prep work for support lzma compressed kernel modules (Jiri Olsa)
      
        - Update .gitignore with recently added/renamed feature detection files (Yunlong Song)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      963a70b8
    • I
      Merge tag 'perf-core-for-mingo' of... · 08b3f913
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Bash completion for subcommands (Yunlong Song)
      
        - Allow annotating entries in callchains in the hists browser (top/report).
          TODO: give some visual cue to what entries in callchains have samples and thus
          can be annotated and/or allow showing the source code for functions without
          samples (Arnaldo Carvalho de Melo)
      
        - Don't allow empty argument for '-t' in perf report, fixing segfault (Wang Nan)
      
      Infrastructure:
      
        - Prep work for moving the perf feature tests build system to tools/build (Jiri Olsa)
      
        - Fix perf-read-vdsox32 not building and lib64 install dir (H.J. Lu)
      
        - ARM64: fix building error and eh/debug frame offset cache fixes (Wang Nan)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      08b3f913
    • J
      perf tools: Use kmod_path__parse for machine__new_dso · ca33380a
      Jiri Olsa 提交于
      Using kmod_path__parse to get the module name and update the dso short
      name within machine__new_dso function.
      
      This way it's done only first time when dso is created, unlike the
      current way when we update it all the time we process memory map of the
      kernel module.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-8gjmt1ggf5ls1xkk7qi2ko4k@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ca33380a
    • J
      perf tools: Add machine__module_dso function · da17ea33
      Jiri Olsa 提交于
      Separate the dso object addition and update when adding new kernel
      module.
      
      Currently we update dso's symtab_type any time we find it in the list,
      because we can't distinguish between new and found dso from
      __dsos__findnew function.
      
      Adding machine__module_dso that separates finding and adding new dso
      objects, so there's no superfluous update of dso.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-uvqgs5tyq4wssnq6fm43hgvk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      da17ea33