1. 15 9月, 2015 4 次提交
  2. 12 9月, 2015 2 次提交
  3. 09 9月, 2015 6 次提交
  4. 05 9月, 2015 2 次提交
  5. 02 9月, 2015 4 次提交
  6. 01 9月, 2015 7 次提交
  7. 29 8月, 2015 11 次提交
  8. 28 8月, 2015 4 次提交
    • K
      perf stat: Get correct cpu id for print_aggr · 601083cf
      Kan Liang 提交于
      print_aggr() fails to print per-core/per-socket statistics after commit
      582ec082 ("perf stat: Fix per-socket output bug for uncore events")
      if events have differnt cpus. Because in print_aggr(), aggr_get_id needs
      index (not cpu id) to find core/pkg id. Also, evsel cpu maps should be
      used to get aggregated id.
      
      Here is an example:
      
      Counting events cycles,uncore_imc_0/cas_count_read/. (Uncore event has
      cpumask 0,18)
      
        $ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,18 --per-core sleep 2
      
      Without this patch, it failes to get CPU 18 result.
      
         Performance counter stats for 'CPU(s) 0,18':
      
        S0-C0           1            7526851      cycles
        S0-C0           1               1.05 MiB  uncore_imc_0/cas_count_read/
        S1-C0           0      <not counted>      cycles
        S1-C0           0      <not counted> MiB  uncore_imc_0/cas_count_read/
      
      With this patch, it can get both CPU0 and CPU18 result.
      
         Performance counter stats for 'CPU(s) 0,18':
      
        S0-C0           1            6327768      cycles
        S0-C0           1               0.47 MiB  uncore_imc_0/cas_count_read/
        S1-C0           1             330228      cycles
        S1-C0           1               0.29 MiB  uncore_imc_0/cas_count_read/
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NStephane Eranian <eranian@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Fixes: 582ec082 ("perf stat: Fix per-socket output bug for uncore events")
      Link: http://lkml.kernel.org/r/1435820925-51091-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      601083cf
    • S
      tools lib traceeveent: Allow for negative numbers in print format · 1d945012
      Steven Rostedt 提交于
      It was reported that "%-8s" does not parse well when used in the printk
      format. The '-' is what is throwing it off. Allow that to be included.
      
      Reporter note:
      
      Example before:
      
        transhuge-stres-10730 [004]  5897.713989: mm_compaction_finished: node=0
        zone=>-<8s order=-2119871790 ret=
      
      Example after:
      
        transhuge-stres-4235  [000]   453.149280: mm_compaction_finished: node=0
        zone=ffffffff81815d7a order=9 ret=
      
      (I will send patches to fix the string handling in the tracepoints so
      it's on par with in-kernel printing via trace_pipe:)
      
        transhuge-stres-10921 [007] ...1  6307.140205: mm_compaction_finished: node=0
        zone=Normal   order=9 ret=partial
      Reported-by: NVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Tested-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/20150827094601.46518bcc@gandalf.local.homeSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1d945012
    • M
      perf script: Add --[no-]-demangle/--[no-]-demangle-kernel · 77e0070d
      Mark Drayton 提交于
      Sometimes when post-processing output from `perf script` one does not
      want to demangle C++ symbol names. Add an option to allow this.
      
      Also add --[no-]demangle-kernel to be consistent with top/report/probe.
      Signed-off-by: NMark Drayton <mbd@fb.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1440616695-32340-1-git-send-email-scientist@fb.comSigned-off-by: NYannick Brosseau <scientist@fb.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      77e0070d
    • R
      nd_blk: change aperture mapping from WC to WB · 67a3e8fe
      Ross Zwisler 提交于
      This should result in a pretty sizeable performance gain for reads.  For
      rough comparison I did some simple read testing using PMEM to compare
      reads of write combining (WC) mappings vs write-back (WB).  This was
      done on a random lab machine.
      
      PMEM reads from a write combining mapping:
      	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=100000
      	100000+0 records in
      	100000+0 records out
      	409600000 bytes (410 MB) copied, 9.2855 s, 44.1 MB/s
      
      PMEM reads from a write-back mapping:
      	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=1000000
      	1000000+0 records in
      	1000000+0 records out
      	4096000000 bytes (4.1 GB) copied, 3.44034 s, 1.2 GB/s
      
      To be able to safely support a write-back aperture I needed to add
      support for the "read flush" _DSM flag, as outlined in the DSM spec:
      
      http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
      
      This flag tells the ND BLK driver that it needs to flush the cache lines
      associated with the aperture after the aperture is moved but before any
      new data is read.  This ensures that any stale cache lines from the
      previous contents of the aperture will be discarded from the processor
      cache, and the new data will be read properly from the DIMM.  We know
      that the cache lines are clean and will be discarded without any
      writeback because either a) the previous aperture operation was a read,
      and we never modified the contents of the aperture, or b) the previous
      aperture operation was a write and we must have written back the dirtied
      contents of the aperture to the DIMM before the I/O was completed.
      
      In order to add support for the "read flush" flag I needed to add a
      generic routine to invalidate cache lines, mmio_flush_range().  This is
      protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently
      only supported on x86.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      67a3e8fe