1. 20 5月, 2010 2 次提交
  2. 19 5月, 2010 16 次提交
    • C
      perf, x86: P4 PMU -- add missing bit in CCCR mask · ce7f1545
      Cyrill Gorcunov 提交于
      Should be there for the sake of RAW events.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      CC: Lin Ming <ming.m.lin@intel.com>
      CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100518212439.354345151@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ce7f1545
    • C
      perf, x86: P4_pmu_schedule_events -- use smp_processor_id instead of raw_ · 9d36dfcf
      Cyrill Gorcunov 提交于
      This snippet somehow escaped the commit:
      
       | commit 137351e0
       | Author: Cyrill Gorcunov <gorcunov@openvz.org>
       | Date:   Sat May 8 15:25:52 2010 +0400
       |
       |    x86, perf: P4 PMU -- protect sensible procedures from preemption
      
      so bring it eventually back. It helps to catch
      preemption issue (if there will be, rule of thumb --
      don't use raw_ if you can).
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100518212439.167259349@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9d36dfcf
    • C
      perf, x86: P4 PMU -- do a real check for ESCR address being in hash · 623aab89
      Cyrill Gorcunov 提交于
      To prevent from clashes in future code modifications
      do a real check for ESCR address being in hash. At
      moment the callers are known to pass sane values but
      better to be on a safe side.
      
      And comment fix.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      CC: Lin Ming <ming.m.lin@intel.com>
      CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100518212439.004503600@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      623aab89
    • A
      perf tools: remove xstrndup, xmalloc, xzalloc · 151f85a4
      Arnaldo Carvalho de Melo 提交于
      All the functions that call this can handle the equivalent, non
      panic'ing wrapped routines.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      151f85a4
    • A
      perf probe: Don't call die() · 8a7ddad8
      Arnaldo Carvalho de Melo 提交于
      Functions that were calling xzalloc also returned -1 when, for other
      reasons, it could fail, and the calleds are coping with failures, so
      stop using die() and xzalloc().
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8a7ddad8
    • A
      perf probe: Fix some error exit paths · b448c4b6
      Arnaldo Carvalho de Melo 提交于
      That could leave filedescriptors open and leak memory. Also stop using
      xmalloc, use malloc and handle results just like other error cases in
      the same routine that used it.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b448c4b6
    • A
      perf tools: Remove some unused functions · a41794cd
      Arnaldo Carvalho de Melo 提交于
      Without the bloated cplus_demangle from binutils, i.e building with:
      
      $ make NO_DEMANGLE=1 O=~acme/git/build/perf -j3 -C tools/perf/ install
      
      Before:
      
         text	   data	    bss	    dec	    hex	filename
       471851	  29280	4025056	4526187	 45106b	/home/acme/bin/perf
      
      After:
      
      [acme@doppio linux-2.6-tip]$ size ~/bin/perf
         text	   data	    bss	    dec	    hex	filename
       446886	  29232	4008576	4484694	 446e56	/home/acme/bin/perf
      
      So its a 5.3% size reduction in code, but the interesting part is in the git
      diff --stat output:
      
       19 files changed, 20 insertions(+), 1909 deletions(-)
      
      If we ever need some of the things we got from git but weren't using, we just
      have to go to the git repo and get fresh, uptodate source code bits.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a41794cd
    • S
      perf stat: add perf stat -B to pretty print large numbers · 5af52b51
      Stephane Eranian 提交于
      It is hard to read very large numbers so provide an option to perf stat
      to separate thousands using a separator. The patch leverages the locale
      support of stdio. You need to set your LC_NUMERIC appropriately, for
      instance LC_NUMERIC=en_US.UTF8. You need to pass -B to activate this
      feature. This way existing scripts parsing the output do not need to be
      changed. Here is an example.
      
      $ perf stat noploop 2
      noploop for 2 seconds
      
       Performance counter stats for 'noploop 2':
      
              1998.347031  task-clock-msecs         #      0.998 CPUs
                       61  context-switches         #      0.000 M/sec
                        0  CPU-migrations           #      0.000 M/sec
                      118  page-faults              #      0.000 M/sec
            4,138,410,900  cycles                   #   2070.917 M/sec  (scaled from 70.01%)
            2,062,650,268  instructions             #      0.498 IPC    (scaled from 70.01%)
            2,057,653,466  branches                 #   1029.678 M/sec  (scaled from 70.01%)
                   40,267  branch-misses            #      0.002 %      (scaled from 30.04%)
            2,055,961,348  cache-references         #   1028.831 M/sec  (scaled from 30.03%)
                   53,725  cache-misses             #      0.027 M/sec  (scaled from 30.02%)
      
              2.001393933  seconds time elapsed
      
      $ perf stat -B  noploop 2
      noploop for 2 seconds
      
       Performance counter stats for 'noploop 2':
      
              1998.297883  task-clock-msecs         #      0.998 CPUs
                       59  context-switches         #      0.000 M/sec
                        0  CPU-migrations           #      0.000 M/sec
                      119  page-faults              #      0.000 M/sec
            4,131,380,160  cycles                   #   2067.450 M/sec  (scaled from 70.01%)
            2,059,096,507  instructions             #      0.498 IPC    (scaled from 70.01%)
            2,054,681,303  branches                 #   1028.216 M/sec  (scaled from 70.01%)
                   25,650  branch-misses            #      0.001 %      (scaled from 30.05%)
            2,056,283,014  cache-references         #   1029.017 M/sec  (scaled from 30.03%)
                   47,097  cache-misses             #      0.024 M/sec  (scaled from 30.02%)
      
              2.001391016  seconds time elapsed
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4bf28fe8.914ed80a.01ca.fffff5f5@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5af52b51
    • L
      perf, sparc: Implement group scheduling transactional APIs · a13c3afd
      Lin Ming 提交于
      Convert to the transactional PMU API and remove the duplication of
      group_sched_in().
      
      [cross build only]
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Acked-by: NDavid Miller <davem@davemloft.net>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1272002193.5707.65.camel@minggr.sh.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a13c3afd
    • P
      perf: Optimize perf_output_*() by avoiding local_xchg() · 6d1acfd5
      Peter Zijlstra 提交于
      Since the x86 XCHG ins implies LOCK, avoid the use by
      using a sequence count instead.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6d1acfd5
    • P
      perf: Optimize the hotpath by converting the perf output buffer to local_t · fa588151
      Peter Zijlstra 提交于
      Since there is now only a single writer, we can use
      local_t instead and avoid all these pesky LOCK insn.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fa588151
    • P
      perf: Optimize the perf_output() path by removing IRQ-disables · ef60777c
      Peter Zijlstra 提交于
      Since we can now assume there is only a single writer
      to each buffer, we can remove per-cpu lock thingy and
      use a simply nest-count to the same effect.
      
      This removes the need to disable IRQs.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ef60777c
    • P
      perf: Disallow mmap() on per-task inherited events · c7920614
      Peter Zijlstra 提交于
      Since we now have working per-task-per-cpu events for
      a while, disallow mmap() on per-task inherited
      events. Those things were a performance problem
      anyway, and doing away with it allows us to optimize
      the buffer somewhat by assuming there is only a
      single writer.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7920614
    • P
      perf: Optimize buffer placement by allocating buffers NUMA aware · a19d35c1
      Peter Zijlstra 提交于
      Ensure cpu bound buffers live on the right NUMA node.
      Suggested-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1274114880.5605.5236.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a19d35c1
    • S
      perf: Fix errors path in perf_output_begin() · 00d1d0b0
      Stephane Eranian 提交于
      In case the sampling buffer has no "payload" pages,
      nr_pages is 0. The problem is that the error path in
      perf_output_begin() skips to a label which assumes
      perf_output_lock() has been issued which is not the
      case. That triggers a WARN_ON() in
      perf_output_unlock().
      
      This patch fixes the problem by skipping
      perf_output_unlock() in case data->nr_pages is 0.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4bf13674.014fd80a.6c82.ffffb20c@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      00d1d0b0
    • P
      perf/ftrace: Optimize perf/tracepoint interaction for single events · 4f41c013
      Peter Zijlstra 提交于
      When we've got but a single event per tracepoint
      there is no reason to try and multiplex it so don't.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Tested-by: NIngo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4f41c013
  3. 18 5月, 2010 11 次提交
  4. 17 5月, 2010 5 次提交
    • A
      perf tui: Add workaround for slang < 2.1.4 · dc4ff193
      Arnaldo Carvalho de Melo 提交于
      Older versions of the slang library didn't used the 'const' specifier,
      causing problems with modern compilers of this kind:
      
      util/newt.c:252: error: passing argument 1 of ‘SLsmg_printf’ discards
      qualifiers from pointer target type
      
      Fix it by using some wrappers that when needed const the affected
      parameters back to plain (char *).
      Reported-by: NLin Ming <ming.m.lin@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20100517145421.GD29052@ghostprotocols.net>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dc4ff193
    • S
      perf record: Fix bug mismatch with -c option definition · 3de29cab
      Stephane Eranian 提交于
      The -c option defines the user requested sampling period. It was implemented
      using an unsigned int variable but the type of the option was OPT_LONG. Thus,
      the option parser was overwriting memory belonging to other variables, namely
      the mmap_pages leading to a zero page sampling buffer. The bug was exposed only
      when compiling at -O0, probably because the compiler was padding variables at
      higher optimization levels.
      
      This patch fixes this problem by declaring user_interval as u64. This also
      avoids wrap-around issues for large period on 32-bit systems.
      
      Commiter note:
      
      Made it use OPT_U64(user_interval) after implementing OPT_U64 in the
      previous patch.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <4bf11ae9.e88cd80a.06b0.ffffa8e3@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3de29cab
    • A
      perf options: Introduce OPT_U64 · 6ba85cea
      Arnaldo Carvalho de Melo 提交于
      We have things like user_interval (-c/--count) in 'perf record' that
      needs this.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6ba85cea
    • A
      perf tui: Add help window to show key associations · a9a4ab74
      Arnaldo Carvalho de Melo 提交于
      Suggested-by: NIngo Molnar <mingo@elte.hu>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a9a4ab74
    • A
      perf tui: Make <- exit menus too · a308f3a8
      Arnaldo Carvalho de Melo 提交于
      In fact it is now added to the hot key list when newt_form__new is used,
      allowing us to remove the explicit assignment in all its users.
      
      The visible change is that <- will exit the menu that pops up when -> is
      pressed (and Enter when callchains are not being used).
      Suggested-by: NIngo Molnar <mingo@elte.hu>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a308f3a8
  5. 16 5月, 2010 4 次提交
  6. 15 5月, 2010 2 次提交