1. 24 10月, 2013 1 次提交
    • J
      perf script python: Fix mem leak due to missing Py_DECREFs on dict entries · c0268e8d
      Joseph Schuchart 提交于
      We are using the Python scripting interface in perf to extract kernel
      events relevant for performance analysis of HPC codes. We noticed that
      the "perf script" call allocates a significant amount of memory (in the
      order of several 100 MiB) during it's run, e.g. 125 MiB for a 25 MiB
      input file:
      
        $> perf record -o perf.data -a -R -g fp \
             -e power:cpu_frequency -e sched:sched_switch \
             -e sched:sched_migrate_task -e sched:sched_process_exit \
             -e sched:sched_process_fork -e sched:sched_process_exec \
             -e cycles  -m 4096 --freq 4000
        $> /usr/bin/time perf script -i perf.data -s dummy_script.py
        0.84user 0.13system 0:01.92elapsed 51%CPU (0avgtext+0avgdata
        125532maxresident)k
        73072inputs+0outputs (57major+33086minor)pagefaults 0swaps
      
      Upon further investigation using the valgrind massif tool, we noticed
      that Python objects that are created in trace-event-python.c via
      PyString_FromString*() (and their Integer and Long counterparts) are
      never free'd.
      
      The reason for this seem to be missing Py_DECREF calls on the objects
      that are returned by these functions and stored in the Python
      dictionaries. The Python dictionaries do not steal references (as
      opposed to Python tuples and lists) but instead add their own reference.
      
      Hence, the reference that is returned by these object creation functions
      is never released and the memory is leaked. (see [1,2])
      
      The attached patch fixes this by wrapping all relevant calls to
      PyDict_SetItemString() and decrementing the reference counter
      immediately after the Python function call.
      
      This reduces the allocated memory to a reasonable amount:
      
        $> /usr/bin/time perf script -i perf.data -s dummy_script.py
        0.73user 0.05system 0:00.79elapsed 99%CPU (0avgtext+0avgdata
        49132maxresident)k
        0inputs+0outputs (0major+14045minor)pagefaults 0swaps
      
      For comparison, with a 120 MiB input file the memory consumption
      reported by time drops from almost 600 MiB to 146 MiB.
      
      The patch has been tested using Linux 3.8.2 with Python 2.7.4 and Linux
      3.11.6 with Python 2.7.5.
      
      Please let me know if you need any further information.
      
      [1] http://docs.python.org/2/c-api/tuple.html#PyTuple_SetItem
      [2] http://docs.python.org/2/c-api/dict.html#PyDict_SetItemStringSigned-off-by: NJoseph Schuchart <joseph.schuchart@tu-dresden.de>
      Reviewed-by: NTom Zanussi <tom.zanussi@linux.intel.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Link: http://lkml.kernel.org/r/1381468543-25334-4-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c0268e8d
  2. 22 7月, 2013 1 次提交
  3. 25 1月, 2013 1 次提交
  4. 08 10月, 2012 1 次提交
  5. 11 9月, 2012 1 次提交
    • I
      perf tools: Use __maybe_used for unused variables · 1d037ca1
      Irina Tirdea 提交于
      perf defines both __used and __unused variables to use for marking
      unused variables. The variable __used is defined to
      __attribute__((__unused__)), which contradicts the kernel definition to
      __attribute__((__used__)) for new gcc versions. On Android, __used is
      also defined in system headers and this leads to warnings like: warning:
      '__used__' attribute ignored
      
      __unused is not defined in the kernel and is not a standard definition.
      If __unused is included everywhere instead of __used, this leads to
      conflicts with glibc headers, since glibc has a variables with this name
      in its headers.
      
      The best approach is to use __maybe_unused, the definition used in the
      kernel for __attribute__((unused)). In this way there is only one
      definition in perf sources (instead of 2 definitions that point to the
      same thing: __used and __unused) and it works on both Linux and Android.
      This patch simply replaces all instances of __used and __unused with
      __maybe_unused.
      Signed-off-by: NIrina Tirdea <irina.tirdea@intel.com>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com
      [ committer note: fixed up conflict with a116e05d in builtin-sched.c ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1d037ca1
  6. 10 8月, 2012 1 次提交
  7. 08 8月, 2012 5 次提交
  8. 30 6月, 2012 1 次提交
  9. 28 6月, 2012 1 次提交
    • A
      perf tools: Stop using a global trace events description list · da378962
      Arnaldo Carvalho de Melo 提交于
      The pevent thing is per perf.data file, so I made it stop being static
      and become a perf_session member, so tools processing perf.data files
      use perf_session and _there_ we read the trace events description into
      session->pevent and then change everywhere to stop using that single
      global pevent variable and use the per session one.
      
      Note that it _doesn't_ fall backs to trace__event_id, as we're not
      interested at all in what is present in the
      /sys/kernel/debug/tracing/events in the workstation doing the analysis,
      just in what is in the perf.data file.
      
      This patch also introduces perf_session__set_tracepoints_handlers that
      is the perf perf.data/session way to associate handlers to tracepoint
      events by resolving their IDs using the events descriptions stored in a
      perf.data file. Make 'perf sched' use it.
      Reported-by: NDmitry Antipov <dmitry.antipov@linaro.org>
      Tested-by: NDmitry Antipov <dmitry.antipov@linaro.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linaro-dev@lists.linaro.org
      Cc: patches@linaro.org
      Link: http://lkml.kernel.org/r/20120625232016.GA28525@infradead.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      da378962
  10. 25 4月, 2012 1 次提交
    • S
      perf: Have perf use the new libtraceevent.a library · aaf045f7
      Steven Rostedt 提交于
      The event parsing code in perf was originally copied from trace-cmd
      but never was kept up-to-date with the changes that was done there.
      The trace-cmd libtraceevent.a code is much more mature than what is
      currently in perf.
      
      This updates the code to use wrappers to handle the calls to the
      new event parsing code. The new code requires a handle to be pass
      around, which removes the global event variables and allows
      more than one event structure to be read from different files
      (and different machines).
      
      But perf still has the old global events and the code throughout
      perf does not yet have a nice way to pass around a handle.
      A global 'pevent' has been made for perf and the old calls have
      been created as wrappers to the new event parsing code that uses
      the global pevent.
      
      With this change, perf can later incorporate the pevent handle into
      the perf structures and allow more than one file to be read and
      compared, that contains different events.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      aaf045f7
  11. 31 1月, 2012 1 次提交
  12. 28 11月, 2011 1 次提交
  13. 24 3月, 2011 1 次提交
    • A
      perf session: Pass evsel in event_ops->sample() · 9e69c210
      Arnaldo Carvalho de Melo 提交于
      Resolving the sample->id to an evsel since the most advanced tools,
      report and annotate, and the others will too when they evolve to
      properly support multi-event perf.data files.
      
      Good also because it does an extra validation, checking that the ID is
      valid when present. When that is not the case, the overhead is just a
      branch + function call (perf_evlist__id2evsel).
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9e69c210
  14. 15 3月, 2011 1 次提交
  15. 07 2月, 2011 1 次提交
    • K
      perf tool: Fix gcc 4.6.0 issues · fb7d0b3c
      Kyle McMartin 提交于
      GCC 4.6.0 in Fedora rawhide turned up some compile errors in tools/perf
      due to the -Werror=unused-but-set-variable flag.
      
      I've gone through and annotated some of the assignments that had side
      effects (ie: return value from a function) with the __used annotation,
      and in some cases, just removed unused code.
      
      In a few cases, we were assigning something useful, but not using it in
      later parts of the function.
      
      kyle@dreadnought:~/src% gcc --version
      gcc (GCC) 4.6.0 20110122 (Red Hat 4.6.0-0.3)
      
      Cc: Ingo Molnar <mingo@redhat.com>
      LKML-Reference: <20110124161304.GK27353@bombadil.infradead.org>
      Signed-off-by: NKyle McMartin <kyle@redhat.com>
      [ committer note: Fixed up the annotation fixes, as that code moved recently ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fb7d0b3c
  16. 17 11月, 2010 1 次提交
  17. 01 6月, 2010 1 次提交
  18. 11 5月, 2010 1 次提交
  19. 03 4月, 2010 1 次提交
    • T
      perf/scripts: Tuple was set from long in both branches in python_process_event() · b1dcc03c
      Tom Zanussi 提交于
      This is a fix to the signed/unsigned field handling in the
      Python scripting engine, based on a patch from Roel Kluin.
      
      Basically, Python wants to use a PyInt (which is internally a
      long) if it can i.e. if the value will fit into that type.  If
      not, it stores it into a PyLong, which isn't actually a long,
      but an arbitrary-precision integer variable.
      
      The code below is similar to to what Python does internally, and
      it seems to work as expected on the x86 and x86_64 sytems I
      tested it on.
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Roel Kluin <roel.kluin@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: rostedt@goodmis.org
      LKML-Reference: <1270184305.6422.10.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b1dcc03c
  20. 25 2月, 2010 2 次提交