1. 26 9月, 2014 2 次提交
    • A
      perf evlist: Monitor POLLERR and POLLHUP events too · 8179672c
      Arnaldo Carvalho de Melo 提交于
      We want to know when the fd went away, like when a monitored thread
      exits.
      
      If we do not monitor such events, then the tools will wait forever on
      events from a vanished thread, like when running:
      
       $ sleep 5s &
       $ perf record -p `pidof sleep`
      
      This builds upon the kernel patch by Jiri Olsa that actually makes a
      poll on those file descriptors to return POLLHUP.
      
      It is also needed to change the tools to use
      perf_evlist__filter_pollfd() to check if there are remainings fds to
      monitor or if all are gone, in which case they will exit the
      poll/mmap/read loop.
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-a4fslwspov0bs69nj825hqpq@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8179672c
    • A
      perf evlist: Introduce perf_evlist__filter_pollfd method · 1ddec7f0
      Arnaldo Carvalho de Melo 提交于
      To remove all entries in evlist->pollfd[] that have revents matching at
      least one of the bits in the specified mask.
      
      It'll adjust evlist->nr_fds to the number of unfiltered fds and will
      return this value, as a convenience and to avoid requiring direct access
      to internal state of perf_evlist objects.
      
      This will be used after polling the evlist fds so that we remove fds
      that were closed by the kernel.
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-y2sca7z3wicvvy40a50lozwm@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1ddec7f0
  2. 15 8月, 2014 1 次提交
  3. 14 8月, 2014 3 次提交
  4. 31 7月, 2014 1 次提交
    • A
      perf evlist: Don't run workload if not told to · 5f1c4225
      Arnaldo Carvalho de Melo 提交于
      The perf_evlist__prepare_workload() method works by forking and then
      waiting on a fd that must be written to to allow the workload to be
      exec()ed.
      
      But if the tool calling it fails to, say, set up the events with which
      it wants to sample the workload for, it will not call
      perf_evlist__start_workload(), but even in this case the workload ended
      up running:
      
        [acme@zoo linux]$ trace /bin/echo workload ends up running, it should not...
        Couldn't mmap the events: Operation not permitted
        workload ends up running, it should not...
        [acme@zoo linux]$
      
      So check if at least one byte was written before letting exec() be
      called.
      
      Now the expected behaviour:
      
        [acme@zoo linux]$ trace /bin/echo workload ends up running, it should not...
        Couldn't mmap the events: Operation not permitted
        [acme@zoo linux]$
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-oh1ixo8m74rf295a05gfjw8b@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5f1c4225
  5. 17 7月, 2014 1 次提交
  6. 20 6月, 2014 1 次提交
  7. 21 1月, 2014 1 次提交
    • S
      perf stat: Fix memory corruption of xyarray when cpumask is used · 8ad9219e
      Stephane Eranian 提交于
      This patch fixes a memory corruption problem with the xyarray when the
      evsel fds get closed at the end of the run_perf_stat() call.
      
      It could be triggered with:
      
       # perf stat -a -e power/energy-cores/ ls
      
      When cpumask are used by events (.e.g, RAPL or uncores) then the evsel
      fds are allocated based on the actual number of CPUs monitored. That
      number can be smaller than the total number of CPUs on the system.
      
      The problem arises at the end by perf stat closes the fds twice. When
      fds are closed, their entry in the xyarray are set to -1.
      
      The first close() on the evsel is made from __run_perf_stat() and it
      uses the actual number of CPUS for the event which is how the xyarray
      was allocated for.
      
      The second is from perf_evlist_close() but that one is on the total
      number of CPUs in the system, so it assume the xyarray was allocated to
      cover it. However it was not, and some writes corrupt memory.
      
      The fix is in perf_evlist_close() is to first try with the evsel->cpus
      if present, if not use the evlist->cpus. That fixes the problem.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1389972846-6566-3-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8ad9219e
  8. 13 1月, 2014 6 次提交
  9. 28 12月, 2013 1 次提交
  10. 17 12月, 2013 1 次提交
    • B
      tools/: Convert to new topic libraries · 553873e1
      Borislav Petkov 提交于
      Move debugfs.* to api/fs/. We have a common tools/lib/api/ place where
      the Makefile lives and then we place the headers in subdirs.
      
      For example, all the fs-related stuff goes to tools/lib/api/fs/ from
      which we get libapikfs.a (acme got almost the naming he wanted :-)) and
      we link it into the tools which need it - in this case perf and
      tools/vm/page-types.
      
      acme:
      
      "Looking at the implementation, I think some tools can even link
      directly to the .o files, avoiding the .a file altogether.
      
      But that is just an optimization/finer granularity tools/lib/
      cherrypicking that toolers can make use of."
      
      Fixup documentation cleaning target while at it.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stanislav Fomichev <stfomichev@yandex-team.ru>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1386605664-24041-2-git-send-email-bp@alien8.deSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      553873e1
  11. 13 12月, 2013 2 次提交
  12. 10 12月, 2013 3 次提交
  13. 05 12月, 2013 1 次提交
  14. 28 11月, 2013 1 次提交
  15. 15 11月, 2013 2 次提交
  16. 13 11月, 2013 4 次提交
  17. 12 11月, 2013 1 次提交
    • A
      perf evsel: Remove idx parm from constructor · ef503831
      Arnaldo Carvalho de Melo 提交于
      Most uses of the evsel constructor are followed by a call to
      perf_evlist__add with an idex of evlist->nr_entries, so make rename
      the current constructor to perf_evsel__new_idx and remove the need
      for passing the constructor for the common case.
      
      We still need the new_idx variant because the way groups are handled,
      with evsel->nr_members holding the number of entries in an evlist,
      partitioning the evlist into sublists inside a single linked list.
      
      This asks for a clarifying refactoring, but for now simplify the non
      parser cases, so that tool writers don't have to bother with evsel idx
      setting.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-zy9tskx6jqm2rmw7468zze2a@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ef503831
  18. 07 11月, 2013 1 次提交
  19. 04 11月, 2013 1 次提交
  20. 29 10月, 2013 1 次提交
  21. 23 10月, 2013 1 次提交
  22. 21 10月, 2013 3 次提交
  23. 18 10月, 2013 1 次提交
    • A
      perf trace: Improve messages related to /proc/sys/kernel/perf_event_paranoid · a8f23d8f
      Arnaldo Carvalho de Melo 提交于
      kernel/events/core.c has:
      
        /*
         * perf event paranoia level:
         *  -1 - not paranoid at all
         *   0 - disallow raw tracepoint access for unpriv
         *   1 - disallow cpu events for unpriv
         *   2 - disallow kernel profiling for unpriv
         */
        int sysctl_perf_event_paranoid __read_mostly = 1;
      
      So, with the default being 1, a non-root user can trace his stuff:
      
        [acme@zoo ~]$ cat /proc/sys/kernel/perf_event_paranoid
        1
        [acme@zoo ~]$ yes > /dev/null &
        [1] 15338
        [acme@zoo ~]$ trace -p 15338 | head -5
             0.005 ( 0.005 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.045 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.085 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.125 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.165 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
        [acme@zoo ~]$
        [acme@zoo ~]$ trace --duration 1 sleep 1
          1002.148 (1001.218 ms): nanosleep(rqtp: 0x7fff46c79250                           ) = 0
        [acme@zoo ~]$
        [acme@zoo ~]$ trace -- usleep 1 | tail -5
             0.905 ( 0.002 ms): brk(                                                     ) = 0x1c82000
             0.910 ( 0.003 ms): brk(brk: 0x1ca3000                                       ) = 0x1ca3000
             0.913 ( 0.001 ms): brk(                                                     ) = 0x1ca3000
             0.990 ( 0.059 ms): nanosleep(rqtp: 0x7fffe31a3280                           ) = 0
             0.995 ( 0.000 ms): exit_group(
        [acme@zoo ~]$
      
      But can't do system wide tracing:
      
        [acme@zoo ~]$ trace
        Error:	Operation not permitted.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 1.
        [acme@zoo ~]$
      
        [acme@zoo ~]$ trace --cpu 0
        Error:	Operation not permitted.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 1.
        [acme@zoo ~]$
      
      If the paranoid level is >= 2, i.e. turn this perf stuff off for !root users:
      
        [acme@zoo ~]$ sudo sh -c 'echo 2 > /proc/sys/kernel/perf_event_paranoid'
        [acme@zoo ~]$ cat /proc/sys/kernel/perf_event_paranoid
        2
        [acme@zoo ~]$
        [acme@zoo ~]$ trace usleep 1
        Error:	Permission denied.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For your workloads it needs to be <= 1
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 2.
        [acme@zoo ~]$
        [acme@zoo ~]$ trace
        Error:	Permission denied.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For your workloads it needs to be <= 1
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 2.
        [acme@zoo ~]$
        [acme@zoo ~]$ trace --cpu 1
        Error:	Permission denied.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For your workloads it needs to be <= 1
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 2.
        [acme@zoo ~]$
      
      If the user manages to get what he/she wants, convincing root not
      to be paranoid at all...
      
        [root@zoo ~]# echo -1 > /proc/sys/kernel/perf_event_paranoid
        [root@zoo ~]# cat /proc/sys/kernel/perf_event_paranoid
        -1
        [root@zoo ~]#
      
        [acme@zoo ~]$ ps -eo user,pid,comm | grep Xorg
        root       729 Xorg
        [acme@zoo ~]$
        [acme@zoo ~]$ trace -a --duration 0.001 -e \!select,ioctl,writev | grep Xorg  | head -5
            23.143 ( 0.003 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0
            23.152 ( 0.004 ms): Xorg/729 read(fd: 31, buf: 0x2544af0, count: 4096     ) = 8
            23.161 ( 0.002 ms): Xorg/729 read(fd: 31, buf: 0x2544af0, count: 4096     ) = -1 EAGAIN Resource temporarily unavailable
            23.175 ( 0.002 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0
            23.235 ( 0.002 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0
        [acme@zoo ~]$
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-di28olfwd28rvkox7v3hqhu1@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a8f23d8f