1. 26 9月, 2014 5 次提交
    • A
      perf evlist: Refcount mmaps · 82396986
      Arnaldo Carvalho de Melo 提交于
      We need to know how many fds are using a perf mmap via
      PERF_EVENT_IOC_SET_OUTPUT, so that we can know when to ditch an mmap,
      refcount it.
      
      v2: Automatically unmap it when the refcount hits one, which will happen
      when all fds are filtered by perf_evlist__filter_pollfd(), in later
      patches.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20140908153824.GG2773@kernel.org
      Link: http://lkml.kernel.org/n/tip-cpv7v2lw0g74ucmxa39xdpms@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      82396986
    • A
      tools lib api: Adopt fdarray class from perf's evlist · 1b85337d
      Arnaldo Carvalho de Melo 提交于
      The extensible file description array that grew in the perf_evlist class
      can be useful for other tools, as it is not something that only evlists
      need, so move it to tools/lib/api/fd to ease sharing it.
      
      v2: Don't use {} like in:
      
       libapi_dirs:
      	$(QUIET_MKDIR)mkdir -p $(OUTPUT){fs,fd}/
      
      in Makefiles, as it will not work in some systems, as in ubuntu13.10.
      
      v3: Add fd/*.[ch] to LIBAPIKFS_SOURCES (Fix from Jiri Olsa)
      
      v4: Leave the fcntl(fd, O_NONBLOCK) in the evlist layer, remains to
          be checked if it is really needed there, but has no place in the
          fdarray class (Fix from Jiri Olsa)
      
      v5: Remove evlist details from fdarray grow/filter tests. Improve it a
          bit doing more tests about expected internal state.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/n/tip-kleuni3hckbc3s0lu6yb9x40@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1b85337d
    • A
      perf evlist: Introduce poll method for common code idiom · f66a889d
      Arnaldo Carvalho de Melo 提交于
      Since we have access two evlist members in all these poll calls, provide
      a helper.
      
      This will also help to make the patch introducing the pollfd class more
      clear, as the evlist specific uses will be hiden away
      perf_evlist__poll().
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/n/tip-jr9d4aop4lvy9453qahbcgp0@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f66a889d
    • A
      perf evlist: Allow growing pollfd on add method · ad6765dd
      Arnaldo Carvalho de Melo 提交于
      This way we will be able to add more file descriptors to be polled,
      like stdin or some timer fd.
      
      At this point we might as well yank the pollfd class from evlist so that
      it can be used in other places.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/n/tip-o2mzsjl7taumsoc35ryol00i@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ad6765dd
    • A
      perf evlist: Introduce perf_evlist__filter_pollfd method · 1ddec7f0
      Arnaldo Carvalho de Melo 提交于
      To remove all entries in evlist->pollfd[] that have revents matching at
      least one of the bits in the specified mask.
      
      It'll adjust evlist->nr_fds to the number of unfiltered fds and will
      return this value, as a convenience and to avoid requiring direct access
      to internal state of perf_evlist objects.
      
      This will be used after polling the evlist fds so that we remove fds
      that were closed by the kernel.
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-y2sca7z3wicvvy40a50lozwm@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1ddec7f0
  2. 14 8月, 2014 2 次提交
  3. 13 1月, 2014 3 次提交
  4. 20 12月, 2013 1 次提交
  5. 13 12月, 2013 2 次提交
  6. 13 11月, 2013 1 次提交
  7. 06 11月, 2013 3 次提交
  8. 29 10月, 2013 1 次提交
  9. 18 10月, 2013 2 次提交
    • A
      perf trace: Improve messages related to /proc/sys/kernel/perf_event_paranoid · a8f23d8f
      Arnaldo Carvalho de Melo 提交于
      kernel/events/core.c has:
      
        /*
         * perf event paranoia level:
         *  -1 - not paranoid at all
         *   0 - disallow raw tracepoint access for unpriv
         *   1 - disallow cpu events for unpriv
         *   2 - disallow kernel profiling for unpriv
         */
        int sysctl_perf_event_paranoid __read_mostly = 1;
      
      So, with the default being 1, a non-root user can trace his stuff:
      
        [acme@zoo ~]$ cat /proc/sys/kernel/perf_event_paranoid
        1
        [acme@zoo ~]$ yes > /dev/null &
        [1] 15338
        [acme@zoo ~]$ trace -p 15338 | head -5
             0.005 ( 0.005 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.045 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.085 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.125 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.165 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
        [acme@zoo ~]$
        [acme@zoo ~]$ trace --duration 1 sleep 1
          1002.148 (1001.218 ms): nanosleep(rqtp: 0x7fff46c79250                           ) = 0
        [acme@zoo ~]$
        [acme@zoo ~]$ trace -- usleep 1 | tail -5
             0.905 ( 0.002 ms): brk(                                                     ) = 0x1c82000
             0.910 ( 0.003 ms): brk(brk: 0x1ca3000                                       ) = 0x1ca3000
             0.913 ( 0.001 ms): brk(                                                     ) = 0x1ca3000
             0.990 ( 0.059 ms): nanosleep(rqtp: 0x7fffe31a3280                           ) = 0
             0.995 ( 0.000 ms): exit_group(
        [acme@zoo ~]$
      
      But can't do system wide tracing:
      
        [acme@zoo ~]$ trace
        Error:	Operation not permitted.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 1.
        [acme@zoo ~]$
      
        [acme@zoo ~]$ trace --cpu 0
        Error:	Operation not permitted.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 1.
        [acme@zoo ~]$
      
      If the paranoid level is >= 2, i.e. turn this perf stuff off for !root users:
      
        [acme@zoo ~]$ sudo sh -c 'echo 2 > /proc/sys/kernel/perf_event_paranoid'
        [acme@zoo ~]$ cat /proc/sys/kernel/perf_event_paranoid
        2
        [acme@zoo ~]$
        [acme@zoo ~]$ trace usleep 1
        Error:	Permission denied.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For your workloads it needs to be <= 1
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 2.
        [acme@zoo ~]$
        [acme@zoo ~]$ trace
        Error:	Permission denied.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For your workloads it needs to be <= 1
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 2.
        [acme@zoo ~]$
        [acme@zoo ~]$ trace --cpu 1
        Error:	Permission denied.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For your workloads it needs to be <= 1
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 2.
        [acme@zoo ~]$
      
      If the user manages to get what he/she wants, convincing root not
      to be paranoid at all...
      
        [root@zoo ~]# echo -1 > /proc/sys/kernel/perf_event_paranoid
        [root@zoo ~]# cat /proc/sys/kernel/perf_event_paranoid
        -1
        [root@zoo ~]#
      
        [acme@zoo ~]$ ps -eo user,pid,comm | grep Xorg
        root       729 Xorg
        [acme@zoo ~]$
        [acme@zoo ~]$ trace -a --duration 0.001 -e \!select,ioctl,writev | grep Xorg  | head -5
            23.143 ( 0.003 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0
            23.152 ( 0.004 ms): Xorg/729 read(fd: 31, buf: 0x2544af03, count: 4096     ) = 8
            23.161 ( 0.002 ms): Xorg/729 read(fd: 31, buf: 0x2544af03, count: 4096     ) = -1 EAGAIN Resource temporarily unavailable
            23.175 ( 0.002 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0
            23.235 ( 0.002 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0
        [acme@zoo ~]$
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-di28olfwd28rvkox7v3hqhu1@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a8f23d8f
    • A
      perf evlist: Introduce perf_evlist__strerror_tp method · 6ef068cb
      Arnaldo Carvalho de Melo 提交于
      Out of 'perf trace', should be used by other tools that uses
      tracepoints.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ramkumar Ramachandra <artagnon@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-lyvtxhchz4ga8fwht15x8wou@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6ef068cb
  10. 11 10月, 2013 1 次提交
    • J
      perf evlist: Fix perf_evlist__mmap_read event overflow · a65cb4b9
      Jiri Olsa 提交于
      The perf_evlist__mmap_read used 'union perf_event' as a placeholder for
      event crossing the mmap boundary.
      
      This is ok for sample shorter than ~PATH_MAX. However we could grow up
      to the maximum sample size which is 16 bits max.
      
      I hit this overflow issue when using 'perf top -G dwarf' which produces
      sample with the size around 8192 bytes.  We could configure any valid
      sample size here using: '-G dwarf,size'.
      
      Using array with sample max size instead for the event placeholder. Also
      adding another safe check for the dynamic size of the user stack.
      
      TODO: The 'struct perf_mmap' is quite big now, maybe we could use some
      lazy allocation for event_copy size.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1380721599-24285-1-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a65cb4b9
  11. 09 10月, 2013 2 次提交
  12. 03 9月, 2013 1 次提交
  13. 30 8月, 2013 2 次提交
  14. 08 8月, 2013 2 次提交
  15. 16 3月, 2013 5 次提交
  16. 07 2月, 2013 1 次提交
    • D
      perf evlist: Make event_copy local to mmaps · 0479b8b9
      David Ahern 提交于
      I am getting segfaults *after* the time sorting of perf samples where
      the event type is off the charts:
      
      (gdb) bt
      \#0  0x0807b1b2 in hists__inc_nr_events (hists=0x80a99c4, type=1163281902) at util/hist.c:1225
      \#1  0x08070795 in perf_session_deliver_event (session=0x80a9b90, event=0xf7a6aff8, sample=0xffffc318, tool=0xffffc520,
          file_offset=0) at util/session.c:884
      \#2  0x0806f9b9 in flush_sample_queue (s=0x80a9b90, tool=0xffffc520) at util/session.c:555
      \#3  0x0806fc53 in process_finished_round (tool=0xffffc520, event=0x0, session=0x80a9b90) at util/session.c:645
      
      This is bizarre because the event has already been processed once --
      before it was added to the samples queue -- and the event was found to
      be sane at that time.
      
      There seem to be 2 causes:
      
      1. perf_evlist__mmap_read updates the read location even though there
      are outstanding references to events sitting in the mmap buffers via the
      ordered samples queue.
      
      2. There is a single evlist->event_copy for all evlist entries.
      event_copy is used to handle an event wrapping at the mmap buffer
      boundary.
      
      This patch addresses the second problem - making event_copy local to
      each perf_mmap. With this change my highly repeatable use case no longer
      fails.
      
      The first problem is much more complicated and will be the subject of a
      future patch.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1360098762-61827-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0479b8b9
  17. 01 2月, 2013 1 次提交
  18. 12 12月, 2012 1 次提交
  19. 03 10月, 2012 2 次提交
  20. 27 9月, 2012 2 次提交