1. 21 10月, 2013 3 次提交
  2. 18 10月, 2013 2 次提交
    • A
      perf trace: Improve messages related to /proc/sys/kernel/perf_event_paranoid · a8f23d8f
      Arnaldo Carvalho de Melo 提交于
      kernel/events/core.c has:
      
        /*
         * perf event paranoia level:
         *  -1 - not paranoid at all
         *   0 - disallow raw tracepoint access for unpriv
         *   1 - disallow cpu events for unpriv
         *   2 - disallow kernel profiling for unpriv
         */
        int sysctl_perf_event_paranoid __read_mostly = 1;
      
      So, with the default being 1, a non-root user can trace his stuff:
      
        [acme@zoo ~]$ cat /proc/sys/kernel/perf_event_paranoid
        1
        [acme@zoo ~]$ yes > /dev/null &
        [1] 15338
        [acme@zoo ~]$ trace -p 15338 | head -5
             0.005 ( 0.005 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.045 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.085 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.125 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.165 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
        [acme@zoo ~]$
        [acme@zoo ~]$ trace --duration 1 sleep 1
          1002.148 (1001.218 ms): nanosleep(rqtp: 0x7fff46c79250                           ) = 0
        [acme@zoo ~]$
        [acme@zoo ~]$ trace -- usleep 1 | tail -5
             0.905 ( 0.002 ms): brk(                                                     ) = 0x1c82000
             0.910 ( 0.003 ms): brk(brk: 0x1ca3000                                       ) = 0x1ca3000
             0.913 ( 0.001 ms): brk(                                                     ) = 0x1ca3000
             0.990 ( 0.059 ms): nanosleep(rqtp: 0x7fffe31a3280                           ) = 0
             0.995 ( 0.000 ms): exit_group(
        [acme@zoo ~]$
      
      But can't do system wide tracing:
      
        [acme@zoo ~]$ trace
        Error:	Operation not permitted.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 1.
        [acme@zoo ~]$
      
        [acme@zoo ~]$ trace --cpu 0
        Error:	Operation not permitted.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 1.
        [acme@zoo ~]$
      
      If the paranoid level is >= 2, i.e. turn this perf stuff off for !root users:
      
        [acme@zoo ~]$ sudo sh -c 'echo 2 > /proc/sys/kernel/perf_event_paranoid'
        [acme@zoo ~]$ cat /proc/sys/kernel/perf_event_paranoid
        2
        [acme@zoo ~]$
        [acme@zoo ~]$ trace usleep 1
        Error:	Permission denied.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For your workloads it needs to be <= 1
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 2.
        [acme@zoo ~]$
        [acme@zoo ~]$ trace
        Error:	Permission denied.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For your workloads it needs to be <= 1
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 2.
        [acme@zoo ~]$
        [acme@zoo ~]$ trace --cpu 1
        Error:	Permission denied.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For your workloads it needs to be <= 1
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 2.
        [acme@zoo ~]$
      
      If the user manages to get what he/she wants, convincing root not
      to be paranoid at all...
      
        [root@zoo ~]# echo -1 > /proc/sys/kernel/perf_event_paranoid
        [root@zoo ~]# cat /proc/sys/kernel/perf_event_paranoid
        -1
        [root@zoo ~]#
      
        [acme@zoo ~]$ ps -eo user,pid,comm | grep Xorg
        root       729 Xorg
        [acme@zoo ~]$
        [acme@zoo ~]$ trace -a --duration 0.001 -e \!select,ioctl,writev | grep Xorg  | head -5
            23.143 ( 0.003 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0
            23.152 ( 0.004 ms): Xorg/729 read(fd: 31, buf: 0x2544af03, count: 4096     ) = 8
            23.161 ( 0.002 ms): Xorg/729 read(fd: 31, buf: 0x2544af03, count: 4096     ) = -1 EAGAIN Resource temporarily unavailable
            23.175 ( 0.002 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0
            23.235 ( 0.002 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0
        [acme@zoo ~]$
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-di28olfwd28rvkox7v3hqhu1@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a8f23d8f
    • A
      perf evlist: Introduce perf_evlist__strerror_tp method · 6ef068cb
      Arnaldo Carvalho de Melo 提交于
      Out of 'perf trace', should be used by other tools that uses
      tracepoints.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ramkumar Ramachandra <artagnon@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-lyvtxhchz4ga8fwht15x8wou@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6ef068cb
  3. 11 10月, 2013 1 次提交
    • J
      perf evlist: Fix perf_evlist__mmap_read event overflow · a65cb4b9
      Jiri Olsa 提交于
      The perf_evlist__mmap_read used 'union perf_event' as a placeholder for
      event crossing the mmap boundary.
      
      This is ok for sample shorter than ~PATH_MAX. However we could grow up
      to the maximum sample size which is 16 bits max.
      
      I hit this overflow issue when using 'perf top -G dwarf' which produces
      sample with the size around 8192 bytes.  We could configure any valid
      sample size here using: '-G dwarf,size'.
      
      Using array with sample max size instead for the event placeholder. Also
      adding another safe check for the dynamic size of the user stack.
      
      TODO: The 'struct perf_mmap' is quite big now, maybe we could use some
      lazy allocation for event_copy size.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1380721599-24285-1-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a65cb4b9
  4. 09 10月, 2013 3 次提交
  5. 10 9月, 2013 1 次提交
  6. 06 9月, 2013 1 次提交
  7. 03 9月, 2013 1 次提交
  8. 30 8月, 2013 3 次提交
  9. 17 8月, 2013 1 次提交
  10. 08 8月, 2013 5 次提交
  11. 13 7月, 2013 3 次提交
  12. 09 7月, 2013 1 次提交
    • N
      perf evlist: Enhance perf_evlist__start_workload() · bcf3145f
      Namhyung Kim 提交于
      When perf tries to start a workload, it relies on a pipe which the
      workload was blocked for reading.  After closing the pipe on the parent,
      the workload (child) can start the actual work via exec().
      
      However, if another process was forked after creating a workload, this
      mechanism cannot work since the other process (child) also inherits the
      pipe, so that closing the pipe in parent cannot unblock the workload.
      Fix it by using explicit write call can then closing it.
      
      For similar reason, the pipe fd on parent should be marked as CLOEXEC so
      that it can be closed after another child exec'ed.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1372230862-15861-13-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bcf3145f
  13. 30 5月, 2013 1 次提交
  14. 16 3月, 2013 8 次提交
  15. 28 2月, 2013 1 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
  16. 07 2月, 2013 2 次提交
    • D
      perf evlist: Make event_copy local to mmaps · 0479b8b9
      David Ahern 提交于
      I am getting segfaults *after* the time sorting of perf samples where
      the event type is off the charts:
      
      (gdb) bt
      \#0  0x0807b1b2 in hists__inc_nr_events (hists=0x80a99c4, type=1163281902) at util/hist.c:1225
      \#1  0x08070795 in perf_session_deliver_event (session=0x80a9b90, event=0xf7a6aff8, sample=0xffffc318, tool=0xffffc520,
          file_offset=0) at util/session.c:884
      \#2  0x0806f9b9 in flush_sample_queue (s=0x80a9b90, tool=0xffffc520) at util/session.c:555
      \#3  0x0806fc53 in process_finished_round (tool=0xffffc520, event=0x0, session=0x80a9b90) at util/session.c:645
      
      This is bizarre because the event has already been processed once --
      before it was added to the samples queue -- and the event was found to
      be sane at that time.
      
      There seem to be 2 causes:
      
      1. perf_evlist__mmap_read updates the read location even though there
      are outstanding references to events sitting in the mmap buffers via the
      ordered samples queue.
      
      2. There is a single evlist->event_copy for all evlist entries.
      event_copy is used to handle an event wrapping at the mmap buffer
      boundary.
      
      This patch addresses the second problem - making event_copy local to
      each perf_mmap. With this change my highly repeatable use case no longer
      fails.
      
      The first problem is much more complicated and will be the subject of a
      future patch.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1360098762-61827-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0479b8b9
    • S
      perf evlist: Fix set event list leader · 74b2133d
      Stephane Eranian 提交于
      The __perf_evlist__set_leader() was setting the leader for all events in
      the list except the first. Which means it assumed the first event
      already had event->leader = event.
      
      Seems like this should be the role of the function to also do this. This
      is a requirement for an upcoming patch set.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Tested-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20130131125437.GA3656@quadSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      74b2133d
  17. 01 2月, 2013 1 次提交
  18. 12 12月, 2012 2 次提交