1. 04 5月, 2011 1 次提交
  2. 03 5月, 2011 1 次提交
    • B
      perf: Start the restructuring · fae85b7c
      Borislav Petkov 提交于
      mv kernel/perf_event.c -> kernel/events/core.c. From there, all further
      sensible splitting can happen. The idea is that due to perf_event.c
      becoming pretty sizable and with the advent of the marriage with ftrace,
      splitting functionality into its logical parts should help speeding up
      the unification and to manage the complexity of the subsystem.
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      fae85b7c
  3. 11 4月, 2011 1 次提交
  4. 05 4月, 2011 1 次提交
    • J
      jump label: Introduce static_branch() interface · d430d3d7
      Jason Baron 提交于
      Introduce:
      
      static __always_inline bool static_branch(struct jump_label_key *key);
      
      instead of the old JUMP_LABEL(key, label) macro.
      
      In this way, jump labels become really easy to use:
      
      Define:
      
              struct jump_label_key jump_key;
      
      Can be used as:
      
              if (static_branch(&jump_key))
                      do unlikely code
      
      enable/disale via:
      
              jump_label_inc(&jump_key);
              jump_label_dec(&jump_key);
      
      that's it!
      
      For the jump labels disabled case, the static_branch() becomes an
      atomic_read(), and jump_label_inc()/dec() are simply atomic_inc(),
      atomic_dec() operations. We show testing results for this change below.
      
      Thanks to H. Peter Anvin for suggesting the 'static_branch()' construct.
      
      Since we now require a 'struct jump_label_key *key', we can store a pointer into
      the jump table addresses. In this way, we can enable/disable jump labels, in
      basically constant time. This change allows us to completely remove the previous
      hashtable scheme. Thanks to Peter Zijlstra for this re-write.
      
      Testing:
      
      I ran a series of 'tbench 20' runs 5 times (with reboots) for 3
      configurations, where tracepoints were disabled.
      
      jump label configured in
      avg: 815.6
      
      jump label *not* configured in (using atomic reads)
      avg: 800.1
      
      jump label *not* configured in (regular reads)
      avg: 803.4
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20110316212947.GA8792@redhat.com>
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Suggested-by: NH. Peter Anvin <hpa@linux.intel.com>
      Tested-by: NDavid Daney <ddaney@caviumnetworks.com>
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d430d3d7
  5. 31 3月, 2011 2 次提交
  6. 24 3月, 2011 1 次提交
  7. 23 3月, 2011 1 次提交
    • S
      perf_events: Fix stale ->cgrp pointer in update_cgrp_time_from_cpuctx() · 68cacd29
      Stephane Eranian 提交于
      This patch solves a stale pointer problem in
      update_cgrp_time_from_cpuctx(). The cpuctx->cgrp
      was not cleared on all possible event exit paths,
      including:
      
         close()
           perf_release()
             perf_release_kernel()
               list_del_event()
      
      This patch fixes list_del_event() to clear cpuctx->cgrp
      when there are no cgroup events left in the context.
      
      [ This second version makes the code compile when
        CONFIG_CGROUP_PERF is not enabled. We unconditionally define
        perf_cpu_context->cgrp. ]
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: perfmon2-devel@lists.sf.net
      Cc: paulus@samba.org
      Cc: davem@davemloft.net
      LKML-Reference: <20110323150306.GA1580@quad>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      68cacd29
  8. 16 3月, 2011 3 次提交
  9. 04 3月, 2011 5 次提交
  10. 23 2月, 2011 2 次提交
  11. 16 2月, 2011 4 次提交
    • P
      perf: Optimize hrtimer events · ba3dd36c
      Peter Zijlstra 提交于
      There is no need to re-initialize the hrtimer every time we start it,
      so don't do that (shaves a few cycles). Also, since we know hrtimers
      run at a fixed rate (nanoseconds) we can pre-compute the desired
      frequency at which they tick. This avoids us having to go through the
      whole adaptive frequency feedback logic (shaves another few cycles).
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1297448589.5226.47.camel@laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ba3dd36c
    • P
      perf: Optimize throttling code · 163ec435
      Peter Zijlstra 提交于
      By pre-computing the maximum number of samples per tick we can avoid a
      multiplication and a conditional since MAX_INTERRUPTS >
      max_samples_per_tick.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      163ec435
    • S
      perf: Add cgroup support · e5d1367f
      Stephane Eranian 提交于
      This kernel patch adds the ability to filter monitoring based on
      container groups (cgroups). This is for use in per-cpu mode only.
      
      The cgroup to monitor is passed as a file descriptor in the pid
      argument to the syscall. The file descriptor must be opened to
      the cgroup name in the cgroup filesystem. For instance, if the
      cgroup name is foo and cgroupfs is mounted in /cgroup, then the
      file descriptor is opened to /cgroup/foo. Cgroup mode is
      activated by passing PERF_FLAG_PID_CGROUP in the flags argument
      to the syscall.
      
      For instance to measure in cgroup foo on CPU1 assuming
      cgroupfs is mounted under /cgroup:
      
      struct perf_event_attr attr;
      int cgroup_fd, fd;
      
      cgroup_fd = open("/cgroup/foo", O_RDONLY);
      fd = perf_event_open(&attr, cgroup_fd, 1, -1, PERF_FLAG_PID_CGROUP);
      close(cgroup_fd);
      Signed-off-by: NStephane Eranian <eranian@google.com>
      [ added perf_cgroup_{exit,attach} ]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4d590250.114ddf0a.689e.4482@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e5d1367f
    • P
      perf: Fix throttle logic · 4fe757dd
      Peter Zijlstra 提交于
      It was possible to call pmu::start() on an already running event. In
      particular this lead so some wreckage as the hrtimer events would
      re-initialize active timers.
      
      This was due to throttled events being activated again by scheduling.
      Scheduling in a context would add and force start events, resulting in
      running events with a possible throttle status. The next tick to hit
      that task will then try to unthrottle the event and call ->start() on
      an already running event.
      Reported-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4fe757dd
  12. 03 2月, 2011 2 次提交
  13. 28 1月, 2011 1 次提交
    • E
      perf: Fix alloc_callchain_buffers() · 88d4f0db
      Eric Dumazet 提交于
      Commit 927c7a9e ("perf: Fix race in callchains") introduced
      a mismatch in the sizing of struct callchain_cpus_entries.
      
      nr_cpu_ids must be used instead of num_possible_cpus(), or we
      might get out of bound memory accesses on some machines.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Stephane Eranian <eranian@google.com>
      CC: stable@kernel.org
      LKML-Reference: <1295980851.3588.351.camel@edumazet-laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      88d4f0db
  14. 22 1月, 2011 1 次提交
    • O
      perf: perf_event_exit_task_context: s/rcu_dereference/rcu_dereference_raw/ · 806839b2
      Oleg Nesterov 提交于
      In theory, almost every user of task->child->perf_event_ctxp[]
      is wrong. find_get_context() can install the new context at any
      moment, we need read_barrier_depends().
      
      dbe08d82 "perf: Fix
      find_get_context() vs perf_event_exit_task() race" added
      rcu_dereference() into perf_event_exit_task_context() to make
      the precedent, but this makes __rcu_dereference_check() unhappy.
      Use rcu_dereference_raw() to shut up the warning.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: acme@redhat.com
      Cc: paulus@samba.org
      Cc: stern@rowland.harvard.edu
      Cc: a.p.zijlstra@chello.nl
      Cc: fweisbec@gmail.com
      Cc: roland@redhat.com
      Cc: prasad@linux.vnet.ibm.com
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      LKML-Reference: <20110121174547.GA8796@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      806839b2
  15. 21 1月, 2011 1 次提交
    • P
      perf: Annotate cpuctx->ctx.mutex to avoid a lockdep splat · 547e9fd7
      Peter Zijlstra 提交于
      Lockdep spotted:
      
      	loop_1b_instruc/1899 is trying to acquire lock:
      	 (event_mutex){+.+.+.}, at: [<ffffffff810e1908>] perf_trace_init+0x3b/0x2f7
      
      	but task is already holding lock:
      	 (&ctx->mutex){+.+.+.}, at: [<ffffffff810eb45b>] perf_event_init_context+0xc0/0x218
      
      	which lock already depends on the new lock.
      
      	the existing dependency chain (in reverse order) is:
      
      	-> #3 (&ctx->mutex){+.+.+.}:
      	-> #2 (cpu_hotplug.lock){+.+.+.}:
      	-> #1 (module_mutex){+.+...}:
      	-> #0 (event_mutex){+.+.+.}:
      
      But because the deadlock would be cpuhotplug (cpu-event) vs fork
      (task-event) it cannot, in fact, happen. We can annotate this by giving the
      perf_event_context used for the cpuctx a different lock class from those
      used by tasks.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      547e9fd7
  16. 20 1月, 2011 2 次提交
    • O
      perf: Fix perf_event_init_task()/perf_event_free_task() interaction · 8550d7cb
      Oleg Nesterov 提交于
      perf_event_init_task() should clear child->perf_event_ctxp[]
      before anything else. Otherwise, if
      perf_event_init_context(perf_hw_context) fails,
      perf_event_free_task() can free perf_event_ctxp[perf_sw_context]
      copied from parent->perf_event_ctxp[] by dup_task_struct().
      
      Also move the initialization of perf_event_mutex and
      perf_event_list from perf_event_init_context() to
      perf_event_init_context().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Roland McGrath <roland@redhat.com>
      LKML-Reference: <20110119182228.GC12183@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8550d7cb
    • O
      perf: Fix find_get_context() vs perf_event_exit_task() race · dbe08d82
      Oleg Nesterov 提交于
      find_get_context() must not install the new perf_event_context
      if the task has already passed perf_event_exit_task().
      
      If nothing else, this means the memory leak. Initially
      ctx->refcount == 2, it is supposed that
      perf_event_exit_task_context() should participate and do the
      necessary put_ctx().
      
      find_lively_task_by_vpid() checks PF_EXITING but this buys
      nothing, by the time we call find_get_context() this task can be
      already dead. To the point, cmpxchg() can succeed when the task
      has already done the last schedule().
      
      Change find_get_context() to populate task->perf_event_ctxp[]
      under task->perf_event_mutex, this way we can trust PF_EXITING
      because perf_event_exit_task() takes the same mutex.
      
      Also, change perf_event_exit_task_context() to use
      rcu_dereference(). Probably this is not strictly needed, but
      with or without this change find_get_context() can race with
      setup_new_exec()->perf_event_exit_task(), rcu_dereference()
      looks better.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Roland McGrath <roland@redhat.com>
      LKML-Reference: <20110119182207.GB12183@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dbe08d82
  17. 19 1月, 2011 2 次提交
    • O
      perf: Validate cpu early in perf_event_alloc() · 66832eb4
      Oleg Nesterov 提交于
      Starting from perf_event_alloc()->perf_init_event(), the kernel
      assumes that event->cpu is either -1 or the valid CPU number.
      
      Change perf_event_alloc() to validate this argument early. This
      also means we can remove the similar check in
      find_get_context().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: gregkh@suse.de
      Cc: stable@kernel.org
      LKML-Reference: <20110118161032.GC693@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      66832eb4
    • O
      perf: Find_get_context: fix the per-cpu-counter check · 22a4ec72
      Oleg Nesterov 提交于
      If task == NULL, find_get_context() should always check that cpu
      is correct.
      
      Afaics, the bug was introduced by 38a81da2 "perf events: Clean
      up pid passing", but even before that commit "&& cpu != -1" was
      not exactly right, -ESRCH from find_task_by_vpid() is not
      accurate.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: gregkh@suse.de
      Cc: stable@kernel.org
      LKML-Reference: <20110118161008.GB693@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      22a4ec72
  18. 18 1月, 2011 1 次提交
    • P
      perf: Fix contexted inheritance · c5ed5145
      Peter Zijlstra 提交于
      Linus reported that the RCU lockdep annotation bits triggered for this
      rcu_dereference() because we're not holding rcu_read_lock().
      
      Going over the code I cannot convince myself its correct:
      
       - holding a ref on the parent_ctx, doesn't avoid it being uncloned
         concurrently (as the comment says), so we can race with a free.
      
       - holding parent_ctx->mutex doesn't avoid the above free from taking
         place either, it would at best avoid parent_ctx from being freed.
      
      I.e. the warning is correct. To fix the bug, serialize against the
      unclone_ctx() call by extending the reach of the parent_ctx->lock.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c5ed5145
  19. 07 1月, 2011 3 次提交
  20. 16 12月, 2010 3 次提交
  21. 09 12月, 2010 2 次提交