提交 · 4ec8363dfc1451f8c8f86825731fe712798ada02 · xiphi1978 / linux

01 7月, 2011 1 次提交

perf_events: Fix perf buffer watermark setting · 4ec8363d

由 Vince Weaver 提交于 6月 01, 2011

Since 2.6.36 (specifically commit d57e34fd ("perf: Simplify the
ring-buffer logic: make perf_buffer_alloc() do everything needed"),
the perf_buffer_init_code() has been mis-setting the buffer watermark
if perf_event_attr.wakeup_events has a non-zero value.

This is because perf_event_attr.wakeup_events is a union with
perf_event_attr.wakeup_watermark.

This commit re-enables the check for perf_event_attr.watermark being
set before continuing with setting a non-default watermark.

This bug is most noticable when you are trying to use PERF_IOC_REFRESH
with a value larger than one and perf_event_attr.wakeup_events is set to
one.  In this case the buffer watermark will be set to 1 and you will
get extraneous POLL_IN overflows rather than POLL_HUP as expected.

[ avoid using attr.wakeup_events when attr.watermark is set ]
Signed-off-by: NVince Weaver <vweaver1@eecs.utk.edu>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1106011506390.5384@cl320.eecs.utk.eduSigned-off-by: NIngo Molnar <mingo@elte.hu>

4ec8363d

09 6月, 2011 1 次提交

perf: Split up buffer handling from core code · 76369139

由 Frederic Weisbecker 提交于 5月 19, 2011

And create the internal perf events header.

v2: Keep an internal inlined perf_output_copy()
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1305827704-5607-1-git-send-email-fweisbec@gmail.com
[ v3: use clearer 'ring_buffer' and 'rb' naming ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

76369139

07 6月, 2011 1 次提交

perf, core: Fix initial task_ctx/event installation · b58f6b0d

由 Peter Zijlstra 提交于 6月 07, 2011

A lost Quilt refresh of 2c29ef0f (perf: Simplify and fix
__perf_install_in_context()) is causing grief and lockups,
reported by Jiri Olsa.

When installing an event in a task context, there's a number of
issues:

 - there might not be an existing task context, in which case
   we should install the now current context;

 - there might already be a context, not the current one, in
   which case we should de-schedule the old and install the new;

these cases were dealt with in the lost refresh, however there is one
further case that was found in testing:

 - there might already be a context, the current one, in which
   case we should still de-schedule, and should take care
   to re-install it (note that task_ctx_sched_out() clears
   cpuctx->task_ctx).
Reported-by: NJiri Olsa <jolsa@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1307399008.2497.971.camel@laptopSigned-off-by: NIngo Molnar <mingo@elte.hu>

b58f6b0d

31 5月, 2011 1 次提交

perf, cgroups: Fix up for new API · 74c355fb

由 Peter Zijlstra 提交于 5月 30, 2011

Ben changed the cgroup API in commit f780bdb7 (cgroups: add
per-thread subsystem callbacks) in an incompatible way, but
forgot to convert the perf cgroup bits.

Avoid compile warnings and runtime splats and convert perf too ;-)
Acked-by: NBen Blum <bblum@andrew.cmu.edu>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1306767651.1200.2990.camel@twinsSigned-off-by: NIngo Molnar <mingo@elte.hu>

74c355fb

29 5月, 2011 9 次提交

perf: De-schedule a task context when removing the last event · 64ce3126

由 Peter Zijlstra 提交于 4月 09, 2011

Since perf_install_in_context() will now install a context when we
add the first event, we can de-schedule the context when the last
event is removed.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110409192142.090431763@chello.nlSigned-off-by: NIngo Molnar <mingo@elte.hu>

64ce3126

perf: Change close() semantics for group events · e03a9a55

由 Peter Zijlstra 提交于 4月 09, 2011

In order to always call list_del_event() on the correct cpu if the
event is part of an active context and avoid having to do two IPIs,
change the close() semantics slightly.

The current perf_event_disable() call would disable a whole group if
the event that's being closed is the group leader, whereas the new
code keeps the group siblings enabled.

People should not rely on this behaviour and I don't think they do,
but in case we find they do, the fix is easy and we have to take the
double IPI cost.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Vince Weaver <vweaver1@eecs.utk.edu>
Link: http://lkml.kernel.org/r/20110409192142.038377551@chello.nlSigned-off-by: NIngo Molnar <mingo@elte.hu>

e03a9a55

perf: Collect the schedule-in rules in one function · dce5855b

由 Peter Zijlstra 提交于 4月 09, 2011

This was scattered out - refactor it into a single function.
No change in functionality.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110409192141.979862055@chello.nlSigned-off-by: NIngo Molnar <mingo@elte.hu>

dce5855b

perf: Change and simplify ctx::is_active semantics · db24d33e

由 Peter Zijlstra 提交于 4月 09, 2011

Instead of tracking if a context is active or not, track which events
of the context are active. By making it a bitmask of
EVENT_PINNED|EVENT_FLEXIBLE we can simplify some of the scheduling
routines since it can avoid adding events that are already active.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110409192141.930282378@chello.nlSigned-off-by: NIngo Molnar <mingo@elte.hu>

db24d33e

perf: Simplify and fix __perf_install_in_context() · 2c29ef0f

由 Peter Zijlstra 提交于 4月 09, 2011

Currently __perf_install_in_context() will try and schedule in the
event irrespective of our event scheduling rules, that is, we try to
schedule CPU-pinned, TASK-pinned, CPU-flexible, TASK-flexible, but
when creating a new event we simply try and schedule it on top of
whatever is already on the PMU, this can lead to errors for pinned
events.

Therefore, simplify things and simply schedule everything out, add the
event to the corresponding context and schedule everything back in.

This also nicely handles the case where with
__ARCH_WANT_INTERRUPTS_ON_CTXSW the IPI can come right in the middle
of schedule, before we managed to call perf_event_task_sched_in().
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110409192141.870894224@chello.nlSigned-off-by: NIngo Molnar <mingo@elte.hu>

2c29ef0f

perf: Remove task_ctx_sched_in() · 04dc2dbb

由 Peter Zijlstra 提交于 4月 09, 2011

Make task_ctx_sched_*() imply EVENT_ALL, since anything less will not
actually have scheduled the task in/out at all.

Since there's no site that schedules all of a task in (due to the
interleave with flexible cpuctx) we can remove this function.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110409192141.817893268@chello.nlSigned-off-by: NIngo Molnar <mingo@elte.hu>

04dc2dbb

perf: Optimize event scheduling locking · facc4307

由 Peter Zijlstra 提交于 4月 09, 2011

Currently we only hold one ctx->lock at a time, which results in us
flipping back and forth between cpuctx->ctx.lock and task_ctx->lock.

Avoid this and gain large atomic regions by holding both locks. We
nest the task lock inside the cpu lock, since with task scheduling we
might have to change task ctx while holding the cpu ctx lock.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110409192141.769881865@chello.nlSigned-off-by: NIngo Molnar <mingo@elte.hu>

facc4307

perf: Clean up 'ctx' reference counting · 9137fb28

由 Peter Zijlstra 提交于 4月 09, 2011

Small cleanup to how we refcount in find_get_context(), this also
allows us to use put_ctx() to free things instead of using kfree().
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110409192141.719340481@chello.nlSigned-off-by: NIngo Molnar <mingo@elte.hu>

9137fb28

perf: Optimize ctx_sched_out() · 075e0b00

由 Peter Zijlstra 提交于 4月 09, 2011

Oleg noted that ctx_sched_out() disables the PMU even though it might
not actually do something, avoid needless PMU-disabling.
Reported-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110409192141.665385503@chello.nlSigned-off-by: NIngo Molnar <mingo@elte.hu>

075e0b00

28 5月, 2011 1 次提交

perf: Fix SIGIO handling · f506b3dc

由 Peter Zijlstra 提交于 5月 26, 2011

Vince noticed that unless we mmap() a buffer, SIGIO gets lost. So
explicitly push the wakeup (including signals) when requested.
Reported-by: NVince Weaver <vweaver1@eecs.utk.edu>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/n/tip-2euus3f3x3dyvdk52cjxw8zu@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

f506b3dc

04 5月, 2011 1 次提交

perf events: Clean up definitions and initializers, update copyrights · e7e7ee2e

由 Ingo Molnar 提交于 5月 04, 2011

Fix a few inconsistent style bits that were added over the past few
months.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-yv4hwf9yhnzoada8pcpb3a97@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

e7e7ee2e

03 5月, 2011 1 次提交

perf: Start the restructuring · fae85b7c

由 Borislav Petkov 提交于 10月 26, 2010

mv kernel/perf_event.c -> kernel/events/core.c. From there, all further
sensible splitting can happen. The idea is that due to perf_event.c
becoming pretty sizable and with the advent of the marriage with ftrace,
splitting functionality into its logical parts should help speeding up
the unification and to manage the complexity of the subsystem.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

fae85b7c

11 4月, 2011 1 次提交

perf_event: Fix cgrp event scheduling bug in perf_enable_on_exec() · e566b76e

由 Stephane Eranian 提交于 4月 06, 2011

There is a bug in perf_event_enable_on_exec() when cgroup events are
active on a CPU: the cgroup events may be scheduled twice causing event
state corruptions which eventually may lead to kernel panics.

The reason is that the function needs to first schedule out the cgroup
events, just like for the per-thread events. The cgroup event are
scheduled back in automatically from the perf_event_context_sched_in()
function.

The patch also adds a WARN_ON_ONCE() is perf_cgroup_switch() to catch any
bogus state.
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110406005454.GA1062@quadSigned-off-by: NIngo Molnar <mingo@elte.hu>

e566b76e

05 4月, 2011 1 次提交

jump label: Introduce static_branch() interface · d430d3d7

由 Jason Baron 提交于 3月 16, 2011

Introduce:

static __always_inline bool static_branch(struct jump_label_key *key);

instead of the old JUMP_LABEL(key, label) macro.

In this way, jump labels become really easy to use:

Define:

        struct jump_label_key jump_key;

Can be used as:

        if (static_branch(&jump_key))
                do unlikely code

enable/disale via:

        jump_label_inc(&jump_key);
        jump_label_dec(&jump_key);

that's it!

For the jump labels disabled case, the static_branch() becomes an
atomic_read(), and jump_label_inc()/dec() are simply atomic_inc(),
atomic_dec() operations. We show testing results for this change below.

Thanks to H. Peter Anvin for suggesting the 'static_branch()' construct.

Since we now require a 'struct jump_label_key *key', we can store a pointer into
the jump table addresses. In this way, we can enable/disable jump labels, in
basically constant time. This change allows us to completely remove the previous
hashtable scheme. Thanks to Peter Zijlstra for this re-write.

Testing:

I ran a series of 'tbench 20' runs 5 times (with reboots) for 3
configurations, where tracepoints were disabled.

jump label configured in
avg: 815.6

jump label *not* configured in (using atomic reads)
avg: 800.1

jump label *not* configured in (regular reads)
avg: 803.4
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110316212947.GA8792@redhat.com>
Signed-off-by: NJason Baron <jbaron@redhat.com>
Suggested-by: NH. Peter Anvin <hpa@linux.intel.com>
Tested-by: NDavid Daney <ddaney@caviumnetworks.com>
Acked-by: NRalf Baechle <ralf@linux-mips.org>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

d430d3d7

31 3月, 2011 2 次提交

perf: Fix task_struct reference leak · fd1edb3a

由 Peter Zijlstra 提交于 3月 28, 2011

sys_perf_event_open() had an imbalance in the number of task refs it
took causing memory leakage

Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@kernel.org # .37+
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fd1edb3a

perf: Rebase max unprivileged mlock threshold on top of page size · 20443384

由 Frederic Weisbecker 提交于 3月 31, 2011

Ensure we allow 512 kiB + 1 page for user control without
assuming a 4096 bytes page size.
Reported-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: <stable@kernel.org>
LKML-Reference: <1301535209-9679-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

20443384

24 3月, 2011 1 次提交

perf: Better fit max unprivileged mlock pages for tools needs · 880f5731

由 Frederic Weisbecker 提交于 3月 23, 2011

The maximum kilobytes of locked memory that an unprivileged user
can reserve is of 512 kB = 128 pages by default, scaled to the
number of onlined CPUs, which fits well with the tools that use
128 data pages by default.

However tools actually use 129 pages, because they need one more
for the user control page. Thus the default mlock threshold is
not sufficient for the default tools needs and we always end up
to evaluate the constant mlock rlimit policy, which doesn't have
this scaling with the number of online CPUs.

Hence, on systems that have more than 16 CPUs, we overlap the
rlimit threshold and fail to mmap:

	$ perf record ls
	Error: failed to mmap with 1 (Operation not permitted)

Just increase the max unprivileged mlock threshold by one page
so that it supports well perf tools even after 16 CPUs.
Reported-by: NHan Pingtian <phan@redhat.com>
Reported-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stable <stable@kernel.org>
LKML-Reference: <1300904979-5508-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

880f5731

23 3月, 2011 1 次提交

perf_events: Fix stale ->cgrp pointer in update_cgrp_time_from_cpuctx() · 68cacd29

由 Stephane Eranian 提交于 3月 23, 2011

This patch solves a stale pointer problem in
update_cgrp_time_from_cpuctx(). The cpuctx->cgrp
was not cleared on all possible event exit paths,
including:

   close()
     perf_release()
       perf_release_kernel()
         list_del_event()

This patch fixes list_del_event() to clear cpuctx->cgrp
when there are no cgroup events left in the context.

[ This second version makes the code compile when
  CONFIG_CGROUP_PERF is not enabled. We unconditionally define
  perf_cpu_context->cgrp. ]
Signed-off-by: NStephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: perfmon2-devel@lists.sf.net
Cc: paulus@samba.org
Cc: davem@davemloft.net
LKML-Reference: <20110323150306.GA1580@quad>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

68cacd29

16 3月, 2011 3 次提交

perf: Fix tear-down of inherited group events · 38b435b1

由 Peter Zijlstra 提交于 3月 15, 2011

When destroying inherited events, we need to destroy groups too,
otherwise the event iteration in perf_event_exit_task_context() will
miss group siblings and we leak events with all the consequences.
Reported-and-tested-by: NVince Weaver <vweaver1@eecs.utk.edu>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org> # .35+
LKML-Reference: <1300196470.2203.61.camel@twins>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

38b435b1

perf: Handle stopped state with tracepoints · a0f7d0f7

由 Frederic Weisbecker 提交于 3月 07, 2011

We toggle the state from start and stop callbacks but actually
don't check it when the event triggers. Do it so that
these callbacks actually work.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: <stable@kernel.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299529629-18280-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a0f7d0f7

perf: Fix the software events state check · 91b2f482

由 Frederic Weisbecker 提交于 3月 07, 2011

Fix the mistakenly inverted check of events state.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: <stable@kernel.org>
LKML-Reference: <1299529629-18280-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

91b2f482

04 3月, 2011 5 次提交

perf: Fix cgroup vs jump_label problem · 08309379

由 Peter Zijlstra 提交于 3月 03, 2011

Li Zefan reported that the jump label code sleeps and we're calling it
under a spinlock, *fail* ;-)
Reported-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

08309379

perf cgroup: Clean up perf_cgroup_create() · 1b15d055

由 Li Zefan 提交于 3月 03, 2011

- Use kzalloc() to replace kmalloc() + memset().

- Remove redundant initialization, since alloc_percpu() returns
  zero-filled percpu memory.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4D6F347E.2010806@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1b15d055

perf cgroup: Fix unmatched call to perf_detach_cgroup() · f75e18cb

由 Li Zefan 提交于 3月 03, 2011

In the failure path, we call perf_detach_cgroup(), but we didn't
call perf_get_cgroup() prio to it.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4D6F346E.9070606@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f75e18cb

perf cgroup: Fix leak of file reference count · 3db272c0

由 Li Zefan 提交于 3月 03, 2011

In perf_cgroup_connect(), fput_light() is missing in a failure path.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4D6F3461.6060406@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3db272c0

perf: Fix the missing event initialization when pmu is found in idr · 940c5b29

由 Lin Ming 提交于 2月 27, 2011

Currently, the event is not initialized if pmu is found in idr. This
never causes bug just because now no pmu is associated with the idr
id.
Signed-off-by: NLin Ming <ming.m.lin@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1298812411.2699.9.camel@localhost>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

940c5b29

23 2月, 2011 2 次提交

perf: Simplify task_clock_event_read() · 768a06e2

由 Peter Zijlstra 提交于 2月 22, 2011

There is no point in us having different code paths for nmi and !nmi
here, so remove the !nmi one.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

768a06e2

perf_events: Fix rcu and locking issues with cgroup support · 3f7cce3c

由 Stephane Eranian 提交于 2月 18, 2011

This patches ensures that we do not end up calling
perf_cgroup_from_task() when there is no cgroup event.
This avoids potential RCU and locking issues.

The change in perf_cgroup_set_timestamp() ensures we
check against ctx->nr_cgroups. It also avoids calling
perf_clock() tiwce in a row. It also ensures we do need
to grab ctx->lock before calling the function.

We drop update_cgrp_time() from task_clock_event_read()
because it is not needed. This also avoids having to
deal with perf_cgroup_from_task().

Thanks to Peter Zijlstra for his help on this.
Signed-off-by: NStephane Eranian <eranian@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4d5e76b8.815bdf0a.7ac3.774f@mx.google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3f7cce3c

16 2月, 2011 4 次提交

perf: Optimize hrtimer events · ba3dd36c

由 Peter Zijlstra 提交于 2月 15, 2011

There is no need to re-initialize the hrtimer every time we start it,
so don't do that (shaves a few cycles). Also, since we know hrtimers
run at a fixed rate (nanoseconds) we can pre-compute the desired
frequency at which they tick. This avoids us having to go through the
whole adaptive frequency feedback logic (shaves another few cycles).
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1297448589.5226.47.camel@laptop>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ba3dd36c

perf: Optimize throttling code · 163ec435

由 Peter Zijlstra 提交于 2月 16, 2011

By pre-computing the maximum number of samples per tick we can avoid a
multiplication and a conditional since MAX_INTERRUPTS >
max_samples_per_tick.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

163ec435

perf: Add cgroup support · e5d1367f

由 Stephane Eranian 提交于 2月 14, 2011

This kernel patch adds the ability to filter monitoring based on
container groups (cgroups). This is for use in per-cpu mode only.

The cgroup to monitor is passed as a file descriptor in the pid
argument to the syscall. The file descriptor must be opened to
the cgroup name in the cgroup filesystem. For instance, if the
cgroup name is foo and cgroupfs is mounted in /cgroup, then the
file descriptor is opened to /cgroup/foo. Cgroup mode is
activated by passing PERF_FLAG_PID_CGROUP in the flags argument
to the syscall.

For instance to measure in cgroup foo on CPU1 assuming
cgroupfs is mounted under /cgroup:

struct perf_event_attr attr;
int cgroup_fd, fd;

cgroup_fd = open("/cgroup/foo", O_RDONLY);
fd = perf_event_open(&attr, cgroup_fd, 1, -1, PERF_FLAG_PID_CGROUP);
close(cgroup_fd);
Signed-off-by: NStephane Eranian <eranian@google.com>
[ added perf_cgroup_{exit,attach} ]
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4d590250.114ddf0a.689e.4482@mx.google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e5d1367f

perf: Fix throttle logic · 4fe757dd

由 Peter Zijlstra 提交于 2月 15, 2011

It was possible to call pmu::start() on an already running event. In
particular this lead so some wreckage as the hrtimer events would
re-initialize active timers.

This was due to throttled events being activated again by scheduling.
Scheduling in a context would add and force start events, resulting in
running events with a possible throttle status. The next tick to hit
that task will then try to unthrottle the event and call ->start() on
an already running event.
Reported-by: NJeff Moyer <jmoyer@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4fe757dd

03 2月, 2011 2 次提交

perf: Fix reading in perf_event_read() · 542e72fc

由 Peter Zijlstra 提交于 1月 26, 2011

It is quite possible for the event to have been disabled between
perf_event_read() sending the IPI and the CPU servicing the IPI and
calling __perf_event_read(), hence revalidate the state.
Reported-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

542e72fc

perf: Cure task_oncpu_function_call() races · fe4b04fa

由 Peter Zijlstra 提交于 2月 02, 2011

Oleg reported that on architectures with
__ARCH_WANT_INTERRUPTS_ON_CTXSW the IPI from
task_oncpu_function_call() can land before perf_event_task_sched_in()
and cause interesting situations for eg. perf_install_in_context().

This patch reworks the task_oncpu_function_call() interface to give a
more usable primitive as well as rework all its users to hopefully be
more obvious as well as remove the races.

While looking at the code I also found a number of races against
perf_event_task_sched_out() which can flip contexts between tasks so
plug those too.
Reported-and-reviewed-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fe4b04fa

28 1月, 2011 1 次提交

perf: Fix alloc_callchain_buffers() · 88d4f0db

由 Eric Dumazet 提交于 1月 25, 2011

Commit 927c7a9e ("perf: Fix race in callchains") introduced
a mismatch in the sizing of struct callchain_cpus_entries.

nr_cpu_ids must be used instead of num_possible_cpus(), or we
might get out of bound memory accesses on some machines.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Miller <davem@davemloft.net>
Cc: Stephane Eranian <eranian@google.com>
CC: stable@kernel.org
LKML-Reference: <1295980851.3588.351.camel@edumazet-laptop>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

88d4f0db

22 1月, 2011 1 次提交

perf: perf_event_exit_task_context: s/rcu_dereference/rcu_dereference_raw/ · 806839b2

由 Oleg Nesterov 提交于 1月 21, 2011

In theory, almost every user of task->child->perf_event_ctxp[]
is wrong. find_get_context() can install the new context at any
moment, we need read_barrier_depends().

dbe08d82 "perf: Fix
find_get_context() vs perf_event_exit_task() race" added
rcu_dereference() into perf_event_exit_task_context() to make
the precedent, but this makes __rcu_dereference_check() unhappy.
Use rcu_dereference_raw() to shut up the warning.
Reported-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: acme@redhat.com
Cc: paulus@samba.org
Cc: stern@rowland.harvard.edu
Cc: a.p.zijlstra@chello.nl
Cc: fweisbec@gmail.com
Cc: roland@redhat.com
Cc: prasad@linux.vnet.ibm.com
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
LKML-Reference: <20110121174547.GA8796@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

806839b2