提交 · 4c635a4e04700a371ef7e4d4bb33ed88747e801e · openeuler / raspberrypi-kernel

01 12月, 2010 13 次提交

perf tools: fix event parsing of comma-separated tracepoint events · 4c635a4e

由 Corey Ashford 提交于 11月 30, 2010

There are number of issues that prevent the use of multiple tracepoint events
being specified in a -e/--event switch, separated by commas.

For example, perf stat -e irq:irq_handler_entry,irq:irq_handler_exit ...  fails
because the tracepoint event parsing code doesn't recognize the comma separator
properly.

This patch corrects those issues.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Julia Lawall <julia@diku.dk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reported-by: NMichael Ellerman <michaele@au1.ibm.com>
LKML-Reference: <1291156021-17711-1-git-send-email-cjashfor@linux.vnet.ibm.com>
Signed-off-by: NCorey Ashford <cjashfor@linux.vnet.ibm.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

4c635a4e

perf packaging: add memcpy to perf MANIFEST · 3e8e24f2

由 Don Zickus 提交于 11月 30, 2010

There seems to be a new dependency on arch/*/lib/memcpy*.S when compiling
the perf tool.  Make sure that file is included in the MANIFEST when
creating the tarball.

Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1291155133-3499-2-git-send-email-dzickus@redhat.com>
Signed-off-by: NDon Zickus <dzickus@redhat.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

3e8e24f2

perf debug: Simplify trace_event · 5b1c1444

由 Arnaldo Carvalho de Melo 提交于 11月 30, 2010

No need to check that many times if debug_trace is on.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

5b1c1444

perf session: Allocate chunks of sample objects · 5c891f38

由 Thomas Gleixner 提交于 11月 30, 2010

The ordered sample code allocates singular reference objects struct
sample_queue which have 48byte size on 64bit and 20 bytes on 32bit. That's
silly. Allocate ~64k sized chunks and hand them out.

Performance gain: ~ 15%

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.398713983@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

5c891f38

perf session: Cache sample objects · 020bb75a

由 Thomas Gleixner 提交于 11月 30, 2010

When the sample queue is flushed we free the sample reference objects. Though
we need to malloc new objects when we process further. Stop the malloc/free
orgy and cache the already allocated object for resuage. Only allocate when
the cache is empty.

Performance gain: ~ 10%

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.338488630@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

020bb75a

perf session: Keep file mmaped instead of malloc/memcpy · fe174207

由 Thomas Gleixner 提交于 11月 30, 2010

Profiling perf with perf revealed that a large part of the processing time is
spent in malloc/memcpy/free in the sample ordering code. That code copies the
data from the mmap into malloc'ed memory. That's silly. We can keep the mmap
and just store the pointer in the queuing data structure. For 64 bit this is
not a problem as we map the whole file anyway. On 32bit we keep 8 maps around
and unmap the oldest before mmaping the next chunk of the file.

Performance gain: 2.95s -> 1.23s (Faktor 2.4)

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.278787719@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

fe174207

perf session: Use sensible mmap size · 55b44629

由 Thomas Gleixner 提交于 11月 30, 2010

On 64bit we can map the whole file in one go, on 32bit we can at least map
32MB and not map/unmap tiny chunks of the file.

Base the progress bar on 1/16 of the data size.

Preparatory patch to get rid of the malloc/memcpy/free of trace data.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.213687773@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

55b44629

perf session: Simplify termination checks · d6513281

由 Thomas Gleixner 提交于 11月 30, 2010

No need to check twice.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.152886642@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

d6513281

perf session: Move ui_progress_update in __perf_session__process_events() · 85b99952

由 Thomas Gleixner 提交于 11月 30, 2010

The progress bar is changed when the file offset changes. This happens only
when the next mmap is done. No need to call ui_progress_update() for every
event.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.094836523@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

85b99952

perf session: Cleanup __perf_session__process_events() · 0331ee0c

由 Thomas Gleixner 提交于 11月 30, 2010

Replace the pseudo C++ self argument with session and give the mmap related
variables a sensible name. shift is a complete misnomer - it took me several
rounds of cursing to figure out that it's not a shift value.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.029687218@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

0331ee0c

perf session: Use appropriate pointer type instead of silly typecasting · 28990f75

由 Thomas Gleixner 提交于 11月 30, 2010

There is no reason to use a struct sample_event pointer in struct sample_queue
and type cast it when flushing the queue.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163819.969462809@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

28990f75

perf session: Fix list sort algorithm · a1225dec

由 Thomas Gleixner 提交于 11月 30, 2010

The homebrewn sort algorithm fails to sort in time order. One of the problem
spots is that it fails to deal with equal timestamps correctly.

My first gut reaction was to replace the fancy list with an rbtree, but the
performance is 3 times worse.

Rewrite it so it works.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163819.908482530@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

a1225dec

perf events: Precalculate the header space for PERF_SAMPLE_ fields · c320c7b7

由 Arnaldo Carvalho de Melo 提交于 10月 20, 2010

PERF_SAMPLE_{CALLCHAIN,RAW} have variable lenghts per sample, but the others
can be precalculated, reducing a bit the per sample cost.
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

c320c7b7

27 11月, 2010 5 次提交

perf tools: Fix lost and unknown events handling · 068ffaa8

由 Arnaldo Carvalho de Melo 提交于 11月 27, 2010

Fix it by explaining what can be happening and giving the number of processed
and lost events.

Also holler if unknown events were found, that can be due to processing a
perf.data file collected using a newer tool where newer events got added on
reporting using an older perf tool, that or a bug, so ask for a report to be
made.

Works on both --tui and --stdio.
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

068ffaa8

perf trace: Handle DT_UNKNOWN on filesystems that don't support d_type · 008f29d3

由 Shawn Bohrer 提交于 11月 21, 2010

Some filesystems like xfs and reiserfs will return DT_UNKNOWN for the
d_type.  Handle this case by calling stat() to determine the type.

Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290355779-3276-1-git-send-email-sbohrer@rgmadvisors.com>
Signed-off-by: NShawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

008f29d3

perf symbols: Correct final kernel map guesses · 9d1faba5

由 Ian Munsie 提交于 11月 25, 2010

If a 32bit userspace perf is running on a 64bit kernel, the end of the final
map in the kernel would incorrectly be set to 2^32-1 rather than 2^64-1.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290658375-10342-1-git-send-email-imunsie@au1.ibm.com>
Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

9d1faba5

perf events: Default to using event__process_lost · 37982ba0

由 Arnaldo Carvalho de Melo 提交于 11月 26, 2010

Tool developers have to fill in a 'perf_event_ops' method table to
specify how to handle each event, so far the ones that were not
explicitely especified would get a stub that would just discard the
event.

Change that so that tool developers can get the lost event details and
the total number of such events at the end of 'perf report -D' output.
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
CC: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

37982ba0

perf record: Add option to disable collecting build-ids · baa2f6ce

由 Arnaldo Carvalho de Melo 提交于 11月 26, 2010

Collecting build-ids for long running sessions may take a long time
because it needs to traverse the whole just collected perf.data stream
of events, marking the DSOs that had hits and then looking for the
.note.gnu.build-id ELF section.

For things like the 'trace' tool that records and right away consumes
the data on systems where its unlikely that the DSOs being monitored
will change while 'trace' runs, it is desirable to remove build id
collection, so add a -B/--no-buildid option to perf record to allow such
use case.

Longer term we'll avoid all this if we, at DSO load time, in the kernel,
take advantage of this slow code path to collect the build-id and stash
it somewhere, so that we can insert it in the PERF_RECORD_MMAP event.
Reported-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

baa2f6ce

26 11月, 2010 16 次提交

perf, x86: P4 PMU - describe config format · af86da53

由 Cyrill Gorcunov 提交于 11月 26, 2010

Add description of .config in a sake of RAW events.
At least this should bring some light to those who
will be reading this code.
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: NStephane Eranian <eranian@google.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

af86da53

perf, arch: Cleanup perf-pmu init vs lockup-detector · 004417a6

由 Peter Zijlstra 提交于 11月 25, 2010

The perf hardware pmu got initialized at various points in the boot,
some before early_initcall() some after (notably arch_initcall).

The problem is that the NMI lockup detector is ran from early_initcall()
and expects the hardware pmu to be present.

Sanitize this by moving all architecture hardware pmu implementations to
initialize at early_initcall() and move the lockup detector to an explicit
initcall right after that.

Cc: paulus <paulus@samba.org>
Cc: davem <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Acked-by: NPaul Mundt <lethal@linux-sh.org>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290707759.2145.119.camel@laptop>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

004417a6

x86: Set cpu masks before calling CPU_STARTING notifiers · 5ef428c4

由 Andi Kleen 提交于 11月 18, 2010

When booting up a CPU set the various topology masks before
calling the CPU_STARTING notifier. This way the notifier
can actually use the masks.

This is needed for a perf change.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290077254-12165-2-git-send-email-andi@firstfloor.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5ef428c4

perf: Ignore non-sampling overflows · 96398826

由 Peter Zijlstra 提交于 11月 24, 2010

Some arch implementations call perf_event_overflow() by 'accident',
ignore this.
Reported-by: NFrancis Moreau <francis.moro@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

96398826

perf: Don't bother to init the hrtimer for no SW sampling counters · 5d508e82

由 Franck Bui-Huu 提交于 11月 23, 2010

Signed-off-by: NFranck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290525705-6265-3-git-send-email-fbuihuu@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5d508e82

perf: Limit event refresh to sampling event · 2e939d1d

由 Franck Bui-Huu 提交于 11月 23, 2010

Signed-off-by: NFranck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290525705-6265-2-git-send-email-fbuihuu@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2e939d1d

perf: Introduce is_sampling_event() · 6c7e550f

由 Franck Bui-Huu 提交于 11月 23, 2010

and use it when appropriate.
Signed-off-by: NFranck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290525705-6265-1-git-send-email-fbuihuu@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6c7e550f

scripts/tags.sh: Add magic for trace-events · 35d3778a

由 Peter Zijlstra 提交于 11月 24, 2010

Make tags find the trace-event definitions
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NWANG Cong <xiyou.wangcong@gmail.com>
LKML-Reference: <1290591835.2072.438.camel@laptop>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

35d3778a

Merge branch 'perf/urgent' into perf/core · 6c869e77

由 Ingo Molnar 提交于 11月 26, 2010

Conflicts:
	arch/x86/kernel/apic/hw_nmi.c

Merge reason: Resolve conflict, queue up dependent patch.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6c869e77

I
Merge commit 'v2.6.37-rc3' into perf/core · e4e91ac4
由 Ingo Molnar 提交于 11月 26, 2010
```
Merge reason: Pick up latest fixes.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
e4e91ac4

perf: Fix the software context switch counter · ee6dcfa4

由 Peter Zijlstra 提交于 11月 26, 2010

Stephane noticed that because the perf_sw_event() call is inside the
perf_event_task_sched_out() call it won't get called unless we
have a per-task counter.
Reported-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ee6dcfa4

perf, x86: Fixup Kconfig deps · cc2067a5

由 Peter Zijlstra 提交于 11月 16, 2010

This leads to a Kconfig dep inversion, x86 selects PERF_EVENT (due to
a hw_breakpoint dep) but doesn't unconditionally provide
HAVE_PERF_EVENT.

(This can cause build failures on M386/M486 kernel .config's.)
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101117222055.982965150@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cc2067a5

x86, perf, nmi: Disable perf if counters are not accessible · 33c6d6a7

由 Don Zickus 提交于 11月 22, 2010

In a kvm virt guests, the perf counters are not emulated.  Instead they
return zero on a rdmsrl. The perf nmi handler uses the fact that crossing
a zero means the counter overflowed (for those counters that do not have
specific interrupt bits). Therefore on kvm guests, perf will swallow all
NMIs thinking the counters overflowed.

This causes problems for subsystems like kgdb which needs NMIs to do its
magic. This problem was discovered by running kgdb tests.

The solution is to write garbage into a perf counter during the
initialization and hopefully reading back the same number.  On kvm
guests, the value will be read back as zero and we disable perf as
a result.
Reported-by: NJason Wessel <jason.wessel@windriver.com>
Patch-inspired-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NDon Zickus <dzickus@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1290462923-30734-1-git-send-email-dzickus@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

33c6d6a7

perf: Fix inherit vs. context rotation bug · dddd3379

由 Thomas Gleixner 提交于 11月 24, 2010

It was found that sometimes children of tasks with inherited events had
one extra event. Eventually it turned out to be due to the list rotation
no being exclusive with the list iteration in the inheritance code.

Cure this by temporarily disabling the rotation while we inherit the events.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Cc: <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

dddd3379

perf bench: Add feature that measures the performance of the... · ea7872b9

由 Hitoshi Mitake 提交于 11月 25, 2010

perf bench: Add feature that measures the performance of the arch/x86/lib/memcpy_64.S memcpy routines via 'perf bench mem'

This patch ports arch/x86/lib/memcpy_64.S to perf bench mem
memcpy for benchmarking memcpy() in userland with tricky and
dirty way.

util/include/asm/cpufeature.h, util/include/asm/dwarf2.h, and
util/include/linux/linkage.h are mostly dummy files with small
wrappers, so that we are able to include memcpy_64.S
unmodified.
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: h.mitake@gmail.com
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Ma Ling <ling.ma@intel.com>
Cc: Zhao Yakui <yakui.zhao@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <1290668693-27068-2-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ea7872b9

perf bench: Print both of prefaulted and no prefaulted results by default · 49ce8fc6

由 Hitoshi Mitake 提交于 11月 25, 2010

After applying this patch, perf bench mem memcpy prints
both of prefualted and without prefaulted score of memcpy().

New options --no-prefault and --only-prefault are added
to print single result, mainly for scripting usage.

Usage example:

 | mitake@X201i:~/linux/.../tools/perf% ./perf bench mem memcpy -l 500MB
 | # Running mem/memcpy benchmark...
 | # Copying 500MB Bytes ...
 |
 |      634.969014 MB/Sec
 |        4.828062 GB/Sec (with prefault)
 | mitake@X201i:~/linux/.../tools/perf% ./perf bench mem memcpy -l 500MB --only-prefault
 | # Running mem/memcpy benchmark...
 | # Copying 500MB Bytes ...
 |
 |        4.705192 GB/Sec (with prefault)
 | mitake@X201i:~/linux/.../tools/perf% ./perf bench mem memcpy -l 500MB --no-prefault
 | # Running mem/memcpy benchmark...
 | # Copying 500MB Bytes ...
 |
 |      642.725568 MB/Sec
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: h.mitake@gmail.com
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Ma Ling <ling.ma@intel.com>
Cc: Zhao Yakui <yakui.zhao@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <1290668693-27068-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

49ce8fc6

24 11月, 2010 1 次提交

perf symbols: Remove incorrect open-coded container_of() · 02a9d037

由 Rabin Vincent 提交于 11月 23, 2010

At least on ARM, padding is inserted between rb_node and sym in struct
symbol_name_rb_node, causing "((void *)sym) - sizeof(struct rb_node)" to
point inside rb_node rather than to the symbol_name_rb_node.  Fix this
by converting the code to use container_of().

Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20101123163106.GA25677@debian>
Signed-off-by: NRabin Vincent <rabin@rab.in>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

02a9d037

23 11月, 2010 1 次提交

perf record: Handle restrictive permissions in /proc/{kallsyms,modules} · c1a3a4b9

由 Arnaldo Carvalho de Melo 提交于 11月 22, 2010

The 59365d13 commit, even being reverted by 33e0d57f, showed a non robust
behavior in 'perf record': it really should just warn the user that some
functionality will not be available.

The new behavior then becomes:

	[acme@felicio linux]$ ls -la /proc/{kallsyms,modules}
	-r-------- 1 root root 0 Nov 22 12:19 /proc/kallsyms
	-r-------- 1 root root 0 Nov 22 12:19 /proc/modules
	[acme@felicio linux]$ perf record ls -R > /dev/null
	Couldn't record kernel reference relocation symbol
	Symbol resolution may be skewed if relocation was used (e.g. kexec).
	Check /proc/kallsyms permission or run as root.
	[ perf record: Woken up 1 times to write data ]
	[ perf record: Captured and wrote 0.004 MB perf.data (~161 samples) ]
	[acme@felicio linux]$ perf report --stdio
	[kernel.kallsyms] with build id 77b05e00e64e4de1c9347d83879779b540d69f00 not found, continuing without symbols
	# Events: 98  cycles
	#
	# Overhead  Command    Shared Object                Symbol
	# ........  .......  ...............  ....................
	#
	    48.26%       ls  [kernel]         [k] ffffffff8102b92b
	    22.49%       ls  libc-2.12.90.so  [.] __strlen_sse2
	     8.35%       ls  libc-2.12.90.so  [.] __GI___strcoll_l
	     8.17%       ls  ls               [.]            11580
	     3.35%       ls  libc-2.12.90.so  [.] _IO_new_file_xsputn
	     3.33%       ls  libc-2.12.90.so  [.] _int_malloc
	     1.88%       ls  libc-2.12.90.so  [.] _int_free
	     0.84%       ls  libc-2.12.90.so  [.] malloc_consolidate
	     0.84%       ls  libc-2.12.90.so  [.] __readdir64
	     0.83%       ls  ls               [.] strlen@plt
	     0.83%       ls  libc-2.12.90.so  [.] __GI_fwrite_unlocked
	     0.83%       ls  libc-2.12.90.so  [.] __memcpy_sse2

	#
	# (For a higher level overview, try: perf report --sort comm,dso)
	#
[acme@felicio linux]$

It still has the build-ids for DSOs in the maps with hits:

[acme@felicio linux]$ perf buildid-list
77b05e00e64e4de1c9347d83879779b540d69f00 [kernel.kallsyms]
09c4a431a4a8b648fcfc2c2bdda70f56050ddff1 /bin/ls
af75ea9ad951d25e0f038901a11b3846dccb29a4 /lib64/libc-2.12.90.so
[acme@felicio linux]$

That can be used in another machine to resolve kernel symbols.

Cc: Eugene Teo <eugeneteo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jesper Juhl <jj@chaosbits.net>
Cc: Marcus Meissner <meissner@suse.de>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

c1a3a4b9

22 11月, 2010 1 次提交
- L
  
  Linux 2.6.37-rc3 · 3561d43f
  由 Linus Torvalds 提交于 11月 21, 2010
  
  3561d43f
20 11月, 2010 3 次提交

perf stat: Change and clean up sys_perf_event_open error handling · d9cf837e

由 Corey Ashford 提交于 11月 19, 2010

This patch makes several changes to "perf stat":

- "perf stat" will no longer go ahead and run the application when one or
more of the specified events could not be opened.
- Use error() and die() instead of pr_err() so that the output is more
consistent with "perf top" and "perf record".
- Handle permission errors in a more robust way, and in a similar way to
"perf record" and "perf top".

In addition, the sys_perf_event_open() error handling of "perf top" and "perf
record" is made more consistent and adds the following phrase when an event
doesn't open (with something ther than an access or permission error):

"/bin/dmesg may provide additional information."

This is added because kernel code doesn't have a good way of expressing
detailed errors to user space, so its only avenue is to use printk's.  However,
many users may not think of looking at dmesg to find out why an event is being
rejected.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <fweisbec@gmail.com>
Cc: Ian Munsie <ianmunsi@au1.ibm.com>
Cc: Michael Ellerman <michaele@au1.ibm.com>
LKML-Reference: <1290217044-26293-1-git-send-email-cjashfor@linux.vnet.ibm.com>
Signed-off-by: NCorey Ashford <cjashfor@linux.vnet.ibm.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

d9cf837e

Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · b86db474

由 Linus Torvalds 提交于 11月 19, 2010

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: Add EXT4_IOC_TRIM ioctl to handle batched discard
  fs: Do not dispatch FITRIM through separate super_operation
  ext4: ext4_fill_super shouldn't return 0 on corruption
  jbd2: fix /proc/fs/jbd2/<dev> when using an external journal
  ext4: missing unlock in ext4_clear_request_list()
  ext4: fix setting random pages PageUptodate

b86db474

ext4: Add EXT4_IOC_TRIM ioctl to handle batched discard · e681c047

由 Lukas Czerner 提交于 11月 19, 2010

Filesystem independent ioctl was rejected as not common enough to be in
core vfs ioctl. Since we still need to access to this functionality this
commit adds ext4 specific ioctl EXT4_IOC_TRIM to dispatch
ext4_trim_fs().

It takes fstrim_range structure as an argument. fstrim_range is definec in
the include/linux/fs.h and its definition is as follows.

struct fstrim_range {
	__u64 start;
	__u64 len;
	__u64 minlen;
}

start	- first Byte to trim
len	- number of Bytes to trim from start
minlen	- minimum extent length to trim, free extents shorter than this
  number of Bytes will be ignored. This will be rounded up to fs
  block size.

After the FITRIM is done, the number of actually discarded Bytes is stored
in fstrim_range.len to give the user better insight on how much storage
space has been really released for wear-leveling.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e681c047