提交 · b9cacc7bf193df16532bfa7d7ca77fe50fc3c2e6 · openeuler / raspberrypi-kernel

06 4月, 2009 40 次提交

perf_counter: more elaborate write API · b9cacc7b

由 Peter Zijlstra 提交于 3月 25, 2009

Provide a begin, copy, end interface to the output buffer.

begin() reserves the space,
 copy() copies the data over, considering page boundaries,
  end() finalizes the event and does the wakeup.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Orig-LKML-Reference: <20090325113316.740550870@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b9cacc7b

perf_counter: fix perf_poll() · c7138f37

由 Peter Zijlstra 提交于 3月 24, 2009

Impact: fix kerneltop 100% CPU usage

Only return a poll event when there's actually been one, poll_wait()
doesn't actually wait for the waitq you pass it, it only enqueues
you on it.

Only once all FDs have been iterated and none of thm returned a
poll-event will it schedule().

Also make it return POLL_HUP when there's not mmap() area to read from.

Further, fix a silly bug in the write code.
Reported-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Orig-LKML-Reference: <1237897096.24918.181.camel@twins>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c7138f37

perf_counter: update documentation · f66c6b20

由 Paul Mackerras 提交于 3月 23, 2009

Impact: documentation fix

This updates the perfcounter documentation to reflect recent changes.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

f66c6b20

perf_counter tools: remove glib dependency and fix bugs in kerneltop.c, fix poll() · 0fd112e4

由 Peter Zijlstra 提交于 3月 24, 2009

Paul Mackerras wrote:

> I noticed the poll stuff is bogus - we have a 2D array of struct
> pollfds (MAX_NR_CPUS x MAX_COUNTERS), we fill in a sub-array (with the
> rest being uninitialized, since the array is on the stack) and then
> pass the first nr_cpus elements to poll.  Not what we really meant, I
> suspect. :)  Not even if we only have one counter, since it's the
> counter dimension that varies fastest.

This should fix the most obvious poll fubar.. not enough to fix the
full problem though..
Reported-by: NPaul Mackerras <paulus@samba.org>
Reported-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Orig-LKML-Reference: <18888.29986.340328.540512@cargo.ozlabs.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0fd112e4

perf_counter tools: remove glib dependency and fix bugs in kerneltop.c · cbe46555

由 Paul Mackerras 提交于 3月 24, 2009

The glib dependency in kerneltop.c is only for a little bit of list
manipulation, and I find it inconvenient.  This adds a 'next' field to
struct source_line, which lets us link them together into a list.  The
code to do the linking ourselves turns out to be no longer or more
difficult than using glib.

This also fixes a few other problems:

- We need to #include <limits.h> to get PATH_MAX on powerpc.

- We need to #include <linux/types.h> rather than have our own
  definitions of __u64 and __s64; on powerpc the installed headers
  define them to be unsigned long and long respectively, and if we
  have our own, different definition here that causes a compile error.

- This takes out the x86 setting of errno from -ret in
  sys_perf_counter_open.  My experiments on x86 indicate that the
  glibc syscall() does this for us already.

- We had two CPU migration counters in the default set, which seems
  unnecessary; I changed one of them to a context switch counter.

- In perfstat mode we were printing CPU cycles and instructions as
  milliseconds, and the cpu clock and task clock counters as events.
  This fixes that.

- In perfstat mode we were still printing a blank line after the first
  counter, which was a holdover from when a task clock counter was
  automatically included as the first counter.  This removes the blank
  line.

- On a test machine here, parse_symbols() and parse_vmlinux() were
  taking long enough (almost 0.5 seconds) for the mmap buffer to
  overflow before we got to the first mmap_read() call, so this moves
  them before we open all the counters.

- The error message if sys_perf_counter_open fails needs to use errno,
  not -fd[i][counter].
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NMike Galbraith <efault@gmx.de>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Orig-LKML-Reference: <18888.29986.340328.540512@cargo.ozlabs.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cbe46555

perf_counter tools: increase cpu-cycles again · 81cdbe05

由 Ingo Molnar 提交于 3月 23, 2009

Commit b7368fdd7d decreased the CPU cycles interval 100-fold, but
this is causig kerneltop failures on my Nehalem box:

 aldebaran:/home/mingo/linux/linux/Documentation/perf_counter>
 ./kerneltop
 KernelTop refresh period: 2 seconds
 ERROR: failed to keep up with mmap data

10,000 cycles is way too short.

What we should do instead on mostly-idle systems is some sort of
read/poll timeout, so that we display something every 2 seconds
for sure.

Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

81cdbe05

perf_counter tools: fix build warning in kerneltop.c · 193e8df1

由 Ingo Molnar 提交于 3月 23, 2009

Fix:

 kerneltop.c: In function ‘record_ip’:
 kerneltop.c:1005: warning: format ‘%016llx’ expects type ‘long long unsigned int’, but argument 2 has type ‘uint64_t’

Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Orig-LKML-Reference: <20090323172417.677932499@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

193e8df1

perf_counter tools: tidy up in-kernel dependencies · 383c5f8c

由 Ingo Molnar 提交于 3月 23, 2009

Remove now unified perfstat.c and perf_counter.h, and link to the
in-kernel perf_counter.h.

Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Orig-LKML-Reference: <20090323172417.677932499@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

383c5f8c

perf_counter tools: use mmap() output · bcbcb37c

由 Peter Zijlstra 提交于 3月 23, 2009

update kerneltop to use the mmap() output to gather overflow information
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.677932499@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bcbcb37c

perf_counter tools: update to new syscall ABI · 803d4f39

由 Peter Zijlstra 提交于 3月 23, 2009

update the kerneltop userspace to work with the latest syscall ABI
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.559643732@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

803d4f39

perf_counter: new output ABI - part 1 · 7b732a75

由 Peter Zijlstra 提交于 3月 23, 2009

Impact: Rework the perfcounter output ABI

use sys_read() only for instant data and provide mmap() output for all
async overflow data.

The first mmap() determines the size of the output buffer. The mmap()
size must be a PAGE_SIZE multiple of 1+pages, where pages must be a
power of 2 or 0. Further mmap()s of the same fd must have the same
size. Once all maps are gone, you can again mmap() with a new size.

In case of 0 extra pages there is no data output and the first page
only contains meta data.

When there are data pages, a poll() event will be generated for each
full page of data. Furthermore, the output is circular. This means
that although 1 page is a valid configuration, its useless, since
we'll start overwriting it the instant we report a full page.

Future work will focus on the output format (currently maintained)
where we'll likey want each entry denoted by a header which includes a
type and length.

Further future work will allow to splice() the fd, also containing the
async overflow data -- splice() would be mutually exclusive with
mmap() of the data.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.470536358@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7b732a75

mutex: drop "inline" from mutex_lock() inside kernel/mutex.c · b09d2501

由 H. Peter Anvin 提交于 4月 01, 2009

Impact: build fix

mutex_lock() is was defined inline in kernel/mutex.c, but wasn't
declared so not in <linux/mutex.h>.  This didn't cause a problem until
checkin 3a2d367d9aabac486ac4444c6c7ec7a1dab16267 added the
atomic_dec_and_mutex_lock() inline in between declaration and
definion.

This broke building with CONFIG_ALLOW_WARNINGS=n, e.g. make
allnoconfig.

Either from the source code nor the allnoconfig binary output I cannot
find any internal references to mutex_lock() in kernel/mutex.c, so
presumably this "inline" is now-useless legacy.

Cc: Eric Paris <eparis@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <tip-3a2d367d9aabac486ac4444c6c7ec7a1dab16267@git.kernel.org>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

b09d2501

mutex: add atomic_dec_and_mutex_lock() · 9ab772cd

由 Eric Paris 提交于 3月 23, 2009

Much like the atomic_dec_and_lock() function in which we take an hold a
spin_lock if we drop the atomic to 0 this function takes and holds the
mutex if we dec the atomic to 0.
Signed-off-by: NEric Paris <eparis@redhat.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.410913479@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9ab772cd

perf_counter: add an mmap method to allow userspace to read hardware counters · 37d81828

由 Paul Mackerras 提交于 3月 23, 2009

Impact: new feature giving performance improvement

This adds the ability for userspace to do an mmap on a hardware counter
fd and get access to a read-only page that contains the information
needed to translate a hardware counter value to the full 64-bit
counter value that would be returned by a read on the fd.  This is
useful on architectures that allow user programs to read the hardware
counters, such as PowerPC.

The mmap will only succeed if the counter is a hardware counter
monitoring the current process.

On my quad 2.5GHz PowerPC 970MP machine, userspace can read a counter
and translate it to the full 64-bit value in about 30ns using the
mmapped page, compared to about 830ns for the read syscall on the
counter, so this does give a significant performance improvement.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Orig-LKML-Reference: <20090323172417.297057964@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

37d81828

perf_counter: avoid recursion · 96f6d444

由 Peter Zijlstra 提交于 3月 23, 2009

Tracepoint events like lock_acquire and software counters like
pagefaults can recurse into the perf counter code again, avoid that.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.152096433@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

96f6d444

perf_counter: remove the event config bitfields · f4a2deb4

由 Peter Zijlstra 提交于 3月 23, 2009

Since the bitfields turned into a bit of a mess, remove them and rely on
good old masks.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.059499915@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f4a2deb4

perf_counter tools: when no command is feed to perfstat, display help and exit · af9522cf