提交 · 429eb051011a580beae2dc9f8caed5dade9591dc · openanolis / cloud-kernel

22 7月, 2013 1 次提交

perf bench: Fix memcpy benchmark for large sizes · a198996c

由 Andi Kleen 提交于 7月 18, 2013

The glibc calloc() function has an optimization to not explicitely
memset() very large calloc allocations that just came from mmap(),
because they are known to be zero.

This could result in the perf memcpy benchmark reading only from
the zero page, which gives unrealistic results.

Always call memset explicitly on the source area to avoid this problem.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: http://lkml.kernel.org/n/tip-pzz2qrdq9eymxda0y8yxdn33@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

a198996c

09 7月, 2013 1 次提交

perf bench: Fix memory allocation fail check in mem{set,cpy} workloads · 13966721

由 Kirill A. Shutemov 提交于 6月 06, 2013

Addresses of allocated memory areas saved to '*src' and '*dst', so we
need to check them for NULL, not 'src' and 'dst'.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: NHitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1370518503-4230-1-git-send-email-kirill.shutemov@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

13966721

14 3月, 2013 1 次提交

perf tools: Fix LIBNUMA build with glibc 2.12 and older. · d1398ccf

由 Vinson Lee 提交于 3月 13, 2013

The tokens MADV_HUGEPAGE and MADV_NOHUGEPAGE are not available with
glibc 2.12 and older. Define these tokens if they are not already
defined.

This patch fixes these build errors with older versions of glibc.

    CC bench/numa.o
bench/numa.c: In function ‘alloc_data’:
bench/numa.c:334: error: ‘MADV_HUGEPAGE’ undeclared (first use in this function)
bench/numa.c:334: error: (Each undeclared identifier is reported only once
bench/numa.c:334: error: for each function it appears in.)
bench/numa.c:341: error: ‘MADV_NOHUGEPAGE’ undeclared (first use in this function)
make: *** [bench/numa.o] Error 1
Signed-off-by: NVinson Lee <vlee@twitter.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Irina Tirdea <irina.tirdea@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1363214064-4671-2-git-send-email-vlee@twitter.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

d1398ccf

30 1月, 2013 1 次提交

perf: Add 'perf bench numa mem' NUMA performance measurement suite · 1c13f3c9

由 Ingo Molnar 提交于 12月 06, 2012

Add a suite of NUMA performance benchmarks.

The goal was simulate the behavior and access patterns of real NUMA
workloads, via a wide range of parameters, so this tool goes well
beyond simple bzero() measurements that most NUMA micro-benchmarks use:

 - It processes the data and creates a chain of data dependencies,
   like a real workload would. Neither the compiler, nor the
   kernel (via KSM and other optimizations) nor the CPU can
   eliminate parts of the workload.

 - It randomizes the initial state and also randomizes the target
   addresses of the processing - it's not a simple forward scan
   of addresses.

 - It provides flexible options to set process, thread and memory
   relationship information: -G sets "global" memory shared between
   all test processes, -P sets "process" memory shared by all
   threads of a process and -T sets "thread" private memory.

 - There's a NUMA convergence monitoring and convergence latency
   measurement option via -c and -m.

 - Micro-sleeps and synchronization can be injected to provoke lock
   contention and scheduling, via the -u and -S options. This simulates
   IO and contention.

 - The -x option instructs the workload to 'perturb' itself artificially
   every N seconds, by moving to the first and last CPU of the system
   periodically. This way the stability of convergence equilibrium and
   the number of steps taken for the scheduler to reach equilibrium again
   can be measured.

 - The amount of work can be specified via the -l loop count, and/or
   via a -s seconds-timeout value.

 - CPU and node memory binding options, to test hard binding scenarios.
   THP can be turned on and off via madvise() calls.

 - Live reporting of convergence progress in an 'at glance' output format.
   Printing of convergence and deconvergence events.

The 'perf bench numa mem -a' option will start an array of about 30
individual tests that will each output such measurements:

 # Running  5x5-bw-thread, "perf bench numa mem -p 5 -t 5 -P 512 -s 20 -zZ0q --thp  1"
  5x5-bw-thread,                         20.276, secs,           runtime-max/thread
  5x5-bw-thread,                         20.004, secs,           runtime-min/thread
  5x5-bw-thread,                         20.155, secs,           runtime-avg/thread
  5x5-bw-thread,                          0.671, %,              spread-runtime/thread
  5x5-bw-thread,                         21.153, GB,             data/thread
  5x5-bw-thread,                        528.818, GB,             data-total
  5x5-bw-thread,                          0.959, nsecs,          runtime/byte/thread
  5x5-bw-thread,                          1.043, GB/sec,         thread-speed
  5x5-bw-thread,                         26.081, GB/sec,         total-speed

See the help text and the code for more details.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

1c13f3c9

11 9月, 2012 1 次提交

perf tools: Use __maybe_used for unused variables · 1d037ca1

由 Irina Tirdea 提交于 9月 11, 2012

perf defines both __used and __unused variables to use for marking
unused variables. The variable __used is defined to
__attribute__((__unused__)), which contradicts the kernel definition to
__attribute__((__used__)) for new gcc versions. On Android, __used is
also defined in system headers and this leads to warnings like: warning:
'__used__' attribute ignored

__unused is not defined in the kernel and is not a standard definition.
If __unused is included everywhere instead of __used, this leads to
conflicts with glibc headers, since glibc has a variables with this name
in its headers.

The best approach is to use __maybe_unused, the definition used in the
kernel for __attribute__((unused)). In this way there is only one
definition in perf sources (instead of 2 definitions that point to the
same thing: __used and __unused) and it works on both Linux and Android.
This patch simply replaces all instances of __used and __unused with
__maybe_unused.
Signed-off-by: NIrina Tirdea <irina.tirdea@intel.com>
Acked-by: NPekka Enberg <penberg@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com
[ committer note: fixed up conflict with a116e05d in builtin-sched.c ]
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

1d037ca1

09 9月, 2012 1 次提交

perf bench: fix assert when NDEBUG is defined · 8bf98b89

由 Irina Tirdea 提交于 9月 08, 2012

When NDEBUG is defined, the assert macro will be expanded to nothing.
Some assert calls used in perf are also including some functionality
(e.g. system calls), not only validity checks. Therefore, if NDEBUG is
defined, this functionality will be removed along with the assert.  Perf
also defines BUG_ON based on assert, so it has the same problem.

Define BUG_ON so that the condition will be executed when NDEBUG is
defined.  Replace the assert statements that have these side effects
with BUG_ON.

For defining BUG_ON, use "if (cond) {}" insted of "if (cond) ;" because
in the latter case build fails with "error: suggest braces around empty
body in an ‘if’ statement [-Werror=empty-body]"
Suggested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIrina Tirdea <irina.tirdea@intel.com>
Reviewed-by: NNamhyung Kim <namhyung@kernel.org>
Reviewed-by: NPekka Enberg <penberg@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1347082551-2394-1-git-send-email-irina.tirdea@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

8bf98b89

03 7月, 2012 1 次提交

perf bench: Fix confused variable namings and descriptions in mem subsystem · 17d7a112

由 Hitoshi Mitake 提交于 7月 02, 2012

As Namhyung Kim pointed, there are confused namings and descriptions of words
"cycle" and "clock" in mem-memset.c and mem-memcpy.c.

With the option "-c" (or "--clock", now renamed as "--cycle"), mem subsystem
measures cost of memset() and memcpy() with cpu-cycles event.

But current mem subsystem source code contains lots of confused variable
namings and descriptions with "clock" (e.g. the variable use_clock). This is a
very bad style because there is another software event named "cpu-clock". This
patch replaces wrong usage of "clock" to "cycle".

v2: modified Documentation/perf-bench.txt for the descriptions of
--cycle option
Signed-off-by: NHitoshi Mitake <h.mitake@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1341236777-18457-1-git-send-email-h.mitake@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

17d7a112

28 6月, 2012 1 次提交

perf bench: Documentation update · 08942f6d

由 Namhyung Kim 提交于 6月 20, 2012

The current perf-bench documentation has a couple of typos and even
lacks entire description of mem subsystem. Fix it.
Reported-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
Acked-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1340172486-17805-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

08942f6d

07 2月, 2012 2 次提交

perf tool: Fix perf stack to non executable on x86_64 · e89cef13

由 Jiri Olsa 提交于 2月 01, 2012

By adding following objects:
  bench/mem-memset-x86-64-asm.o
  bench/mem-memcpy-x86-64-asm.o
the x86_64 perf binary ended up with executable stack.

The reason was that above objects are assembler sourced and are missing the
GNU-stack note section. In such case the linker assumes that the final binary
should not be restricted at all and mark the stack as RWX.

Adding section ".note.GNU-stack" definition to mentioned objects, with all
flags disabled, thus omiting those objects from linker stack flags decision.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=783570Reported-by: NClark Williams <williams@redhat.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328100848-5630-1-git-send-email-jolsa@redhat.comSigned-off-by: NJiri Olsa <jolsa@redhat.com>
[ committer note: Remaining bits after what was already added to perf/urgent ]
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

e89cef13

perf tools: Fix perf stack to non executable on x86_64 · 7a0153ee

由 Jiri Olsa 提交于 2月 06, 2012

By adding following objects:
  bench/mem-memcpy-x86-64-asm.o
the x86_64 perf binary ended up with executable stack.

The reason was that above object are assembler sourced and is missing the
GNU-stack note section. In such case the linker assumes that the final binary
should not be restricted at all and mark the stack as RWX.

Adding section ".note.GNU-stack" definition to mentioned object, with all
flags disabled, thus omiting this object from linker stack flags decision.

Problem introduced in:

  $ git describe ea7872b9
  v2.6.37-rc2-19-gea7872b9

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=783570Reported-by: NClark Williams <williams@redhat.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1328100848-5630-1-git-send-email-jolsa@redhat.comSigned-off-by: NJiri Olsa <jolsa@redhat.com>
[ committer note: Backported fix to perf/urgent (3.3-rc2+) ]
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

7a0153ee

31 1月, 2012 1 次提交

perf tools: Remove unnecessary ctype.h inclusion · d30d4a08

由 Namhyung Kim 提交于 1月 29, 2012

There are unnecessary #include <ctype.h> out there, and they might cause
a nasty build failure in some environment. As we already have most of
ctype macros in util.h, just get rid of them.

A few of exceptions are util/symbol.c which needs isupper() macro util.h
doesn't provide and perl scripting support code which includes ctype.h
internally.
Suggested-by: NIngo Molnar <mingo@elte.hu>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1327827356-8786-4-git-send-email-namhyung@gmail.comSigned-off-by: NNamhyung Kim <namhyung@gmail.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

d30d4a08

25 1月, 2012 4 次提交

perf bench: Allow passing an iteration count to "bench mem mem{cpy,set}" · e3e877e7

由 Jan Beulich 提交于 1月 18, 2012

"perf stat ... perf bench mem mem..." is pretty meaningless when using
small block sizes (as the overhead of the invocation of each test run
basically hides the actual test result in the noise). Repeating the
actually interesting function's invocation a number of times allows the
results to become meaningful.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4F16D767020000780006D738@nat28.tlf.novell.comSigned-off-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

e3e877e7

perf bench: Also allow measuring memset() · be3de80d

由 Jan Beulich 提交于 1月 24, 2012

This simply clones the respective memcpy() implementation.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/4F16D743020000780006D735@nat28.tlf.novell.comSigned-off-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

be3de80d

perf bench: Also allow measuring alternative memcpy implementations · 800eb014

由 Jan Beulich 提交于 1月 18, 2012

Intended to be able to support the current selection of the preferred
memcpy() implementation, this patch adds the ability to also measure the
two alternative implementations, again by way of using some
pre-processsor replacement.

While on my Westmere system this proves that the movsb based variant is
worse than the movsq based one (since the ERMS feature isn't there), it
also shows that here for the default as well as small sizes the unrolled
variant outperforms the movsq one.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4F16D728020000780006D732@nat28.tlf.novell.comSigned-off-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

800eb014

perf bench: Make "default" memcpy() selection actually use glibc's implementation · 9ea81197

由 Jan Beulich 提交于 1月 18, 2012

Since arch/x86/lib/memcpy_64.S implements not only __memcpy, but also
memcpy, without further precautions this function will get chose by the
static linker for resolving all references, and hence the "default"
measurement didn't really measure anything else than the
"x86-64-unrolled" one.

Fix this by renaming (through the pre-processor) the conflicting symbol.

On my Westmere system, the glibc variant turns out to require about 4%
less instructions, but 15% more cycles for the default 1Mb block size
measured.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4F16D6FD020000780006D72F@nat28.tlf.novell.comSigned-off-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

9ea81197

07 2月, 2011 1 次提交

perf tool: Fix gcc 4.6.0 issues · fb7d0b3c

由 Kyle McMartin 提交于 1月 24, 2011

GCC 4.6.0 in Fedora rawhide turned up some compile errors in tools/perf
due to the -Werror=unused-but-set-variable flag.

I've gone through and annotated some of the assignments that had side
effects (ie: return value from a function) with the __used annotation,
and in some cases, just removed unused code.

In a few cases, we were assigning something useful, but not using it in
later parts of the function.

kyle@dreadnought:~/src% gcc --version
gcc (GCC) 4.6.0 20110122 (Red Hat 4.6.0-0.3)

Cc: Ingo Molnar <mingo@redhat.com>
LKML-Reference: <20110124161304.GK27353@bombadil.infradead.org>
Signed-off-by: NKyle McMartin <kyle@redhat.com>
[ committer note: Fixed up the annotation fixes, as that code moved recently ]
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

fb7d0b3c

26 11月, 2010 2 次提交

perf bench: Add feature that measures the performance of the... · ea7872b9

由 Hitoshi Mitake 提交于 11月 25, 2010

perf bench: Add feature that measures the performance of the arch/x86/lib/memcpy_64.S memcpy routines via 'perf bench mem'

This patch ports arch/x86/lib/memcpy_64.S to perf bench mem
memcpy for benchmarking memcpy() in userland with tricky and
dirty way.

util/include/asm/cpufeature.h, util/include/asm/dwarf2.h, and
util/include/linux/linkage.h are mostly dummy files with small
wrappers, so that we are able to include memcpy_64.S
unmodified.
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: h.mitake@gmail.com
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Ma Ling <ling.ma@intel.com>
Cc: Zhao Yakui <yakui.zhao@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <1290668693-27068-2-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ea7872b9

perf bench: Print both of prefaulted and no prefaulted results by default · 49ce8fc6

由 Hitoshi Mitake 提交于 11月 25, 2010

After applying this patch, perf bench mem memcpy prints
both of prefualted and without prefaulted score of memcpy().

New options --no-prefault and --only-prefault are added
to print single result, mainly for scripting usage.

Usage example:

 | mitake@X201i:~/linux/.../tools/perf% ./perf bench mem memcpy -l 500MB
 | # Running mem/memcpy benchmark...
 | # Copying 500MB Bytes ...
 |
 |      634.969014 MB/Sec
 |        4.828062 GB/Sec (with prefault)
 | mitake@X201i:~/linux/.../tools/perf% ./perf bench mem memcpy -l 500MB --only-prefault
 | # Running mem/memcpy benchmark...
 | # Copying 500MB Bytes ...
 |
 |        4.705192 GB/Sec (with prefault)
 | mitake@X201i:~/linux/.../tools/perf% ./perf bench mem memcpy -l 500MB --no-prefault
 | # Running mem/memcpy benchmark...
 | # Copying 500MB Bytes ...
 |
 |      642.725568 MB/Sec
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: h.mitake@gmail.com
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Ma Ling <ling.ma@intel.com>
Cc: Zhao Yakui <yakui.zhao@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <1290668693-27068-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

49ce8fc6

18 5月, 2010 1 次提交

perf options: Check v type in OPT_U?INTEGER · 1967936d

由 Arnaldo Carvalho de Melo 提交于 5月 17, 2010

To avoid problems like the one fixed by Stephane Eranian in 3de29cab, now
we'll got this instead:

	bench/sched-messaging.c:259: error: negative width in bit-field ‘<anonymous>’
	bench/sched-messaging.c:261: error: negative width in bit-field ‘<anonymous>’

Which is rather cryptic, but is how BUILD_BUG_ON_ZERO works, so kernel
hackers should be already used to this.

With it in place found some problems, fixed by changing the affected
variables to sensible types or changed some OPT_INTEGER to OPT_UINTEGER.

Next csets will go thru converting each of the remaining OPT_ so that
review can be made easier by grouping changes per type per patch.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

1967936d

14 4月, 2010 1 次提交

perf: Fix endianness argument compatibility with OPT_BOOLEAN() and introduce OPT_INCR() · c0555642

由 Ian Munsie 提交于 4月 13, 2010

Parsing an option from the command line with OPT_BOOLEAN on a
bool data type would not work on a big-endian machine due to the
manner in which the boolean was being cast into an int and
incremented. For example, running 'perf probe --list' on a
PowerPC machine would fail to properly set the list_events bool
and would therefore print out the usage information and
terminate.

This patch makes OPT_BOOLEAN work as expected with a bool
datatype. For cases where the original OPT_BOOLEAN was
intentionally being used to increment an int each time it was
passed in on the command line, this patch introduces OPT_INCR
with the old behaviour of OPT_BOOLEAN (the verbose variable is
currently the only such example of this).

I have reviewed every use of OPT_BOOLEAN to verify that a true
C99 bool was passed. Where integers were used, I verified that
they were only being used for boolean logic and changed them to
bools to ensure that they would not be mistakenly used as ints.
The major exception was the verbose variable which now uses
OPT_INCR instead of OPT_BOOLEAN.
Signed-off-by: NIan Munsie <imunsie@au.ibm.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Cc: <stable@kernel.org> # NOTE: wont apply to .3[34].x cleanly, please backport
Cc: Git development list <git@vger.kernel.org>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric B Munson <ebmunson@us.ibm.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: WANG Cong <amwang@redhat.com>
Cc: Thiago Farina <tfransosi@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: John Kacur <jkacur@redhat.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1271147857-11604-1-git-send-email-imunsie@au.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c0555642

08 4月, 2010 1 次提交

perf bench: fix spello · f0e9c4fc

由 Randy Dunlap 提交于 3月 31, 2010

Fix spello in user message.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Cc: Paul Mackerra <paulus@samba.org>s
LKML-Reference: <20100331113056.2c7df509.randy.dunlap@oracle.com>
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>

f0e9c4fc

03 4月, 2010 1 次提交

perf tools: Move the prototypes in util/string.h to util.h · e206d556

由 Arnaldo Carvalho de Melo 提交于 4月 03, 2010

So that we avoid conflict with libc's string.h header.
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Suggested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

e206d556

14 12月, 2009 2 次提交

perf sched: Fix build failure on sparc · 2cd9046c

由 David Miller 提交于 12月 13, 2009

Here, tvec->tv_usec is "unsigned int" not "unsigned long".

Since the type is different on every platform, it's probably
best to just use long printf formats and cast.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091213.235622.53363059.davem@davemloft.net>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2cd9046c

perf bench: Add "all" pseudo subsystem and "all" pseudo suite · 2044279d

由 Hitoshi Mitake 提交于 12月 13, 2009

This patch adds a new "all" pseudo subsystem and an "all" pseudo
suite. These are for testing all subsystem and its all suite, or
all suite of one subsystem.

(This patch also contains a few trivial comment fixes for
bench/* and output style fixes. I judged that there are no
necessity to make them into individual patch.)

Example of use:

| % ./perf bench sched all                      # Test all suites of sched subsystem
| # Running sched/messaging benchmark...
| # 20 sender and receiver processes per group
| # 10 groups == 400 processes run
|
|      Total time: 0.414 [sec]
|
| # Running sched/pipe benchmark...
| # Extecuted 1000000 pipe operations between two tasks
|
|      Total time: 10.999 [sec]
|
|       10.999317 usecs/op
|           90914 ops/sec
|
| % ./perf bench all                            # Test all suites of all subsystems
| # Running sched/messaging benchmark...
| # 20 sender and receiver processes per group
| # 10 groups == 400 processes run
|
|      Total time: 0.420 [sec]
|
| # Running sched/pipe benchmark...
| # Extecuted 1000000 pipe operations between two tasks
|
|      Total time: 11.741 [sec]
|
|       11.741346 usecs/op
|           85169 ops/sec
|
| # Running mem/memcpy benchmark...
| # Copying 1MB Bytes from 0x7ff33e920010 to 0x7ff3401ae010 ...
|
|      808.407437 MB/Sec
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1260691319-4683-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2044279d

24 11月, 2009 1 次提交

perf tools: Introduce zalloc() for the common calloc(1, N) case · 36479484

由 Arnaldo Carvalho de Melo 提交于 11月 24, 2009

This way we type less characters and it looks more like the
kzalloc kernel counterpart.
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1259071517-3242-3-git-send-email-acme@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

36479484

22 11月, 2009 1 次提交

perf bench: Make the mem/memcpy tests more user-friendly · 12eac0bf

由 Hitoshi Mitake 提交于 11月 20, 2009

mem-memcpy.c uses perf event system calls to obtain CPU clocks.
And it suddenly dies with BUG_ON() when it running on Linux
doesn't support perf event.

Also fail at calloc() can occur easily when too large
length is passed. Fail of calloc() causes sudden death
with assert().

These behaviours are not friendly. So I fixed the treating of
errors.
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1258688237-3797-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
[ v2: improved a few small details ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

12eac0bf

19 11月, 2009 1 次提交

perf bench: Add memcpy() benchmark · 827f3b49

由 Hitoshi Mitake 提交于 11月 18, 2009

'perf bench mem memcpy' is a benchmark suite for measuring memcpy()
performance.

Example on a Intel(R) Core(TM)2 Duo CPU E6850 @ 3.00GHz:

| % perf bench mem memcpy -l 1GB
| # Running mem/memcpy benchmark...
| # Copying 1MB Bytes from 0xb7d98008 to 0xb7e99008 ...
|
|     726.216412 MB/Sec
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1258471212-30281-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
[ v2: updated changelog, clarified history of builtin-bench.c ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

827f3b49

11 11月, 2009 2 次提交

perf bench: Improve sched-message.c with more comfortable output · c5659b74

由 Hitoshi Mitake 提交于 11月 11, 2009

This patch improves sched-message.c with more comfortable output.

Change points are comment style description and
formatting numerical values and its units.

Example:

 | % perf bench sched messaging
 | # Running sched/messaging benchmark...
 | # 20 sender and receiver processes per group
 | # 10 groups == 400 processes run
 |
 |      Total time: 1.490 [sec]
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1257865442-20252-4-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>

c5659b74

perf bench: Improve sched-pipe.c with more comfortable output · ff676b19

由 Hitoshi Mitake 提交于 11月 11, 2009

This patch improves sched-pipe.c with more comfortable output.

Change points are comment style description and
formatting numerical values and its units.

Example:

 | % ./perf bench sched pipe
 | # Running sched/pipe benchmark...
 | # Extecuted 1000000 pipe operations between two tasks
 |
 |      Total time:5.822 [sec]
 |
 |        5.822553 usecs/op
 |          171745 ops/sec
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1257865442-20252-3-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ff676b19

10 11月, 2009 4 次提交

perf bench: Clean up bench/bench.h · 606bc1e1

由 Ingo Molnar 提交于 11月 10, 2009

Clean up initializers in bench.h:

  - No need to break the line for function prototypes, they are more
    readable in a single line. (even if checkpatch complains about it

  - We try to align definitions / structure fields vertically,
    to make it  all a bit more readable.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1257853855-28934-2-git-send-email-mitake@dcl.info.waseda.ac.jp>

606bc1e1

perf bench: Modify builtin-pipe.c for processing common options · 158ba827

由 Hitoshi Mitake 提交于 11月 10, 2009

This patch modifies builtin-pipe.c for processing common
options. The first option added is "--format".
Users of perf bench will be able to specify output style by
--format.

Usage example:

 % ./perf bench sched pipe		# with no style specify
 (executing 1000000 pipe operations between two tasks)

         Total time:5.855 sec
                 5.855061 usecs/op
                 170792 ops/sec

 % ./perf bench --format=simple sched pipe # specified simple
 5.988
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1257808802-9420-5-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>

158ba827

perf bench: Modify bench/bench-messaging.c to adopt unified output formatting · cced06c6

由 Hitoshi Mitake 提交于 11月 10, 2009

This patch modifies bench/bench-messaging.c to adopt
unified output formatting: --format option.

Usage example:

 % ./perf bench sched messaging              # with no style
 specify (20 sender and receiver processes per group)
 (10 groups == 400 processes run)

        Total time:1.431 sec

 % ./perf bench --format=simple sched messaging # specified
 simple 1.431
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1257808802-9420-4-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cced06c6

perf bench: Add format constants to bench.h for unified output formatting · 242aa14a

由 Hitoshi Mitake 提交于 11月 10, 2009

This patch adds some constants and extern declaration to
bench.h. These are used for unified output formatting
of 'perf bench'.
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1257808802-9420-2-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

242aa14a

09 11月, 2009 1 次提交

perf bench: Fix bench/sched-pipe.c to wait for child process · 5ff0cfc6

由 Hitoshi Mitake 提交于 11月 09, 2009

Ingo reported this small 'perf bench sched pipe' output problem:

 | $ ./perf bench sched pipe
 | (executing 1000000 pipe operations between two tasks)
 |
 |	Total time:4.898 sec
 | $		4.898586 usecs/op
 |		204140 ops/sec
 |
 | the shell prompt came back before the usecs/op and ops/sec line
 | was printed. Process teardown race, lack of wait() or so?

This caused by lack of calling waitpid() by parent process,
so I added it.
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>
LKML-Reference: <1257737465-7546-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5ff0cfc6

08 11月, 2009 3 次提交

perf bench: Add sched-pipe.c: Benchmark for pipe() system call · c7d9300f

由 Hitoshi Mitake 提交于 11月 05, 2009

This patch adds bench/sched-pipe.c.

bench/sched-pipe.c is a benchmark program
to measure performance of pipe() system call.
This benchmark is based on pipe-test-1m.c by Ingo Molnar:

   http://people.redhat.com/mingo/cfs-scheduler/tools/pipe-test-1m.c

Example of use:

% perf bench sched pipe
  (executing 1000000 pipe operations between two tasks)

          Total time:4.499 sec
                  4.499179 usecs/op
                  222262 ops/sec

% perf bench sched pipe -s -l 1000
0.015
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Jiri Kosina <jkosina@suse.cz>
LKML-Reference: <1257381097-4743-4-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c7d9300f

perf bench: Add sched-messaging.c: Benchmark for scheduler and IPC mechanisms based on hackbench · e27454cc

由 Hitoshi Mitake 提交于 11月 05, 2009

This patch adds bench/sched-messaging.c.

This benchmark measures performance of scheduler and IPC
mechanisms, and is based on hackbench by Rusty Russell.

Example of usage:

  % perf bench sched messaging -g 20 -l 1000 -s
  5.432  	  	       	    	    	     # in sec

  % perf bench sched messaging                 # run with default
  options (20 sender and receiver processes per group)
  (10 groups == 400 processes run)

        Total time:0.308 sec

  % perf bench sched messaging -t -g 20	     # # be multi-thread,
  with 20 groups (20 sender and receiver threads per group)
  (20 groups == 800 threads run)

        Total time:0.582 sec

( Rusty is the original author of hackbench.c and he said the code is
  and was under the GPLv2 so fine to be merged. )
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Jiri Kosina <jkosina@suse.cz>
LKML-Reference: <1257381097-4743-3-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e27454cc

perf bench: Add new directory and header for new subcommand 'bench' · c426bba0

由 Hitoshi Mitake 提交于 11月 05, 2009

This patch adds bench/ directory and bench/bench.h.

bench/ directory will contain modules for bench subcommand.
bench/bench.h is for listing prototypes of module functions.
Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Jiri Kosina <jkosina@suse.cz>
LKML-Reference: <1257381097-4743-2-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c426bba0

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功