- 20 12月, 2016 1 次提交
-
-
由 Davidlohr Bueso 提交于
Obvious copy/paste typo from the requeue program. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Cc: Davidlohr Bueso <dbueso@suse.de> Link: http://lkml.kernel.org/r/1481830584-30909-1-git-send-email-dave@stgolabs.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 10月, 2016 2 次提交
-
-
由 Davidlohr Bueso 提交于
This gets rid of oddities such as: perf bench futex hash -t -4 perf: calloc: Cannot allocate memory Runtime (and many more) are equally busted, i.e. run for bogus amounts of time. Just use the abs, instead of, for example errorring out. Committer note: After the patch: $ perf bench futex hash -t -4 # Running 'futex/hash' benchmark: Run summary [PID 10178]: 4 threads, each operating on 1024 [private] futexes for 10 secs. [thread 0] futexes: 0x34f9fa0 ... 0x34faf9c [ 4702208 ops/sec ] [thread 1] futexes: 0x34fb140 ... 0x34fc13c [ 4707020 ops/sec ] [thread 2] futexes: 0x34fc2e0 ... 0x34fd2dc [ 4711526 ops/sec ] [thread 3] futexes: 0x34fd480 ... 0x34fe47c [ 4709683 ops/sec ] Averaged 4707609 operations/sec (+- 0.04%), total secs = 10 $ Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1477342613-9938-3-git-send-email-dave@stgolabs.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Davidlohr Bueso 提交于
Sebastian noted that overhead for worker thread ops (throughput) accounting was producing 'perf' to appear in the profiles, consuming a non-trivial (i.e. 13%) amount of CPU. This is due to cacheline bouncing due to the increment of w->ops. We can easily fix this by just working on a local copy and updating the actual worker once done running, and ready to show the program summary. There is no danger of the worker being concurrent, so we can trust that no stale value is being seen by another thread. This also gets rid of the unnecessary cache alignment hack; its not worth it. Reported-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Acked-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Link: http://lkml.kernel.org/r/1477342613-9938-2-git-send-email-dave@stgolabs.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 10月, 2016 2 次提交
-
-
It popped up in perf testing that the worker consumes some amount of CPU. It boils down to the increment of `ops` which causes cache line bouncing between the individual threads. This patch aligns the struct by 256 bytes to ensure that not a cache line is shared among CPUs. 128 byte is the x86 worst case and grep says that L1_CACHE_SHIFT is set to 8 on s390. Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20161016190803.3392-1-bigeasy@linutronix.deSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Instead of having all tests perform alloc/free, do it in the code that calls the do_cycles() and do_gettimeofday() functions. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-lywj4mbdb1m9x1z9asivwuuy@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 8月, 2016 5 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Following kernel practices and better documentin Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xncwqxegjp13g2nxih3lp9mx@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xhyoyxejvorrgmwjx9k3j8k2@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Following kernel practices, using linux/time64.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xdtmguafva17wp023sxojiib@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Following kernel practices. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wgfu1h1pnw8lc919o2tan58y@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Following kernel practices, using linux/time64.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-7vnv15263y50qku76p4w5xk6@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 13 7月, 2016 4 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
We can't access kernel files directly from tools/, so copy the required bits, and make sure that we detect when the original files, in the kernel, gets modified. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-z7e76274ch5j4nugv048qacb@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Since these files use __maybe_unused, and that is defined in linux/compiler.h, include it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-1llbf59ut6xon6ti88jm0n9j@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
We should try avoiding that perf.h header, it includes way too much stuff, making it difficult to use things like setting _GNU_SOURCE only on a small set of headers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-lb6eg9w1kzrwhv0gm3ho0h54@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Cc: David Ahern <dsahern@gmail.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-48qbfv7tqs8n8ey74lbyfjtq@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 27 4月, 2016 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Propagate the error instead. Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-z6erjg35d1gekevwujoa0223@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 26 4月, 2016 1 次提交
-
-
由 Davidlohr Bueso 提交于
Given that the 'val' parameter is ignored for FUTEX_LOCK_PI, get rid of the bogus deadlock detection flag in the wrapper code and avoid the extra argument, making it resemble its unlock counterpart. And if nothing else, we already only pass 0 anyway. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Cc: Davidlohr Bueso <dbueso@suse.de> Link: http://lkml.kernel.org/r/1461208447-29328-1-git-send-email-dave@stgolabs.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 3月, 2016 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-w246stf7ponfamclsai6b9zo@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 23 3月, 2016 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
All over the tree. Cc: David Ahern <dsahern@gmail.com> cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-8nzhnokxyp8y4v7gf0j00oyb@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 22 3月, 2016 1 次提交
-
-
由 Jakub Jelen 提交于
Comparing bits and bytes in numa benchmark assertion I hit the issue on two socket Power8 machine presenting its numa nodes as 0,1,16,17 (according to numactl). Therefore I got error (and hang of parent process): perf: bench/numa.c:296: bind_to_memnode: Assertion `!(g->p.nr_nodes > (int)sizeof(nodemask))' failed. This is obviously false positive. We can fit all the 18 nodes into bitfield of 8 bytes (long on 64b architecture). Signed-off-by: NJakub Jelen <jakuje@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jakub Jelen <jjelen@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: trivial@kernel.org Link: http://lkml.kernel.org/r/1458388687-24421-1-git-send-email-jakuje@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 09 3月, 2016 1 次提交
-
-
由 Ingo Molnar 提交于
The following upcoming upstream commit: 92b0729c ("x86/mm, x86/mce: Add memcpy_mcsafe()") Adds _ASM_EXTABLE_FAULT(), which is not available in user-space and breaks the build. We don't really need _ASM_EXTABLE_FAULT() in user-space, so simply wrap it to nothing. Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 18 12月, 2015 1 次提交
-
-
由 Josh Poimboeuf 提交于
Move the subcommand-related files from perf to a new library named libsubcmd.a. Since we're moving files anyway, go ahead and rename 'exec_cmd.*' to 'exec-cmd.*' to be consistent with the naming of all the other files. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/c0a838d4c878ab17fee50998811612b2281355c1.1450193761.git.jpoimboe@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 20 10月, 2015 13 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To avoid this splat with gcc 4.4.7: cc1: warnings being treated as errors bench/mem-functions.c:273: error: missing initializer bench/mem-functions.c:273: error: (near initialization for ‘memcpy_functions[4].desc’) bench/mem-functions.c:366: error: missing initializer bench/mem-functions.c:366: error: (near initialization for ‘memset_functions[4].desc’) Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/n/tip-0s8o6tgw1pdwvdv02llb9tkd@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ingo Molnar 提交于
So right now there's a somewhat inconsistent mess of the benchmarking code and options sometimes calling benchmarked functions 'functions', sometimes calling them 'routines'. Name them 'functions' consistently. Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-14-git-send-email-mingo@kernel.org [ Updated perf-bench man page, pointed out by David Ahern ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ingo Molnar 提交于
We have three benchmarking subsystems that specify some sort of 'number of loops' parameter - but all of them do it inconsistently: numa: -l/--nr_loops sched messaging: -l/--loops mem memset/memcpy: -i/--iterations Harmonize them to -l/--nr_loops by picking the numa variant - which is also the most likely one to have existing scripting which we don't want to break. Plus improve the parameter help texts to indicate the default value for the nr_loops variable to keep users from guessing ... Also propagate the naming to internal variables. Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-13-git-send-email-mingo@kernel.org [ Let the harmonisation reach the perf-bench man page as well ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ingo Molnar 提交于
Reorder functions a bit, so that we synchronize the layout of the memcpy() and memset() portions of the code. This improves the code, especially after we'll add an strlcpy() variant as well. Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-12-git-send-email-mingo@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ingo Molnar 提交于
- fix various typos in user visible output strings - make the output consistent (wrt. capitalization and spelling) - offer the list of routines to benchmark on '-r help'. Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-11-git-send-email-mingo@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ingo Molnar 提交于
So 'perf bench mem memcpy/memset' consistently uses 'len' and 'length' for buffer sizes - while it's really a memory buffer size. (strings have length.) Rename all affected variables. Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-10-git-send-email-mingo@kernel.org [ Update perf-bench man page ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ingo Molnar 提交于
So bench/mem-functions.c has a 'routine' name for the routines parameter string, but a 'length_str' name for the length parameter string. We also have another entity named 'routine': 'struct routine'. This is inconsistent and confusing: rename 'routine' to 'routine_str'. Also fix typos in the --routine help text. Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-9-git-send-email-mingo@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ingo Molnar 提交于
So 'perf bench mem memset/memcpy' has a CPU cycles measurement method, but calls it 'cycle' (singular) throughout the code, which makes it harder to read. Rename all related functions, variables and options to a plural 'cycles' nomenclature. Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-8-git-send-email-mingo@kernel.org [ s/--cycle/--cycles/g in perf-bench man page ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ingo Molnar 提交于
So 'perf bench mem memcpy/memset' has elaborate code to measure memcpy()/memset() performance both with freshly allocated buffers (which includes initial page fault overhead) and with preallocated buffers. But the thing is, the resulting bandwidth results are mostly meaningless, because page faults dominate so much of the cost. It might make sense to measure cache cold vs. cache hot performance, but the code does not do this. So remove this complication, and always prefault the ranges before using them. Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-6-git-send-email-mingo@kernel.org [ Remove --no-prefault, --only-prefault from docs, noticed by David Ahern ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ingo Molnar 提交于
So mem-memcpy.c started out as a simple memcpy() benchmark, then it grew memset() functionality and now I plan to add string copy benchmarks as well. This makes the file name a misnomer: rename it to the more generic mem-functions.c name. Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-5-git-send-email-mingo@kernel.org [ The "rename" was introducing __unused, wasn't removing the old file, and didn't update tools/perf/bench/Build, fix it ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ingo Molnar 提交于
Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-4-git-send-email-mingo@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ingo Molnar 提交于
So few people know that the --routine option to 'perf bench memcpy/memset' exists, and would not know that it's capable of testing the kernel's memcpy/memset implementations. Furthermore, 'perf bench mem all' will not run all routines: vega:~> perf bench mem all # Running mem/memcpy benchmark... Routine default (Default memcpy() provided by glibc) # Copying 1MB Bytes ... 894.454383 MB/Sec 3.844734 GB/Sec (with prefault) # Running mem/memset benchmark... Routine default (Default memset() provided by glibc) # Copying 1MB Bytes ... 1.220703 GB/Sec 9.042245 GB/Sec (with prefault) Because misleadingly the 'all' refers to 'all sub-benchmarks', not 'all sub-benchmarks and routines'. Fix all this by making the memcpy/memset routine to default to 'all', which results in all the benchmarks being run: triton:~> perf bench mem all # Running mem/memcpy benchmark... Routine default (Default memcpy() provided by glibc) # Copying 1MB Bytes ... 1.448906 GB/Sec 4.957170 GB/Sec (with prefault) Routine x86-64-unrolled (unrolled memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB Bytes ... 1.614153 GB/Sec 4.379204 GB/Sec (with prefault) Routine x86-64-movsq (movsq-based memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB Bytes ... 1.570036 GB/Sec 4.264465 GB/Sec (with prefault) Routine x86-64-movsb (movsb-based memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB Bytes ... 1.788576 GB/Sec 6.554111 GB/Sec (with prefault) # Running mem/memset benchmark... Routine default (Default memset() provided by glibc) # Copying 1MB Bytes ... 2.082223 GB/Sec 9.126752 GB/Sec (with prefault) Routine x86-64-unrolled (unrolled memset() in arch/x86/lib/memset_64.S) # Copying 1MB Bytes ... 5.710892 GB/Sec 8.346688 GB/Sec (with prefault) Routine x86-64-stosq (movsq-based memset() in arch/x86/lib/memset_64.S) # Copying 1MB Bytes ... 9.765625 GB/Sec 12.520032 GB/Sec (with prefault) Routine x86-64-stosb (movsb-based memset() in arch/x86/lib/memset_64.S) # Copying 1MB Bytes ... 9.668936 GB/Sec 12.682630 GB/Sec (with prefault) Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-3-git-send-email-mingo@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ingo Molnar 提交于
- improve the readability of initializations - fix unnecessary double negations - fix ugly line breaks - fix other small details Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445241870-24854-2-git-send-email-mingo@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 21 7月, 2015 1 次提交
-
-
由 Davidlohr Bueso 提交于
Allows a way of measuring low level kernel implementation of FUTEX_LOCK_PI and FUTEX_UNLOCK_PI. The program comes in two flavors: (i) single futex (default), all threads contend on the same uaddr. For the sake of the benchmark, we call into kernel space even when the lock is uncontended. The kernel will set it to TID, any waters that come in and contend for the pi futex will be handled respectively by the kernel. (ii) -M option for multiple futexes, each thread deals with its own futex. This is a trivial scenario and only measures kernel handling of 0->TID transition. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Cc: Mel Gorman <mgorman@suse.de> Link: http://lkml.kernel.org/r/1436259353.12255.78.camel@stgolabs.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 18 5月, 2015 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
We really should move the sched_getcpu() to some more suitable place, but this one-liner fixes this build problem on ancient distros like RHEL5. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Vinson Lee <vlee@twitter.com> Link: http://lkml.kernel.org/n/tip-5yqg4p11f9uii6yremz3r35v@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 09 5月, 2015 2 次提交
-
-
由 Davidlohr Bueso 提交于
Wrap futex_wait around a loop and catch for EINTR. Either a spurious wakeup occurred or a signal interrupted is, either way we need to block again. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Cc: Davidlohr Bueso <dbueso@suse.de> Link: http://lkml.kernel.org/r/1431110280-20231-2-git-send-email-dave@stgolabs.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Davidlohr Bueso 提交于
The futex-wake benchmark only measures wakeups done within a single process. While this has value in its own, it does not really generate any hb->lock contention. A new benchmark 'wake-parallel' is added, by extending the futex-wake code such that we can measure parallel waker threads. The program output shows the avg per-thread latency in order to complete its share of wakeups: Run summary [PID 13474]: blocking on 512 threads (at [private] futex 0xa88668), 8 threads waking up 64 at a time. [Run 1]: Avg per-thread latency (waking 64/512 threads) in 0.6230 ms (+-15.31%) [Run 2]: Avg per-thread latency (waking 64/512 threads) in 0.5175 ms (+-29.95%) [Run 3]: Avg per-thread latency (waking 64/512 threads) in 0.7578 ms (+-18.03%) [Run 4]: Avg per-thread latency (waking 64/512 threads) in 0.8944 ms (+-12.54%) [Run 5]: Avg per-thread latency (waking 64/512 threads) in 1.1204 ms (+-23.85%) Avg per-thread latency (waking 64/512 threads) in 0.7826 ms (+-9.91%) Naturally, different combinations of numbers of blocking and waker threads will exhibit different information. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Davidlohr Bueso <dbueso@suse.de> Link: http://lkml.kernel.org/r/1431110280-20231-1-git-send-email-dave@stgolabs.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 04 5月, 2015 1 次提交
-
-
由 Petr Holasek 提交于
In verbose mode perf bench numa shows also GB/s speed, system and user cpu time for each particular thread. Using of getrusage() can provide much more per process or per thread stats in future. Signed-off-by: NPetr Holasek <pholasek@redhat.com> Reviewed-by: NIngo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1429198699-25039-3-git-send-email-pholasek@redhat.com [ Rename 'usage' variable to not shadow util.h's usage() ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 28 4月, 2015 1 次提交
-
-
由 Petr Holasek 提交于
This patch fixes the race in the beginning of benchmark run when some threads hasn't got assigned curr_cpu yet so they don't occur in nodes-of-process stats and benchmark concludes that all remaining threads are converged already. The race can be reproduced with small amount of threads and some bigger amount of shared process memory, e.g. one process, two threads and 5GB of process memory. Signed-off-by: NPetr Holasek <pholasek@redhat.com> Reviewed-by: NIngo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1429198699-25039-4-git-send-email-pholasek@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-