提交 · 9d9420f1209a1facea7110d549ac695f5aeeb503 · openeuler / Kernel

10 10月, 2014 2 次提交

selftests/vm/transhuge-stress: stress test for memory compaction · 0085d61f

由 Konstantin Khlebnikov 提交于 10月 09, 2014

This tool induces memory fragmentation via sequential allocation of
transparent huge pages and splitting off everything except their last
sub-pages.  It easily generates pressure to the memory compaction code.

$ perf stat -e 'compaction:*' -e 'migrate:*' ./transhuge-stress
transhuge-stress: allocate 7858 transhuge pages, using 15716 MiB virtual memory and 61 MiB of ram
transhuge-stress: 1.653 s/loop, 0.210 ms/page,   9504.828 MiB/s	7858 succeed,    0 failed, 2439 different pages
transhuge-stress: 1.537 s/loop, 0.196 ms/page,  10226.227 MiB/s	7858 succeed,    0 failed, 2364 different pages
transhuge-stress: 1.658 s/loop, 0.211 ms/page,   9479.215 MiB/s	7858 succeed,    0 failed, 2179 different pages
transhuge-stress: 1.617 s/loop, 0.206 ms/page,   9716.992 MiB/s	7858 succeed,    0 failed, 2421 different pages
^C./transhuge-stress: Interrupt

 Performance counter stats for './transhuge-stress':

         1.744.051      compaction:mm_compaction_isolate_migratepages
             1.014      compaction:mm_compaction_isolate_freepages
         1.744.051      compaction:mm_compaction_migratepages
             1.647      compaction:mm_compaction_begin
             1.647      compaction:mm_compaction_end
         1.744.051      migrate:mm_migrate_pages
                 0      migrate:mm_numa_migrate_ratelimit

       7,964696835 seconds time elapsed
Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0085d61f

mm/balloon_compaction: add vmstat counters and kpageflags bit · 09316c09

由 Konstantin Khlebnikov 提交于 10月 09, 2014

Always mark pages with PageBalloon even if balloon compaction is disabled
and expose this mark in /proc/kpageflags as KPF_BALLOON.

Also this patch adds three counters into /proc/vmstat: "balloon_inflate",
"balloon_deflate" and "balloon_migrate".  They accumulate balloon
activity.  Current size of balloon is (balloon_inflate - balloon_deflate)
pages.

All generic balloon code now gathered under option CONFIG_MEMORY_BALLOON.
It should be selected by ballooning driver which wants use this feature.
Currently virtio-balloon is the only user.
Signed-off-by: NKonstantin Khlebnikov <k.khlebnikov@samsung.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

09316c09

09 10月, 2014 1 次提交

saner perf_atoll() · 8ba7f6c2

由 Al Viro 提交于 8月 29, 2014

That loop in there is both anti-idiomatic *and* completely pointless.
strtoll() is there for purpose; use it and compare what's left with
acceptable suffices.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8ba7f6c2

08 10月, 2014 3 次提交

tracing/kprobes: Add selftest scripts testing kprobe-tracer as startup test · 89c5497d

由 Masami Hiramatsu 提交于 10月 08, 2014

Add two selftest scripts which tests kprobe-tracer as the startup
selftest does.
These test cases are testing that the kprobe_event can accept a
kprobe event with $stack related arguments and a kretprobe event
with $retval argument.

Link: http://lkml.kernel.org/p/20141008040307.13415.45145.stgit@kbuild-f20.novalocalSigned-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

89c5497d

ktest: Don't bother with bisect good or bad on replay · d832d743

由 Steven Rostedt (Red Hat) 提交于 10月 07, 2014

If git bisect reply is being used in the bisect tests, don't bother
doing the git bisect good or git bisect bad calls. The git bisect
reply will override them anyway, and that's called immediately
after the other two. Going the git bisect (good|bad) is just a
waste of time.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

d832d743

ktest: Fix check for new kernel success on rebooting to good kernel · 995bc431

由 Steven Rostedt (Red Hat) 提交于 10月 07, 2014

The reboot function when rebooting back to a good kernel has a check
to make sure that a new kernel was indeed booted. But that check
uses a timeout value, which when calling the monitor will still
return success if the timeout is hit (no bug was found). It should
return an error to let the reboot code know that a new kernel was
not reached. Only the reboot code checks the return value of the
monitor.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

995bc431

04 10月, 2014 1 次提交

ftracetest: Add POSIX.3 standard and XFAIL result codes · 915de2ad

由 Masami Hiramatsu 提交于 9月 29, 2014

Add XFAIL and POSIX 1003.3 standard codes (UNRESOLVED/
UNTESTED/UNSUPPORTED) as result codes. These are used for the
results that test case is expected to fail or unsupported
feature (by config).

To return these result code, this introduces exit_unresolved,
exit_untested, exit_unsupported and exit_xfail functions,
which use real-time signals to notify the result code to
ftracetest.

This also set "errexit" option for the testcases, so that
the tests don't need to exit explicitly.

Note that if the test returns UNRESOLVED/UNSUPPORTED/FAIL,
its test log including executed commands is shown on console
and main logfile as below.

  ------
  # ./ftracetest samples/
  === Ftrace unit tests ===
  [1] failure-case example        [FAIL]
  execute: /home/fedora/ksrc/linux-3/tools/testing/selftests/ftrace/samples/fail.tc
  + . /home/fedora/ksrc/linux-3/tools/testing/selftests/ftrace/samples/fail.tc
  ++ cat non-exist-file
  cat: non-exist-file: No such file or directory
  [2] pass-case example   [PASS]
  [3] unresolved-case example     [UNRESOLVED]
  execute: /home/fedora/ksrc/linux-3/tools/testing/selftests/ftrace/samples/unresolved.tc
  + . /home/fedora/ksrc/linux-3/tools/testing/selftests/ftrace/samples/unresolved.tc
  ++ trap exit_unresolved INT
  ++ kill -INT 29324
  +++ exit_unresolved
  +++ kill -s 38 29265
  +++ exit 0
  [4] unsupported-case example    [UNSUPPORTED]
  execute: /home/fedora/ksrc/linux-3/tools/testing/selftests/ftrace/samples/unsupported.tc
  + . /home/fedora/ksrc/linux-3/tools/testing/selftests/ftrace/samples/unsupported.tc
  ++ exit_unsupported
  ++ kill -s 40 29265
  ++ exit 0
  [5] untested-case example       [UNTESTED]
  [6] xfail-case example  [XFAIL]

  # of passed:  1
  # of failed:  1
  # of unresolved:  1
  # of untested:  1
  # of unsupported:  1
  # of xfailed:  1
  # of undefined(test bug):  0
  ------

Link: http://lkml.kernel.org/p/20140929120211.30203.99510.stgit@kbuild-f20.novalocalAcked-by: NNamhyung Kim <namhyung@kernel.org>
Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

915de2ad

02 10月, 2014 3 次提交

perf record: Fix error message for --filter option not coming after tracepoint · 281f92f2

由 Arnaldo Carvalho de Melo 提交于 10月 01, 2014

  [root@zoo ~]# perf record --filter "common_pid != PERF_PID" -a
  -F option should follow a -e tracepoint option.

The -F option is for --freq, not --filter. Fix it up to show:

  [root@zoo ~]# perf record --filter "common_pid != PERF_PID" -a
  --filter option should follow a -e tracepoint option

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-z0yrm8stn9w3423nkov3eksg@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

281f92f2

perf tools: Fix build breakage on arm64 targets · 660d1329

由 Will Deacon 提交于 9月 30, 2014

Attempting to build the perf tool for an arm64 target results in the
following failure:

  arch/arm64/util/unwind-libunwind.c: In function 'libunwind__arch_reg_id':
  arch/arm64/util/unwind-libunwind.c:77:3: error: implicit declaration of function 'pr_err'
     pr_err("unwind: invalid reg id %d\n", regnum);
     ^
  arch/arm64/util/unwind-libunwind.c:77:3: error: nested extern declaration of 'pr_err'

This is due to commit 84f5d36f ("perf tools: Move pr_* debug macros
into debug object") moving the pr_* macros into a new header file, but
failing to update architectures other than x86.

This patch adds the missing include, and fixes the build again.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1412076432-22045-1-git-send-email-will.deacon@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

660d1329

perf symbols: Improve DSO long names lookup speed with rbtree · 4598a0a6

由 Waiman Long 提交于 9月 30, 2014

With workload that spawns and destroys many threads and processes, it
was found that perf-mem could took a long time to post-process the perf
data after the target workload had completed its operation.

The performance bottleneck was found to be the lookup and insertion of
the new DSO structures (thousands of them in this case).

In a dual-socket Ivy-Bridge E7-4890 v2 machine (30-core, 60-thread), the
perf profile below shows what perf was doing after the profiled AIM7
shared workload completed:

-     83.94%  perf  libc-2.11.3.so     [.] __strcmp_sse42
   - __strcmp_sse42
      - 99.82% map__new
           machine__process_mmap_event
           perf_session_deliver_event
           perf_session__process_event
           __perf_session__process_events
           cmd_record
           cmd_mem
           run_builtin
           main
           __libc_start_main
-     13.17%  perf  perf               [.] __dsos__findnew
     __dsos__findnew
     map__new
     machine__process_mmap_event
     perf_session_deliver_event
     perf_session__process_event
     __perf_session__process_events
     cmd_record
     cmd_mem
     run_builtin
     main
     __libc_start_main

So about 97% of CPU times were spent in the map__new() function trying
to insert new DSO entry into the DSO linked list. The whole
post-processing step took about 9 minutes.

The DSO structures are currently searched linearly. So the total
processing time will be proportional to n^2.

To overcome this performance problem, the DSO code is modified to also
put the DSO structures in a RB tree sorted by its long name in
additional to being in a simple linked list. With this change, the
processing time will become proportional to n*log(n) which will be much
quicker for large n. However, the short name will still be searched
using the old linear searching method.  With that patch in place, the
same perf-mem post-processing step took less than 30 seconds to
complete.
Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Scott J Norton <scott.norton@hp.com>
Link: http://lkml.kernel.org/r/1412098575-27863-3-git-send-email-Waiman.Long@hp.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

4598a0a6

30 9月, 2014 7 次提交

perf symbols: Encapsulate dsos list head into struct dsos · 8fa7d87f

由 Waiman Long 提交于 9月 29, 2014

This is a precursor patch to enable long name searching of DSOs using
a rbtree.

In this patch, a new dsos structure is created which contains only a
list head structure for the moment.

The new dsos structure is used, in turn, in the machine structure for
the user_dsos and kernel_dsos fields.

Only the following 3 dsos functions are modified to accept the new dsos
structure parameter instead of list_head:

 - dsos__add()
 - dsos__find()
 - __dsos__findnew()
Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Scott J Norton <scott.norton@hp.com>
Link: http://lkml.kernel.org/r/1412021249-19201-2-git-send-email-Waiman.Long@hp.com
[ Move struct dsos to dso.h to reduce the dso methods depends on machine.h ]
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

8fa7d87f

locktorture: Support rwlocks · e34191fa

由 Davidlohr Bueso 提交于 9月 29, 2014

Add a "rw_lock" torture test to stress kernel rwlocks and their irq
variant. Reader critical regions are 5x longer than writers. As such
a similar ratio of lock acquisitions is seen in the statistics. In the
case of massive contention, both hold the lock for 1/10 of a second.
Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

e34191fa

selftests/powerpc: Add test of load_unaligned_zero_pad() · fe2a1bb1

由 Michael Ellerman 提交于 9月 25, 2014

It is a rarely exercised case, so we want to have a test to ensure it
works as required.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

fe2a1bb1

perf bench futex: Sanitize -q option in requeue · e19685ed

由 Davidlohr Bueso 提交于 9月 29, 2014

When given the number of threads to requeue at once by user input,
there's always the risk of this value being larger than the total number
of threads.  This doesn't make any sense, and the kernel can easily deal
with such sort of situations, hence no big deal. We should however
prevent bogus output such as:

./perf bench --repeat 2 futex requeue -q 10
Run summary [PID 22210]: Requeuing 4 threads (from [private] 0x99ef3c to 0x99ef38), 10 at a time.

[Run 1]: Requeued 10 of 4 threads in 0.0040 ms
[Run 2]: Requeued 10 of 4 threads in 0.0030 ms
Requeued 10 of 4 threads in 0.0035 ms (+-14.29%)
Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Link: http://lkml.kernel.org/r/1412008868-22328-2-git-send-email-dave@stgolabs.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

e19685ed

perf bench futex: Support operations for shared futexes · 86c87e13

由 Davidlohr Bueso 提交于 9月 29, 2014

Unlike futex-hash, requeuing and wakeup benchmarks do not support shared
futexes, limiting the usefulness of the programs. Correct this, and
allow using the local -S parameter. The default remains using private
futexes.
Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Link: http://lkml.kernel.org/r/1412008868-22328-1-git-send-email-dave@stgolabs.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

86c87e13

perf trace: Fix mmap return address truncation to 32-bit · 2c82c3ad

由 Chang Hyun Park 提交于 9月 26, 2014

Using 'perf trace' for mmap is truncating return values by stripping the
top 32 bits, actually printing only the lower 32 bits.

This was because the ret value was of an 'int' type and not a 'long'
type.

  The Problem:

  991258501.244 ( 0.004 ms): mmap(len: 40001536, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1) = 0x56691000
  991258501.257 ( 0.000 ms): minfault [_int_malloc+0x1038] => //anon@0x7fa056691008 //(d.)

The first line shows an mmap, which succeeds and returns 0x56691000.

However the next line shows a memory access to that virtual memory area,
specifically to 0x7fa056691008. The upper 32 bit is lost due to the
problem mentioned above, and thus mmap's return value didn't have the
upper 0x7fa0.

Tested on 3.17-rc5 from the linus's tree, and the HEAD of tip/master
Signed-off-by: NChang Hyun Park <heartinpiece@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1411736041-8017-1-git-send-email-heartinpiece@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

2c82c3ad

perf tools: Refactor unit and scale function parameters · 46441bdc

由 Matt Fleming 提交于 9月 24, 2014

Passing pointers to alias modifiers 'unit' and 'scale' isn't very
future-proof since if we add more modifiers to the list we'll end up
passing more arguments.

Instead wrap everything up in a struct perf_pmu_info, which can easily
be expanded when additional alias modifiers are necessary in the future.
Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
Acked-by: NJiri Olsa <jolsa@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1411567455-31264-3-git-send-email-matt@console-pimps.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

46441bdc

26 9月, 2014 23 次提交

perf tools: Fix line number in the config file error message · 49757c9c

由 Jiri Olsa 提交于 9月 23, 2014

If we fail to parse the config file within the callback function,
the line number counter 'could be' already on the next line.

This results in wrong line number report like:

  $ cat ~/.perfconfig
  [call-graph]
          sort-key = krava
  $ perf record ls
  Fatal: bad config file line 3 in /home/jolsa/.perfconfig

Fixing this by saving the current line number for this case.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Acked-by: NNamhyung Kim <namhyung@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Milian Wolff <mail@milianw.de>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20140923115656.GC2979@krava.brq.redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

49757c9c

perf tools: Convert {record,top}.call-graph option to call-graph.record-mode · 5a2e5e85