- 16 2月, 2018 12 次提交
-
-
由 yuzhoujian 提交于
Introduce a new option to print counts for fixed number of times and update 'perf stat' documentation accordingly. Show below is the output of the new option for perf stat. $ perf stat -I 1000 --interval-count 2 -e cycles -a # time counts unit events 1.002827089 93,884,870 cycles 2.004231506 56,573,446 cycles We can just print the counts for several times with this newly introduced option. The usage of it is a little like 'vmstat', and it should be used together with "-I" option. $ vmstat -n 1 2 procs ---------memory-------------- --swap- ----io-- -system-- ------cpu--- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 78270544 547484 51732076 0 0 0 20 1 1 1 0 99 0 0 0 0 0 78270512 547484 51732080 0 0 0 16 477 1555 0 0 100 0 0 Changes since v3: - merge interval_count check and times check to one line. - fix the wrong indent in stat.h - use stat_config.times instead of 'times' in cmd_stat function. Changes since v2: - none. Changes since v1: - change the name of the new option "times-print" to "interval-count". - keep the new option interval specifically. Signed-off-by: Nyuzhoujian <yuzhoujian@didichuxing.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1517217923-8302-2-git-send-email-ufo19890607@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Add support to display group output for if non grouped events are detected and user forces --group option. Now for non-group events recorded like: $ perf record -e 'cycles,instructions' ls you can still get group output by using --group option in report: $ perf report --group --stdio ... # Overhead Command Shared Object Symbol # ................ ....... ................ ...................... # 17.67% 0.00% ls libc-2.25.so [.] _IO_do_write@@GLIB 15.59% 25.94% ls ls [.] calculate_columns 15.41% 31.35% ls libc-2.25.so [.] __strcoll_l ... Committer note: We should improve on this by making sure that the first line states that this is not a group, but since the user doesn't have to force group view when really using grouped events (e.g. '{cycles,instructions}'), the user better know what is being done... Requested-by: NStephane Eranian <eranian@google.com> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Tested-by: NStephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180209092734.GB20449@kravaSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
If we have the time in, keep the events in time order. Committer notes: Trying to be more verbose, what actual effect this will have in this particular case? Before and after this patch shows the artifacts: --- /tmp/before 2018-02-06 15:40:29.536411625 -0300 +++ /tmp/after 2018-02-06 15:40:51.963403599 -0300 @@ -5,34 +5,34 @@ 2540 2540 1818 | gnome-terminal- 3489 3489 2540 | bash 32433 32433 3489 | perf - 32434 32434 32433 | perf + 32434 32434 32433 | make 32441 32441 32434 | make 32514 32514 32441 | make 511 511 32514 | sh - 512 512 511 | sh + 512 512 511 | install <SNIP> We don't have 'perf' calling 'perf' calling 'make', etc, the second 'perf' actually is 'make', i.e. there was reordering of the relevant PERF_RECORD_COMM and PERF_RECORD_FORK records. Ditto for sh/install later on. Look for FORK and COMM meta events, for those tids: # perf report -D | egrep 'PERF_RECORD_(FORK|COMM)' | egrep '3243[34]' 0 14774650990679 0x1a3cd8 [0x38]: PERF_RECORD_FORK(32433:32433):(3489:3489) 1 14774652080381 0x1d6568 [0x30]: PERF_RECORD_COMM exec: perf:32433/32433 1 14774742473340 0x1dbb48 [0x38]: PERF_RECORD_FORK(32434:32434):(32433:32433) 0 14774752005779 0x1a4af8 [0x30]: PERF_RECORD_COMM exec: make:32434/32434 0 14774753997960 0x1a5578 [0x38]: PERF_RECORD_FORK(32435:32435):(32434:32434) 0 14774756070782 0x1a5618 [0x38]: PERF_RECORD_FORK(32438:32438):(32434:32434) 0 14774757772939 0x1a5680 [0x38]: PERF_RECORD_FORK(32440:32440):(32434:32434) 0 14774758230600 0x1a56e8 [0x38]: PERF_RECORD_FORK(32441:32441):(32434:32434) # First column is the cpu, second is the timestamp. So they are on different CPUs, thus ring buffers, and when we don't use the ordered_events class, we end up mixing that up, use it to take advantage of the PERF_RECORD_FINISHED_ROUND meta events to go on ordering the events using the PERF_SAMPLE_TIME present in the PERF_RECORD_{FORK,COMM,EXIT,SAMPLE,etc} records in the ring buffer. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180206181813.10943-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
In commit 2f15bd8c ("perf tools: Fix "Command" sort_entry's cmp and collapse function") we switched from pointer to string comparison. But failed to remove related comments. Removing them and adding another one to warn before pointer comparison in here. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180206181813.10943-18-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
When we strip the perf binary, dwarf unwind test stop to work. The reason is that strip will remove static function symbols, which we need to check for unwind. This change will keep this test working in cases where the global symbols are put into dynamic symbol table, which is the case on x86. It still won't work on powerpc. Making those 5 local functions global, and adding 'test_dwarf_unwind__' to their names. Committer testing: Before: # perf test dwarf 58: DWARF unwind : Ok # strip ~/bin/perf # perf test dwarf 58: DWARF unwind : FAILED! # perf test -v dwarf 58: DWARF unwind : --- start --- test child forked, pid 6590 unwind: thread map already set, dso=/home/acme/bin/perf <SNIP> unwind: access_mem addr 0x7ffce6c48098 val 48563f, offset 1144 unwind: test__dwarf_unwind:ip = 0x4a54e5 (0xa54e5) got: test__dwarf_unwind 0xa54e5, expecting test__dwarf_unwind unwind: '':ip = 0x4a50bb (0xa50bb) failed: got unresolved address 0xa50bb unwind failed test child finished with -1 ---- end ---- DWARF unwind: FAILED! # After: # perf test dwarf 58: DWARF unwind : Ok # strip ~/bin/perf # perf test dwarf 58: DWARF unwind : Ok # # perf test -v dwarf 58: DWARF unwind : --- start --- test child forked, pid 7219 unwind: thread map already set, dso=/home/acme/bin/perf <SNIP> unwind: access_mem addr 0x7fff007da2c8 val 48575f, offset 1144 unwind: test__arch_unwind_sample:ip = 0x589044 (0x189044) got: test__arch_unwind_sample 0x189044, expecting test__arch_unwind_sample unwind: test_dwarf_unwind__thread:ip = 0x4a52f7 (0xa52f7) got: test_dwarf_unwind__thread 0xa52f7, expecting test_dwarf_unwind__thread unwind: test_dwarf_unwind__compare:ip = 0x4a5468 (0xa5468) got: test_dwarf_unwind__compare 0xa5468, expecting test_dwarf_unwind__compare unwind: bsearch:ip = 0x7f6608ae94d8 (0x394d8) got: bsearch 0x394d8, expecting bsearch unwind: test_dwarf_unwind__krava_3:ip = 0x4a54d1 (0xa54d1) got: test_dwarf_unwind__krava_3 0xa54d1, expecting test_dwarf_unwind__krava_3 unwind: test_dwarf_unwind__krava_2:ip = 0x4a550b (0xa550b) got: test_dwarf_unwind__krava_2 0xa550b, expecting test_dwarf_unwind__krava_2 unwind: test_dwarf_unwind__krava_1:ip = 0x4a554b (0xa554b) got: test_dwarf_unwind__krava_1 0xa554b, expecting test_dwarf_unwind__krava_1 unwind: test__dwarf_unwind:ip = 0x4a5605 (0xa5605) got: test__dwarf_unwind 0xa5605, expecting test__dwarf_unwind test child finished with 0 ---- end ---- DWARF unwind: Ok # Signed-off-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180206181813.10943-17-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Adding sysfs__read_xll function to be able to read sysfs files with hex numbers in, which do not have 0x prefix. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180206181813.10943-6-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Adding filename__read_xll function to be able to read files with hex numbers in, which do not have 0x prefix. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180206181813.10943-5-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Adding --show-round-event to display PERF_RECORD_FINISHED_ROUND events like: # perf script --show-round-events 2>/dev/null yes 8591 [002] 124177.397597: 18 cpu/mem-stores/P: ff... yes 8591 [002] 124177.397615: 1 cpu/mem-loads,ldlat=30/P: ff... PERF_RECORD_FINISHED_ROUND perf 10380 [001] 124177.397622: 6 cpu/mem-loads,ldlat=30/P: ff... PERF_RECORD_FINISHED_ROUND swapper 0 [000] 124177.400518: 88 cpu/mem-stores/P: ff... swapper 0 [000] 124177.400521: 88 cpu/mem-stores/P: ff... Signed-off-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180206181813.10943-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
There's no new-line after target-override warning, now: $ perf record -a --per-thread Warning: SYSTEM/CPU switch overriding PER-THREAD^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.705 MB perf.data (2939 samples) ] with patch: $ perf record -a --per-thread Warning: SYSTEM/CPU switch overriding PER-THREAD ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.705 MB perf.data (2939 samples) ] Signed-off-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 16ad2ffb ("perf tools: Introduce perf_target__strerror()") Link: http://lkml.kernel.org/r/20180206181813.10943-3-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jessica Yu 提交于
Improve error handling when disarming ftrace-based kprobes. Like with arm_kprobe_ftrace(), propagate any errors from disarm_kprobe_ftrace() so that we do not disable/unregister kprobes that are still armed. In other words, unregister_kprobe() and disable_kprobe() should not report success if the kprobe could not be disarmed. disarm_all_kprobes() keeps its current behavior and attempts to disarm all kprobes. It returns the last encountered error and gives a warning if not all probes could be disarmed. This patch is based on Petr Mladek's original patchset (patches 2 and 3) back in 2015, which improved kprobes error handling, found here: https://lkml.org/lkml/2015/2/26/452 However, further work on this had been paused since then and the patches were not upstreamed. Based-on-patches-by: NPetr Mladek <pmladek@suse.com> Signed-off-by: NJessica Yu <jeyu@kernel.org> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S . Miller <davem@davemloft.net> Cc: Jiri Kosina <jikos@kernel.org> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/20180109235124.30886-3-jeyu@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jessica Yu 提交于
Improve error handling when arming ftrace-based kprobes. Specifically, if we fail to arm a ftrace-based kprobe, register_kprobe()/enable_kprobe() should report an error instead of success. Previously, this has lead to confusing situations where register_kprobe() would return 0 indicating success, but the kprobe would not be functional if ftrace registration during the kprobe arming process had failed. We should therefore take any errors returned by ftrace into account and propagate this error so that we do not register/enable kprobes that cannot be armed. This can happen if, for example, register_ftrace_function() finds an IPMODIFY conflict (since kprobe_ftrace_ops has this flag set) and returns an error. Such a conflict is possible since livepatches also set the IPMODIFY flag for their ftrace_ops. arm_all_kprobes() keeps its current behavior and attempts to arm all kprobes. It returns the last encountered error and gives a warning if not all probes could be armed. This patch is based on Petr Mladek's original patchset (patches 2 and 3) back in 2015, which improved kprobes error handling, found here: https://lkml.org/lkml/2015/2/26/452 However, further work on this had been paused since then and the patches were not upstreamed. Based-on-patches-by: NPetr Mladek <pmladek@suse.com> Signed-off-by: NJessica Yu <jeyu@kernel.org> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S . Miller <davem@davemloft.net> Cc: Jiri Kosina <jikos@kernel.org> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/20180109235124.30886-2-jeyu@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ingo Molnar 提交于
Merge tag 'perf-core-for-mingo-4.17-20180215' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core fixes from Arnaldo Carvalho de Melo: - perf_mmap overwrite mode fixes/overhaul, prep work to get 'perf top' using it, making it bearable to use it in large core count systems such as Knights Landing/Mill Intel systems (Kan Liang) - s/390 now uses syscall.tbl, just like x86-64 to generate the syscall table id -> string tables used by 'perf trace' (Hendrik Brueckner) - Use strtoull() instead of home grown function (Andy Shevchenko) - Synchronize kernel ABI headers, v4.16-rc1 (Ingo Molnar) - Document missing 'perf data --force' option (Sangwon Hong) - Add perf vendor JSON metrics for ARM Cortex-A53 Processor (William Cohen) Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 15 2月, 2018 28 次提交
-
-
由 Hendrik Brueckner 提交于
This reverts commit f120c7b187e6c418238710b48723ce141f467543 which is no longer required with the introduction of a syscall.tbl on s390. Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: linux-s390@vger.kernel.org LPU-Reference: 1518090470-2899-2-git-send-email-brueckner@linux.vnet.ibm.com Link: https://lkml.kernel.org/n/tip-q1lg0nvhha1tk39ri9aqalcb@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Hendrik Brueckner 提交于
Recently, s390 uses a syscall.tbl input file to generate its system call table and unistd uapi header files. Hence, update mksyscalltbl to use it as input to create the system table for perf. Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: linux-s390@vger.kernel.org LPU-Reference: 1518090470-2899-4-git-send-email-brueckner@linux.vnet.ibm.com Link: https://lkml.kernel.org/n/tip-bdyhllhsq1zgxv2qx4m377y6@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Hendrik Brueckner 提交于
Grab a copy of the s390 system call table file introduced with commit 857f46bf "s390/syscalls: add system call table". Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: linux-s390@vger.kernel.org LPU-Reference: 1518090470-2899-3-git-send-email-brueckner@linux.vnet.ibm.com Link: https://lkml.kernel.org/n/tip-hpw7vdjp7g92ivgpddrp5ydq@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ingo Molnar 提交于
Sync the following tooling headers with the latest kernel version: tools/arch/powerpc/include/uapi/asm/kvm.h tools/arch/x86/include/asm/cpufeatures.h tools/include/uapi/drm/i915_drm.h tools/include/uapi/linux/if_link.h tools/include/uapi/linux/kvm.h All the changes are new ABI additions which don't impact their use in existing tooling. Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Thomas Richter 提交于
On Intel test case trace+probe_libc_inet_pton.sh succeeds and the output is: [root@f27 perf]# ./perf trace --no-syscalls -e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.037 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.037/0.037/0.037/0.000 ms 0.000 probe_libc:inet_pton:(7fa40ac618a0)) __GI___inet_pton (/usr/lib64/libc-2.26.so) getaddrinfo (/usr/lib64/libc-2.26.so) main (/usr/bin/ping) The kernel stack unwinder is used, it is specified implicitly as call-graph=fp (frame pointer). On s390x only dwarf is available for stack unwinding. It is also done in user space. This requires different parameter setup and result checking for s390x and Intel. This patch adds separate perf trace setup and result checking for Intel and s390x. On s390x specify this command line to get a call-graph and handle the different call graph result checking: [root@s35lp76 perf]# ./perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.041 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.041/0.041/0.041/0.000 ms 0.000 probe_libc:inet_pton:(3ffb9942060)) __GI___inet_pton (/usr/lib64/libc-2.26.so) gaih_inet (inlined) __GI_getaddrinfo (inlined) main (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) _start (/usr/bin/ping) [root@s35lp76 perf]# Before: [root@s8360047 perf]# ./perf test -vv 58 58: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 26349 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.079 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.079/0.079/0.079/0.000 ms 0.000 probe_libc:inet_pton:(3ff925c2060)) test child finished with -1 ---- end ---- probe libc's inet_pton & backtrace it with ping: FAILED! [root@s8360047 perf]# After: [root@s35lp76 perf]# ./perf test -vv 57 57: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 38708 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.038 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.038/0.038/0.038/0.000 ms 0.000 probe_libc:inet_pton:(3ff87342060)) __GI___inet_pton (/usr/lib64/libc-2.26.so) gaih_inet (inlined) __GI_getaddrinfo (inlined) main (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) _start (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok [root@s35lp76 perf]# On Intel the test case runs unchanged and succeeds. Signed-off-by: NThomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180117083831.101001-1-tmricht@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Sangwon Hong 提交于
Add the --force option to the man page. Signed-off-by: NSangwon Hong <qpakzk@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Taeung Song <treeze.taeung@gmail.com> Link: http://lkml.kernel.org/r/1517831315-31490-1-git-send-email-qpakzk@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andy Shevchenko 提交于
Instead of home grown function let's use what library provides us. Signed-off-by: NAndriy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20180129130359.1490-1-andriy.shevchenko@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
The latency of perf_top__mmap_read() should be lower than refresh time. If not, give some hints to reduce the latency. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1516310792-208685-18-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
perf_top__mmap_read() has a severe performance issue in the Knights Landing/Mill platform, when monitoring heavy load systems. It costs several minutes to finish, which is unacceptable. Currently, 'perf top' uses the non overwrite mode. For non overwrite mode, it tries to read everything in the ringbuffer and doesn't pause it. Once there are lots of samples delivered persistently, the processing time could be very long. Also, the latest samples could be lost when the ringbuffer is full. For overwrite mode, it takes a snapshot for the system by pausing the ringbuffer, which could significantly reduce the processing time. Also, the overwrite mode always keep the latest samples. Considering the real time requirement for 'perf top', the overwrite mode is more suitable for it. Actually, 'perf top' was overwrite mode. It is changed to non overwrite mode since commit 93fc64f1 ("perf top: Switch to non overwrite mode"). It's better to change it back to overwrite mode by default. For the kernel which doesn't support overwrite mode, it will fall back to non overwrite mode. There would be some records lost in overwrite mode because of pausing the ringbuffer. It has little impact for the accuracy of the snapshot and can be tolerated. For overwrite mode, unconditionally wait 100 ms before each snapshot. It also reduces the overhead caused by pausing ringbuffer, especially on light load system. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1516310792-208685-17-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
There would be some records lost in overwrite mode because of pausing the ringbuffer. It has little impact for the accuracy of the snapshot and could be tolerated by 'perf top'. Remove the lost events checking. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1516310792-208685-16-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
For overwrite mode, the ringbuffer will be paused. The event lost is expected. It needs a way to notify the browser not print the warning. It will be used later for perf top to disable lost event warning in overwrite mode. There is no behavior change for now. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1516310792-208685-15-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
Switch to non-overwrite mode if kernel doesnot support overwrite ringbuffer. It's only effect when overwrite mode is supported. No change to current behavior. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1516310792-208685-14-git-send-email-kan.liang@intel.com [ Use perf_missing_features.write_backward instead of the non merged is_write_backward_fail() ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
As tools may need to adjust to missing features, as 'perf top' will, in the next csets, to cope with a missing 'write_backward' feature. Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-jelngl9q1ooaizvkcput9tic@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
Per-event overwrite term is not forbidden in 'perf top', which can bring problems. Because 'perf top' only support non-overwrite mode now. Add new rules and check regarding to overwrite term for 'perf top'. - All events either have same per-event term or don't have per-event mode setting. Otherwise, it will error out. - Per-event overwrite term should be consistent as opts->overwrite. If not, updating the opts->overwrite according to per-event term. Make it possible to support either non-overwrite or overwrite mode. The overwrite mode is forbidden now, which will be removed when the overwrite mode is supported later. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1516310792-208685-12-git-send-email-kan.liang@intel.com [ Renamed perf_top_overwrite_check to perf_top__overwrite_check, to follow existing convention ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
Discards perf_mmap__read_backward() and perf_mmap__read_catchup(). No tools use them. There are tools still use perf_mmap__read_forward(). Keep it, but add comments to point to the new interface for future use. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1516310792-208685-11-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
Use the new perf_mmap__read_* interfaces for overwrite ringbuffer test. Commiter notes: Testing: [root@seventh ~]# perf test -v backward 48: Read backward ring buffer : --- start --- test child forked, pid 8309 Using CPUID GenuineIntel-6-9E mmap size 1052672B mmap size 8192B Finished reading overwrite ring buffer: rewind test child finished with 0 ---- end ---- Read backward ring buffer: Ok [root@seventh ~]# Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1516310792-208685-10-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
Except for 'perf record', the other perf tools read events one by one from the ring buffer using perf_mmap__read_forward(). But it only supports non-overwrite mode. Introduce perf_mmap__read_event() to support both non-overwrite and overwrite mode. Usage: perf_mmap__read_init() while(event = perf_mmap__read_event()) { //process the event perf_mmap__consume() } perf_mmap__read_done() It cannot use perf_mmap__read_backward(). Because it always reads the stale buffer which is already processed. Furthermore, the forward and backward concepts have been removed. The perf_mmap__read_backward() will be replaced and discarded later. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1516310792-208685-9-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
The direction of overwrite mode is backward. The last perf_mmap__read() will set tail to map->prev. Need to correct the map->prev to head which is the end of next read. It will be used later. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1516310792-208685-8-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
The 'start' and 'prev' variables are duplicates in perf_mmap__read(). Use 'map->prev' to replace 'start' in perf_mmap__read_*(). Suggested-by: NWang Nan <wangnan0@huawei.com> Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1516310792-208685-7-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
Improve the readability by using meaningful enum (-EAGAIN, -EINVAL and 0) to replace the three returning states (0, -1 and 1). Suggested-by: NWang Nan <wangnan0@huawei.com> Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1516310792-208685-6-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
The new function perf_mmap__read_init() is factored out from perf_mmap__push(). It is to calculate the 'start' and 'end' of the available data in ringbuffer. No functional change. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1516310792-208685-5-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
The first assignment for 'start' and 'end' is redundant. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1516310792-208685-4-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
In perf_mmap__push(), the 'size' need to be recalculated, otherwise the invalid data might be pushed to the record in overwrite mode. The issue is introduced by commit 7fb4b407 ("perf mmap: Don't discard prev in backward mode"). When the ring buffer is full in overwrite mode, backward_rb_find_range() will be called to recalculate the 'start' and 'end'. The 'size' needs to be recalculated accordingly. Unconditionally recalculate the 'size', not just for full ring buffer in overwrite mode. Because: - There is no harmful to recalculate the 'size' for other cases. - The code of calculating 'start' and 'end' will be factored out later. The new function does not need to return 'size'. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 7fb4b407 ("perf mmap: Don't discard prev in backward mode") Link: http://lkml.kernel.org/r/1516310792-208685-3-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
perf_evlist__mmap_read_catchup() and perf_evlist__mmap_read_backward() are only for overwrite mode. But they read the evlist->mmap buffer which is for non-overwrite mode. It did not bring any serious problem yet, because there is no one use it. Remove the unused interfaces. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NWang Nan <wangnan0@huawei.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1516310792-208685-2-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 William Cohen 提交于
Add JSON metrics for ARM Cortex-A53 Processor. Unlike the Intel processors there isn't a script that automatically generated these files. The patch was manually generated from the documentation and the previous oprofile ARM Cortex ac53 event file patch I made. The relevant documentation is in the "12.9 Events" section of the ARM Cortex A53 MPCore Processor Revision: r0p4 Technical Reference Manual. The ARM Cortex A53 manual is available at: http://infocenter.arm.com/help/topic/com.arm.doc.ddi0500g/DDI0500G_cortex_a53_trm.pdf Use that to look for additional information about the events. Signed-off-by: NWilliam Cohen <wcohen@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180131032813.9564-1-wcohen@redhat.com [ Added references provided by William Cohen ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull x86 fixes from Ingo Molnar: "Misc fixes all across the map: - /proc/kcore vsyscall related fixes - LTO fix - build warning fix - CPU hotplug fix - Kconfig NR_CPUS cleanups - cpu_has() cleanups/robustification - .gitignore fix - memory-failure unmapping fix - UV platform fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages x86/error_inject: Make just_return_func() globally visible x86/platform/UV: Fix GAM Range Table entries less than 1GB x86/build: Add arch/x86/tools/insn_decoder_test to .gitignore x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU x86/mm/kcore: Add vsyscall page to /proc/kcore conditionally vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page x86/Kconfig: Further simplify the NR_CPUS config x86/Kconfig: Simplify NR_CPUS config x86/MCE: Fix build warning introduced by "x86: do not use print_symbol()" x86/cpufeature: Update _static_cpu_has() to use all named variables x86/cpufeature: Reindent _static_cpu_has()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull x86 PTI and Spectre related fixes and updates from Ingo Molnar: "Here's the latest set of Spectre and PTI related fixes and updates: Spectre: - Add entry code register clearing to reduce the Spectre attack surface - Update the Spectre microcode blacklist - Inline the KVM Spectre helpers to get close to v4.14 performance again. - Fix indirect_branch_prediction_barrier() - Fix/improve Spectre related kernel messages - Fix array_index_nospec_mask() asm constraint - KVM: fix two MSR handling bugs PTI: - Fix a paranoid entry PTI CR3 handling bug - Fix comments objtool: - Fix paranoid_entry() frame pointer warning - Annotate WARN()-related UD2 as reachable - Various fixes - Add Add Peter Zijlstra as objtool co-maintainer Misc: - Various x86 entry code self-test fixes - Improve/simplify entry code stack frame generation and handling after recent heavy-handed PTI and Spectre changes. (There's two more WIP improvements expected here.) - Type fix for cache entries There's also some low risk non-fix changes I've included in this branch to reduce backporting conflicts: - rename a confusing x86_cpu field name - de-obfuscate the naming of single-TLB flushing primitives" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) x86/entry/64: Fix CR3 restore in paranoid_exit() x86/cpu: Change type of x86_cache_size variable to unsigned int x86/spectre: Fix an error message x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping selftests/x86/mpx: Fix incorrect bounds with old _sigfault x86/mm: Rename flush_tlb_single() and flush_tlb_one() to __flush_tlb_one_[user|kernel]() x86/speculation: Add <asm/msr-index.h> dependency nospec: Move array_index_nospec() parameter checking into separate macro x86/speculation: Fix up array_index_nospec_mask() asm constraint x86/debug: Use UD2 for WARN() x86/debug, objtool: Annotate WARN()-related UD2 as reachable objtool: Fix segfault in ignore_unreachable_insn() selftests/x86: Disable tests requiring 32-bit support on pure 64-bit systems selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c selftests/x86: Do not rely on "int $0x80" in test_mremap_vdso.c selftests/x86: Fix build bug caused by the 5lvl test which has been moved to the VM directory selftests/x86/pkeys: Remove unused functions selftests/x86: Clean up and document sscanf() usage selftests/x86: Fix vDSO selftest segfault for vsyscall=none x86/entry/64: Remove the unused 'icebp' macro ...
-
由 Ingo Molnar 提交于
Josh Poimboeuf noticed the following bug: "The paranoid exit code only restores the saved CR3 when it switches back to the user GS. However, even in the kernel GS case, it's possible that it needs to restore a user CR3, if for example, the paranoid exception occurred in the syscall exit path between SWITCH_TO_USER_CR3_STACK and SWAPGS." Josh also confirmed via targeted testing that it's possible to hit this bug. Fix the bug by also restoring CR3 in the paranoid_exit_no_swapgs branch. The reason we haven't seen this bug reported by users yet is probably because "paranoid" entry points are limited to the following cases: idtentry double_fault do_double_fault has_error_code=1 paranoid=2 idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK idtentry int3 do_int3 has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK idtentry machine_check do_mce has_error_code=0 paranoid=1 Amongst those entry points only machine_check is one that will interrupt an IRQS-off critical section asynchronously - and machine check events are rare. The other main asynchronous entries are NMI entries, which can be very high-freq with perf profiling, but they are special: they don't use the 'idtentry' macro but are open coded and restore user CR3 unconditionally so don't have this bug. Reported-and-tested-by: NJosh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: NAndy Lutomirski <luto@kernel.org> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180214073910.boevmg65upbk3vqb@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-