- 31 7月, 2012 2 次提交
-
-
由 Peter Zijlstra 提交于
Some PMUs don't provide a full register set for their sample, specifically 'advanced' PMUs like AMD IBS and Intel PEBS which provide 'better' than regular interrupt accuracy. In this case we use the interrupt regs as basis and over-write some fields (typically IP) with different information. The perf core however uses user_mode() to distinguish user/kernel samples, user_mode() relies on regs->cs. If the interrupt skid pushed us over a boundary the new IP might not be in the same domain as the interrupt. Commit ce5c1fe9 ("perf/x86: Fix USER/KERNEL tagging of samples") tried to fix this by making the perf core use kernel_ip(). This however is wrong (TM), as pointed out by Linus, since it doesn't allow for VM86 and non-zero based segments in IA32 mode. Therefore, provide a new helper to set the regs->ip field, set_linear_ip(), which massages the regs into a suitable state assuming the provided IP is in fact a linear address. Also modify perf_instruction_pointer() and perf_callchain_user() to deal with segments base offsets. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1341910954.3462.102.camel@twinsSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andrew Morton 提交于
i386 allmodconfig: arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function 'uncore_pmu_hrtimer': arch/x86/kernel/cpu/perf_event_intel_uncore.c:728: warning: integer overflow in expression arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function 'uncore_pmu_start_hrtimer': arch/x86/kernel/cpu/perf_event_intel_uncore.c:735: warning: integer overflow in expression Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-h84qlqj02zrojmxxybzmy9hi@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 30 7月, 2012 14 次提交
-
-
由 Oleg Nesterov 提交于
Like do_wp_page(), __replace_page() should do munlock_vma_page() for the case when the old page still has other !VM_LOCKED mappings. Unfortunately this needs mm/internal.h. Also, move put_page() outside of ptl lock. This doesn't really matter but looks a bit better. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182249.GA20372@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
1. vma_address() returns loff_t, this looks confusing and this is unnecessary after the previous change. Make it return "ulong", all callers truncate the result anyway. 2. Its name conflicts with mm/rmap.c:vma_address(), rename it to offset_to_vaddr(), this matches vaddr_to_offset(). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182247.GA20365@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
1. register_for_each_vma() checks that vma_address() == vaddr, but this is not enough. We should also ensure that vaddr >= vm_start, find_vma() guarantees "vaddr < vm_end" only. 2. After the prevous changes, register_for_each_vma() is the only reason why vma_address() has to return loff_t, all other users know that we have the valid mapping at this offset and thus the overflow is not possible. Change the code to use vaddr_to_offset() instead, imho this looks more clean/understandable and now we can change vma_address(). 3. While at it, remove the unnecessary type-cast. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182244.GA20362@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
Add the new helper, vaddr_to_offset(vma, vaddr) which returns the offset in vma->vm_file this vaddr is mapped at. Change build_probe_list() and find_active_uprobe() to use the new helper, the next patch adds another user. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182242.GA20355@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
Currently build_probe_list() builds the list of all uprobes attached to the given inode, and the caller should filter out those who don't fall into the [start,end) range, this is sub-optimal. This patch turns find_least_offset_node() into find_node_in_range() which returns the first node inside the [min,max] range, and changes build_probe_list() to use this node as a starting point for rb_prev() and rb_next() to find all other nodes the caller needs. The resulting list is no longer sorted but we do not care. This can speed up both build_probe_list() and the callers, but there is another reason to introduce find_node_in_range(). It can be used to figure out whether the given vma has uprobes or not, this will be needed soon. While at it, shift INIT_LIST_HEAD(tmp_list) into build_probe_list(). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182240.GA20352@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
Remove insert_vm_struct()->uprobe_mmap(). It is not needed, nobody except arch/ia64/kernel/perfmon.c uses insert_vm_struct(vma) with vma->vm_file != NULL. And it is wrong. Again, get_user_pages() can not succeed before vma_link(vma) makes is visible to find_vma(). And even if this worked, we must not insert the new bp before this mapping is visible to vma_prio_tree_foreach() for uprobe_unregister(). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182238.GA20349@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
Remove copy_vma()->uprobe_mmap(new_vma), it is absolutely wrong. This new_vma was just initialized to represent the new unmapped area, [vm_start, vm_end) was returned by get_unmapped_area() in the caller. This means that uprobe_mmap()->get_user_pages() will fail for sure, simply because find_vma() can never succeed. And I verified that sys_mremap()->mremap_to() indeed always fails with the wrong ENOMEM code if [addr, addr+old_len] is probed. And why this uprobe_mmap() was added? I believe the intent was wrong. Note that the caller is going to do move_page_tables(), all registered uprobes are already faulted in, we only change the virtual addresses. NOTE: However, somehow we need to close the race with uprobe_register() which relies on map_info->vaddr. This needs another fix I'll try to do later. Probably we need uprobe_mmap() in move_vma() but we can not do this right now, this can confuse uprobes_state.counter (which I still hope we are going to kill). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182236.GA20342@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
vma->vm_pgoff is "unsigned long", it should be promoted to loff_t before the multiplication to avoid the overflow. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182233.GA20339@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
uprobe_munmap() does get_user_pages() and it is also called from the final mmput()->exit_mmap() path. This slows down exit/mmput() for no reason, and I think it is simply dangerous/wrong to try to fault-in a page into the dying mm. If nothing else, this happens after the last sync_mm_rss(), afaics handle_mm_fault() can change the task->rss_stat and make the subsequent check_mm() unhappy. Change uprobe_munmap() to check mm->mm_users != 0. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182231.GA20336@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
The bug was introduced by me in 449d0d7c ("uprobes: Simplify the usage of uprobe->pending_list"). Yes, we do not care about uprobe->pending_list after return and nobody can remove the current list entry, but put_uprobe(uprobe) can actually free it and thus we need list_for_each_safe(). Reported-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Link: http://lkml.kernel.org/r/20120729182229.GA20329@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
The comment above write_opcode()->lock_page(old_page) tells about the race with do_wp_page(). I don't really understand which exactly race it means, but afaics this lock_page() was not enough to close all races with do_wp_page(). Anyway, since: 77fc4af1 uprobes: Change register_for_each_vma() to take mm->mmap_sem for writing this code is always called with ->mmap_sem held for writing, so we can forget about do_wp_page(). However, we can't simply remove this lock_page(), and the only (afaics) reason is __replace_page()->try_to_free_swap(). Nothing in write_opcode() needs it, move it into __replace_page() and fix the comment. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182220.GA20322@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
write_opcode() does lock_page(new_page) for no reason. Nobody can see this page until __replace_page() exposes it under ptl lock, and we do nothing with this page after pte_unmap_unlock(). If nothing else, the similar code in do_wp_page() doesn't lock the new page for page_add_new_anon_rmap/set_pte_at_notify. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182218.GA20315@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
page_address_in_vma(old_page) in __replace_page() is ugly and wrong. The caller already knows the correct virtual address, this page was found by get_user_pages(vaddr). However, page_address_in_vma() can actually fail if page->mapping was cleared by __delete_from_page_cache() after get_user_pages() returns. But this means the race with page reclaim, write_opcode() should not fail, it should retry and read this page again. Probably the race with remove_mapping() is not possible due to page_freeze_refs() logic, but afaics at least shmem_writepage()->shmem_delete_from_page_cache() can clear ->mapping. We could change __replace_page() to return -EAGAIN in this case, but it would be better to simply use the caller's vaddr and rely on page_check_address(). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182216.GA20311@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
write_opcode() rechecks valid_vma() and ->f_mapping, this is pointless. The caller, register_for_each_vma() or uprobe_mmap(), has already done these checks under mmap_sem. To clarify, uprobe_mmap() checks valid_vma() only, but we can rely on build_probe_list(vm_file->f_mapping->host). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com> Cc: Anton Arapov <anton@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120729182212.GA20304@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 26 7月, 2012 9 次提交
-
-
由 Jovi Zhang 提交于
When CONFIG_PERF_EVENTS disabled, there will have a compiliation error, because missing struct before structure name. Signed-off-by: NJovi Zhang <bookjovi@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jiri Kosina <trivial@kernel.org> Link: http://lkml.kernel.org/r/CACV3sbKF%3DCX%2B2jWEWesfCA6rBoQ3wDM4-5ac9MuBtVbCtMRHdQ@mail.gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Yan, Zheng 提交于
The event control register of SNB-EP uncore QPI box has a one bit extension at bit position 21. Reported-by: NStephane Eranian <eranian@google.com> Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1343097850-4348-1-git-send-email-zheng.z.yan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Fix: arch/x86/kernel/cpu/perf_event.h:377:43: sparse: dubious one-bit signed bitfield Cc: Borislav Petkov <bp@amd64.org> Reported-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-2jxkmktkppkclj1qe6qxd7ah@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Yan, Zheng 提交于
LLC-* and node-* events require using the OFFCORE_RESPONSE events on SandyBridge, but the hw_cache_extra_regs is left uninitialized. This patch adds the missing extra register configure table for SandyBridge. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1342517275-2875-1-git-send-email-zheng.z.yan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Yan, Zheng 提交于
The uncore subsystem in Nehalem-EX consists of 7 components (U-Box, C-Box, B-Box, S-Box, R-Box, M-Box and W-Box). This patch is large because the way to program these boxes is diverse. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4FF534F1.3030307@intel.com [ Improved the code. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Yan, Zheng 提交于
The format definition of uncore PCU filter should be filter_band* instead of filter_brand*. Reported-by: NStephane Eranian <eranian@google.com> Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1343024611-4692-1-git-send-email-zheng.z.yan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andrew Vagin 提交于
Otherwise they can't be filtered for a defined task: perf record -e sched:sched_switch ./foo This command doesn't report any events without this patch. I think it isn't a security concern if someone knows who will be executed next - this can already be observed by polling /proc state. By default perf is disabled for non-root users in any case. I need these events for profiling sleep times. sched_switch is used for getting callchains and sched_stat_* is used for getting time periods. These events are combined in user space, then it can be analyzed by perf tools. Signed-off-by: NAndrew Vagin <avagin@openvz.org> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Arun Sharma <asharma@fb.com> Link: http://lkml.kernel.org/r/1342088069-1005148-1-git-send-email-avagin@openvz.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ingo Molnar 提交于
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core fixes and improvements from Arnaldo Carvalho de Melo: * libtraceevent build fixes from Namhyung Kim * Prevent overflow when calculating annotation buckets size, from Cody Schafer * Set breakpoint events sample period to 1, just like trace events, from Jovi Zhang * Fix strerror_r usage, from Kirill Shutemov * Fix duplicate function declaration problems with bison 2.6, from Kirill Shutemov * Add DSO caching facility (and perf test entry) to speed up binary file reading, will be used by the DWARF unwind code, from Jiri Olsa * Fix trace event storms by setting PERF_SAMPLE_PERIOD, from Frederic Weisbecker * perf kvm fixes from David Ahern Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Cody Schafer 提交于
A large enough symbol size causes an overflow in the size parameter to the histogram allocation, leading to a segfault in symbol__inc_addr_samples later on when this histogram is accessed. In the case of being called via perf-report, this returns back and gracefully ignores the sample, eventually ignoring the chained return value of perf_session_deliver_event in flush_sample_queue. Signed-off-by: NCody Schafer <cody@linux.vnet.ibm.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1342753525-4521-1-git-send-email-cody@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 7月, 2012 15 次提交
-
-
由 Namhyung Kim 提交于
The TRACEEVENT-CFLAGS file is used to detect any change on compiler flags. Just ignore it. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1341559297-25725-3-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
Cross compiling perf requires setting ARCH and CROSS_COMPILE variables, but libtraceevent couldn't detect the changes so it ends up believing no recompiling is required. Thus the linker failed like: LINK perf ../lib/traceevent//libtraceevent.a: member ../lib/traceevent//libtraceevent.a(event-parse.o) in archive is not an object collect2: ld returned 1 exit status make: *** [perf] Error 1 This patch fixes this by adding TRACEEVENT-CFLAGS file like PERF-CFLAGS to track those changes. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1341559297-25725-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kirill A. Shutemov 提交于
Bison 2.6 started to generate parse_events_parse() declaration in header. In this case we have redundant redeclaration: util/parse-events.c:29:5: error: redundant redeclaration of ‘parse_events_parse’ [-Werror=redundant-decls] In file included from util/parse-events.c:14:0: util/parse-events-bison.h:99:5: note: previous declaration of ‘parse_events_parse’ was here cc1: all warnings being treated as errors Let's disable -Wredundant-decls for util/parse-events.c since it includes header we can't control. Signed-off-by: NKirill A. Shutemov <kirill@shutemov.name> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120723210407.GA25186@shutemov.nameSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kirill A. Shutemov 提交于
Perf uses GNU-specific version of strerror_r(). The GNU-specific strerror_r() returns a pointer to a string containing the error message. This may be either a pointer to a string that the function stores in buf, or a pointer to some (immutable) static string (in which case buf is unused). In glibc-2.16 GNU version was marked with attribute warn_unused_result. It triggers few warnings in perf: util/target.c: In function ‘perf_target__strerror’: util/target.c:114:13: error: ignoring return value of ‘strerror_r’, declared with attribute warn_unused_result [-Werror=unused-result] ui/browsers/hists.c: In function ‘hist_browser__dump’: ui/browsers/hists.c:981:13: error: ignoring return value of ‘strerror_r’, declared with attribute warn_unused_result [-Werror=unused-result] They are bugs. Let's fix strerror_r() usage. Signed-off-by: NKirill A. Shutemov <kirill@shutemov.name> Acked-by: NUlrich Drepper <drepper@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ulrich Drepper <drepper@gmail.com> Link: http://lkml.kernel.org/r/20120723210654.GA25248@shutemov.name [ committer note: s/assert/BUG_ON/g ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jovi Zhang 提交于
There have one problem about hw_breakpoint perf event, as watched, the events reported to userspace is not correctly, sometime one trigger bp_event report several events, sometime bp_event cannot go through to user. The root cause is attr->freq is 1 passed to kernel defaultly in bp events, this make kernel calculate event period not as expect, make sample period to 1 will change attr->freq to 0, to fix this problem. This patch is similar with commit f92128 about tracepoint events: perf: Make the trace events sample period default to 1 Signed-off-by: NJovi Zhang <bookjovi@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/CACV3sbLF8taiCq_VYW-sgRJyupeMzg58C7ZXfMe3xZUiH_Mx6w@mail.gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Adding automated test for DSO data reading. Testing raw/cached reads from different file/cache locations. Signed-off-by: NJiri Olsa <jolsa@redhat.com> Cc: Arun Sharma <asharma@fb.com> Cc: Benjamin Redelings <benjamin.redelings@nescent.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Ulrich Drepper <drepper@gmail.com> Link: http://lkml.kernel.org/r/1342959280-5361-18-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Adding dso data caching so we don't need to open/read/close, each time we want dso data. The DSO data caching affects following functions: dso__data_read_offset dso__data_read_addr Each DSO read tries to find the data (based on offset) inside the cache. If it's not present it fills the cache from file, and returns the data. If it is present, data are returned with no file read. Each data read is cached by reading cache page sized/aligned amount of DSO data. The cache page size is hardcoded to 4096. The cache is using RB tree with file offset as a sort key. Signed-off-by: NJiri Olsa <jolsa@redhat.com> Cc: Arun Sharma <asharma@fb.com> Cc: Benjamin Redelings <benjamin.redelings@nescent.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Ulrich Drepper <drepper@gmail.com> Link: http://lkml.kernel.org/r/1342959280-5361-17-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Adding following interface for DSO object to allow reading of DSO image data: dso__data_fd - opens DSO and returns file descriptor Binary types are used to locate/open DSO in following order: DSO_BINARY_TYPE__BUILD_ID_CACHE DSO_BINARY_TYPE__SYSTEM_PATH_DSO In other word we first try to open DSO build-id path, and if that fails we try to open DSO system path. dso__data_read_offset - reads DSO data from specified offset dso__data_read_addr - reads DSO data from specified address/map. Signed-off-by: NJiri Olsa <jolsa@redhat.com> Cc: Arun Sharma <asharma@fb.com> Cc: Benjamin Redelings <benjamin.redelings@nescent.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Ulrich Drepper <drepper@gmail.com> Link: http://lkml.kernel.org/r/1342959280-5361-11-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Adding interface to access DSOs so it could be used from another place. New DSO binary type is added - making current SYMTAB__* types more general: DSO_BINARY_TYPE__* = SYMTAB__* Following function is added to return path based on the specified binary type: dso__binary_type_file Signed-off-by: NJiri Olsa <jolsa@redhat.com> Cc: Arun Sharma <asharma@fb.com> Cc: Benjamin Redelings <benjamin.redelings@nescent.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Ulrich Drepper <drepper@gmail.com> Link: http://lkml.kernel.org/r/1342959280-5361-10-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Frederic Weisbecker 提交于
Tiny cosmetic fix. The lack of a newline between hists callchains was looking slightly messy. Before: 0.24% swapper [kernel.kallsyms] [k] _raw_spin_lock_irq | --- _raw_spin_lock_irq run_timer_softirq __do_softirq call_softirq do_softirq irq_exit smp_apic_timer_interrupt apic_timer_interrupt default_idle amd_e400_idle cpu_idle start_secondary 0.10% perf [kernel.kallsyms] [k] lock_is_held | --- lock_is_held __might_sleep mutex_lock_nested perf_event_for_each_child perf_ioctl do_vfs_ioctl sys_ioctl system_call_fastpath ioctl cmd_record run_builtin main __libc_start_main After: 0.24% swapper [kernel.kallsyms] [k] _raw_spin_lock_irq | --- _raw_spin_lock_irq run_timer_softirq __do_softirq call_softirq do_softirq irq_exit smp_apic_timer_interrupt apic_timer_interrupt default_idle amd_e400_idle cpu_idle start_secondary 0.10% perf [kernel.kallsyms] [k] lock_is_held | --- lock_is_held __might_sleep mutex_lock_nested perf_event_for_each_child perf_ioctl do_vfs_ioctl sys_ioctl system_call_fastpath ioctl cmd_record run_builtin main __libc_start_main Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1342631456-7233-3-git-send-email-fweisbec@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Frederic Weisbecker 提交于
Trace events have a period (weight) of 1 by default. This can be overriden on events definition by using the __perf_count() macro. For example, the sched_stat_runtime() is weighted with the runtime of the task that fired the event. By default, perf handles such weighted event by dividing it into individual events carrying a weight of 1. For example if sched_stat_runtime is fired and the task has run 5000000 nsecs, perf divides it into 5000000 events in the buffer. This behaviour makes weighted events unusable because they quickly fullfill the buffers and we lose most events. The commit 5d81e5cf ("events: Don't divide events if it has field period") solves this problem by sending only one event when PERF_SAMPLE_PERIOD flag is set. The weight is carried in the sample itself such that we don't need to demultiplex it anymore. This patch provides the last missing piece to use this feature by setting PERF_SAMPLE_PERIOD from perf tools when we deal with trace events. Before: $ ./perf record -e sched:* -a sleep 1 [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 1.619 MB perf.data (~70749 samples) ] Warning: Processed 16909 events and lost 1 chunks! Check IO/CPU overload! $ ./perf script perf 1894 [003] 824.898327: sched_migrate_task: comm=perf pid=1898 prio=120 orig_cpu=2 dest_cpu=0 perf 1894 [003] 824.898335: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns] perf 1894 [003] 824.898336: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns] perf 1894 [003] 824.898337: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns] perf 1894 [003] 824.898338: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns] perf 1894 [003] 824.898339: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns] perf 1894 [003] 824.898340: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns] perf 1894 [003] 824.898341: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns] [...] After: $ ./perf record -e sched:* -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.074 MB perf.data (~3228 samples) ] $ ./perf script perf 1461 [000] 554.286957: sched_migrate_task: comm=perf pid=1465 prio=120 orig_cpu=3 dest_cpu=1 perf 1461 [000] 554.286964: sched_stat_sleep: comm=perf pid=1465 delay=133047190 [ns] perf 1461 [000] 554.286967: sched_wakeup: comm=perf pid=1465 prio=120 success=1 target_cpu=001 swapper 0 [001] 554.286976: sched_stat_wait: comm=perf pid=1465 delay=0 [ns] swapper 0 [001] 554.286983: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=perf [...] Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1342631456-7233-1-git-send-email-fweisbec@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Frederic Weisbecker 提交于
Include the omitted number of characters printed for the first entry. Not that it really matters because nobody seem to care about the number of printed characters for now. But just in case. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1342631456-7233-2-git-send-email-fweisbec@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 David Ahern 提交于
Adds the attributes to the event line in the header dump. From: event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, ... to event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 0, precise_ip = 0, ... Signed-off-by: NDavid Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1342826756-64663-8-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 David Ahern 提交于
After 7ed97ad4 use of the guestmount option without a subdir for *each* VM generates an error message for each sample related to that VM. Once per VM is enough. Signed-off-by: NDavid Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1342826756-64663-7-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 David Ahern 提交于
Guest kernel symbols are not resolved despite passing the information needed to resolve them. e.g., perf kvm --guest --guestmount=/tmp/guest-mount record -a -- sleep 1 perf kvm --guest --guestmount=/tmp/guest-mount report --stdio 36.55% [guest/11399] [unknown] [g] 0xffffffff81600bc8 33.19% [guest/10474] [unknown] [g] 0x00000000c0116e00 30.26% [guest/11094] [unknown] [g] 0xffffffff8100a288 43.69% [guest/10474] [unknown] [g] 0x00000000c0103d90 37.38% [guest/11399] [unknown] [g] 0xffffffff81600bc8 12.24% [guest/11094] [unknown] [g] 0xffffffff810aa91d 6.69% [guest/11094] [unknown] [u] 0x00007fa784d721c3 which is just pathetic. After a maddening 2 days sifting through perf minutia I found it -- id_hdr_size is not initialized for guest machines. This shows up on the report side as random garbage for the cpu and timestamp, e.g., 29816 7310572949125804849 0x1ac0 [0x50]: PERF_RECORD_MMAP ... That messes up the sample sorting such that synthesized guest maps are processed last. With this patch you get a much more helpful report: 12.11% [guest/11399] [guest.kernel.kallsyms.11399] [g] irqtime_account_process_tick 10.58% [guest/11399] [guest.kernel.kallsyms.11399] [g] run_timer_softirq 6.95% [guest/11094] [guest.kernel.kallsyms.11094] [g] printk_needs_cpu 6.50% [guest/11094] [guest.kernel.kallsyms.11094] [g] do_timer 6.45% [guest/11399] [guest.kernel.kallsyms.11399] [g] idle_balance 4.90% [guest/11094] [guest.kernel.kallsyms.11094] [g] native_read_tsc ... v2: - changed rbtree walk to use rb_first per Namhyung's suggestion Tested-by: NJiri Olsa <jolsa@redhat.com> Signed-off-by: NDavid Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1342826756-64663-5-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-