- 27 5月, 2015 13 次提交
-
-
由 Peter Zijlstra 提交于
Lockdep is very good at finding incorrect IRQ state while locking and is far better at telling us if we hold a lock than the _is_locked() API. It also generates less code for !DEBUG kernels. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
For some obscure reason the current code accounts the current SMT thread's state on the remote thread and reads the remote's state on the local SMT thread. While internally consistent, and 'correct' its pointless confusion we can do without. Flip them the right way around. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Matt Fleming 提交于
Since we write RMID values to MSRs the correct type to use is 'u32' because that clearly articulates we're writing a hardware register value. Fix up all uses of RMID in this code to consistently use the correct data type. Reported-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NMatt Fleming <matt.fleming@intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: Kanaka Juvva <kanaka.d.juvva@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: Will Auld <will.auld@intel.com> Link: http://lkml.kernel.org/r/1432285182-17180-1-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Thomas Gleixner 提交于
'closid' (CLass Of Service ID) is used for the Class based Cache Allocation Technology (CAT). Add explicit storage to the per cpu cache for it, so it can be used later with the CAT support (requires to move the per cpu data). While at it: - Rename the structure to intel_pqr_state which reflects the actual purpose of the struct: cache values which go into the PQR MSR - Rename 'cnt' to rmid_usecnt which reflects the actual purpose of the counter. - Document the structure and the struct members. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NMatt Fleming <matt.fleming@intel.com> Cc: Kanaka Juvva <kanaka.d.juvva@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: Will Auld <will.auld@intel.com> Link: http://lkml.kernel.org/r/20150518235150.240899319@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Thomas Gleixner 提交于
intel_cqm_event_del() is a 1:1 wrapper for intel_cqm_event_stop(). Remove the useless indirection. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NMatt Fleming <matt.fleming@intel.com> Cc: Kanaka Juvva <kanaka.d.juvva@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: Will Auld <will.auld@intel.com> Link: http://lkml.kernel.org/r/20150518235150.159779847@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Thomas Gleixner 提交于
If the usage counter is non-zero there is no point to update the rmid in the PQR MSR. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NMatt Fleming <matt.fleming@intel.com> Cc: Kanaka Juvva <kanaka.d.juvva@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: Will Auld <will.auld@intel.com> Link: http://lkml.kernel.org/r/20150518235150.080844281@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Thomas Gleixner 提交于
'struct intel_cqm_state' is a strict per CPU cache of the rmid and the usage counter. It can never be modified from a remote CPU. The three functions which modify the content: intel_cqm_event[start|stop|del] (del maps to stop) are called from the perf core with interrupts disabled which is enough protection for the per CPU state values. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NMatt Fleming <matt.fleming@intel.com> Cc: Kanaka Juvva <kanaka.d.juvva@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: Will Auld <will.auld@intel.com> Link: http://lkml.kernel.org/r/20150518235150.001006529@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Thomas Gleixner 提交于
'int' is really not a proper data type for an MSR. Use u32 to make it clear that we are dealing with a 32-bit unsigned hardware value. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NMatt Fleming <matt.fleming@intel.com> Cc: Kanaka Juvva <kanaka.d.juvva@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: Will Auld <will.auld@intel.com> Link: http://lkml.kernel.org/r/20150518235149.919350144@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Thomas Gleixner 提交于
The CQM code acts like it owns the PQR MSR completely. That's not true because only the lower 10 bits are used for CQM. The upper 32 bits are used for the 'CLass Of Service ID' (CLOSID). Document the abuse. Will be fixed in a later patch. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NMatt Fleming <matt.fleming@intel.com> Cc: Kanaka Juvva <kanaka.d.juvva@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: Will Auld <will.auld@intel.com> Link: http://lkml.kernel.org/r/20150518235149.823214798@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Don Zickus 提交于
I stumbled upon an AMD box that had the BIOS using a hardware performance counter. Instead of printing out a warning and continuing, it failed and blocked further perf counter usage. Looking through the history, I found this commit: a5ebe0ba ("perf/x86: Check all MSRs before passing hw check") which tweaked the rules for a Xen guest on an almost identical box and now changed the behaviour. Unfortunately the rules were tweaked incorrectly and will always lead to MSR failures even though the MSRs are completely fine. What happens now is in arch/x86/kernel/cpu/perf_event.c::check_hw_exists(): <snip> for (i = 0; i < x86_pmu.num_counters; i++) { reg = x86_pmu_config_addr(i); ret = rdmsrl_safe(reg, &val); if (ret) goto msr_fail; if (val & ARCH_PERFMON_EVENTSEL_ENABLE) { bios_fail = 1; val_fail = val; reg_fail = reg; } } <snip> /* * Read the current value, change it and read it back to see if it * matches, this is needed to detect certain hardware emulators * (qemu/kvm) that don't trap on the MSR access and always return 0s. */ reg = x86_pmu_event_addr(0); ^^^^ if the first perf counter is enabled, then this routine will always fail because the counter is running. :-( if (rdmsrl_safe(reg, &val)) goto msr_fail; val ^= 0xffffUL; ret = wrmsrl_safe(reg, val); ret |= rdmsrl_safe(reg, &val_new); if (ret || val != val_new) goto msr_fail; The above bios_fail used to be a 'goto' which is why it worked in the past. Further, most vendors have migrated to using fixed counters to hide their evilness hence this problem rarely shows up now days except on a few old boxes. I fixed my problem and kept the spirit of the original Xen fix, by recording a safe non-enable register to be used safely for the reading/writing check. Because it is not enabled, this passes on bare metal boxes (like metal), but should continue to throw an msr_fail on Xen guests because the register isn't emulated yet. Now I get a proper bios_fail error message and Xen should still see their msr_fail message (untested). Signed-off-by: NDon Zickus <dzickus@redhat.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: george.dunlap@eu.citrix.com Cc: konrad.wilk@oracle.com Link: http://lkml.kernel.org/r/1431976608-56970-1-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Alexander Shishkin 提交于
Currently, pt_buffer_reset_markers() is a difficult to read knot of arithmetics with a redundant check for multiple-entry TOPA capability, a commented out wakeup marker placement and a logical error wrt to stop marker placement. The latter happens when write head is not page aligned and results in stop marker being placed one page earlier than it actually should. All these problems only affect PT implementations that support multiple-entry TOPA tables (read: proper scatter-gather). For single-entry TOPA implementations, there is no functional impact. This patch deals with all of the above. Tested on both single-entry and multiple-entry TOPA PT implementations. Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@infradead.org Cc: adrian.hunter@intel.com Cc: hpa@zytor.com Link: http://lkml.kernel.org/r/1432308626-18845-4-git-send-email-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
The (SNB/IVB/HSW) HT bug only affects events that can be programmed onto GP counters, therefore we should only limit the number of GP counters that can be used per cpu -- iow we should not constrain the FP counters. Furthermore, we should only enfore such a limit when there are in fact exclusive events being scheduled on either sibling. Reported-by: NVince Weaver <vincent.weaver@maine.edu> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> [ Fixed build fail for the !CONFIG_CPU_SUP_INTEL case. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Commit 43b45780 ("perf/x86: Reduce stack usage of x86_schedule_events()") violated the rule that 'fake' scheduling; as used for event/group validation; should not change the event state. This went mostly un-noticed because repeated calls of x86_pmu::get_event_constraints() would give the same result. And x86_pmu::put_event_constraints() would mostly not do anything. Commit e979121b ("perf/x86/intel: Implement cross-HT corruption bug workaround") made the situation much worse by actually setting the event->hw.constraint value to NULL, so when validation and actual scheduling interact we get NULL ptr derefs. Fix it by removing the constraint pointer from the event and move it back to an array, this time in cpuc instead of on the stack. validate_group() x86_schedule_events() event->hw.constraint = c; # store <context switch> perf_task_event_sched_in() ... x86_schedule_events(); event->hw.constraint = c2; # store ... put_event_constraints(event); # assume failure to schedule intel_put_event_constraints() event->hw.constraint = NULL; <context switch end> c = event->hw.constraint; # read -> NULL if (!test_bit(hwc->idx, c->idxmsk)) # <- *BOOM* NULL deref This in particular is possible when the event in question is a cpu-wide event and group-leader, where the validate_group() tries to add an event to the group. Reported-by: NVince Weaver <vincent.weaver@maine.edu> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Hunter <ahh@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 43b45780 ("perf/x86: Reduce stack usage of x86_schedule_events()") Fixes: e979121b ("perf/x86/intel: Implement cross-HT corruption bug workaround") Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 11 5月, 2015 2 次提交
-
-
由 Stephane Eranian 提交于
This patch enables the uncore Memory Controller (IMC) PMU support for Intel Broadwell-U (Model 61) mobile processors. The IMC PMU enables measuring memory bandwidth. To use with perf: $ perf stat -a -I 1000 -e uncore_imc/data_reads/,uncore_imc/data_writes/ sleep 10 Tested-by: NSonny Rao <sonnyrao@chromium.org> Signed-off-by: NStephane Eranian <eranian@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kan.liang@intel.com Cc: peterz@infradead.org Link: http://lkml.kernel.org/r/20150423065642.GA4890@thinkpadSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Stephane Eranian 提交于
This patch enables RAPL counters (energy consumption counters) support for Intel Broadwell-U processors (Model 61): To use: $ perf stat -a -I 1000 -e power/energy-cores/,power/energy-pkg/,power/energy-ram/ sleep 10 Signed-off-by: NStephane Eranian <eranian@google.com> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jacob.jun.pan@linux.intel.com Cc: kan.liang@intel.com Cc: peterz@infradead.org Cc: sonnyrao@chromium.org Link: http://lkml.kernel.org/r/20150423070709.GA4970@thinkpadSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 08 5月, 2015 1 次提交
-
-
由 Kan Liang 提交于
iTLB-load-misses and LLC-load-misses count incorrectly on SLM. There is no ITLB.MISSES support on SLM. Event PAGE_WALKS.I_SIDE_WALK should be used to count iTLB-load-misses. This event counts when an instruction (I) page walk is completed or started. Since a page walk implies a TLB miss, the number of TLB misses can be counted by counting the number of pagewalks. DMND_DATA_RD counts both demand and DCU prefetch data reads. However, LLC-load-misses should only count demand reads. There is no way to not include prefetches with a single counter on SLM. So the LLC-load-misses support should be removed on SLM. Signed-off-by: NKan Liang <kan.liang@intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1429608881-5055-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 5月, 2015 3 次提交
-
-
由 Bobby Powers 提交于
The following commit: f893959b ("x86/fpu: Don't abuse drop_init_fpu() in flush_thread()") removed drop_init_fpu() usage from flush_thread(). This seems to break things for me - the Go 1.4 test suite fails all over the place with floating point comparision errors (offending commit found through bisection). The functional change was that flush_thread() after this commit only calls restore_init_xstate() when both use_eager_fpu() and !used_math() are true. drop_init_fpu() (now fpu_reset_state()) calls restore_init_xstate() regardless of whether current used_math() - apply the same logic here. Switch used_math() -> tsk_used_math(tsk) to consistently use the grabbed tsk instead of current, like in the rest of flush_thread(). Tested-by: NDave Hansen <dave.hansen@intel.com> Signed-off-by: NBobby Powers <bobbypowers@gmail.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NOleg Nesterov <oleg@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: f893959b ("x86/fpu: Don't abuse drop_init_fpu() in flush_thread()") Link: http://lkml.kernel.org/r/1430147441-9820-1-git-send-email-bobbypowers@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Marc Dionne 提交于
Commit 75182b16 ("x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()") changed current_thread_info to use this_cpu_sp0, and indirectly made it rely on init_tss which was exported with EXPORT_PER_CPU_SYMBOL_GPL. As a result some macros and inline functions such as set/get_fs, test_thread_flag and variants have been made unusable for external modules. Make cpu_tss exported with EXPORT_PER_CPU_SYMBOL so that these functions are accessible again, as they were previously. Signed-off-by: NMarc Dionne <marc.dionne@your-file-system.com> Acked-by: NAndy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/1430763404-21221-1-git-send-email-marc.dionne@your-file-system.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Boris Ostrovsky 提交于
Commit 61f01dd9 ("x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue") makes AMD processors set SS to __KERNEL_DS in __switch_to() to deal with cases when SS is NULL. This breaks Xen PV guests who do not want to load SS with__KERNEL_DS. Since the problem that the commit is trying to address would have to be fixed in the hypervisor (if it in fact exists under Xen) there is no reason to set X86_BUG_SYSRET_SS_ATTRS flag for PV VPCUs here. This can be easily achieved by adding x86_hyper_xen_hvm.set_cpu_features op which will clear this flag. (And since this structure is no longer HVM-specific we should do some renaming). Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Reported-by: NSander Eikelenboom <linux@eikelenboom.it> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
- 27 4月, 2015 2 次提交
-
-
由 Paolo Bonzini 提交于
This reverts commits 0a4e6be9 and 80f7fdb1. The task migration notifier was originally introduced in order to support the pvclock vsyscall with non-synchronized TSC, but KVM only supports it with synchronized TSC. Hence, on KVM the race condition is only needed due to a bad implementation on the host side, and even then it's so rare that it's mostly theoretical. As far as KVM is concerned it's possible to fix the host, avoiding the additional complexity in the vDSO and the (re)introduction of the task migration notifier. Xen, on the other hand, hasn't yet implemented vsyscall support at all, so we do not care about its plans for non-synchronized TSC. Reported-by: NPeter Zijlstra <peterz@infradead.org> Suggested-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andy Lutomirski 提交于
AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET with SS == 0 results in an invalid usermode state in which SS is apparently equal to __USER_DS but causes #SS if used. Work around the issue by setting SS to __KERNEL_DS __switch_to, thus ensuring that SYSRET never happens with SS set to NULL. This was exposed by a recent vDSO cleanup. Fixes: e7d6eefa x86/vdso32/syscall.S: Do not load __USER32_DS to %ss Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Brian Gerst <brgerst@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 4月, 2015 3 次提交
-
-
由 Sonny Rao 提交于
This keeps all the related PCI IDs together in the driver where they are used. Signed-off-by: NSonny Rao <sonnyrao@chromium.org> Acked-by: NBjorn Helgaas <bhelgaas@google.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1429644791-25724-1-git-send-email-sonnyrao@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Sonny Rao 提交于
perf/x86/intel/uncore: Add support for Intel Haswell ULT (lower power Mobile Processor) IMC uncore PMUs This uncore is the same as the Haswell desktop part but uses a different PCI ID. Signed-off-by: NSonny Rao <sonnyrao@chromium.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1429569247-16697-1-git-send-email-sonnyrao@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jiri Olsa 提交于
The core_pmu does not define cpu_* callbacks, which handles allocation of 'struct cpu_hw_events::shared_regs' data, initialization of debug store and PMU_FL_EXCL_CNTRS counters. While this probably won't happen on bare metal, virtual CPU can define x86_pmu.extra_regs together with PMU version 1 and thus be using core_pmu -> using shared_regs data without it being allocated. That could could leave to following panic: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8152cd4f>] _spin_lock_irqsave+0x1f/0x40 SNIP [<ffffffff81024bd9>] __intel_shared_reg_get_constraints+0x69/0x1e0 [<ffffffff81024deb>] intel_get_event_constraints+0x9b/0x180 [<ffffffff8101e815>] x86_schedule_events+0x75/0x1d0 [<ffffffff810586dc>] ? check_preempt_curr+0x7c/0x90 [<ffffffff810649fe>] ? try_to_wake_up+0x24e/0x3e0 [<ffffffff81064ba2>] ? default_wake_function+0x12/0x20 [<ffffffff8109eb16>] ? autoremove_wake_function+0x16/0x40 [<ffffffff810577e9>] ? __wake_up_common+0x59/0x90 [<ffffffff811a9517>] ? __d_lookup+0xa7/0x150 [<ffffffff8119db5f>] ? do_lookup+0x9f/0x230 [<ffffffff811a993a>] ? dput+0x9a/0x150 [<ffffffff8119c8f5>] ? path_to_nameidata+0x25/0x60 [<ffffffff8119e90a>] ? __link_path_walk+0x7da/0x1000 [<ffffffff8101d8f9>] ? x86_pmu_add+0xb9/0x170 [<ffffffff8101d7a7>] x86_pmu_commit_txn+0x67/0xc0 [<ffffffff811b07b0>] ? mntput_no_expire+0x30/0x110 [<ffffffff8119c731>] ? path_put+0x31/0x40 [<ffffffff8107c297>] ? current_fs_time+0x27/0x30 [<ffffffff8117d170>] ? mem_cgroup_get_reclaim_stat_from_page+0x20/0x70 [<ffffffff8111b7aa>] group_sched_in+0x13a/0x170 [<ffffffff81014a29>] ? sched_clock+0x9/0x10 [<ffffffff8111bac8>] ctx_sched_in+0x2e8/0x330 [<ffffffff8111bb7b>] perf_event_sched_in+0x6b/0xb0 [<ffffffff8111bc36>] perf_event_context_sched_in+0x76/0xc0 [<ffffffff8111eb3b>] perf_event_comm+0x1bb/0x2e0 [<ffffffff81195ee9>] set_task_comm+0x69/0x80 [<ffffffff81195fe1>] setup_new_exec+0xe1/0x2e0 [<ffffffff811ea68e>] load_elf_binary+0x3ce/0x1ab0 Adding cpu_(prepare|starting|dying) for core_pmu to have shared_regs data allocated for core_pmu. AFAICS there's no harm to initialize debug store and PMU_FL_EXCL_CNTRS either for core_pmu. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20150421152623.GC13169@krava.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 18 4月, 2015 1 次提交
-
-
由 Ingo Molnar 提交于
Dan Carpenter reported that pt_event_add() has buggy error handling logic: it returns 0 instead of -EBUSY when it fails to start a newly added event. Furthermore, the control flow in this function is messy, with cleanup labels mixed with direct returns. Fix the bug and clean up the code by converting it to a straight fast path for the regular non-failing case, plus a clear sequence of cascading goto labels to do all cleanup. NOTE: I materially changed the existing clean up logic in the pt_event_start() failure case to use the direct perf_aux_output_end() path, not pt_event_del(), because perf_aux_output_end() is enough here. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150416103830.GB7847@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 17 4月, 2015 5 次提交
-
-
由 Borislav Petkov 提交于
So I was playing with gdb today and did this simple thing: gdb /bin/ls ... (gdb) run Box exploded with this splat: BUG: unable to handle kernel NULL pointer dereference at 00000000000001d0 IP: [<ffffffff8100fe5a>] xstateregs_get+0x7a/0x120 [...] Call Trace: ptrace_regset ptrace_request ? wait_task_inactive ? preempt_count_sub arch_ptrace ? ptrace_get_task_struct SyS_ptrace system_call_fastpath ... because we do cache &target->thread.fpu.state->xsave into the local variable xsave but that pointer is NULL at that time and it gets initialized later, in init_fpu(), see: e7f180dc ("x86/fpu: Change xstateregs_get()/set() to use ->xsave.i387 rather than ->fxsave") The fix is simple: load xsave *after* init_fpu() has run. Also do the same in xstateregs_set(), as suggested by Oleg Nesterov. Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NOleg Nesterov <oleg@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rik van Riel <riel@redhat.com> Cc: Tavis Ormandy <taviso@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1429209697-5902-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kan Liang 提交于
Same as Haswell, Broadwell also support the LBR callstack. Signed-off-by: NKan Liang <kan.liang@intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NAndi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1427962377-40955-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jacob Pan 提交于
RAPL energy hardware unit can vary within a single CPU package, e.g. HSW server DRAM has a fixed energy unit of 15.3 uJ (2^-16) whereas the unit on other domains can be enumerated from power unit MSR. There might be other variations in the future, this patch adds per cpu model quirk to allow special handling of certain cpus. hw_unit is also removed from per cpu data since it is not per cpu and the sampling rate for energy counter is typically not high. Without this patch, DRAM domain on HSW servers will be counted 4x higher than the real energy counter. Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NStephane Eranian <eranian@google.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1427405325-780-1-git-send-email-jacob.jun.pan@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Ingo reported that cycles:pp didn't work for him on some machines. It turns out that in this commit: af4bdcf6 perf/x86/intel: Disallow flags for most Core2/Atom/Nehalem/Westmere events Andi forgot to explicitly allow that event when he disabled event flags for PEBS on those uarchs. Reported-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: af4bdcf6 ("perf/x86/intel: Disallow flags for most Core2/Atom/Nehalem/Westmere events") Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Somehow we ended up with overlapping flags when merging the RDPMC control flag - this is bad, fix it. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 16 4月, 2015 2 次提交
-
-
由 Oleg Nesterov 提交于
When the TIF_SINGLESTEP tracee dequeues a signal, handle_signal() clears TIF_FORCED_TF and X86_EFLAGS_TF but leaves TIF_SINGLESTEP set. If the tracer does PTRACE_SINGLESTEP again, enable_single_step() sets X86_EFLAGS_TF but not TIF_FORCED_TF. This means that the subsequent PTRACE_CONT doesn't not clear X86_EFLAGS_TF, and the tracee gets the wrong SIGTRAP. Test-case (needs -O2 to avoid prologue insns in signal handler): #include <unistd.h> #include <stdio.h> #include <sys/ptrace.h> #include <sys/wait.h> #include <sys/user.h> #include <assert.h> #include <stddef.h> void handler(int n) { asm("nop"); } int child(void) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); signal(SIGALRM, handler); kill(getpid(), SIGALRM); return 0x23; } void *getip(int pid) { return (void*)ptrace(PTRACE_PEEKUSER, pid, offsetof(struct user, regs.rip), 0); } int main(void) { int pid, status; pid = fork(); if (!pid) return child(); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGALRM); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 0); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 1); assert(ptrace(PTRACE_CONT, pid, 0,0) == 0); assert(wait(&status) == pid); assert(WIFEXITED(status) && WEXITSTATUS(status) == 0x23); return 0; } The last assert() fails because PTRACE_CONT wrongly triggers another single-step and X86_EFLAGS_TF can't be cleared by debugger until the tracee does sys_rt_sigreturn(). Change handle_signal() to do user_disable_single_step() if stepping, we do not need to preserve TIF_SINGLESTEP because we are going to do ptrace_notify(), and it is simply wrong to leak this bit. While at it, change the comment to explain why we also need to clear TF unconditionally after setup_rt_frame(). Note: in the longer term we should probably change setup_sigcontext() to use get_flags() and then just remove this user_disable_single_step(). And, the state of TIF_FORCED_TF can be wrong after restore_sigcontext() which can set/clear TF, this needs another fix. This fix fixes the 'single_step_syscall_32' testcase in the x86 testsuite: Before: ~/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [WARN] Hit 10000 SIGTRAPs with si_addr 0xf7789cc0, ip 0xf7789cc0 Trace/breakpoint trap (core dumped) After: ~/linux/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [OK] Survived with TF set and 39 traps [RUN] Fast syscall with TF cleared [OK] Nothing unexpected happened Reported-by: NEvan Teran <eteran@alum.rit.edu> Reported-by: NPedro Alves <palves@redhat.com> Tested-by: NAndres Freund <andres@anarazel.de> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> [ Added x86 self-test info. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Joe Perches 提交于
The seq_printf return value, because it's frequently misused, will eventually be converted to void. See: commit 1f33c41c ("seq_file: Rename seq_overflow() to seq_has_overflowed() and make public") Signed-off-by: NJoe Perches <joe@perches.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 4月, 2015 2 次提交
-
-
由 Kirill A. Shutemov 提交于
We would want to use number of page table level to define mm_struct. Let's expose it as CONFIG_PGTABLE_LEVELS. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Tested-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ulrich Obergfell 提交于
Have kvm_guest_init() use hardlockup_detector_disable() instead of watchdog_enable_hardlockup_detector(false). Remove the watchdog_hardlockup_detector_is_enabled() and the watchdog_enable_hardlockup_detector() function which are no longer needed. Signed-off-by: NUlrich Obergfell <uobergfe@redhat.com> Signed-off-by: NDon Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 4月, 2015 1 次提交
-
-
由 Richard Weinberger 提交于
As execution domain support is gone we can remove signal translation from the signal code and remove exec_domain from thread_info. Signed-off-by: NRichard Weinberger <richard@nod.at>
-
- 12 4月, 2015 1 次提交
-
-
由 Ingo Molnar 提交于
Dan Carpenter pointed out that the control flow in pt_pmu_hw_init() is a bit messy: for example the kfree(de_attrs) is entirely superfluous. Another problem is the inconsistent mixing of label based and direct return error handling. Add modern, label based error handling instead and clean up the code a bit as well. Note that we'll still do a kfree(NULL) in the normal case - this does not matter as this is an init path and kfree() returns early if it sees a NULL. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20150409090805.GG17605@mwandaSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 11 4月, 2015 4 次提交
-
-
由 Denys Vlasenko 提交于
I don't see why we report e.g. orix_ax, which is not always meaningful, but don't report ax, which is meaningful. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1428671219-29341-4-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Denys Vlasenko 提交于
user_64bit_mode(regs) basically checks regs->cs to point to a 64-bit segment. This check used to be unreliable here because regs->cs was not always correct in syscalls. Now regs->cs is always correct: in syscalls, in interrupts, in exceptions. No need to emply heuristics here. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1428671219-29341-3-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Denys Vlasenko 提交于
Yes, it is true that cx contains return address. It's not clear why we trash it. Stop doing that. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1428671219-29341-2-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Denys Vlasenko 提交于
After recent changes to syscall entry points, user_regs->{cs,ss,sp} are always correct. (They used to be undefined while in syscalls). We can report them reliably, without guessing. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1428671219-29341-1-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-