- 23 6月, 2013 3 次提交
-
-
由 Dave Hansen 提交于
This patch has been invaluable in my adventures finding issues in the perf NMI handler. I'm as big a fan of printk() as anybody is, but using printk() in NMIs is deadly when they're happening frequently. Even hacking in trace_printk() ended up eating enough CPU to throw off some of the measurements I was making. Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: acme@ghostprotocols.net Cc: Dave Hansen <dave@sr71.net> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dave Hansen 提交于
This patch keeps track of how long perf's NMI handler is taking, and also calculates how many samples perf can take a second. If the sample length times the expected max number of samples exceeds a configurable threshold, it drops the sample rate. This way, we don't have a runaway sampling process eating up the CPU. This patch can tend to drop the sample rate down to level where perf doesn't work very well. *BUT* the alternative is that my system hangs because it spends all of its time handling NMIs. I'll take a busted performance tool over an entire system that's busted and undebuggable any day. BTW, my suspicion is that there's still an underlying bug here. Using the HPET instead of the TSC is definitely a contributing factor, but I suspect there are some other things going on. But, I can't go dig down on a bug like that with my machine hanging all the time. Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: acme@ghostprotocols.net Cc: Dave Hansen <dave@sr71.net> [ Prettified it a bit. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dave Hansen 提交于
I have a system which is causing all kinds of problems. It has 8 NUMA nodes, and lots of cores that can fight over cachelines. If things are not working _perfectly_, then NMIs can take longer than expected. If we get too many of them backed up to each other, we can easily end up in a situation where we are doing nothing *but* running NMIs. The biggest problem, though, is that this happens _silently_. You might be lucky to get an hrtimer warning, but most of the time system simply hangs. This patch should at least give us some warning before we fall off the cliff. the warnings look like this: nmi_handle: perf_event_nmi_handler() took: 26095071 ns The message is triggered whenever we notice the longest NMI we've seen to date. You can always view and reset this value via the debugfs interface if you like. Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: acme@ghostprotocols.net Cc: Dave Hansen <dave@sr71.net> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 20 6月, 2013 1 次提交
-
-
由 Masami Hiramatsu 提交于
Fix arch_prepare_kprobe() to handle failures in copy instruction correctly. This fix is related to the previous fix: 8101376d which made __copy_instruction return an error result if failed, but caller site was not updated to handle it. Thus, this is the other half of the bugfix. This fix is also related to the following bug-report: https://bugzilla.redhat.com/show_bug.cgi?id=910649Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Tested-by: NJonathan Lebon <jlebon@redhat.com> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: systemtap@sourceware.org Cc: yrl.pp-manager.tt@hitachi.com Link: http://lkml.kernel.org/r/20130605031216.15285.2001.stgit@mhiramat-M0-7522Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 19 6月, 2013 11 次提交
-
-
由 Andi Kleen 提交于
mem-loads is basically the same as Sandy Bridge, but we use a separate string for changes later. Haswell doesn't support the full precise store mode, so we emulate it using the "DataLA" facility. This allows to do everything, but for data sources we can only detect L1 hit or not. There is no explicit enable bit anymore, so we have to tie it to a perf internal only flag. The address is supported for all memory related PEBS events with DataLA. Instead of only logging for the load and store events we allow logging it for all (it will be simply 0 if the current event does not support it) Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Andi Kleen <ak@linux.jf.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1371515812-9646-7-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andi Kleen 提交于
Haswell has two additional LBR from flags for TSX: in_tx and abort_tx, implemented as a new "v4" version of the LBR format. Handle those in and adjust the sign extension code to still correctly extend. The flags are exported similarly in the LBR record to the existing misprediction flag Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Andi Kleen <ak@linux.jf.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1371515812-9646-6-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andi Kleen 提交于
This avoids some problems with spurious PMIs on Haswell. Haswell seems to behave more like P4 in this regard. Do the same thing as the P4 perf handler by unmasking the NMI only at the end. Shouldn't make any difference for earlier family 6 cores. (Tested on Haswell, IvyBridge, Westmere, Saltwell (Atom).) Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Andi Kleen <ak@linux.jf.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1371515812-9646-5-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andi Kleen 提交于
Add simple PEBS support for Haswell. The constraints are similar to SandyBridge with a few new events. Reviewed-by: NStephane Eranian <eranian@google.com> Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Andi Kleen <ak@linux.jf.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1371515812-9646-4-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andi Kleen 提交于
Similar to SandyBridge, but has a few new events and two new counter bits. There are some new counter flags that need to be prevented from being set on fixed counters, and allowed to be set for generic counters. Also we add support for the counter 2 constraint to handle all raw events. (Contains fixes from Stephane Eranian.) Reviewed-by: NStephane Eranian <eranian@google.com> Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Andi Kleen <ak@linux.jf.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1371515812-9646-3-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andi Kleen 提交于
Add support for the Haswell extended (fmt2) PEBS format. It has a superset of the nhm (fmt1) PEBS fields, but has a longer record so we need to adjust the code paths. The main advantage is the new "EventingRip" support which directly gives the instruction, not off-by-one instruction. So with precise == 2 we use that directly and don't try to use LBRs and walking basic blocks. This lowers the overhead of using precise significantly. Some other features are added in later patches. Reviewed-by: NStephane Eranian <eranian@google.com> Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Andi Kleen <ak@linux.jf.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1371515812-9646-2-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1370421025-10986-1-git-send-email-zheng.z.yan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Suravee Suthikulpanit 提交于
Implement a perf PMU to handle IOMMU performance counters and events. The PMU only supports counting mode (e.g. perf stat). Since the counters are shared across all cores, the PMU is implemented as "system-wide" mode. To invoke the AMD IOMMU PMU, issue a perf tool command such as: ./perf stat -a -e amd_iommu/<events>/ <command> or: ./perf stat -a -e amd_iommu/config=<config-data>,config1=<config1-data>/ <command> For example: ./perf stat -a -e amd_iommu/mem_trans_total/ <command> The resulting count will be how many IOMMU total peripheral memory operations were performed during the command execution window. Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1370466709-3212-3-git-send-email-suravee.suthikulpanit@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dave Hansen 提交于
intel_pmu_handle_irq() has a warning in it if it does too many loops. It is a WARN_ONCE(), but the perf_event_print_debug() call beneath it is unconditional. For the first warning, you get a nice backtrace and message, but subsequent ones just dump the PMU state with no leading messages. I doubt this is what was intended. This patch will only print the PMU state when paired with the WARN_ON() text. It effectively open-codes WARN_ONCE()'s one-time-only logic. My suspicion is that the code really just wants to make sure we do not sit in the loop and spit out a warning for every loop iteration after the 100th. From what I've seen, this is very unlikely to happen since we also clear the PMU state. After this patch, instead of seeing the PMU state dumped each time, you will just see: [57494.894540] perf_event_intel: clearing PMU state on CPU#129 [57579.539668] perf_event_intel: clearing PMU state on CPU#10 [57587.137762] perf_event_intel: clearing PMU state on CPU#134 [57623.039912] perf_event_intel: clearing PMU state on CPU#114 [57644.559943] perf_event_intel: clearing PMU state on CPU#118 ... Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130530174559.0DB049F4@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andrew Hunter 提交于
x86_schedule_events() caches event constraints on the stack during scheduling. Given the number of possible events, this is 512 bytes of stack; since it can be invoked under schedule() under god-knows-what, this is causing stack blowouts. Trade some space usage for stack safety: add a place to cache the constraint pointer to struct perf_event. For 8 bytes per event (1% of its size) we can save the giant stack frame. This shouldn't change any aspect of scheduling whatsoever and while in theory the locality's a tiny bit worse, I doubt we'll see any performance impact either. Tested: `perf stat whatever` does not blow up and produces results that aren't hugely obviously wrong. I'm not sure how to run particularly good tests of perf code, but this should not produce any functional change whatsoever. Signed-off-by: NAndrew Hunter <ahh@google.com> Reviewed-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1369332423-4400-1-git-send-email-ahh@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Stephane Eranian 提交于
This patch fixes broken support of PEBS-LL on SNB-EP/IVB-EP. For some reason, the LDLAT extra reg definition for snb_ep showed up as duplicate in the snb table. This patch moves the definition of LDLAT back into the snb_ep table. Thanks to Don Zickus for tracking this one down. Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130607212210.GA11849@quadSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 28 5月, 2013 6 次提交
-
-
由 Peter Zijlstra 提交于
Josh reported that his QEMU is a bad hardware emulator and trips a WARN in the AMD PMU init code. He requested the WARN be turned into a pr_err() or similar. While there, rework the code a little. Reported-by: NJosh Boyer <jwboyer@redhat.com> Acked-by: NRobert Richter <rric@kernel.org> Acked-by: NJacob Shin <jacob.shin@amd.com> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130521110537.GG26912@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Stephane Eranian 提交于
This patch moves commit 7cc23cd6 to the generic code: perf/x86/intel/lbr: Demand proper privileges for PERF_SAMPLE_BRANCH_KERNEL The check is now implemented in generic code instead of x86 specific code. That way we do not have to repeat the test in each arch supporting branch sampling. Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/20130521105337.GA2879@quadSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dan Carpenter 提交于
We're trying to use 64 bit masks but the shifts wrap so we can't use the high 32 bits. I've fixed this by changing several types to unsigned long long. This is a static checker fix. The one change which is clearly needed is "mask = 0xff << (idx * 8);" where the author obviously intended to use all 64 bits. The other changes are mostly to silence my static checker. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20130518183452.GA14587@elgon.mountainSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jiri Olsa 提交于
Merging EFLAGS bit clearing into a single statement, to ensure EFLAGS bits are being cleared in a single instruction. Signed-off-by: NJiri Olsa <jolsa@redhat.com> Tested-by: NOleg Nesterov <oleg@redhat.com> Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com> Originally-Reported-by: NVince Weaver <vincent.weaver@maine.edu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Stephane Eranian <eranian@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1367421944-19082-4-git-send-email-jolsa@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jiri Olsa 提交于
Clearing RF EFLAGS bit for signal handler. The reason is that this flag is set by debug exception code to prevent the recursive exception entry. Leaving it set for signal handler might prevent debug exception of the signal handler itself. Signed-off-by: NJiri Olsa <jolsa@redhat.com> Tested-by: NOleg Nesterov <oleg@redhat.com> Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com> Originally-Reported-by: NVince Weaver <vincent.weaver@maine.edu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Stephane Eranian <eranian@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1367421944-19082-3-git-send-email-jolsa@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jiri Olsa 提交于
While porting Vince's perf overflow tests I found perf event breakpoint overflow does not work properly. I found the x86 RF EFLAG bit not being set when returning from debug exception after triggering signal handler. Which is exactly what you get when you set perf breakpoint overflow SIGIO handler. This patch and the next two patches fix the underlying bugs. This patch adds the RF EFLAGS bit to be restored on return from signal from the original register context before the signal was entered. This will prevent the RF flag to disappear when returning from exception due to the signal handler being executed. Signed-off-by: NJiri Olsa <jolsa@redhat.com> Tested-by: NOleg Nesterov <oleg@redhat.com> Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com> Originally-Reported-by: NVince Weaver <vincent.weaver@maine.edu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Stephane Eranian <eranian@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1367421944-19082-2-git-send-email-jolsa@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 21 5月, 2013 2 次提交
-
-
由 Linus Torvalds 提交于
In commit 78d77df7 ("x86-64, init: Do not set NX bits on non-NX capable hardware") we added the early_pmd_flags that gets the NX bit set when a CPU supports NX. However, the new variable was marked __initdata, because the main _use_ of this is in an __init routine. However, the bit setting happens from secondary_startup_64(), which is called not only at bootup, but on every secondary CPU start. Including resuming from STR and at CPU hotplug time. So the value cannot be __initdata. Reported-bisected-and-tested-by: NMichal Hocko <mhocko@suse.cz> Cc: stable@vger.kernel.org # v3.9 Acked-by: NPeter Anvin <hpa@linux.intel.com> Cc: Fernando Luis Vázquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bjorn Helgaas 提交于
This reverts commit dd72be99. Andy Shevchenko <andy.shevchenko@gmail.com> reported that this commit broke Intel Medfield devices. Reference: https://lkml.kernel.org/r/CAHp75Vdf6gFZChS47=grUygHBDWcoOWDYPzw+Zj5bdVCWj85Jw@mail.gmail.comReported-by: NAndy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 15 5月, 2013 1 次提交
-
-
由 John Stultz 提交于
Kay Sievers noted that the ALWAYS_USE_PERSISTENT_CLOCK config, which enables some minor compile time optimization to avoid uncessary code in mostly the suspend/resume path could cause problems for userland. In particular, the dependency for RTC_HCTOSYS on !ALWAYS_USE_PERSISTENT_CLOCK, which avoids setting the time twice and simplifies suspend/resume, has the side effect of causing the /sys/class/rtc/rtcN/hctosys flag to always be zero, and this flag is commonly used by udev to setup the /dev/rtc symlink to /dev/rtcN, which can cause pain for older applications. While the udev rules could use some work to be less fragile, breaking userland should strongly be avoided. Additionally the compile time optimizations are fairly minor, and the code being optimized is likely to be reworked in the future, so lets revert this change. Reported-by: NKay Sievers <kay@vrfy.org> Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Cc: stable <stable@vger.kernel.org> #3.9 Cc: Feng Tang <feng.tang@intel.com> Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Link: http://lkml.kernel.org/r/1366828376-18124-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 10 5月, 2013 4 次提交
-
-
由 Bjorn Helgaas 提交于
We now cache the MSI-X capability offset in the struct pci_dev, so no need to find the capability again. Acked-by: NJan Beulich <jbeulich@suse.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Bjorn Helgaas 提交于
PCI_MSIX_FLAGS_BIRMASK is mis-named because the BIR mask is in the Table Offset register, not the flags ("Message Control" per spec) register. Acked-by: NJan Beulich <jbeulich@suse.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Zhang Yanfei 提交于
Two sets of comments were lost during patch-series shuffling: - comments for init_range_memory_mapping() - comments in init_mem_mapping that is helpful for reminding people that the pagetable is setup top-down The comments were written by Yinghai in his patch in: https://lkml.org/lkml/2012/11/28/620 This patch reintroduces them. Originally-From: Yinghai Lu <yinghai@kernel.org> Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/518BC776.7010506@gmail.com [ Tidied it all up a bit. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 09 5月, 2013 5 次提交
-
-
由 Paolo Bonzini 提交于
This is an almost-undocumented instruction available in 32-bit mode. I say "almost" undocumented because AMD documents it in their opcode maps just to say that it is unavailable in 64-bit mode (sections "A.2.1 One-Byte Opcodes" and "B.3 Invalid and Reassigned Instructions in 64-Bit Mode"). It is roughly equivalent to "sbb %al, %al" except it does not set the flags. Use fastop to emulate it, but do not use the opcode directly because it would fail if the host is 64-bit! Reported-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: stable@vger.kernel.org # 3.9 Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Paolo Bonzini 提交于
This is used by SGABIOS, KVM breaks with emulate_invalid_guest_state=1. It is just a MOV in disguise, with a funny source address. Reported-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: stable@vger.kernel.org # 3.9 Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Paolo Bonzini 提交于
This is used by SGABIOS, KVM breaks with emulate_invalid_guest_state=1. AAM needs the source operand to be unsigned; do the same in AAD as well for consistency, even though it does not affect the result. Reported-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: stable@vger.kernel.org # 3.9 Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Konrad Rzeszutek Wilk 提交于
This can easily be triggered if a new CPU is added (via ACPI hotplug mechanism) and from user-space you do: echo 1 > /sys/devices/system/cpu/cpu3/online (or wait for UDEV to do it) on a newly appeared physical CPU. The deadlock is that the "store_online" in drivers/base/cpu.c takes the cpu_hotplug_driver_lock() lock, then calls "cpu_up". "cpu_up" eventually ends up calling "save_mc_for_early" which also takes the cpu_hotplug_driver_lock() lock. And here is that lockdep thinks of it: smpboot: Stack at about ffff880075c39f44 smpboot: CPU3: has booted. microcode: CPU3 sig=0x206a7, pf=0x2, revision=0x25 ============================================= [ INFO: possible recursive locking detected ] 3.9.0upstream-10129-g167af0e #1 Not tainted --------------------------------------------- sh/2487 is trying to acquire lock: (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20 but task is already holding lock: (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(x86_cpu_hotplug_driver_mutex); lock(x86_cpu_hotplug_driver_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 6 locks held by sh/2487: #0: (sb_writers#5){.+.+.+}, at: [<ffffffff811ca48d>] vfs_write+0x17d/0x190 #1: (&buffer->mutex){+.+.+.}, at: [<ffffffff812464ef>] sysfs_write_file+0x3f/0x160 #2: (s_active#20){.+.+.+}, at: [<ffffffff81246578>] sysfs_write_file+0xc8/0x160 #3: (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20 #4: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff810961c2>] cpu_maps_update_begin+0x12/0x20 #5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff810962a7>] cpu_hotplug_begin+0x27/0x60 Suggested-and-Acked-by: NBorislav Petkov <bp@alien8.de> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: fenghua.yu@intel.com Cc: xen-devel@lists.xensource.com Cc: stable@vger.kernel.org # for v3.9 Link: http://lkml.kernel.org/r/1368029583-23337-1-git-send-email-konrad.wilk@oracle.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Gleb Natapov 提交于
The invalid guest state emulation loop does not check halt_request which causes 100% cpu loop while guest is in halt and in invalid state, but more serious issue is that this leaves halt_request set, so random instruction emulated by vm86 #GP exit can be interpreted as halt which causes guest hang. Fix both problems by handling halt_request in emulation loop. Reported-by: NTomas Papan <tomas.papan@gmail.com> Tested-by: NTomas Papan <tomas.papan@gmail.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> CC: stable@vger.kernel.org Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 08 5月, 2013 4 次提交
-
-
由 Zhenzhong Duan 提交于
On x2apic enabled pvm, doing sysrq+l, got NULL pointer dereference as below. SysRq : Show backtrace of all active CPUs BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8125e3cb>] memcpy+0xb/0x120 Call Trace: [<ffffffff81039633>] ? __x2apic_send_IPI_mask+0x73/0x160 [<ffffffff8103973e>] x2apic_send_IPI_all+0x1e/0x20 [<ffffffff8103498c>] arch_trigger_all_cpu_backtrace+0x6c/0xb0 [<ffffffff81501be4>] ? _raw_spin_lock_irqsave+0x34/0x50 [<ffffffff8131654e>] sysrq_handle_showallcpus+0xe/0x10 [<ffffffff8131616d>] __handle_sysrq+0x7d/0x140 [<ffffffff81316230>] ? __handle_sysrq+0x140/0x140 [<ffffffff81316287>] write_sysrq_trigger+0x57/0x60 [<ffffffff811ca996>] proc_reg_write+0x86/0xc0 [<ffffffff8116dd8e>] vfs_write+0xce/0x190 [<ffffffff8116e3e5>] sys_write+0x55/0x90 [<ffffffff8150a242>] system_call_fastpath+0x16/0x1b That's because apic points to apic_x2apic_cluster or apic_x2apic_phys but the basic element like cpumask isn't initialized. Mask x2APIC feature in pvm to avoid overwrite of apic pointer, update commit message per Konrad's suggestion. Signed-off-by: NZhenzhong Duan <zhenzhong.duan@oracle.com> Tested-by: NTamon Shiose <tamon.shiose@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Konrad Rzeszutek Wilk 提交于
During review of git commit cb9c6f15 ("xen/spinlock: Check against default value of -1 for IRQ line.") Stefano pointed out a bug in the patch. Unfortunatly due to vacation timing the fix was not applied and this patch fixes it up. Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Konrad Rzeszutek Wilk 提交于
As it will point to some data, but not event channel data (the shared_info has an array limited to 32). This means that for PVHVM guests with more than 32 VCPUs without the usage of VCPUOP_register_info any interrupts to VCPUs larger than 32 would have gone unnoticed during early bootup. That is OK, as during early bootup, in smp_init we end up calling the hotplug mechanism (xen_hvm_cpu_notify) which makes the VCPUOP_register_vcpu_info call for all VCPUs and we can receive interrupts on VCPUs 33 and further. This is just a cleanup. Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Marcelo Tosatti 提交于
Emulation of xcr0 writes zero guest_xcr0_loaded variable so that subsequent VM-entry reloads CPU's xcr0 with guests xcr0 value. However, this is incorrect because guest_xcr0_loaded variable is read to decide whether to reload hosts xcr0. In case the vcpu thread is scheduled out after the guest_xcr0_loaded = 0 assignment, and scheduler decides to preload FPU: switch_to { __switch_to __math_state_restore restore_fpu_checking fpu_restore_checking if (use_xsave()) fpu_xrstor_checking xrstor64 with CPU's xcr0 == guests xcr0 Fix by properly restoring hosts xcr0 during emulation of xcr0 writes. Analyzed-by: NUlrich Obergfell <uobergfe@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 07 5月, 2013 3 次提交
-
-
由 Thomas Gleixner 提交于
The core code expects the arch idle code to return with interrupts enabled. The conversion missed two x86 cases which fail to do that. Reported-and-tested-by: NMarkus Trippelsdorf <markus@trippelsdorf.de> Tested-by: NBorislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1305021557030.3972@ionosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Michel Lespinasse 提交于
modify __down_write[_nested] and __down_write_trylock to grab the write lock whenever the active count is 0, even if there are queued waiters (they must be writers pending wakeup, since the active count is 0). Note that this is an optimization only; architectures without this optimization will still work fine: - __down_write() would take the slow path which would take the wait_lock and then try stealing the lock (as in the spinlocked rwsem implementation) - __down_write_trylock() would fail, but callers must be ready to deal with that - since there are some writers pending wakeup, they could have raced with us and obtained the lock before we steal it. Signed-off-by: NMichel Lespinasse <walken@google.com> Reviewed-by: NPeter Hurley <peter@hurleysoftware.com> Acked-by: NDavidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Konrad Rzeszutek Wilk 提交于
They are important structures and it is not clear at first look what they are for. The xen_vcpu is a pointer. By default it points to the shared_info structure (at the CPU offset location). However if the VCPUOP_register_vcpu_info hypercall is implemented we can make the xen_vcpu pointer point to a per-CPU location. Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> [v1: Added comments from Ian Campbell] Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-