- 18 3月, 2011 29 次提交
-
-
由 Xiao Guangrong 提交于
Only remove write access in the last sptes. Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Lai Jiangshan 提交于
use EFER_SCE, EFER_LME and EFER_LMA instead of magic numbers. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Lai Jiangshan 提交于
The hash array of async gfns may still contain some left gfns after kvm_clear_async_pf_completion_queue() called, need to clear them. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gleb Natapov 提交于
Currently vm86 task is initialized on each real mode entry and vcpu reset. Initialization is done by zeroing TSS and updating relevant fields. But since all vcpus are using the same TSS there is a race where one vcpu may use TSS while other vcpu is initializing it, so the vcpu that uses TSS will see wrong TSS content and will behave incorrectly. Fix that by initializing TSS only once. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gleb Natapov 提交于
When rmode.vm86 is active TR descriptor is updated with vm86 task values, but selector is left intact. vmx_set_segment() makes sure that if TR register is written into while vm86 is active the new values are saved for use after vm86 is deactivated, but since selector is not updated on vm86 activation/deactivation new value is lost. Fix this by writing new selector into vmcs immediately. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Lai Jiangshan 提交于
The changelog of 104f226b said "adds the __noclone attribute", but it was missing in its patch. I think it is still needed. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Jan Kiszka 提交于
Code under this lock requires non-preemptibility. Ensure this also over -rt by converting it to raw spinlock. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gleb Natapov 提交于
isr_ack logic was added by e4825800 to avoid unnecessary IPIs. Back then it made sense, but now the code checks that vcpu is ready to accept interrupt before sending IPI, so this logic is no longer needed. The patch removes it. Fixes a regression with Debian/Hurd. Signed-off-by: NGleb Natapov <gleb@redhat.com> Reported-and-tested-by: NJonathan Nieder <jrnieder@gmail.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joseph Cihula 提交于
This patch fixes the logic used to detect whether BIOS has disabled VMX, for the case where VMX is enabled only under SMX, but tboot is not active. Signed-off-by: NJoseph Cihula <joseph.cihula@intel.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Jan Kiszka 提交于
Code under this lock requires non-preemptibility. Ensure this also over -rt by converting it to raw spinlock. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Avi Kivity 提交于
When we enable an NMI window, we ask for an IRET intercept, since the IRET re-enables NMIs. However, the IRET intercept happens before the instruction executes, while the NMI window architecturally opens afterwards. To compensate for this mismatch, we only open the NMI window in the following exit, assuming that the IRET has by then executed; however, this assumption is not always correct; we may exit due to a host interrupt or page fault, without having executed the instruction. Fix by checking for forward progress by recording and comparing the IRET's rip. This is somewhat of a hack, since an unchaging rip does not mean that no forward progress has been made, but is the simplest fix for now. Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Avi Kivity 提交于
The interrupt injection logic looks something like if an nmi is pending, and nmi injection allowed inject nmi if an nmi is pending request exit on nmi window the problem is that "nmi is pending" can be set asynchronously by the PIT; if it happens to fire between the two if statements, we will request an nmi window even though nmi injection is allowed. On SVM, this has disasterous results, since it causes eflags.TF to be set in random guest code. The fix is simple; make nmi_pending synchronous using the standard vcpu->requests mechanism; this ensures the code above is completely synchronous wrt nmi_pending. Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Avi Kivity 提交于
Use the new support in the emulator, and drop the ad-hoc code in x86.c. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
Mark some instructions as vendor specific, and allow the caller to request emulation only of vendor specific instructions. This is useful in some circumstances (responding to a #UD fault). Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
x86_decode_insn() doesn't return X86EMUL_* values, so the check for X86EMUL_PROPOGATE_FAULT will always fail. There is a proper check later on, so there is no need for a replacement for this code. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Jan Kiszka 提交于
This warning was once used for debugging QEMU user space. Though uncommon, it is actually possible to send an INIT request to a running VCPU. So better drop this warning before someone misuses it to flood kernel logs this way. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Glauber Costa 提交于
When a vcpu is reset, kvmclock page keeps being written to this days. This is wrong and inconsistent: a cpu reset should take it to its initial state. Signed-off-by: NGlauber Costa <glommer@redhat.com> CC: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 john cooper 提交于
A correction to Intel cpu model CPUID data (patch queued) caused winxp to BSOD when booted with a Penryn model. This was traced to the CPUID "model" field correction from 6 -> 23 (as is proper for a Penryn class of cpu). Only in this case does the problem surface. The cause for this failure is winxp accessing the BBL_CR_CTL3 MSR which is unsupported by current kvm, appears to be a legacy MSR not fully characterized yet existing in current silicon, and is apparently carried forward in MSR space to accommodate vintage code as here. It is not yet conclusive whether this MSR implements any of its legacy functionality or is just an ornamental dud for compatibility. While I found no silicon version specific documentation link to this MSR, a general description exists in Intel's developer's reference which agrees with the functional behavior of other bootloader/kernel code I've examined accessing BBL_CR_CTL3. Regrettably winxp appears to be setting bit #19 called out as "reserved" in the above document. So to minimally accommodate this MSR, kvm msr get will provide the equivalent mock data and kvm msr write will simply toss the guest passed data without interpretation. While this treatment of BBL_CR_CTL3 addresses the immediate problem, the approach may be modified pending clarification from Intel. Signed-off-by: Njohn cooper <john.cooper@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Xiao Guangrong 提交于
Currently we keep track of only two states: guest mode and host mode. This patch adds an "exiting guest mode" state that tells us that an IPI will happen soon, so unless we need to wait for the IPI, we can avoid it completely. Also 1: No need atomically to read/write ->mode in vcpu's thread 2: reorganize struct kvm_vcpu to make ->mode and ->requests in the same cache line explicitly Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Jan Kiszka 提交于
This case is a pure user space error we do not need to record. Moreover, it can be misused to flood the kernel log. Remove it. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Xiao Guangrong 提交于
Fix: [ 1001.499596] =================================================== [ 1001.499599] [ INFO: suspicious rcu_dereference_check() usage. ] [ 1001.499601] --------------------------------------------------- [ 1001.499604] include/linux/kvm_host.h:301 invoked rcu_dereference_check() without protection! ...... [ 1001.499636] Pid: 6035, comm: qemu-system-x86 Not tainted 2.6.37-rc6+ #62 [ 1001.499638] Call Trace: [ 1001.499644] [] lockdep_rcu_dereference+0x9d/0xa5 [ 1001.499653] [] gfn_to_memslot+0x8d/0xc8 [kvm] [ 1001.499661] [] gfn_to_hva+0x16/0x3f [kvm] [ 1001.499669] [] kvm_read_guest_page+0x1e/0x5e [kvm] [ 1001.499681] [] kvm_read_guest_page_mmu+0x53/0x5e [kvm] [ 1001.499699] [] load_pdptrs+0x3f/0x9c [kvm] [ 1001.499705] [] ? vmx_set_cr0+0x507/0x517 [kvm_intel] [ 1001.499717] [] kvm_arch_vcpu_ioctl_set_sregs+0x1f3/0x3c0 [kvm] [ 1001.499727] [] kvm_vcpu_ioctl+0x6a5/0xbc5 [kvm] Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
Instead of exchanging the guest and host rcx, have separate storage for each. This allows us to avoid using the xchg instruction, which is is a little slower than normal operations. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
Change push top-of-stack pop guest-rcx pop dummy to pop guest-rcx which is the same thing, only simpler. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Rik van Riel 提交于
On some CPUs, a ple_gap of 41 is simply insufficient to ever trigger PLE exits, even with the minimalistic PLE test from kvm-unit-tests. http://git.kernel.org/?p=virt/kvm/kvm-unit-tests.git;a=commitdiff;h=eda71b28fa122203e316483b35f37aaacd42f545 For example, the Xeon X5670 CPU needs a ple_gap of at least 48 in order to get pause loop exits: # modprobe kvm_intel ple_gap=47 # taskset 1 /usr/local/bin/qemu-system-x86_64 \ -device testdev,chardev=log -chardev stdio,id=log \ -kernel x86/vmexit.flat -append ple-round-robin -smp 2 VNC server running on `::1:5900' enabling apic enabling apic ple-round-robin 58298446 # rmmod kvm_intel # modprobe kvm_intel ple_gap=48 # taskset 1 /usr/local/bin/qemu-system-x86_64 \ -device testdev,chardev=log -chardev stdio,id=log \ -kernel x86/vmexit.flat -append ple-round-robin -smp 2 VNC server running on `::1:5900' enabling apic enabling apic ple-round-robin 36616 Increase the ple_gap to 128 to be on the safe side. Signed-off-by: NRik van Riel <riel@redhat.com> Acked-by: NZhai, Edwin <edwin.zhai@intel.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Joerg Roedel 提交于
This patch adds the necessary code to run perf-kvm on AMD machines. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Avi Kivity 提交于
When emulating real mode, we fake some state: - tr.base points to a fake vm86 tss - segment registers are made to conform to vm86 restrictions change vmx_get_segment() not to expose this fake state to userspace; instead, return the original state. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
When emulating real mode we play with tr hidden state, but leave tr.selector alone. That works well, except for save/restore, since loading TR writes it to the hidden state in vmx->rmode. Fix by also saving and restoring the tr selector; this makes things more consistent and allows migration to work during the early boot stages of Windows XP. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Sedat Dilek 提交于
WARNING: arch/x86/built-in.o(.text+0x1bb74): Section mismatch in reference from the function kvm_guest_cpu_online() to the function .cpuinit.text:kvm_guest_cpu_init() The function kvm_guest_cpu_online() references the function __cpuinit kvm_guest_cpu_init(). This is often because kvm_guest_cpu_online lacks a __cpuinit annotation or the annotation of kvm_guest_cpu_init is wrong. This patch fixes the warning. Tested with linux-next (next-20101231) Signed-off-by: NSedat Dilek <sedat.dilek@gmail.com> Acked-by: NRik van Riel <riel@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
Instead, drop large mappings, which were the reason we dropped shadow. Signed-off-by: NAvi Kivity <avi@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 16 3月, 2011 2 次提交
-
-
由 Boris Ostrovsky 提交于
Support for Always Running APIC timer (ARAT) was introduced in commit db954b58. This feature allows us to avoid switching timers from LAPIC to something else (e.g. HPET) and go into timer broadcasts when entering deep C-states. AMD processors don't provide a CPUID bit for that feature but they also keep APIC timers running in deep C-states (except for cases when the processor is affected by erratum 400). Therefore we should set ARAT feature bit on AMD CPUs. Tested-by: NBorislav Petkov <borislav.petkov@amd.com> Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NMark Langsdorf <mark.langsdorf@amd.com> Acked-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@amd.com> LKML-Reference: <1300205624-4813-1-git-send-email-ostr@amd64.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
Commit 7f74f8f2 (x86 quirk: Fix polarity for IRQ0 pin2 override on SB800 systems) introduced a regression. It removed some SB600 specific code to determine the revision ID without adapting a corresponding revision ID check for SB600. See this mail thread: http://marc.info/?l=linux-kernel&m=129980296006380&w=2 This patch adapts the corresponding check to cover all SB600 revisions. Tested-by: NWang Lei <f3d27b@gmail.com> Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@kernel.org # 38.x, 37.x, 32.x LKML-Reference: <20110315143137.GD29499@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 15 3月, 2011 6 次提交
-
-
由 Mathieu Desnoyers 提交于
Intel Archiecture Software Developer's Manual section 7.1.3 specifies that a core serializing instruction such as "cpuid" should be executed on _each_ core before the new instruction is made visible. Failure to do so can lead to unspecified behavior (Intel XMC erratas include General Protection Fault in the list), so we should avoid this at all cost. This problem can affect modified code executed by interrupt handlers after interrupt are re-enabled at the end of stop_machine, because no core serializing instruction is executed between the code modification and the moment interrupts are reenabled. Because stop_machine_text_poke performs the text modification from the first CPU decrementing stop_machine_first, modified code executed in thread context is also affected by this problem. To explain why, we have to split the CPUs in two categories: the CPU that initiates the text modification (calls text_poke_smp) and all the others. The scheduler, executed on all other CPUs after stop_machine, issues an "iret" core serializing instruction, and therefore handles core serialization for all these CPUs. However, the text modification initiator can continue its execution on the same thread and access the modified text without any scheduler call. Given that the CPU that initiates the code modification is not guaranteed to be the one actually performing the code modification, it falls into the XMC errata. Q: Isn't this executed from an IPI handler, which will return with IRET (a serializing instruction) anyway? A: No, now stop_machine uses per-cpu workqueue, so that handler will be executed from worker threads. There is no iret anymore. Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20110303160137.GB1590@Krystal> Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: <stable@kernel.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Xiao Guangrong 提交于
native_flush_tlb_others() is called from: flush_tlb_current_task() flush_tlb_mm() flush_tlb_page() All these functions disable preemption explicitly, so we can use smp_processor_id() instead of get_cpu() and put_cpu(). Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Cliff Wickman <cpw@sgi.com> LKML-Reference: <4D7EC791.4040003@cn.fujitsu.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Aneesh Kumar K.V 提交于
This patch add new syscalls to x86_64 Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Aneesh Kumar K.V 提交于
This patch adds new syscalls to x86_32 Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Rafael J. Wysocki 提交于
The variable pm_flags is used to prevent APM from being enabled along with ACPI, which would lead to problems. However, acpi_init() is always called before apm_init() and after acpi_init() has returned, it is known whether or not ACPI will be used. Namely, if acpi_disabled is not set after acpi_init() has returned, this means that ACPI is enabled. Thus, it is sufficient to check acpi_disabled in apm_init() to prevent APM from being enabled in parallel with ACPI. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Acked-by: NLen Brown <len.brown@intel.com>
-
由 Rafael J. Wysocki 提交于
From the users' point of view CONFIG_PM is really only used for making it possible to set CONFIG_SUSPEND, CONFIG_HIBERNATION, CONFIG_PM_RUNTIME and (surprisingly enough) CONFIG_XEN_SAVE_RESTORE (CONFIG_PM_OPP also depends on CONFIG_PM, but quite artificially). However, both CONFIG_SUSPEND and CONFIG_HIBERNATION require platform support (independent of CONFIG_PM) and it is not quite obvious that CONFIG_PM has to be set for CONFIG_XEN_SAVE_RESTORE to be available. Thus, from the users' point of view, it would be more logical to automatically select CONFIG_PM if any of the above options depending on it are set. Make CONFIG_PM depend on (CONFIG_PM_SLEEP || CONFIG_PM_RUNTIME), which will cause it to be selected when any of CONFIG_SUSPEND, CONFIG_HIBERNATION, CONFIG_PM_RUNTIME, CONFIG_XEN_SAVE_RESTORE is set and will clarify its meaning. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
-
- 14 3月, 2011 3 次提交
-
-
由 Stefano Stabellini 提交于
If there is no proper PFN value in the M2P for the MFN (so we get 0xFFFFF.. or 0x55555, or 0x0), we should consult the M2P override to see if there is an entry for this. [Note: we also consult the M2P override if the MFN is past our machine_to_phys size]. We consult the P2M with the PFN. In case the returned MFN is one of the special values: 0xFFF.., 0x5555 (which signify that the MFN can be either "missing" or it belongs to DOMID_IO) or the p2m(m2p(mfn)) != mfn, we check the M2P override. If we fail the M2P override check, we reset the PFN value to INVALID_P2M_ENTRY. Next we try to find the MFN in the P2M using the MFN value (not the PFN value) and if found, we know that this MFN is an identity value and return it as so. Otherwise we have exhausted all the posibilities and we return the PFN, which at this stage can either be a real PFN value found in the machine_to_phys.. array, or INVALID_P2M_ENTRY value. [v1: Added Review-by tag] Reviewed-by: NIan Campbell <ian.campbell@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Konrad Rzeszutek Wilk 提交于
.. beyound what we think is the end of memory. However there might be more System RAM - but assigned to a guest. Hence jump to the M2P override check and consult. [v1: Added Review-by tag] Reviewed-by: NIan Campbell <ian.campbell@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Konrad Rzeszutek Wilk 提交于
Only enabled if XEN_DEBUG is enabled. We print a warning when: pfn_to_mfn(pfn) == pfn, but no VM_IO (_PAGE_IOMAP) flag set (and pfn is an identity mapped pfn) pfn_to_mfn(pfn) != pfn, and VM_IO flag is set. (ditto, pfn is an identity mapped pfn) [v2: Make it dependent on CONFIG_XEN_DEBUG instead of ..DEBUG_FS] [v3: Fix compiler warning] Reviewed-by: NIan Campbell <ian.campbell@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-