- 14 9月, 2015 1 次提交
-
-
由 Dave Hansen 提交于
There are two concepts that have some confusing naming: 1. Extended State Component numbers (currently called XFEATURE_BIT_*) 2. Extended State Component masks (currently called XSTATE_*) The numbers are (currently) from 0-9. State component 3 is the bounds registers for MPX, for instance. But when we want to enable "state component 3", we go set a bit in XCR0. The bit we set is 1<<3. We can check to see if a state component feature is enabled by looking at its bit. The current 'xfeature_bit's are at best xfeature bit _numbers_. Calling them bits is at best inconsistent with ending the enum list with 'XFEATURES_NR_MAX'. This patch renames the enum to be 'xfeature'. These also happen to be what the Intel documentation calls a "state component". We also want to differentiate these from the "XSTATE_*" macros. The "XSTATE_*" macros are a mask, and we rename them to match. These macros are reasonably widely used so this patch is a wee bit big, but this really is just a rename. The only non-mechanical part of this is the s/XSTATE_EXTEND_MASK/XFEATURE_MASK_EXTEND/ We need a better name for it, but that's another patch. Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: dave@sr71.net Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20150902233126.38653250@viggo.jf.intel.com [ Ported to v4.3-rc1. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 11 9月, 2015 1 次提交
-
-
由 Dave Young 提交于
There are two kexec load syscalls, kexec_load another and kexec_file_load. kexec_file_load has been splited as kernel/kexec_file.c. In this patch I split kexec_load syscall code to kernel/kexec.c. And add a new kconfig option KEXEC_CORE, so we can disable kexec_load and use kexec_file_load only, or vice verse. The original requirement is from Ted Ts'o, he want kexec kernel signature being checked with CONFIG_KEXEC_VERIFY_SIG enabled. But kexec-tools use kexec_load syscall can bypass the checking. Vivek Goyal proposed to create a common kconfig option so user can compile in only one syscall for loading kexec kernel. KEXEC/KEXEC_FILE selects KEXEC_CORE so that old config files still work. Because there's general code need CONFIG_KEXEC_CORE, so I updated all the architecture Kconfig with a new option KEXEC_CORE, and let KEXEC selects KEXEC_CORE in arch Kconfig. Also updated general kernel code with to kexec_load syscall. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NDave Young <dyoung@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Petr Tesarik <ptesarik@suse.cz> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Josh Boyer <jwboyer@fedoraproject.org> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 9月, 2015 1 次提交
-
-
由 Vlastimil Babka 提交于
alloc_pages_exact_node() was introduced in commit 6484eb3e ("page allocator: do not check NUMA node ID when the caller knows the node is valid") as an optimized variant of alloc_pages_node(), that doesn't fallback to current node for nid == NUMA_NO_NODE. Unfortunately the name of the function can easily suggest that the allocation is restricted to the given node and fails otherwise. In truth, the node is only preferred, unless __GFP_THISNODE is passed among the gfp flags. The misleading name has lead to mistakes in the past, see for example commits 5265047a ("mm, thp: really limit transparent hugepage allocation to local node") and b360edb4 ("mm, mempolicy: migrate_to_node should only migrate to node"). Another issue with the name is that there's a family of alloc_pages_exact*() functions where 'exact' means exact size (instead of page order), which leads to more confusion. To prevent further mistakes, this patch effectively renames alloc_pages_exact_node() to __alloc_pages_node() to better convey that it's an optimized variant of alloc_pages_node() not intended for general usage. Both functions get described in comments. It has been also considered to really provide a convenience function for allocations restricted to a node, but the major opinion seems to be that __GFP_THISNODE already provides that functionality and we shouldn't duplicate the API needlessly. The number of users would be small anyway. Existing callers of alloc_pages_exact_node() are simply converted to call __alloc_pages_node(), with the exception of sba_alloc_coherent() which open-codes the check for NUMA_NO_NODE, so it is converted to use alloc_pages_node() instead. This means it no longer performs some VM_BUG_ON checks, and since the current check for nid in alloc_pages_node() uses a 'nid < 0' comparison (which includes NUMA_NO_NODE), it may hide wrong values which would be previously exposed. Both differences will be rectified by the next patch. To sum up, this patch makes no functional changes, except temporarily hiding potentially buggy callers. Restricting the checks in alloc_pages_node() is left for the next patch which can in turn expose more existing buggy callers. Signed-off-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NRobin Holt <robinmholt@gmail.com> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NChristoph Lameter <cl@linux.com> Acked-by: NMichael Ellerman <mpe@ellerman.id.au> Cc: Mel Gorman <mgorman@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Cliff Whickman <cpw@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 9月, 2015 3 次提交
-
-
由 Valdis Kletnieks 提交于
Compiler warning: CC [M] arch/x86/kvm/emulate.o arch/x86/kvm/emulate.c: In function "__do_insn_fetch_bytes": arch/x86/kvm/emulate.c:814:9: warning: "linear" may be used uninitialized in this function [-Wmaybe-uninitialized] GCC is smart enough to realize that the inlined __linearize may return before setting the value of linear, but not smart enough to realize the same X86EMU_CONTINUE blocks actual use of the value. However, the value of 'linear' can only be set to one value, so hoisting the one line of code upwards makes GCC happy with the code. Reported-by: NAruna Hewapathirane <aruna.hewapathirane@gmail.com> Tested-by: NAruna Hewapathirane <aruna.hewapathirane@gmail.com> Signed-off-by: NValdis Kletnieks <valdis.kletnieks@vt.edu> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Alexander Kuleshov 提交于
The process_smi_save_seg_64() function called only in the process_smi_save_state_64() if the CONFIG_X86_64 is set. This patch adds #ifdef CONFIG_X86_64 around process_smi_save_seg_64() to prevent following warning message: arch/x86/kvm/x86.c:5946:13: warning: ‘process_smi_save_seg_64’ defined but not used [-Wunused-function] static void process_smi_save_seg_64(struct kvm_vcpu *vcpu, char *buf, int n) ^ Signed-off-by: NAlexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
This does not show up on all compiler versions, so it sneaked into the first 4.3 pull request. The fix is to mimic the logic of the "print sptes" loop in the "fill array" loop. Then leaf and root can be both initialized unconditionally. Note that "leaf" now points to the first unused element of the array, not the last filled element. Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 15 8月, 2015 1 次提交
-
-
由 Andy Lutomirski 提交于
VMX encodes access rights differently from LAR, and the latter is most likely what x86 people think of when they think of "access rights". Rename them to avoid confusion. Cc: kvm@vger.kernel.org Signed-off-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 11 8月, 2015 2 次提交
-
-
由 Wei Huang 提交于
According to AMD programmer's manual, AMD PERFCTRn is 64-bit MSR which, unlike Intel perf counters, doesn't require signed extension. This patch removes the unnecessary conversion in SVM vPMU code when PERFCTRn is being updated. Signed-off-by: NWei Huang <wei@redhat.com> Reviewed-by: NAndrew Jones <drjones@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nicholas Krause 提交于
This fixes error handling in the function kvm_lapic_sync_from_vapic by checking if the call to kvm_read_guest_cached has returned a error code to signal to its caller the call to this function has failed and due to this we must immediately return to the caller of kvm_lapic_sync_from_vapic to avoid incorrectly call apic_set_tpc if a error has occurred here. Signed-off-by: NNicholas Krause <xerofoify@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 07 8月, 2015 2 次提交
-
-
由 Haozhong Zhang 提交于
When kvm_set_msr_common() handles a guest's write to MSR_IA32_TSC_ADJUST, it will calcuate an adjustment based on the data written by guest and then use it to adjust TSC offset by calling a call-back adjust_tsc_offset(). The 3rd parameter of adjust_tsc_offset() indicates whether the adjustment is in host TSC cycles or in guest TSC cycles. If SVM TSC scaling is enabled, adjust_tsc_offset() [i.e. svm_adjust_tsc_offset()] will first scale the adjustment; otherwise, it will just use the unscaled one. As the MSR write here comes from the guest, the adjustment is in guest TSC cycles. However, the current kvm_set_msr_common() uses it as a value in host TSC cycles (by using true as the 3rd parameter of adjust_tsc_offset()), which can result in an incorrect adjustment of TSC offset if SVM TSC scaling is enabled. This patch fixes this problem. Signed-off-by: NHaozhong Zhang <haozhong.zhang@intel.com> Cc: stable@vger.linux.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The recent BlackHat 2015 presentation "The Memory Sinkhole" mentions that the IDT limit is zeroed on entry to SMM. This is not documented, and must have changed some time after 2010 (see http://www.ssi.gouv.fr/uploads/IMG/pdf/IT_Defense_2010_final.pdf). KVM was not doing it, but the fix is easy. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 05 8月, 2015 11 次提交
-
-
由 Xiao Guangrong 提交于
The logic used to check ept misconfig is completely contained in common reserved bits check for sptes, so it can be removed Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
The #PF with PFEC.RSV = 1 is designed to speed MMIO emulation, however, it is possible that the RSV #PF is caused by real BUG by mis-configure shadow page table entries This patch enables full check for the zero bits on shadow page table entries (which includes not only bits reserved by the hardware, but also bits that will never be set in the SPTE), then dump the shadow page table hierarchy. Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
We have the same data struct to check reserved bits on guest page tables and shadow page tables, split is_rsvd_bits_set() so that the logic can be shared between these two paths Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
We have abstracted the data struct and functions which are used to check reserved bit on guest page tables, now we extend the logic to check zero bits on shadow page tables The zero bits on sptes include not only reserved bits on hardware but also the bits that SPTEs willnever use. For example, shadow pages will never use GB pages unless the guest uses them too. Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
Since shadow ept page tables and Intel nested guest page tables have the same format, split reset_rsvds_bits_mask_ept so that the logic can be reused by later patches which check zero bits on sptes Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
Since softmmu & AMD nested shadow page tables and guest page tables have the same format, split reset_rsvds_bits_mask so that the logic can be reused by later patches which check zero bits on sptes Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
These two fields, rsvd_bits_mask and bad_mt_xwr, in "struct kvm_mmu" are used to check if reserved bits set on guest ptes, move them to a data struct so that the approach can be applied to check host shadow page table entries as well Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
FNAME(is_rsvd_bits_set) does not depend on guest mmu mode, move it to mmu.c to stop being compiled multiple times Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
We got the bug that qemu complained with "KVM: unknown exit, hardware reason 31" and KVM shown these info: [84245.284948] EPT: Misconfiguration. [84245.285056] EPT: GPA: 0xfeda848 [84245.285154] ept_misconfig_inspect_spte: spte 0x5eaef50107 level 4 [84245.285344] ept_misconfig_inspect_spte: spte 0x5f5fadc107 level 3 [84245.285532] ept_misconfig_inspect_spte: spte 0x5141d18107 level 2 [84245.285723] ept_misconfig_inspect_spte: spte 0x52e40dad77 level 1 This is because we got a mmio #PF and the handler see the mmio spte becomes normal (points to the ram page) However, this is valid after introducing fast mmio spte invalidation which increases the generation-number instead of zapping mmio sptes, a example is as follows: 1. QEMU drops mmio region by adding a new memslot 2. invalidate all mmio sptes 3. VCPU 0 VCPU 1 access the invalid mmio spte access the region originally was MMIO before set the spte to the normal ram map mmio #PF check the spte and see it becomes normal ram mapping !!! This patch fixes the bug just by dropping the check in mmio handler, it's good for backport. Full check will be introduced in later patches Reported-by: NPavel Shirshov <ru.pchel@gmail.com> Tested-by: NPavel Shirshov <ru.pchel@gmail.com> Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Alex Williamson 提交于
The patch was munged on commit to re-order these tests resulting in excessive warnings when trying to do device assignment. Return to original ordering: https://lkml.org/lkml/2015/7/15/769 Fixes: 3e5d2fdc ("KVM: MTRR: simplify kvm_mtrr_get_guest_memory_type") Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Reviewed-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Alex Williamson 提交于
The patch was munged on commit to re-order these tests resulting in excessive warnings when trying to do device assignment. Return to original ordering: https://lkml.org/lkml/2015/7/15/769 Fixes: 3e5d2fdc ("KVM: MTRR: simplify kvm_mtrr_get_guest_memory_type") Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Reviewed-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 30 7月, 2015 1 次提交
-
-
由 Paolo Bonzini 提交于
The memory barriers are trying to protect against concurrent RCU-based interrupt injection, but the IRQ routing table is not valid at the time kvm->arch.vpic is written. Fix this by writing kvm->arch.vpic last. kvm_destroy_pic then need not set kvm->arch.vpic to NULL; modify it to take a struct kvm_pic* and reuse it if the IOAPIC creation fails. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 29 7月, 2015 2 次提交
-
-
由 Paolo Bonzini 提交于
There is no smp_rmb matching the smp_wmb. shared_msr_update is called from hardware_enable, which in turn is called via on_each_cpu. on_each_cpu and must imply a read memory barrier (on x86 the rmb is achieved simply through asm volatile in native_apic_mem_write). Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
This is another remnant of ia64 support. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 23 7月, 2015 11 次提交
-
-
由 Paolo Bonzini 提交于
We can disable CD unconditionally when there is no assigned device. KVM now forces guest PAT to all-writeback in that case, so it makes sense to also force CR0.CD=0. When there are assigned devices, emulate cache-disabled operation through the page tables. This behavior is consistent with VMX microcode, where CD/NW are not touched by vmentry/vmexit. However, keep this dependent on the quirk because OVMF enables the caches too late. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Mihai Donțu 提交于
Allow a nested hypervisor to single step its guests. Signed-off-by: NMihai Donțu <mihai.dontu@gmail.com> [Fix overlong line. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andrey Smetanin 提交于
Sending of notification is done by exiting vcpu to user space if KVM_REQ_HV_CRASH is enabled for vcpu. At exit to user space the kvm_run structure contains system_event with type KVM_SYSTEM_EVENT_CRASH to notify about guest crash occurred. Signed-off-by: NAndrey Smetanin <asmetanin@virtuozzo.com> Signed-off-by: NDenis V. Lunev <den@openvz.org> Reviewed-by: NPeter Hornyack <peterhornyack@google.com> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Gleb Natapov <gleb@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andrey Smetanin 提交于
Added kvm Hyper-V context hv crash variables as storage of Hyper-V crash msrs. Signed-off-by: NAndrey Smetanin <asmetanin@virtuozzo.com> Signed-off-by: NDenis V. Lunev <den@openvz.org> Reviewed-by: NPeter Hornyack <peterhornyack@google.com> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Gleb Natapov <gleb@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Andrey Smetanin 提交于
This patch introduce Hyper-V related source code file - hyperv.c and per vm and per vcpu hyperv context structures. All Hyper-V MSR's and hypercall code moved into hyperv.c. All Hyper-V kvm/vcpu fields moved into appropriate hyperv context structures. Copyrights and authors information copied from x86.c to hyperv.c. Signed-off-by: NAndrey Smetanin <asmetanin@virtuozzo.com> Signed-off-by: NDenis V. Lunev <den@openvz.org> Reviewed-by: NPeter Hornyack <peterhornyack@google.com> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Gleb Natapov <gleb@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Eugene Korenevsky 提交于
According to Intel SDM several checks must be applied for memory operands of VMX instructions. Long mode: #GP(0) or #SS(0) depending on the segment must be thrown if the memory address is in a non-canonical form. Protected mode, checks in chronological order: - The segment type must be checked with access type (read or write) taken into account. For write access: #GP(0) must be generated if the destination operand is located in a read-only data segment or any code segment. For read access: #GP(0) must be generated if if the source operand is located in an execute-only code segment. - Usability of the segment must be checked. #GP(0) or #SS(0) depending on the segment must be thrown if the segment is unusable. - Limit check. #GP(0) or #SS(0) depending on the segment must be thrown if the memory operand effective address is outside the segment limit. Signed-off-by: NEugene Korenevsky <ekorenevsky@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Make them clearly architecture-dependent; the capability is valid for all architectures, but the argument is not. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
OVMF depends on WB to boot fast, because it only clears caches after it has set up MTRRs---which is too late. Let's do writeback if CR0.CD is set to make it happy, similar to what SVM is already doing. Signed-off-by: NXiao Guangrong <guangrong.xiao@intel.com> Tested-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The logic of the disabled_quirks field usually results in a double negation. Wrap it in a simple function that checks the bit and negates it. Based on a patch from Xiao Guangrong. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
kvm_mtrr_get_guest_memory_type never returns -1 which is implied in the current code since if @type = -1 (means no MTRR contains the range), iter.partial_map must be true Simplify the code to indicate this fact Signed-off-by: NXiao Guangrong <guangrong.xiao@intel.com> Tested-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
Currently code uses default memory type if MTRR is fully disabled, fix it by using UC instead. Signed-off-by: NXiao Guangrong <guangrong.xiao@intel.com> Tested-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 10 7月, 2015 4 次提交
-
-
由 Wanpeng Li 提交于
[ 68.196974] WARNING: CPU: 1 PID: 2140 at arch/x86/kvm/x86.c:3161 kvm_arch_vcpu_ioctl+0xe88/0x1340 [kvm]() [ 68.196975] Modules linked in: snd_hda_codec_hdmi i915 rfcomm bnep bluetooth i2c_algo_bit rfkill nfsd drm_kms_helper nfs_acl nfs drm lockd grace sunrpc fscache snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_dummy snd_seq_oss x86_pkg_temp_thermal snd_seq_midi kvm_intel snd_seq_midi_event snd_rawmidi kvm snd_seq ghash_clmulni_intel fuse snd_timer aesni_intel parport_pc ablk_helper snd_seq_device cryptd ppdev snd lp parport lrw dcdbas gf128mul i2c_core glue_helper lpc_ich video shpchp mfd_core soundcore serio_raw acpi_cpufreq ext4 mbcache jbd2 sd_mod crc32c_intel ahci libahci libata e1000e ptp pps_core [ 68.197005] CPU: 1 PID: 2140 Comm: qemu-system-x86 Not tainted 4.2.0-rc1+ #2 [ 68.197006] Hardware name: Dell Inc. OptiPlex 7020/0F5C5X, BIOS A03 01/08/2015 [ 68.197007] ffffffffa03b0657 ffff8800d984bca8 ffffffff815915a2 0000000000000000 [ 68.197009] 0000000000000000 ffff8800d984bce8 ffffffff81057c0a 00007ff6d0001000 [ 68.197010] 0000000000000002 ffff880211c1a000 0000000000000004 ffff8800ce0288c0 [ 68.197012] Call Trace: [ 68.197017] [<ffffffff815915a2>] dump_stack+0x45/0x57 [ 68.197020] [<ffffffff81057c0a>] warn_slowpath_common+0x8a/0xc0 [ 68.197022] [<ffffffff81057cfa>] warn_slowpath_null+0x1a/0x20 [ 68.197029] [<ffffffffa037bed8>] kvm_arch_vcpu_ioctl+0xe88/0x1340 [kvm] [ 68.197035] [<ffffffffa037aede>] ? kvm_arch_vcpu_load+0x4e/0x1c0 [kvm] [ 68.197040] [<ffffffffa03696a6>] kvm_vcpu_ioctl+0xc6/0x5c0 [kvm] [ 68.197043] [<ffffffff811252d2>] ? perf_pmu_enable+0x22/0x30 [ 68.197044] [<ffffffff8112663e>] ? perf_event_context_sched_in+0x7e/0xb0 [ 68.197048] [<ffffffff811a6882>] do_vfs_ioctl+0x2c2/0x4a0 [ 68.197050] [<ffffffff8107bf33>] ? finish_task_switch+0x173/0x220 [ 68.197053] [<ffffffff8123307f>] ? selinux_file_ioctl+0x4f/0xd0 [ 68.197055] [<ffffffff8122cac3>] ? security_file_ioctl+0x43/0x60 [ 68.197057] [<ffffffff811a6ad9>] SyS_ioctl+0x79/0x90 [ 68.197060] [<ffffffff81597e57>] entry_SYSCALL_64_fastpath+0x12/0x6a [ 68.197061] ---[ end trace 558a5ebf9445fc80 ]--- After commit (0c4109be 'x86/fpu/xstate: Fix up bad get_xsave_addr() assumptions'), there is no assumption an xsave bit is present in the hardware (pcntxt_mask) that it is always present in a given xsave buffer. An enabled state to be present on 'pcntxt_mask', but *not* in 'xstate_bv' could happen when the last 'xsave' did not request that this feature be saved (unlikely) or because the "init optimization" caused it to not be saved. This patch kill the assumption. Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Currently guest MTRR is avoided if kvm_is_reserved_pfn returns true. However, the guest could prefer a different page type than UC for such pages. A good example is that pass-throughed VGA frame buffer is not always UC as host expected. This patch enables full use of virtual guest MTRRs. Suggested-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Tested-by: Joerg Roedel <jroedel@suse.de> (on AMD) Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan Kiszka 提交于
When hardware supports the g_pat VMCB field, we can use it for emulating the PAT configuration that the guest configures by writing to the corresponding MSR. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Tested-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Right now, NPT page attributes are not used, and the final page attribute depends solely on gPAT (which however is not synced correctly), the guest MTRRs and the guest page attributes. However, we can do better by mimicking what is done for VMX. In the absence of PCI passthrough, the guest PAT can be ignored and the page attributes can be just WB. If passthrough is being used, instead, keep respecting the guest PAT, and emulate the guest MTRRs through the PAT field of the nested page tables. The only snag is that WP memory cannot be emulated correctly, because Linux's default PAT setting only includes the other types. Tested-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-