- 21 1月, 2015 3 次提交
-
-
由 Borislav Petkov 提交于
arch/x86/kvm/emulate.c: In function ‘check_cr_write’: arch/x86/kvm/emulate.c:3552:4: warning: left shift count >= width of type rsvd = CR3_L_MODE_RESERVED_BITS & ~CR3_PCID_INVD; happens because sizeof(UL) on 32-bit is 4 bytes but we shift it 63 bits to the left. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Marcelo Tosatti 提交于
SuSE's 2.6.16 kernel fails to boot if the delta between tsc_timestamp and rdtsc is larger than a given threshold: * If we get more than the below threshold into the future, we rerequest * the real time from the host again which has only little offset then * that we need to adjust using the TSC. * * For now that threshold is 1/5th of a jiffie. That should be good * enough accuracy for completely broken systems, but also give us swing * to not call out to the host all the time. */ #define PVCLOCK_DELTA_MAX ((1000000000ULL / HZ) / 5) Disable masterclock support (which increases said delta) in case the boot vcpu does not use MSR_KVM_SYSTEM_TIME_NEW. Upstreams kernels which support pvclock vsyscalls (and therefore make use of PVCLOCK_STABLE_BIT) use MSR_KVM_SYSTEM_TIME_NEW. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Fengguang Wu 提交于
arch/x86/kvm/x86.c:495:5: sparse: symbol 'kvm_read_nested_guest_page' was not declared. Should it be static? arch/x86/kvm/x86.c:646:5: sparse: symbol '__kvm_set_xcr' was not declared. Should it be static? arch/x86/kvm/x86.c:1183:15: sparse: symbol 'max_tsc_khz' was not declared. Should it be static? arch/x86/kvm/x86.c:1237:6: sparse: symbol 'kvm_track_tsc_matching' was not declared. Should it be static? Signed-off-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 19 1月, 2015 2 次提交
-
-
由 Kai Huang 提交于
No TLB flush is needed when there's no valid rmap in memory slot. Signed-off-by: NKai Huang <kai.huang@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Rickard Strandqvist 提交于
Removes some functions that are not used anywhere: cpu_has_vmx_eptp_writeback() cpu_has_vmx_eptp_uncacheable() This was partially found by using a static code analysis program called cppcheck. Signed-off-by: NRickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 1月, 2015 23 次提交
-
-
由 Nadav Amit 提交于
When emulating an instruction that reads the destination memory operand (i.e., instructions without the Mov flag in the emulator), the operand is first read. If a page-fault is detected in this phase, the error-code which would be delivered to the VM does not indicate that the access that caused the exception is a write one. This does not conform with real hardware, and may cause the VM to enter the page-fault handler twice for no reason (once for read, once for write). Signed-off-by: NNadav Amit <namit@cs.technion.ac.il> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Kai Huang 提交于
When software changes D bit (either from 1 to 0, or 0 to 1), the corresponding TLB entity in the hardware won't be updated immediately. We should flush it to guarantee the consistence of D bit between TLB and MMU page table in memory. This is especially important when clearing the D bit, since it may cause false negatives in reporting dirtiness. Sanity test was done on my machine with Intel processor. Signed-off-by: NKai Huang <kai.huang@linux.intel.com> [Check A bit too. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
Emulation does not utilize the feature. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nicholas Krause 提交于
Adds a function kvm_vcpu_set_pending_timer instead of calling kvm_make_request in lapic.c. Signed-off-by: NNicholas Krause <xerofoify@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nadav Amit 提交于
When access to descriptor in LDT/GDT wraparound outside long-mode, the address of the descriptor should be truncated to 32-bit. Citing Intel SDM 2.1.1.1 "Global and Local Descriptor Tables in IA-32e Mode": "GDTR and LDTR registers are expanded to 64-bits wide in both IA-32e sub-modes (64-bit mode and compatibility mode)." So in other cases, we need to truncate. Creating new function to return a pointer to descriptor table to avoid too much code duplication. Signed-off-by: NNadav Amit <namit@cs.technion.ac.il> [Wrap 64-bit check with #ifdef CONFIG_X86_64, to avoid a "right shift count >= width of type" warning and consequent undefined behavior. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nadav Amit 提交于
When segment is loaded, the segment access bit is set unconditionally. In fact, it should be set conditionally, based on whether the segment had the accessed bit set before. In addition, it can improve performance. Signed-off-by: NNadav Amit <namit@cs.technion.ac.il> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nadav Amit 提交于
According to Intel SDM: "If the ESP register is used as a base register for addressing a destination operand in memory, the POP instruction computes the effective address of the operand after it increments the ESP register." The current emulation does not behave so. The fix required to waste another of the precious instruction flags and to check the flag in decode_modrm. Signed-off-by: NNadav Amit <namit@cs.technion.ac.il> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nadav Amit 提交于
Currently, if em_call_far fails it returns success instead of the resulting error-code. Fix it. Signed-off-by: NNadav Amit <namit@cs.technion.ac.il> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nadav Amit 提交于
The KVM emulator does not emulate JMP and CALL that target a call gate or a task gate. This patch does not try to implement these scenario as they are presumably rare; yet it returns X86EMUL_UNHANDLEABLE error in such cases instead of generating an exception. Signed-off-by: NNadav Amit <namit@cs.technion.ac.il> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nadav Amit 提交于
Since the operand size of fnstcw and fnstsw is updated during the execution, the emulation may cause spurious exceptions as it reads the memory beforehand. Marking these instructions as Mov (since the previous value is ignored) and DstMem16 to simplify the setting of operand size. Signed-off-by: NNadav Amit <namit@cs.technion.ac.il> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nadav Amit 提交于
Although pop sreg updates RSP according to the operand size, only 2 bytes are read. The current behavior may result in incorrect #GP or #PF exceptions. Signed-off-by: NNadav Amit <namit@cs.technion.ac.il> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
This makes the direction of the conditions consistent with code that is already using WARN_ON. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Because ASSERT is just a printk, these would oops right away. The assertion thus hardly adds anything. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The initialization function in mmu.c can always use walk_mmu, which is known to be vcpu->arch.mmu. Only init_kvm_nested_mmu is used to initialize vcpu->arch.nested_mmu. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
This is, pedantically, not valid C. It also looks weird. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Marcelo Tosatti 提交于
Add tracepoint to wait_lapic_expire. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> [Remind reader if early or late. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Marcelo Tosatti 提交于
For the hrtimer which emulates the tscdeadline timer in the guest, add an option to advance expiration, and busy spin on VM-entry waiting for the actual expiration time to elapse. This allows achieving low latencies in cyclictest (or any scenario which requires strict timing regarding timer expiration). Reduces average cyclictest latency from 12us to 8us on Core i5 desktop. Note: this option requires tuning to find the appropriate value for a particular hardware/guest combination. One method is to measure the average delay between apic_timer_fn and VM-entry. Another method is to start with 1000ns, and increase the value in say 500ns increments until avg cyclictest numbers stop decreasing. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Marcelo Tosatti 提交于
kvm_x86_ops->test_posted_interrupt() returns true/false depending whether 'vector' is set. Next patch makes use of this interface. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Tiejun Chen 提交于
In most cases calling hwapic_isr_update(), we always check if kvm_apic_vid_enabled() == 1, but actually, kvm_apic_vid_enabled() -> kvm_x86_ops->vm_has_apicv() -> vmx_vm_has_apicv() or '0' in svm case -> return enable_apicv && irqchip_in_kernel(kvm) So its a little cost to recall vmx_vm_has_apicv() inside hwapic_isr_update(), here just NULL out hwapic_isr_update() in case of !enable_apicv inside hardware_setup() then make all related stuffs follow this. Note we don't check this under that condition of irqchip_in_kernel() since we should make sure definitely any caller don't work without in-kernel irqchip. Signed-off-by: NTiejun Chen <tiejun.chen@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nicholas Krause 提交于
Remove FIXME comments about needing fault addresses to be returned. These are propaagated from walk_addr_generic to gva_to_gpa and from there to ops->read_std and ops->write_std. Signed-off-by: NNicholas Krause <xerofoify@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Eugene Korenevsky 提交于
When generating #PF VM-exit, check equality: (PFEC & PFEC_MASK) == PFEC_MATCH If there is equality, the 14 bit of exception bitmap is used to take decision about generating #PF VM-exit. If there is inequality, inverted 14 bit is used. Signed-off-by: NEugene Korenevsky <ekorenevsky@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Eugene Korenevsky 提交于
This patch improve checks required by Intel Software Developer Manual. - SMM MSRs are not allowed. - microcode MSRs are not allowed. - check x2apic MSRs only when LAPIC is in x2apic mode. - MSR switch areas must be aligned to 16 bytes. - address of first and last byte in MSR switch areas should not set any bits beyond the processor's physical-address width. Also it adds warning messages on failures during MSR switch. These messages are useful for people who debug their VMMs in nVMX. Signed-off-by: NEugene Korenevsky <ekorenevsky@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wincy Van 提交于
Several hypervisors need MSR auto load/restore feature. We read MSRs from VM-entry MSR load area which specified by L1, and load them via kvm_set_msr in the nested entry. When nested exit occurs, we get MSRs via kvm_get_msr, writing them to L1`s MSR store area. After this, we read MSRs from VM-exit MSR load area, and load them via kvm_set_msr. Signed-off-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 04 1月, 2015 1 次提交
-
-
由 Daniel Borkmann 提交于
Commit a074335a ("x86, um: Mark system call tables readonly") was supposed to mark the sys_call_table in UML as RO by adding the const, but it doesn't have the desired effect as it's nevertheless being placed into the data section since __cacheline_aligned enforces sys_call_table being placed into .data..cacheline_aligned instead. We need to use the ____cacheline_aligned version instead to fix this issue. Before: $ nm -v arch/x86/um/sys_call_table_64.o | grep -1 "sys_call_table" U sys_writev 0000000000000000 D sys_call_table 0000000000000000 D syscall_table_size After: $ nm -v arch/x86/um/sys_call_table_64.o | grep -1 "sys_call_table" U sys_writev 0000000000000000 R sys_call_table 0000000000000000 D syscall_table_size Fixes: a074335a ("x86, um: Mark system call tables readonly") Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NRichard Weinberger <richard@nod.at>
-
- 28 12月, 2014 2 次提交
-
-
由 Paolo Bonzini 提交于
Since most virtual machines raise this message once, it is a bit annoying. Make it KERN_DEBUG severity. Cc: stable@vger.kernel.org Fixes: 7a2e8aafSigned-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Tiejun Chen 提交于
The commit 34a1cd60, "x86: vmx: move some vmx setting from vmx_init() to hardware_setup()", tried to refactor some codes specific to vmx hardware setting into hardware_setup(), but some msr writing should depend on our previous setting condition like enable_apicv, enable_ept and so on. Reported-by: NJamie Heilman <jamie@audible.transient.net> Tested-by: NJamie Heilman <jamie@audible.transient.net> Signed-off-by: NTiejun Chen <tiejun.chen@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 18 12月, 2014 5 次提交
-
-
由 Andy Lutomirski 提交于
It turns out that there's a lurking ABI issue. GCC, when compiling this in a 32-bit program: struct user_desc desc = { .entry_number = idx, .base_addr = base, .limit = 0xfffff, .seg_32bit = 1, .contents = 0, /* Data, grow-up */ .read_exec_only = 0, .limit_in_pages = 1, .seg_not_present = 0, .useable = 0, }; will leave .lm uninitialized. This means that anything in the kernel that reads user_desc.lm for 32-bit tasks is unreliable. Revert the .lm check in set_thread_area(). The value never did anything in the first place. Fixes: 0e58af4e ("x86/tls: Disallow unusual TLS segments") Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org # Only if 0e58af4e is backported Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/d7875b60e28c512f6a6fc0baf5714d58e7eaadbb.1418856405.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Christian Borntraeger 提交于
ACCESS_ONCE does not work reliably on non-scalar types. For example gcc 4.6 and 4.7 might remove the volatile tag for such accesses during the SRA (scalar replacement of aggregates) step (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145) Change the gup code to replace ACCESS_ONCE with READ_ONCE. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Christian Borntraeger 提交于
ACCESS_ONCE does not work reliably on non-scalar types. For example gcc 4.6 and 4.7 might remove the volatile tag for such accesses during the SRA (scalar replacement of aggregates) step (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145) Change the spinlock code to replace ACCESS_ONCE with READ_ONCE. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paolo Bonzini 提交于
They are not used anymore by IA64, move them away. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Linus Torvalds 提交于
My commit 26178ec1 ("x86: mm: consolidate VM_FAULT_RETRY handling") had a really stupid typo: the FAULT_FLAG_USER bit is in the 'flags' variable, not the 'fault' variable. Duh, The one silver lining in this is that Dave finding this at least confirms that trinity actually triggers this special path easily, in a way normal use does not. Reported-by: NDave Jones <davej@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 12月, 2014 4 次提交
-
-
由 Jiang Liu 提交于
Use helpers to access irq_cfg data structure associated with IRQ, instead of accessing irq_data->chip_data directly. Later we can rewrite those helpers to support hierarchy irqdomain. Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Grant Likely <grant.likely@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1414397531-28254-17-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jiang Liu 提交于
Now we have splitted functions to support MSI and HT_IRQ into vector.c, and they have no dependency on IOAPIC any more. So change Kconfig files to make MSI and HT_IRQ independent of X86_IO_APIC. Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1414397531-28254-16-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jiang Liu 提交于
Move IRQ initialization routines from io_apic.c into vector.c, preparing for enabling hierarchy irqdomain. Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Grant Likely <grant.likely@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1414397531-28254-15-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jiang Liu 提交于
Clean up code by moving IOAPIC related declarations from hw_irq.h into io_apic.h. Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Aubrey <aubrey.li@linux.intel.com> Cc: Ryan Desfosses <ryan@desfo.org> Cc: Quentin Lambert <lambert.quentin@gmail.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: http://lkml.kernel.org/r/1414397531-28254-14-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-