- 29 1月, 2013 3 次提交
-
-
由 Yang Zhang 提交于
Virtual interrupt delivery avoids KVM to inject vAPIC interrupts manually, which is fully taken care of by the hardware. This needs some special awareness into existing interrupr injection path: - for pending interrupt, instead of direct injection, we may need update architecture specific indicators before resuming to guest. - A pending interrupt, which is masked by ISR, should be also considered in above update action, since hardware will decide when to inject it at right time. Current has_interrupt and get_interrupt only returns a valid vector from injection p.o.v. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NKevin Tian <kevin.tian@intel.com> Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Yang Zhang 提交于
basically to benefit from apicv, we need to enable virtualized x2apic mode. Currently, we only enable it when guest is really using x2apic. Also, clear MSR bitmap for corresponding x2apic MSRs when guest enabled x2apic: 0x800 - 0x8ff: no read intercept for apicv register virtualization, except APIC ID and TMCCT which need software's assistance to get right value. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NKevin Tian <kevin.tian@intel.com> Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Yang Zhang 提交于
- APIC read doesn't cause VM-Exit - APIC write becomes trap-like Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NKevin Tian <kevin.tian@intel.com> Signed-off-by: NYang Zhang <yang.z.zhang@intel.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 22 1月, 2013 1 次提交
-
-
由 Xiao Guangrong 提交于
The current reexecute_instruction can not well detect the failed instruction emulation. It allows guest to retry all the instructions except it accesses on error pfn For example, some cases are nested-write-protect - if the page we want to write is used as PDE but it chains to itself. Under this case, we should stop the emulation and report the case to userspace Reviewed-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 14 1月, 2013 1 次提交
-
-
由 Takuya Yoshikawa 提交于
Not needed any more. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 03 1月, 2013 1 次提交
-
-
由 Jesse Larrew 提交于
Correct a typo in the comment explaining hypercalls. Signed-off-by: NJesse Larrew <jlarrew@linux.vnet.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 14 12月, 2012 3 次提交
-
-
由 Alex Williamson 提交于
With the 3 private slots, this gives us a nice round 128 slots total. The primary motivation for this is to support more assigned devices. Each assigned device can theoretically use up to 8 slots (6 MMIO BARs, 1 ROM BAR, 1 spare for a split MSI-X table mapping) though it's far more typical for a device to use 3-4 slots. If we assume a typical VM uses a dozen slots for non-assigned devices purposes, we should always be able to support 14 worst case assigned devices or 28 to 37 typical devices. Reviewed-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Alex Williamson 提交于
Seems like everyone copied x86 and defined 4 private memory slots that never actually get used. Even x86 only uses 3 of the 4. These aren't exposed so there's no need to add padding. Reviewed-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Alex Williamson 提交于
It's easy to confuse KVM_MEMORY_SLOTS and KVM_MEM_SLOTS_NUM. One is the user accessible slots and the other is user + private. Make this more obvious. Reviewed-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 12 12月, 2012 3 次提交
-
-
由 Michel Lespinasse 提交于
Update the x86_64 arch_get_unmapped_area[_topdown] functions to make use of vm_unmapped_area() instead of implementing a brute force search. Signed-off-by: NMichel Lespinasse <walken@google.com> Reviewed-by: NRik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andi Kleen 提交于
There was some desire in large applications using MAP_HUGETLB or SHM_HUGETLB to use 1GB huge pages on some mappings, and stay with 2MB on others. This is useful together with NUMA policy: use 2MB interleaving on some mappings, but 1GB on local mappings. This patch extends the IPC/SHM syscall interfaces slightly to allow specifying the page size. It borrows some upper bits in the existing flag arguments and allows encoding the log of the desired page size in addition to the *_HUGETLB flag. When 0 is specified the default size is used, this makes the change fully compatible. Extending the internal hugetlb code to handle this is straight forward. Instead of a single mount it just keeps an array of them and selects the right mount based on the specified page size. When no page size is specified it uses the mount of the default page size. The change is not visible in /proc/mounts because internal mounts don't appear there. It also has very little overhead: the additional mounts just consume a super block, but not more memory when not used. I also exported the new flags to the user headers (they were previously under __KERNEL__). Right now only symbols for x86 and some other architecture for 1GB and 2MB are defined. The interface should already work for all other architectures though. Only architectures that define multiple hugetlb sizes actually need it (that is currently x86, tile, powerpc). However tile and powerpc have user configurable hugetlb sizes, so it's not easy to add defines. A program on those architectures would need to query sysfs and use the appropiate log2. [akpm@linux-foundation.org: cleanups] [rientjes@google.com: fix build] [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: NRik van Riel <riel@redhat.com> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hillf Danton <dhillf@gmail.com> Signed-off-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhang Yanfei 提交于
This removes the sparse warning: arch/x86/kernel/crash.c:49:32: sparse: incompatible types in comparison expression (different address spaces) Reported-by: Nkbuild test robot <fengguang.wu@intel.com> Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 08 12月, 2012 1 次提交
-
-
由 Daniel J Blueman 提交于
Add NumaChip-specific PCI access mechanism via MMCONFIG cycles, but preventing access to AMD Northbridges which shouldn't respond. Signed-off-by: NDaniel J Blueman <daniel@numascale-asia.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 07 12月, 2012 1 次提交
-
-
由 Zhang Yanfei 提交于
This patch provides a way to VMCLEAR VMCSs related to guests on all cpus before executing the VMXOFF when doing kdump. This is used to ensure the VMCSs in the vmcore updated and non-corrupted. Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com> Acked-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 06 12月, 2012 1 次提交
-
-
由 Matthew Garrett 提交于
EFI provides support for providing PCI ROMs via means other than the ROM BAR. This support vanishes after we've exited boot services, so add support for stashing copies of the ROMs in setup_data if they're not otherwise available. Signed-off-by: NMatthew Garrett <mjg@redhat.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Tested-by: NSeth Forshee <seth.forshee@canonical.com>
-
- 05 12月, 2012 1 次提交
-
-
由 Zhang Xiantao 提交于
Bit24 in VMX_EPT_VPID_CAP_MASI is not used for address-specific invalidation capability reporting, so remove it from KVM to avoid conflicts in future. Signed-off-by: NZhang Xiantao <xiantao.zhang@intel.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 01 12月, 2012 4 次提交
-
-
由 Vincent Palatin 提交于
When a cpu enters S3 state, the FPU state is lost. After resuming for S3, if we try to lazy restore the FPU for a process running on the same CPU, this will result in a corrupted FPU context. Ensure that "fpu_owner_task" is properly invalided when (re-)initializing a CPU, so nobody will try to lazy restore a state which doesn't exist in the hardware. Tested with a 64-bit kernel on a 4-core Ivybridge CPU with eagerfpu=off, by doing thousands of suspend/resume cycles with 4 processes doing FPU operations running. Without the patch, a process is killed after a few hundreds cycles by a SIGFPE. Cc: Duncan Laurie <dlaurie@chromium.org> Cc: Olof Johansson <olofj@chromium.org> Cc: <stable@kernel.org> v3.4+ # for 3.4 need to replace this_cpu_write by percpu_write Signed-off-by: NVincent Palatin <vpalatin@chromium.org> Link: http://lkml.kernel.org/r/1354306532-1014-1-git-send-email-vpalatin@chromium.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Will Auld 提交于
CPUID.7.0.EBX[1]=1 indicates IA32_TSC_ADJUST MSR 0x3b is supported Basic design is to emulate the MSR by allowing reads and writes to a guest vcpu specific location to store the value of the emulated MSR while adding the value to the vmcs tsc_offset. In this way the IA32_TSC_ADJUST value will be included in all reads to the TSC MSR whether through rdmsr or rdtsc. This is of course as long as the "use TSC counter offsetting" VM-execution control is enabled as well as the IA32_TSC_ADJUST control. However, because hardware will only return the TSC + IA32_TSC_ADJUST + vmsc tsc_offset for a guest process when it does and rdtsc (with the correct settings) the value of our virtualized IA32_TSC_ADJUST must be stored in one of these three locations. The argument against storing it in the actual MSR is performance. This is likely to be seldom used while the save/restore is required on every transition. IA32_TSC_ADJUST was created as a way to solve some issues with writing TSC itself so that is not an option either. The remaining option, defined above as our solution has the problem of returning incorrect vmcs tsc_offset values (unless we intercept and fix, not done here) as mentioned above. However, more problematic is that storing the data in vmcs tsc_offset will have a different semantic effect on the system than does using the actual MSR. This is illustrated in the following example: The hypervisor set the IA32_TSC_ADJUST, then the guest sets it and a guest process performs a rdtsc. In this case the guest process will get TSC + IA32_TSC_ADJUST_hyperviser + vmsc tsc_offset including IA32_TSC_ADJUST_guest. While the total system semantics changed the semantics as seen by the guest do not and hence this will not cause a problem. Signed-off-by: NWill Auld <will.auld@intel.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Will Auld 提交于
In order to track who initiated the call (host or guest) to modify an msr value I have changed function call parameters along the call path. The specific change is to add a struct pointer parameter that points to (index, data, caller) information rather than having this information passed as individual parameters. The initial use for this capability is for updating the IA32_TSC_ADJUST msr while setting the tsc value. It is anticipated that this capability is useful for other tasks. Signed-off-by: NWill Auld <will.auld@intel.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Frederic Weisbecker 提交于
Create a new subsystem that probes on kernel boundaries to keep track of the transitions between level contexts with two basic initial contexts: user or kernel. This is an abstraction of some RCU code that use such tracking to implement its userspace extended quiescent state. We need to pull this up from RCU into this new level of indirection because this tracking is also going to be used to implement an "on demand" generic virtual cputime accounting. A necessary step to shutdown the tick while still accounting the cputime. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Gilad Ben-Yossef <gilad@benyossef.com> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> [ paulmck: fix whitespace error and email address. ] Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 30 11月, 2012 7 次提交
-
-
由 H. Peter Anvin 提交于
Simplify the implementation of sync_core() for the case where we may not have the CPUID instruction available. [ v2: stylistic cleanup of the #else clause per suggestion by Borislav Petkov. ] Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1354132230-21854-9-git-send-email-hpa@linux.intel.com Cc: Borislav Petkov <bp@alien8.de>
-
由 H. Peter Anvin 提交于
All 486+ CPUs support WP in supervisor mode, so remove the fallback 386 support code. Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1354132230-21854-7-git-send-email-hpa@linux.intel.com
-
由 H. Peter Anvin 提交于
All 486+ CPUs support INVLPG, so remove the fallback 386 support code. Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1354132230-21854-6-git-send-email-hpa@linux.intel.com
-
由 H. Peter Anvin 提交于
All 486+ CPUs support BSWAP, so remove the fallback 386 support code. Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1354132230-21854-5-git-send-email-hpa@linux.intel.com
-
由 H. Peter Anvin 提交于
All 486+ CPUs support XADD, so remove the fallback 386 support code. Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1354132230-21854-4-git-send-email-hpa@linux.intel.com
-
由 H. Peter Anvin 提交于
All 486+ CPUs support CMPXCHG, so remove the fallback 386 support code. Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1354132230-21854-3-git-send-email-hpa@linux.intel.com
-
由 H. Peter Anvin 提交于
Remove the CONFIG_M386 symbol from Kconfig so that it cannot be selected. Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1354132230-21854-2-git-send-email-hpa@linux.intel.com
-
- 29 11月, 2012 6 次提交
-
-
由 Ian Campbell 提交于
We use XENMEM_add_to_physmap_range which is the preferred interface for foreign mappings. Acked-by: NMukesh Rathor <mukesh.rathor@oracle.com> Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: NIan Campbell <ian.campbell@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
now it can be done... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 28 11月, 2012 7 次提交
-
-
由 Marcelo Tosatti 提交于
With master clock, a pvclock clock read calculates: ret = system_timestamp + [ (rdtsc + tsc_offset) - tsc_timestamp ] Where 'rdtsc' is the host TSC. system_timestamp and tsc_timestamp are unique, one tuple per VM: the "master clock". Given a host with synchronized TSCs, its obvious that guest TSC must be matched for the above to guarantee monotonicity. Allow master clock usage only if guest TSCs are synchronized. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Marcelo Tosatti 提交于
KVM added a global variable to guarantee monotonicity in the guest. One of the reasons for that is that the time between 1. ktime_get_ts(×pec); 2. rdtscll(tsc); Is variable. That is, given a host with stable TSC, suppose that two VCPUs read the same time via ktime_get_ts() above. The time required to execute 2. is not the same on those two instances executing in different VCPUS (cache misses, interrupts...). If the TSC value that is used by the host to interpolate when calculating the monotonic time is the same value used to calculate the tsc_timestamp value stored in the pvclock data structure, and a single <system_timestamp, tsc_timestamp> tuple is visible to all vcpus simultaneously, this problem disappears. See comment on top of pvclock_update_vm_gtod_copy for details. Monotonicity is then guaranteed by synchronicity of the host TSCs and guest TSCs. Set TSC stable pvclock flag in that case, allowing the guest to read clock from userspace. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Marcelo Tosatti 提交于
Allow the caller to pass host tsc value to kvm_x86_ops->read_l1_tsc(). Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Marcelo Tosatti 提交于
Improve performance of time system calls when using Linux pvclock, by reading time info from fixmap visible copy of pvclock data. Originally from Jeremy Fitzhardinge. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Marcelo Tosatti 提交于
Hook into generic pvclock vsyscall code, with the aim to allow userspace to have visibility into pvclock data. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Marcelo Tosatti 提交于
Originally from Jeremy Fitzhardinge. Introduce generic, non hypervisor specific, pvclock initialization routines. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Marcelo Tosatti 提交于
As noted by Gleb, not advertising SSE2 support implies no RDTSC barriers. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-