- 11 1月, 2013 2 次提交
-
-
由 Xiao Guangrong 提交于
We have two issues in current code: - if target gfn is used as its page table, guest will refault then kvm will use small page size to map it. We need two #PF to fix its shadow page table - sometimes, say a exception is triggered during vm-exit caused by #PF (see handle_exception() in vmx.c), we remove all the shadow pages shadowed by the target gfn before go into page fault path, it will cause infinite loop: delete shadow pages shadowed by the gfn -> try to use large page size to map the gfn -> retry the access ->... To fix these, we can adjust page size early if the target gfn is used as page table Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Xiao Guangrong 提交于
If the write-fault access is from supervisor and CR0.WP is not set on the vcpu, kvm will fix it by adjusting pte access - it sets the W bit on pte and clears U bit. This is the chance that kvm can change pte access from readonly to writable Unfortunately, the pte access is the access of 'direct' shadow page table, means direct sp.role.access = pte_access, then we will create a writable spte entry on the readonly shadow page table. It will cause Dirty bit is not tracked when two guest ptes point to the same large page. Note, it does not have other impact except Dirty bit since cr0.wp is encoded into sp.role It can be fixed by adjusting pte access before establishing shadow page table. Also, after that, no mmu specified code exists in the common function and drop two parameters in set_spte Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 10 1月, 2013 9 次提交
-
-
由 Christian Borntraeger 提交于
In rare cases a virtio command might try to issue a ccw before a former ccw was answered with a tsch. This will cause CC=2 (busy). Lets just retry in that case. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Cornelia Huck 提交于
Dynamically allocate any data structures like ccw used when doing channel I/O. Otherwise, we'd need to add extra serialization for the different callbacks using the same data structures. Reported-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
Opcodes: TEST CMP ADD ADC SUB SBB XOR OR AND Acked-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
Acked-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
Acked-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
Acked-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
Instead of disabling writeback via OP_NONE, just specify NoWrite. Acked-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
Acked-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Avi Kivity 提交于
We emulate arithmetic opcodes by executing a "similar" (same operation, different operands) on the cpu. This ensures accurate emulation, esp. wrt. eflags. However, the prologue and epilogue around the opcode is fairly long, consisting of a switch (for the operand size) and code to load and save the operands. This is repeated for every opcode. This patch introduces an alternative way to emulate arithmetic opcodes. Instead of the above, we have four (three on i386) functions consisting of just the opcode and a ret; one for each operand size. For example: .align 8 em_notb: not %al ret .align 8 em_notw: not %ax ret .align 8 em_notl: not %eax ret .align 8 em_notq: not %rax ret The prologue and epilogue are shared across all opcodes. Note the functions use a special calling convention; notably eflags is an input/output parameter and is not clobbered. Rather than dispatching the four functions through a jump table, the functions are declared as a constant size (8) so their address can be calculated. Acked-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi.kivity@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 09 1月, 2013 2 次提交
-
-
由 Marcelo Tosatti 提交于
CPL is always 0 when in real mode, and always 3 when virtual 8086 mode. Using values other than those can cause failures on operations that check CPL. Reviewed-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Gleb Natapov 提交于
Fix compilation warning. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 08 1月, 2013 9 次提交
-
-
由 Gleb Natapov 提交于
MMU code tries to avoid if()s HW is not able to predict reliably by using bitwise operation to streamline code execution, but in case of a dirty bit folding this gives us nothing since write_fault is checked right before the folding code. Lets just piggyback onto the if() to make code more clear. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Gleb Natapov 提交于
trace_kvm_mmu_delay_free_pages() is no longer used. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Cornelia Huck 提交于
Add a new capability, KVM_CAP_S390_CSS_SUPPORT, which will pass intercepts for channel I/O instructions to userspace. Only I/O instructions interacting with I/O interrupts need to be handled in-kernel: - TEST PENDING INTERRUPTION (tpi) dequeues and stores pending interrupts entirely in-kernel. - TEST SUBCHANNEL (tsch) dequeues pending interrupts in-kernel and exits via KVM_EXIT_S390_TSCH to userspace for subchannel- related processing. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Reviewed-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Cornelia Huck 提交于
Make s390 support KVM_ENABLE_CAP. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Acked-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Cornelia Huck 提交于
Explicitely catch all channel I/O related instructions intercepts in the kernel and set condition code 3 for them. This paves the way for properly handling these instructions later on. Note: This is not architecture compliant (the previous code wasn't either) since setting cc 3 is not the correct thing to do for some of these instructions. For Linux guests, however, it still has the intended effect of stopping css probing. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Reviewed-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Cornelia Huck 提交于
Add support for injecting machine checks (only repressible conditions for now). This is a bit more involved than I/O interrupts, for these reasons: - Machine checks come in both floating and cpu varieties. - We don't have a bit for machine checks enabling, but have to use a roundabout approach with trapping PSW changing instructions and watching for opened machine checks. Reviewed-by: NAlexander Graf <agraf@suse.de> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Cornelia Huck 提交于
Add support for handling I/O interrupts (standard, subchannel-related ones and rudimentary adapter interrupts). The subchannel-identifying parameters are encoded into the interrupt type. I/O interrupts are floating, so they can't be injected on a specific vcpu. Reviewed-by: NAlexander Graf <agraf@suse.de> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Cornelia Huck 提交于
Introduce helper functions for decoding the various base/displacement instruction formats. Reviewed-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Cornelia Huck 提交于
These tables are never modified. Reviewed-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 03 1月, 2013 7 次提交
-
-
由 Gleb Natapov 提交于
With emulate_invalid_guest_state=0 if a vcpu is in real mode VMX can enter the vcpu with smaller segment limit than guest configured. If the guest tries to access pass this limit it will get #GP at which point instruction will be emulated with correct segment limit applied. If during the emulation IO is detected it is not handled correctly. Vcpu thread should exit to userspace to serve the IO, but it returns to the guest instead. Since emulation is not completed till userspace completes the IO the faulty instruction is re-executed ad infinitum. The patch fixes that by exiting to userspace if IO happens during instruction emulation. Reported-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Gleb Natapov 提交于
Segment registers will be fixed according to current emulation policy during switching to real mode for the first time. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Gleb Natapov 提交于
Currently when emulation of invalid guest state is enable (emulate_invalid_guest_state=1) segment registers are still fixed for entry to vm86 mode some times. Segment register fixing is avoided in enter_rmode(), but vmx_set_segment() still does it unconditionally. The patch fixes it. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Gleb Natapov 提交于
Currently it allows entering vm86 mode if segment limit is greater than 0xffff and db bit is set. Both of those can cause incorrect execution of instruction by cpu since in vm86 mode limit will be set to 0xffff and db will be forced to 0. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Gleb Natapov 提交于
Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Gleb Natapov 提交于
According to Intel SDM Vol3 Section 5.5 "Privilege Levels" and 5.6 "Privilege Level Checking When Accessing Data Segments" RPL checking is done during loading of a segment selector, not during data access. We already do checking during segment selector loading, so drop the check during data access. Checking RPL during data access triggers #GP if after transition from real mode to protected mode RPL bits in a segment selector are set. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Jesse Larrew 提交于
Correct a typo in the comment explaining hypercalls. Signed-off-by: NJesse Larrew <jlarrew@linux.vnet.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 24 12月, 2012 1 次提交
-
-
由 Gleb Natapov 提交于
Move repetitive code sequence to a separate function. Reviewed-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 23 12月, 2012 9 次提交
-
-
由 Gleb Natapov 提交于
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Gleb Natapov 提交于
Move all vm86_active logic into one place. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Gleb Natapov 提交于
Segment descriptor's base is fixed by call to fix_rmode_seg(). Not need to do it twice. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Gleb Natapov 提交于
The code for SS and CS does the same thing fix_rmode_seg() is doing. Use it instead of hand crafted code. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Gleb Natapov 提交于
VMX without unrestricted mode cannot virtualize real mode, so if emulate_invalid_guest_state=0 kvm uses vm86 mode to approximate it. Sometimes, when guest moves from protected mode to real mode, it leaves segment descriptors in a state not suitable for use by vm86 mode virtualization, so we keep shadow copy of segment descriptors for internal use and load fake register to VMCS for guest entry to succeed. Till now we kept shadow for all segments except SS and CS (for SS and CS we returned parameters directly from VMCS), but since commit a5625189 emulator enforces segment limits in real mode. This causes #GP during move from protected mode to real mode when emulator fetches first instruction after moving to real mode since it uses incorrect CS base and limit to linearize the %rip. Fix by keeping shadow for SS and CS too. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Gleb Natapov 提交于
rmode_segment_valid() checks if segment descriptor can be used to enter vm86 mode. VMX spec mandates that in vm86 mode CS register will be of type data, not code. Lets allow guest entry with vm86 mode if the only problem with CS register is incorrect type. Otherwise entire real mode will be emulated. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Gleb Natapov 提交于
Set segment fields explicitly instead of using binary operations. No behaviour changes. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Alex Williamson 提交于
Previous patch "kvm: Minor memory slot optimization" (b7f69c55) overlooked the generation field of the memory slots. Re-using the original memory slots left us with with two slightly different memory slots with the same generation. To fix this, make update_memslots() take a new parameter to specify the last generation. This also makes generation management more explicit to avoid such problems in the future. Reported-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Yang Zhang 提交于
This hack is wrong. The pin number of PIT is connected to 2 not 0. This means this hack never takes effect. So it is ok to remove it. Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 18 12月, 2012 1 次提交
-
-
由 Cornelia Huck 提交于
Add a driver for kvm guests that matches virtual ccw devices provided by the host as virtio bridge devices. These virtio-ccw devices use a special set of channel commands in order to perform virtio functions. Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Reviewed-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-