提交 · fc3a9157d3148ab91039c75423da8ef97be3e105 · openeuler / raspberrypi-kernel

12 1月, 2011 21 次提交

KVM: X86: Don't report L2 emulation failures to user-space · fc3a9157

由 Joerg Roedel 提交于 11月 29, 2010

This patch prevents that emulation failures which result
from emulating an instruction for an L2-Guest results in
being reported to userspace.
Without this patch a malicious L2-Guest would be able to
kill the L1 by triggering a race-condition between an vmexit
and the instruction emulator.
With this patch the L2 will most likely only kill itself in
this situation.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

fc3a9157

KVM: Pull extra page fault information into struct x86_exception · 6389ee94

由 Avi Kivity 提交于 11月 29, 2010

Currently page fault cr2 and nesting infomation are carried outside
the fault data structure.  Instead they are placed in the vcpu struct,
which results in confusion as global variables are manipulated instead
of passing parameters.

Fix this issue by adding address and nested fields to struct x86_exception,
so this struct can carry all information associated with a fault.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Tested-by: NJoerg Roedel <joerg.roedel@amd.com>
Tested-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

6389ee94

A
KVM: Push struct x86_exception info the various gva_to_gpa variants · ab9ae313
由 Avi Kivity 提交于 11月 22, 2010
```
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
```
ab9ae313

KVM: x86 emulator: make emulator memory callbacks return full exception · bcc55cba

由 Avi Kivity 提交于 11月 22, 2010

This way, they can return #GP, not just #PF.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

bcc55cba

KVM: x86 emulator: introduce struct x86_exception to communicate faults · da9cb575

由 Avi Kivity 提交于 11月 22, 2010

Introduce a structure that can contain an exception to be passed back
to main kvm code.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

da9cb575

KVM: Mask KVM_GET_SUPPORTED_CPUID data with Linux cpuid info · 945ee35e

由 Avi Kivity 提交于 11月 09, 2010

This allows Linux to mask cpuid bits if, for example, nx is enabled on only
some cpus.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

945ee35e

KVM: MMU: fix apf prefault if nested guest is enabled · c4806acd

由 Xiao Guangrong 提交于 11月 12, 2010

If apf is generated in L2 guest and is completed in L1 guest, it will
prefault this apf in L1 guest's mmu context.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c4806acd

KVM: MMU: clear apfs if page state is changed · e5f3f027

由 Xiao Guangrong 提交于 11月 12, 2010

If CR0.PG is changed, the page fault cann't be avoid when the prefault address
is accessed later

And it also fix a bug: it can retry a page enabled #PF in page disabled context
if mmu is shadow page

This idear is from Gleb Natapov
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e5f3f027

KVM: Clean up vm creation and release · d89f5eff

由 Jan Kiszka 提交于 11月 09, 2010

IA64 support forces us to abstract the allocation of the kvm structure.
But instead of mixing this up with arch-specific initialization and
doing the same on destruction, split both steps. This allows to move
generic destruction calls into generic code.

It also fixes error clean-up on failures of kvm_create_vm for IA64.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d89f5eff

KVM: avoid unnecessary wait for a async pf · e6d53e3b

由 Xiao Guangrong 提交于 11月 01, 2010

In current code, it checks async pf completion out of the wait context,
like this:

if (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
		    !vcpu->arch.apf.halted)
			r = vcpu_enter_guest(vcpu);
		else {
			......
			kvm_vcpu_block(vcpu)
			 ^- waiting until 'async_pf.done' is not empty
}

kvm_check_async_pf_completion(vcpu)
 ^- delete list from async_pf.done

So, if we check aysnc pf completion first, it can be blocked at
kvm_vcpu_block

Fixed by mark the vcpu is unhalted in kvm_check_async_pf_completion()
path
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Acked-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e6d53e3b

KVM: fix searching async gfn in kvm_async_pf_gfn_slot · c7d28c24

由 Xiao Guangrong 提交于 11月 01, 2010

Don't search later slots if the slot is empty
Acked-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c7d28c24

KVM: x86: Avoid issuing wbinvd twice · 2eec7343

由 Jan Kiszka 提交于 11月 01, 2010

Micro optimization to avoid calling wbinvd twice on the CPU that has to
emulate it. As we might be preempted between smp_call_function_many and
the local wbinvd, the cache might be filled again so that real work
could be done uselessly.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

2eec7343

KVM: pre-allocate one more dirty bitmap to avoid vmalloc() · 515a0127

由 Takuya Yoshikawa 提交于 10月 27, 2010

Currently x86's kvm_vm_ioctl_get_dirty_log() needs to allocate a bitmap by
vmalloc() which will be used in the next logging and this has been causing
bad effect to VGA and live-migration: vmalloc() consumes extra systime,
triggers tlb flush, etc.

This patch resolves this issue by pre-allocating one more bitmap and switching
between two bitmaps during dirty logging.

Performance improvement:
  I measured performance for the case of VGA update by trace-cmd.
  The result was 1.5 times faster than the original one.

  In the case of live migration, the improvement ratio depends on the workload
  and the guest memory size. In general, the larger the memory size is the more
  benefits we get.

Note:
  This does not change other architectures's logic but the allocation size
  becomes twice. This will increase the actual memory consumption only when
  the new size changes the number of pages allocated by vmalloc().
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

515a0127

KVM: MMU: remove kvm_mmu_set_base_ptes · 982c2565

由 Marcelo Tosatti 提交于 10月 22, 2010

Unused.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

982c2565

KVM: Send async PF when guest is not in userspace too. · fc5f06fa

由 Gleb Natapov 提交于 10月 14, 2010

If guest indicates that it can handle async pf in kernel mode too send
it, but only if interrupts are enabled.
Acked-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

fc5f06fa

KVM: Let host know whether the guest can handle async PF in non-userspace context. · 6adba527

由 Gleb Natapov 提交于 10月 14, 2010

If guest can detect that it runs in non-preemptable context it can
handle async PFs at any time, so let host know that it can send async
PF even if guest cpu is not in userspace.
Acked-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

6adba527

KVM: Inject asynchronous page fault into a PV guest if page is swapped out. · 7c90705b

由 Gleb Natapov 提交于 10月 14, 2010

Send async page fault to a PV guest if it accesses swapped out memory.
Guest will choose another task to run upon receiving the fault.

Allow async page fault injection only when guest is in user mode since
otherwise guest may be in non-sleepable context and will not be able
to reschedule.

Vcpu will be halted if guest will fault on the same page again or if
vcpu executes kernel code.
Acked-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7c90705b

KVM: Add PV MSR to enable asynchronous page faults delivery. · 344d9588

由 Gleb Natapov 提交于 10月 14, 2010

Guest enables async PF vcpu functionality using this MSR.
Reviewed-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

344d9588

KVM: Add memory slot versioning and use it to provide fast guest write interface · 49c7754c

由 Gleb Natapov 提交于 10月 18, 2010

Keep track of memslots changes by keeping generation number in memslots
structure. Provide kvm_write_guest_cached() function that skips
gfn_to_hva() translation if memslots was not changed since previous
invocation.
Acked-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

49c7754c

KVM: Retry fault before vmentry · 56028d08

由 Gleb Natapov 提交于 10月 17, 2010

When page is swapped in it is mapped into guest memory only after guest
tries to access it again and generate another fault. To save this fault
we can map it immediately since we know that guest is going to access
the page. Do it only when tdp is enabled for now. Shadow paging case is
more complicated. CR[034] and EFER registers should be switched before
doing mapping and then switched back.
Acked-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

56028d08

KVM: Halt vcpu if page it tries to access is swapped out · af585b92

由 Gleb Natapov 提交于 10月 14, 2010

If a guest accesses swapped out memory do not swap it in from vcpu thread
context. Schedule work to do swapping and put vcpu into halted state
instead.

Interrupts will still be delivered to the guest and if interrupt will
cause reschedule guest will continue to run another task.

[avi: remove call to get_user_pages_noio(), nacked by Linus; this
      makes everything synchrnous again]
Acked-by: NRik van Riel <riel@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

af585b92

02 1月, 2011 1 次提交

KVM: Don't reset mmu context unnecessarily when updating EFER · 010c520e

由 Avi Kivity 提交于 10月 11, 2010

The only bit of EFER that affects the mmu is NX, and this is already
accounted for (LME only takes effect when changing cr0).

Based on a patch by Hillf Danton.
Signed-off-by: NAvi Kivity <avi@redhat.com>

010c520e

16 12月, 2010 1 次提交
- A
  KVM: Fix preemption counter leak in kvm_timer_init() · 3e26f230
  由 Avi Kivity 提交于 12月 16, 2010
```
Based on a patch from Thomas Meyer.
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
  3e26f230
08 12月, 2010 2 次提交

KVM: SVM: Do not report xsave in supported cpuid · 24d1b15f

由 Joerg Roedel 提交于 12月 07, 2010

To support xsave properly for the guest the SVM module need
software support for it. As long as this is not present do
not report the xsave as supported feature in cpuid.
As a side-effect this patch moves the bit() helper function
into the x86.h file so that it can be used in svm.c too.

KVM-Stable-Tag.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

24d1b15f

KVM: Fix OSXSAVE after migration · 3ea3aa8c

由 Sheng Yang 提交于 12月 08, 2010

CPUID's OSXSAVE is a mirror of CR4.OSXSAVE bit. We need to update the CPUID
after migration.

KVM-Stable-Tag.
Signed-off-by: NSheng Yang <sheng@linux.intel.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3ea3aa8c

06 11月, 2010 3 次提交

KVM: x86: Issue smp_call_function_many with preemption disabled · 453d9c57

由 Jan Kiszka 提交于 11月 01, 2010

smp_call_function_many is specified to be called only with preemption
disabled. Fulfill this requirement.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

453d9c57

KVM: x86: fix information leak to userland · 97e69aa6

由 Vasiliy Kulikov 提交于 10月 30, 2010

Structures kvm_vcpu_events, kvm_debugregs, kvm_pit_state2 and
kvm_clock_data are copied to userland with some padding and reserved
fields unitialized.  It leads to leaking of contents of kernel stack
memory.  We have to initialize them to zero.

In patch v1 Jan Kiszka suggested to fill reserved fields with zeros
instead of memset'ting the whole struct.  It makes sense as these
fields are explicitly marked as padding.  No more fields need zeroing.

KVM-Stable-Tag.
Signed-off-by: NVasiliy Kulikov <segooon@gmail.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

97e69aa6

KVM: Write protect memory after slot swap · edde99ce

由 Michael S. Tsirkin 提交于 10月 25, 2010

I have observed the following bug trigger:

1. userspace calls GET_DIRTY_LOG
2. kvm_mmu_slot_remove_write_access is called and makes a page ro
3. page fault happens and makes the page writeable
   fault is logged in the bitmap appropriately
4. kvm_vm_ioctl_get_dirty_log swaps slot pointers

a lot of time passes

5. guest writes into the page
6. userspace calls GET_DIRTY_LOG

At point (5), bitmap is clean and page is writeable,
thus, guest modification of memory is not logged
and GET_DIRTY_LOG returns an empty bitmap.

The rule is that all pages are either dirty in the current bitmap,
or write-protected, which is violated here.

It seems that just moving kvm_mmu_slot_remove_write_access down
to after the slot pointer swap should fix this bug.

KVM-Stable-Tag.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

edde99ce

24 10月, 2010 12 次提交

KVM: MCE: Add MCG_SER_P into KVM_MCE_CAP_SUPPORTED · 5854dbca

由 Huang Ying 提交于 10月 08, 2010

Now we have MCG_SER_P (and corresponding SRAO/SRAR MCE) support in
kernel and QEMU-KVM, the MCG_SER_P should be added into
KVM_MCE_CAP_SUPPORTED to make all these code really works.
Reported-by: NDean Nelson <dnelson@redhat.com>
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5854dbca

KVM: fix typo in copyright notice · 9611c187

由 Nicolas Kaiser 提交于 10月 06, 2010

Fix typo in copyright notice.
Signed-off-by: NNicolas Kaiser <nikai@nikai.net>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9611c187

KVM: Disable interrupts around get_kernel_ns() · 395c6b0a

由 Avi Kivity 提交于 10月 04, 2010

get_kernel_ns() wants preemption disabled.  It doesn't make a lot of sense
during the get/set ioctls (no way to make them non-racy) but the callee wants
it.
Signed-off-by: NAvi Kivity <avi@redhat.com>

395c6b0a

KVM: x86: Fix constant type in kvm_get_time_scale · 50933623

由 Jan Kiszka 提交于 9月 26, 2010

Older gcc versions complain about the improper type (for x86-32), 4.5
seems to fix this silently. However, we should better use the right type
initially.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

50933623

KVM: x86: TSC catchup mode · c285545f

由 Zachary Amsden 提交于 9月 18, 2010

Negate the effects of AN TYM spell while kvm thread is preempted by tracking
conversion factor to the highest TSC rate and catching the TSC up when it has
fallen behind the kernel view of time.  Note that once triggered, we don't
turn off catchup mode.

A slightly more clever version of this is possible, which only does catchup
when TSC rate drops, and which specifically targets only CPUs with broken
TSC, but since these all are considered unstable_tsc(), this patch covers
all necessary cases.
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c285545f

KVM: x86: Rename timer function · 34c238a1

由 Zachary Amsden 提交于 9月 18, 2010

This just changes some names to better reflect the usage they
will be given.  Separated out to keep confusion to a minimum.
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

34c238a1

KVM: x86: Make math work for other scales · 5f4e3f88

由 Zachary Amsden 提交于 9月 18, 2010

The math in kvm_get_time_scale relies on the fact that
NSEC_PER_SEC < 2^32.  To use the same function to compute
arbitrary time scales, we must extend the first reduction
step to shrink the base rate to a 32-bit value, and
possibly reduce the scaled rate into a 32-bit as well.

Note we must take care to avoid an arithmetic overflow
when scaling up the tps32 value (this could not happen
with the fixed scaled value of NSEC_PER_SEC, but can
happen with scaled rates above 2^31.
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5f4e3f88

KVM: Add kvm_inject_realmode_interrupt() wrapper · 63995653

由 Mohammed Gamal 提交于 9月 19, 2010

This adds a wrapper function kvm_inject_realmode_interrupt() around the
emulator function emulate_int_real() to allow real mode interrupt injection.

[avi: initialize operand and address sizes before emulating interrupts]
[avi: initialize rip for real mode interrupt injection]
[avi: clear interrupt pending flag after emulating interrupt injection]
Signed-off-by: NMohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

63995653

KVM: Convert PIC lock from raw spinlock to ordinary spinlock · f4f51050

由 Avi Kivity 提交于 9月 19, 2010

The PIC code used to be called from preempt_disable() context, which
wasn't very good for PREEMPT_RT.  That is no longer the case, so move
back from raw_spinlock_t to spinlock_t.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f4f51050

KVM: x86: Fix kvmclock bug · 28e4639a

由 Zachary Amsden 提交于 9月 18, 2010

If preempted after kvmclock values are updated, but before hardware
virtualization is entered, the last tsc time as read by the guest is
never set. It underflows the next time kvmclock is updated if there
has not yet been a successful entry / exit into hardware virt.

Fix this by simply setting last_tsc to the newly read tsc value so
that any computed nsec advance of kvmclock is nulled.
Signed-off-by: NZachary Amsden <zamsden@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

28e4639a

KVM: MMU: Don't track nested fault info in error-code · 0959ffac

由 Joerg Roedel 提交于 9月 14, 2010

This patch moves the detection whether a page-fault was
nested or not out of the error code and moves it into a
separate variable in the fault struct.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

0959ffac

KVM: Non-atomic interrupt injection · b463a6f7

由 Avi Kivity 提交于 7月 20, 2010

Change the interrupt injection code to work from preemptible, interrupts
enabled context.  This works by adding a ->cancel_injection() operation
that undoes an injection in case we were not able to actually enter the guest
(this condition could never happen with atomic injection).
Signed-off-by: NAvi Kivity <avi@redhat.com>

b463a6f7