提交 · 75b7127c3858261fc080dd52a022424a7e7f6ae5 · openeuler / raspberrypi-kernel

12 1月, 2011 40 次提交

KVM: rename hardware_[dis|en]able() to *_nolock() and add locking wrappers · 75b7127c

由 Takuya Yoshikawa 提交于 11月 16, 2010

The naming convension of hardware_[dis|en]able family is little bit confusing
because only hardware_[dis|en]able_all are using _nolock suffix.

Renaming current hardware_[dis|en]able() to *_nolock() and using
hardware_[dis|en]able() as wrapper functions which take kvm_lock for them
reduces extra confusion.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

75b7127c

KVM: take kvm_lock for hardware_disable() during cpu hotplug · 97e91e28

由 Takuya Yoshikawa 提交于 11月 16, 2010

In kvm_cpu_hotplug(), only CPU_STARTING case is protected by kvm_lock.
This patch adds missing protection for CPU_DYING case.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

97e91e28

KVM: MMU: don't mark spte notrap if reserved bit set · e730b63c

由 Xiao Guangrong 提交于 11月 17, 2010

If reserved bit is set, we need inject the #PF with PFEC.RSVD=1,
but shadow_notrap_nonpresent_pte injects #PF with PFEC.RSVD=0 only
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e730b63c

KVM: Document device assigment API · 49f48172

由 Jan Kiszka 提交于 11月 16, 2010

Adds API documentation for KVM_[DE]ASSIGN_PCI_DEVICE,
KVM_[DE]ASSIGN_DEV_IRQ, KVM_SET_GSI_ROUTING, KVM_ASSIGN_SET_MSIX_NR, and
KVM_ASSIGN_SET_MSIX_ENTRY.
Acked-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

49f48172

KVM: Clean up kvm_vm_ioctl_assigned_device · 51de271d

由 Jan Kiszka 提交于 11月 16, 2010

Any arch not supporting device assigment will also not build
assigned-dev.c. So testing for KVM_CAP_DEVICE_DEASSIGNMENT is pointless.
KVM_CAP_ASSIGN_DEV_IRQ is unconditinally set. Moreover, add a default
case for dispatching the ioctl.
Acked-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

51de271d

KVM: Save/restore state of assigned PCI device · ed78661f

由 Jan Kiszka 提交于 11月 16, 2010

The guest may change states that pci_reset_function does not touch. So
we better save/restore the assigned device across guest usage.
Acked-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ed78661f

KVM: Refactor IRQ names of assigned devices · 1e001d49

由 Jan Kiszka 提交于 11月 16, 2010

Cosmetic change, but it helps to correlate IRQs with PCI devices.
Acked-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

1e001d49

KVM: Switch assigned device IRQ forwarding to threaded handler · 0645211c

由 Jan Kiszka 提交于 11月 16, 2010

This improves the IRQ forwarding for assigned devices: By using the
kernel's threaded IRQ scheme, we can get rid of the latency-prone work
queue and simplify the code in the same run.

Moreover, we no longer have to hold assigned_dev_lock while raising the
guest IRQ, which can be a lenghty operation as we may have to iterate
over all VCPUs. The lock is now only used for synchronizing masking vs.
unmasking of INTx-type IRQs, thus is renames to intx_lock.
Acked-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0645211c

KVM: Clear assigned guest IRQ on release · 0c106b5a

由 Jan Kiszka 提交于 11月 16, 2010

When we deassign a guest IRQ, clear the potentially asserted guest line.
There might be no chance for the guest to do this, specifically if we
switch from INTx to MSI mode.
Acked-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0c106b5a

KVM: Mask KVM_GET_SUPPORTED_CPUID data with Linux cpuid info · 945ee35e

由 Avi Kivity 提交于 11月 09, 2010

This allows Linux to mask cpuid bits if, for example, nx is enabled on only
some cpus.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

945ee35e

KVM: SVM: Replace svm_has() by standard Linux cpuid accessors · 2a6b20b8

由 Avi Kivity 提交于 11月 09, 2010

Instead of querying cpuid directly, use the Linux accessors (boot_cpu_has,
etc.).  This allows the things like the clearcpuid kernel command line to
work (when it's fixed wrt scattered cpuid bits).
Acked-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

2a6b20b8

KVM: MMU: fix apf prefault if nested guest is enabled · c4806acd

由 Xiao Guangrong 提交于 11月 12, 2010

If apf is generated in L2 guest and is completed in L1 guest, it will
prefault this apf in L1 guest's mmu context.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

c4806acd

KVM: MMU: support apf for nonpaing guest · 060c2abe

由 Xiao Guangrong 提交于 11月 12, 2010

Let's support apf for nonpaing guest
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

060c2abe

KVM: MMU: clear apfs if page state is changed · e5f3f027

由 Xiao Guangrong 提交于 11月 12, 2010

If CR0.PG is changed, the page fault cann't be avoid when the prefault address
is accessed later

And it also fix a bug: it can retry a page enabled #PF in page disabled context
if mmu is shadow page

This idear is from Gleb Natapov
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

e5f3f027

KVM: MMU: fix missing post sync audit · 5054c0de

由 Xiao Guangrong 提交于 11月 12, 2010

Add AUDIT_POST_SYNC audit for long mode shadow page
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

5054c0de

KVM: Clean up vm creation and release · d89f5eff

由 Jan Kiszka 提交于 11月 09, 2010

IA64 support forces us to abstract the allocation of the kvm structure.
But instead of mixing this up with arch-specific initialization and
doing the same on destruction, split both steps. This allows to move
generic destruction calls into generic code.

It also fixes error clean-up on failures of kvm_create_vm for IA64.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d89f5eff

KVM: x86: Makefile clean up · 9d893c6b

由 Tracey Dent 提交于 11月 06, 2010

Changed makefile to use the ccflags-y option instead of EXTRA_CFLAGS.
Signed-off-by: NTracey Dent <tdent48227@gmail.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9d893c6b

KVM: remove unused function declaration · 2a126faa

由 Xiao Guangrong 提交于 11月 04, 2010

Remove the declaration of kvm_mmu_set_base_ptes()
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

2a126faa

J
KVM: Refactor srcu struct release on early errors · 57e7fbee
由 Jan Kiszka 提交于 11月 09, 2010
```
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>
```
57e7fbee

KVM: VMX: Disallow NMI while blocked by STI · 30bd0c4c

由 Avi Kivity 提交于 11月 01, 2010

While not mandated by the spec, Linux relies on NMI being blocked by an
IF-enabling STI.  VMX also refuses to enter a guest in this state, at
least on some implementations.

Disallow NMI while blocked by STI by checking for the condition, and
requesting an interrupt window exit if it occurs.
Signed-off-by: NAvi Kivity <avi@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

30bd0c4c

KVM: fix the race while wakeup all pv guest · 64f638c7

由 Xiao Guangrong 提交于 11月 01, 2010

In kvm_async_pf_wakeup_all(), we add a dummy apf to vcpu->async_pf.done
without holding vcpu->async_pf.lock, it will break if we are handling apfs
at this time.

Also use 'list_empty_careful()' instead of 'list_empty()'
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Acked-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

64f638c7

KVM: handle more completed apfs if possible · 15096ffc

由 Xiao Guangrong 提交于 11月 02, 2010

If it's no need to inject async #PF to PV guest we can handle
more completed apfs at one time, so we can retry guest #PF
as early as possible
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Acked-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

15096ffc

KVM: avoid unnecessary wait for a async pf · e6d53e3b

由 Xiao Guangrong 提交于 11月 01, 2010

In current code, it checks async pf completion out of the wait context,
like this:

if (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
		    !vcpu->arch.apf.halted)
			r = vcpu_enter_guest(vcpu);
		else {
			......
			kvm_vcpu_block(vcpu)
			 ^- waiting until 'async_pf.done' is not empty
}

kvm_check_async_pf_completion(vcpu)
 ^- delete list from async_pf.done

So, if we check aysnc pf completion first, it can be blocked at
kvm_vcpu_block

Fixed by mark the vcpu is unhalted in kvm_check_async_pf_completion()
path
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Acked-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e6d53e3b

KVM: fix searching async gfn in kvm_async_pf_gfn_slot · c7d28c24

由 Xiao Guangrong 提交于 11月 01, 2010

Don't search later slots if the slot is empty
Acked-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c7d28c24

KVM: cleanup async_pf tracepoints · 0730388b

由 Xiao Guangrong 提交于 11月 01, 2010

Use 'DECLARE_EVENT_CLASS' to cleanup async_pf tracepoints
Acked-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0730388b

KVM: fix tracing kvm_try_async_get_page · c9b263d2

由 Xiao Guangrong 提交于 11月 01, 2010

Tracing 'async' and *pfn is useless, since 'async' is always true,
and '*pfn' is always "fault_pfn'

We can trace 'gva' and 'gfn' instead, it can help us to see the
life-cycle of an async_pf
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c9b263d2

KVM: replace vmalloc and memset with vzalloc · 26535037

由 Takuya Yoshikawa 提交于 11月 02, 2010

Let's use newly introduced vzalloc().
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

26535037

KVM: handle exit due to INVD in VMX · ec25d5e6

由 Gleb Natapov 提交于 11月 01, 2010

Currently the exit is unhandled, so guest halts with error if it tries
to execute INVD instruction. Call into emulator when INVD instruction
is executed by a guest instead. This instruction is not needed by ordinary
guests, but firmware (like OpenBIOS) use it and fail.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

ec25d5e6

KVM: x86: Avoid issuing wbinvd twice · 2eec7343

由 Jan Kiszka 提交于 11月 01, 2010

Micro optimization to avoid calling wbinvd twice on the CPU that has to
emulate it. As we might be preempted between smp_call_function_many and
the local wbinvd, the cache might be filled again so that real work
could be done uselessly.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

2eec7343

KVM: get rid of warning within kvm_dev_ioctl_create_vm · aac87636

由 Heiko Carstens 提交于 10月 27, 2010

Fixes this:

  CC      arch/s390/kvm/../../../virt/kvm/kvm_main.o
arch/s390/kvm/../../../virt/kvm/kvm_main.c: In function 'kvm_dev_ioctl_create_vm':
arch/s390/kvm/../../../virt/kvm/kvm_main.c:1828:10: warning: unused variable 'r'
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

aac87636

KVM: add cast within kvm_clear_guest_page to fix warning · 3bcc8a8c

由 Heiko Carstens 提交于 10月 27, 2010

Fixes this:

CC arch/s390/kvm/../../../virt/kvm/kvm_main.o
arch/s390/kvm/../../../virt/kvm/kvm_main.c: In function 'kvm_clear_guest_page':
arch/s390/kvm/../../../virt/kvm/kvm_main.c:1224:2: warning: passing argument 3 of 'kvm_write_guest_page' makes pointer from integer without a cast
arch/s390/kvm/../../../virt/kvm/kvm_main.c:1185:5: note: expected 'const void *' but argument is of type 'long unsigned int'
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

3bcc8a8c

KVM: use kmalloc() for small dirty bitmaps · 6f9e5c17

由 Takuya Yoshikawa 提交于 11月 01, 2010

Currently we are using vmalloc() for all dirty bitmaps even if
they are small enough, say less than K bytes.

We use kmalloc() if dirty bitmap size is less than or equal to
PAGE_SIZE so that we can avoid vmalloc area usage for VGA.

This will also make the logging start/stop faster.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

6f9e5c17

KVM: pre-allocate one more dirty bitmap to avoid vmalloc() · 515a0127

由 Takuya Yoshikawa 提交于 10月 27, 2010

Currently x86's kvm_vm_ioctl_get_dirty_log() needs to allocate a bitmap by
vmalloc() which will be used in the next logging and this has been causing
bad effect to VGA and live-migration: vmalloc() consumes extra systime,
triggers tlb flush, etc.

This patch resolves this issue by pre-allocating one more bitmap and switching
between two bitmaps during dirty logging.

Performance improvement:
  I measured performance for the case of VGA update by trace-cmd.
  The result was 1.5 times faster than the original one.

  In the case of live migration, the improvement ratio depends on the workload
  and the guest memory size. In general, the larger the memory size is the more
  benefits we get.

Note:
  This does not change other architectures's logic but the allocation size
  becomes twice. This will increase the actual memory consumption only when
  the new size changes the number of pages allocated by vmalloc().
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

515a0127

KVM: introduce wrapper functions for creating/destroying dirty bitmaps · a36a57b1

由 Takuya Yoshikawa 提交于 10月 27, 2010

This makes it easy to change the way of allocating/freeing dirty bitmaps.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

a36a57b1

KVM: x86: trace "exit to userspace" event · 64be5007

由 Gleb Natapov 提交于 10月 24, 2010

Add tracepoint for userspace exit.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

64be5007

KVM: propagate fault r/w information to gup(), allow read-only memory · 612819c3

由 Marcelo Tosatti 提交于 10月 22, 2010

As suggested by Andrea, pass r/w error code to gup(), upgrading read fault
to writable if host pte allows it.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

612819c3

KVM: MMU: flush TLBs on writable -> read-only spte overwrite · 7905d9a5

由 Marcelo Tosatti 提交于 10月 22, 2010

This can happen in the following scenario:

vcpu0			vcpu1
read fault
gup(.write=0)
			gup(.write=1)
			reuse swap cache, no COW
			set writable spte
			use writable spte
set read-only spte
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

7905d9a5

KVM: MMU: remove kvm_mmu_set_base_ptes · 982c2565

由 Marcelo Tosatti 提交于 10月 22, 2010

Unused.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

982c2565

KVM: VMX: remove setting of shadow_base_ptes for EPT · ff1fcb9e

由 Marcelo Tosatti 提交于 10月 22, 2010

The EPT present/writable bits use the same position as normal
pagetable bits.

Since direct_map passes ACC_ALL to mmu_set_spte, thus always setting
the writable bit on sptes, use the generic PT_PRESENT shadow_base_pte.

Also pass present/writable error code information from EPT violation
to generic pagefault handler.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

ff1fcb9e

KVM: Avoid double interrupt injection with vapic · 83bcacb1

由 Avi Kivity 提交于 10月 25, 2010

After an interrupt injection, the PPR changes, and we have to reflect that
into the vapic. This causes a KVM_REQ_EVENT to be set, which causes the
whole interrupt injection routine to be run again (harmlessly).

Optimize by only setting KVM_REQ_EVENT if the ppr was lowered; otherwise
there is no chance that a new injection is needed.
Signed-off-by: NAvi Kivity <avi@redhat.com>

83bcacb1