提交 · 8f964525a121f2ff2df948dac908dcc65be21b5b · openeuler / raspberrypi-kernel

07 4月, 2013 1 次提交

KVM: Allow cross page reads and writes from cached translations. · 8f964525

由 Andrew Honig 提交于 3月 29, 2013

This patch adds support for kvm_gfn_to_hva_cache_init functions for
reads and writes that will cross a page.  If the range falls within
the same memslot, then this will be a fast operation.  If the range
is split between two memslots, then the slower kvm_read_guest and
kvm_write_guest are used.

Tested: Test against kvm_clock unit tests.
Signed-off-by: NAndrew Honig <ahonig@google.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

8f964525

20 3月, 2013 2 次提交

KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797) · 0b79459b

由 Andy Honig 提交于 2月 20, 2013

There is a potential use after free issue with the handling of
MSR_KVM_SYSTEM_TIME. If the guest specifies a GPA in a movable or removable
memory such as frame buffers then KVM might continue to write to that
address even after it's removed via KVM_SET_USER_MEMORY_REGION. KVM pins
the page in memory so it's unlikely to cause an issue, but if the user
space component re-purposes the memory previously used for the guest, then
the guest will be able to corrupt that memory.

Tested: Tested against kvmclock unit test
Signed-off-by: NAndrew Honig <ahonig@google.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0b79459b

KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796) · c300aa64

由 Andy Honig 提交于 3月 11, 2013

If the guest sets the GPA of the time_page so that the request to update the
time straddles a page then KVM will write onto an incorrect page.  The
write is done byusing kmap atomic to get a pointer to the page for the time
structure and then performing a memcpy to that page starting at an offset
that the guest controls.  Well behaved guests always provide a 32-byte aligned
address, however a malicious guest could use this to corrupt host kernel
memory.

Tested: Tested against kvmclock unit test.
Signed-off-by: NAndrew Honig <ahonig@google.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c300aa64

19 3月, 2013 1 次提交

KVM: x86: fix deadlock in clock-in-progress request handling · c09664bb

由 Marcelo Tosatti 提交于 3月 18, 2013

There is a deadlock in pvclock handling:

cpu0:                                               cpu1:
kvm_gen_update_masterclock()
                                              kvm_guest_time_update()
 spin_lock(pvclock_gtod_sync_lock)
                                               local_irq_save(flags)

spin_lock(pvclock_gtod_sync_lock)

 kvm_make_mclock_inprogress_request(kvm)
  make_all_cpus_request()
   smp_call_function_many()

Now if smp_call_function_many() called by cpu0 tries to call function on
cpu1 there will be a deadlock.

Fix by moving pvclock_gtod_sync_lock protected section outside irq
disabled section.

Analyzed by Gleb Natapov <gleb@redhat.com>
Acked-by: NGleb Natapov <gleb@redhat.com>
Reported-and-Tested-by: NYongjie Ren <yongjie.ren@intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c09664bb

28 2月, 2013 1 次提交

hlist: drop the node parameter from iterators · b67bfe0d

由 Sasha Levin 提交于 2月 27, 2013

I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj->member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    <+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b67bfe0d

21 2月, 2013 1 次提交

Revert "KVM: MMU: lazily drop large spte" · 6b73a960

由 Marcelo Tosatti 提交于 2月 20, 2013

This reverts commit caf6900f.

It is causing migration failures, reference
https://bugzilla.kernel.org/show_bug.cgi?id=54061.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

6b73a960

20 2月, 2013 1 次提交

x86, kvm: Add MSR_AMD64_BU_CFG2 to the list of ignored MSRs · 2e32b719

由 Borislav Petkov 提交于 2月 19, 2013

The "x86, AMD: Enable WC+ memory type on family 10 processors" patch
currently in -tip added a workaround for AMD F10h CPUs which #GPs my
guest when booted in kvm. This is because it accesses MSR_AMD64_BU_CFG2
which is not currently ignored by kvm. Do that because this MSR is only
baremetal-relevant anyway. While at it, move the ignored MSRs at the
beginning of kvm_set_msr_common so that we exit then and there.
Acked-by: NGleb Natapov <gleb@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Andre Przywara <andre@andrep.de>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1361298793-31834-2-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

2e32b719

14 2月, 2013 2 次提交

KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr · cbd29cb6

由 Jan Kiszka 提交于 2月 11, 2013

We already pass vmcs12 as argument.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

cbd29cb6

x86 emulator: fix parity calculation for AAD instruction · f583c29b

由 Gleb Natapov 提交于 2月 13, 2013

Reported-by: NPaolo Bonzini <pbonzini@redhat.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

f583c29b

11 2月, 2013 2 次提交

KVM: Remove user_alloc from struct kvm_memory_slot · 7a905b14

由 Takuya Yoshikawa 提交于 2月 07, 2013

This field was needed to differentiate memory slots created by the new
API, KVM_SET_USER_MEMORY_REGION, from those by the old equivalent,
KVM_SET_MEMORY_REGION, whose support was dropped long before:

  commit b74a07be
  KVM: Remove kernel-allocated memory regions

Although we also have private memory slots to which KVM allocates
memory with vm_mmap(), !user_alloc slots in other words, the slot id
should be enough for differentiating them.

Note: corresponding function parameters will be removed later.
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

7a905b14

KVM: VMX: disable apicv by default · 257090f7

由 Yang Zhang 提交于 2月 10, 2013

Without Posted Interrupt, current code is broken. Just disable by
default until Posted Interrupt is ready.
Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

257090f7

07 2月, 2013 5 次提交

KVM: MMU: cleanup __direct_map · 24db2734

由 Xiao Guangrong 提交于 2月 05, 2013

Use link_shadow_page to link the sp to the spte in __direct_map
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

24db2734

KVM: MMU: remove pt_access in mmu_set_spte · f7616203

由 Xiao Guangrong 提交于 2月 05, 2013

It is only used in debug code, so drop it
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f7616203

KVM: MMU: cleanup mapping-level · 55dd98c3

由 Xiao Guangrong 提交于 2月 05, 2013

Use min() to cleanup mapping_level
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

55dd98c3

KVM: MMU: lazily drop large spte · caf6900f

由 Xiao Guangrong 提交于 2月 05, 2013

Currently, kvm zaps the large spte if write-protected is needed, the later
read can fault on that spte. Actually, we can make the large spte readonly
instead of making them not present, the page fault caused by read access can
be avoided

The idea is from Avi:
| As I mentioned before, write-protecting a large spte is a good idea,
| since it moves some work from protect-time to fault-time, so it reduces
| jitter.  This removes the need for the return value.
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

caf6900f

KVM: VMX: cleanup vmx_set_cr0(). · 5037878e

由 Gleb Natapov 提交于 2月 04, 2013

When calculating hw_cr0 teh current code masks bits that should be always
on and re-adds them back immediately after. Cleanup the code by masking
only those bits that should be dropped from hw_cr0. This allow us to
get rid of some defines.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5037878e

06 2月, 2013 2 次提交

KVM: VMX: disable SMEP feature when guest is in non-paging mode · c08800a5

由 Dongxiao Xu 提交于 2月 04, 2013

SMEP is disabled if CPU is in non-paging mode in hardware.
However KVM always uses paging mode to emulate guest non-paging
mode with TDP. To emulate this behavior, SMEP needs to be manually
disabled when guest switches to non-paging mode.

We met an issue that, SMP Linux guest with recent kernel (enable
SMEP support, for example, 3.5.3) would crash with triple fault if
setting unrestricted_guest=0. This is because KVM uses an identity
mapping page table to emulate the non-paging mode, where the page
table is set with USER flag. If SMEP is still enabled in this case,
guest will meet unhandlable page fault and then crash.
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NDongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: NXiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c08800a5

Revert "KVM: MMU: split kvm_mmu_free_page" · 834be0d8

由 Gleb Natapov 提交于 1月 30, 2013

This reverts commit bd4c86ea.

There is not user for kvm_mmu_isolate_page() any more.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

834be0d8

05 2月, 2013 5 次提交

KVM: MMU: drop superfluous is_present_gpte() check. · eb3fce87

由 Gleb Natapov 提交于 1月 30, 2013

Gust page walker puts only present ptes into ptes[] array. No need to
check it again.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

eb3fce87

KVM: MMU: drop superfluous min() call. · 116eb3d3

由 Gleb Natapov 提交于 1月 30, 2013

Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

116eb3d3

KVM: MMU: set base_role.nxe during mmu initialization. · 2c9afa52

由 Gleb Natapov 提交于 1月 30, 2013

Move base_role.nxe initialisation to where all other roles are initialized.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

2c9afa52

KVM: MMU: drop unneeded checks. · 9bb4f6b1

由 Gleb Natapov 提交于 1月 30, 2013

Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

9bb4f6b1

KVM: MMU: make spte_is_locklessly_modifiable() more clear · feb3eb70

由 Gleb Natapov 提交于 1月 30, 2013

spte_is_locklessly_modifiable() checks that both SPTE_HOST_WRITEABLE and
SPTE_MMU_WRITEABLE are present on spte. Make it more explicit.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

feb3eb70

29 1月, 2013 3 次提交

x86, apicv: add virtual interrupt delivery support · c7c9c56c

由 Yang Zhang 提交于 1月 25, 2013

Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
manually, which is fully taken care of by the hardware. This needs
some special awareness into existing interrupr injection path:

- for pending interrupt, instead of direct injection, we may need
  update architecture specific indicators before resuming to guest.

- A pending interrupt, which is masked by ISR, should be also
  considered in above update action, since hardware will decide
  when to inject it at right time. Current has_interrupt and
  get_interrupt only returns a valid vector from injection p.o.v.
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NKevin Tian <kevin.tian@intel.com>
Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

c7c9c56c

x86, apicv: add virtual x2apic support · 8d14695f

由 Yang Zhang 提交于 1月 25, 2013

basically to benefit from apicv, we need to enable virtualized x2apic mode.
Currently, we only enable it when guest is really using x2apic.

Also, clear MSR bitmap for corresponding x2apic MSRs when guest enabled x2apic:
0x800 - 0x8ff: no read intercept for apicv register virtualization,
               except APIC ID and TMCCT which need software's assistance to
               get right value.
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NKevin Tian <kevin.tian@intel.com>
Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

8d14695f

x86, apicv: add APICv register virtualization support · 83d4c286

由 Yang Zhang 提交于 1月 25, 2013

- APIC read doesn't cause VM-Exit
- APIC write becomes trap-like
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NKevin Tian <kevin.tian@intel.com>
Signed-off-by: NYang Zhang <yang.z.zhang@intel.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

83d4c286

27 1月, 2013 1 次提交

KVM: x86 emulator: fix test_cc() build failure on i386 · 3f0c3d0b

由 Avi Kivity 提交于 1月 26, 2013

'pushq' doesn't exist on i386.  Replace with 'push', which should work
since the operand is a register.
Signed-off-by: NAvi Kivity <avi.kivity@gmail.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

3f0c3d0b

24 1月, 2013 13 次提交

KVM: VMX: set vmx->emulation_required only when needed. · 14168786