提交 · 2e2e3738af33575cba59597acd5e80cdd5ec11ee · openanolis / cloud-kernel

20 7月, 2008 2 次提交

KVM: Handle vma regions with no backing page · 2e2e3738

由 Anthony Liguori 提交于 4月 30, 2008

This patch allows VMAs that contain no backing page to be used for guest
memory.  This is useful for assigning mmio regions to a guest.
Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

2e2e3738

KVM: remove long -> void *user -> long cast · 1e1c65e0

由 Christian Borntraeger 提交于 4月 21, 2008

kvm_dev_ioctl casts the arg value to void __user *, just to recast it
again to long. This seems unnecessary.

According to objdump the binary code on x86 is unchanged by this patch.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1e1c65e0

06 7月, 2008 1 次提交

KVM: IOAPIC: Fix level-triggered irq injection hang · 35baff25

由 Mark McLoughlin 提交于 7月 04, 2008

The "remote_irr" variable is used to indicate an interrupt
which has been received by the LAPIC, but not acked.

In our EOI handler, we unset remote_irr and re-inject the
interrupt if the interrupt line is still asserted.

However, we do not set remote_irr here, leading to a
situation where if kvm_ioapic_set_irq() is called, then we go
ahead and call ioapic_service(). This means that IRR is
re-asserted even though the interrupt is currently in service
(i.e. LAPIC IRR is cleared and ISR/TMR set)

The issue with this is that when the currently executing
interrupt handler finishes and writes LAPIC EOI, then TMR is
unset and EOI sent to the IOAPIC. Since IRR is now asserted,
but TMR is not, then when the second interrupt is handled,
no EOI is sent and if there is any pending interrupt, it is
not re-injected.

This fixes a hang only seen while running mke2fs -j on an
8Gb virtio disk backed by a fully sparse raw file, with
aliguori "avoid fragmented virtio-blk transfers by copying"
changes.
Signed-off-by: NMark McLoughlin <markmc@redhat.com>
Acked-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

35baff25

26 6月, 2008 2 次提交

on_each_cpu(): kill unused 'retry' parameter · 15c8b6c1

由 Jens Axboe 提交于 5月 09, 2008

It's not even passed on to smp_call_function() anymore, since that
was removed. So kill it.
Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

15c8b6c1

smp_call_function: get rid of the unused nonatomic/retry argument · 8691e5a8

由 Jens Axboe 提交于 6月 06, 2008

It's never used and the comments refer to nonatomic and retry
interchangably. So get rid of it.
Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

8691e5a8

24 6月, 2008 1 次提交

KVM: ioapic: fix lost interrupt when changing a device's irq · 4fa6b9c5

由 Avi Kivity 提交于 6月 17, 2008

The ioapic acknowledge path translates interrupt vectors to irqs.  It
currently uses a first match algorithm, stopping when it finds the first
redirection table entry containing the vector.  That fails however if the
guest changes the irq to a different line, leaving the old redirection table
entry in place (though masked).  Result is interrupts not making it to the
guest.

Fix by always scanning the entire redirection table.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

4fa6b9c5

07 6月, 2008 1 次提交

KVM: IOAPIC: only set remote_irr if interrupt was injected · ff4b9df8

由 Marcelo Tosatti 提交于 6月 05, 2008

There's a bug in the IOAPIC code for level-triggered interrupts. Its
relatively easy to trigger by sharing (virtio-blk + usbtablet was the
testcase, initially reported by Gerd von Egidy).

The "remote_irr" variable is used to indicate accepted but not yet acked
interrupts. Its cleared from the EOI handler.

Problem is that the EOI handler clears remote_irr unconditionally, even
if it reinjected another pending interrupt.

In that case, kvm_ioapic_set_irq() proceeds to ioapic_service() which
sets remote_irr even if it failed to inject (since the IRR was high due
to EOI reinjection).

Since the TMR bit has been cleared by the first EOI, the second one
fails to clear remote_irr.

End result is interrupt line dead.

Fix it by setting remote_irr only if a new pending interrupt has been
generated (and the TMR bit for vector in question set).
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

ff4b9df8

18 5月, 2008 1 次提交

KVM: Fix kvm_vcpu_block() task state race · e5c239cf

由 Marcelo Tosatti 提交于 5月 08, 2008

There's still a race in kvm_vcpu_block(), if a wake_up_interruptible()
call happens before the task state is set to TASK_INTERRUPTIBLE:

CPU0                            CPU1

kvm_vcpu_block

add_wait_queue

kvm_cpu_has_interrupt = 0
                                set interrupt
                                if (waitqueue_active())
                                        wake_up_interruptible()

kvm_cpu_has_pending_timer
kvm_arch_vcpu_runnable
signal_pending

set_current_state(TASK_INTERRUPTIBLE)
schedule()

Can be fixed by using prepare_to_wait() which sets the task state before
testing for the wait condition.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

e5c239cf

04 5月, 2008 1 次提交

KVM: Export necessary function for EPT · 0d150298

由 Sheng Yang 提交于 4月 25, 2008

Signed-off-by: NSheng Yang <sheng.yang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

0d150298

02 5月, 2008 1 次提交

[PATCH] sanitize anon_inode_getfd() · 2030a42c

由 Al Viro 提交于 2月 23, 2008

a) none of the callers even looks at inode or file returned by anon_inode_getfd()
b) any caller that would try to look at those would be racy, since by the time
it returns we might have raced with close() from another thread and that
file would be pining for fjords.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2030a42c

27 4月, 2008 14 次提交

KVM: kill file->f_count abuse in kvm · 66c0b394

由 Al Viro 提交于 4月 19, 2008

Use kvm own refcounting instead of playing with ->filp->f_count.
That will allow to get rid of a lot of crap in anon_inode_getfd() and
kill a race in kvm_dev_ioctl_create_vm() (file might have been closed
immediately by another thread, so ->filp might point to already freed
struct file when we get around to setting it).
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

66c0b394

KVM: Rename debugfs_dir to kvm_debugfs_dir · 76f7c879

由 Hollis Blanchard 提交于 4月 15, 2008

It's a globally exported symbol now.
Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

76f7c879

KVM: add ioctls to save/store mpstate · 62d9f0db

由 Marcelo Tosatti 提交于 4月 11, 2008

So userspace can save/restore the mpstate during migration.

[avi: export the #define constants describing the value]
[christian: add s390 stubs]
[avi: ditto for ia64]
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

62d9f0db

KVM: hlt emulation should take in-kernel APIC/PIT timers into account · 3d80840d

由 Marcelo Tosatti 提交于 4月 11, 2008

Timers that fire between guest hlt and vcpu_block's add_wait_queue() are
ignored, possibly resulting in hangs.

Also make sure that atomic_inc and waitqueue_active tests happen in the
specified order, otherwise the following race is open:

CPU0                                        CPU1
                                            if (waitqueue_active(wq))
add_wait_queue()
if (!atomic_read(pit_timer->pending))
    schedule()
                                            atomic_inc(pit_timer->pending)
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

3d80840d

KVM: Add kvm trace userspace interface · d4c9ff2d

由 Feng(Eric) Liu 提交于 4月 10, 2008

This interface allows user a space application to read the trace of kvm
related events through relayfs.
Signed-off-by: NFeng (Eric) Liu <eric.e.liu@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

d4c9ff2d

KVM: MMU: Don't assume struct page for x86 · 35149e21

由 Anthony Liguori 提交于 4月 02, 2008

This patch introduces a gfn_to_pfn() function and corresponding functions like
kvm_release_pfn_dirty().  Using these new functions, we can modify the x86
MMU to no longer assume that it can always get a struct page for any given gfn.

We don't want to eliminate gfn_to_page() entirely because a number of places
assume they can do gfn_to_page() and then kmap() the results.  When we support
IO memory, gfn_to_page() will fail for IO pages although gfn_to_pfn() will
succeed.

This does not implement support for avoiding reference counting for reserved
RAM or for IO memory.  However, it should make those things pretty straight
forward.

Since we're only introducing new common symbols, I don't think it will break
the non-x86 architectures but I haven't tested those.  I've tested Intel,
AMD, NPT, and hugetlbfs with Windows and Linux guests.

[avi: fix overflow when shifting left pfns by adding casts]
Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

35149e21

KVM: add vm refcounting · d39f13b0

由 Izik Eidus 提交于 3月 30, 2008

the main purpose of adding this functions is the abilaty to release the
spinlock that protect the kvm list while still be able to do operations
on a specific kvm in a safe way.
Signed-off-by: NIzik Eidus <izike@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

d39f13b0

KVM: Use kzalloc to avoid allocating kvm_regs from kernel stack · 3e4bb3ac

由 Xiantao Zhang 提交于 2月 25, 2008

Since the size of kvm_regs is too big to allocate from kernel stack on ia64,
use kzalloc to allocate it.
Signed-off-by: NXiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

3e4bb3ac

KVM: MMU: large page support · 05da4558

由 Marcelo Tosatti 提交于 2月 23, 2008

Create large pages mappings if the guest PTE's are marked as such and
the underlying memory is hugetlbfs backed.  If the largepage contains
write-protected pages, a large pte is not used.

Gives a consistent 2% improvement for data copies on ram mounted
filesystem, without NPT/EPT.

Anthony measures a 4% improvement on 4-way kernbench, with NPT.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

05da4558

KVM: MMU: ignore zapped root pagetables · 2e53d63a

由 Marcelo Tosatti 提交于 2月 20, 2008

Mark zapped root pagetables as invalid and ignore such pages during lookup.

This is a problem with the cr3-target feature, where a zapped root table fools
the faulting code into creating a read-only mapping. The result is a lockup
if the instruction can't be emulated.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

2e53d63a

KVM: Disable pagefaults during copy_from_user_inatomic() · 0aac03f0

由 Andrea Arcangeli 提交于 1月 30, 2008

With CONFIG_PREEMPT=n, this is needed in order to disable the fault-in
code from sleeping.
Signed-off-by: NAndrea Arcangeli <andrea@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

0aac03f0

KVM: Limit vcpu mmap size to one page on non-x86 · adb1ff46

由 Avi Kivity 提交于 1月 24, 2008

The second page is only needed on archs that support pio.

Noted by Carsten Otte.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

adb1ff46

A
KVM: Only x86 has pio · 09566765
由 Avi Kivity 提交于 1月 23, 2008
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
09566765

KVM: constify function pointer tables · 5c502742

由 Jan Engelhardt 提交于 1月 22, 2008

Signed-off-by: NJan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

5c502742

04 3月, 2008 2 次提交

KVM: Route irq 0 to vcpu 0 exclusively · 8c35f237

由 Avi Kivity 提交于 2月 25, 2008

Some Linux versions allow the timer interrupt to be processed by more than
one cpu, leading to hangs due to tsc instability.  Work around the issue
by only disaptching the interrupt to vcpu 0.

Problem analyzed (and patch tested) by Sheng Yang.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

8c35f237

KVM: remove the usage of the mmap_sem for the protection of the memory slots. · 72dc67a6

由 Izik Eidus 提交于 2月 10, 2008

This patch replaces the mmap_sem lock for the memory slots with a new
kvm private lock, it is needed beacuse untill now there were cases where
kvm accesses user memory while holding the mmap semaphore.
Signed-off-by: NIzik Eidus <izike@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

72dc67a6

09 2月, 2008 1 次提交

libfs: allow error return from simple attributes · 8b88b099

由 Christoph Hellwig 提交于 2月 08, 2008

Sometimes simple attributes might need to return an error, e.g. for
acquiring a mutex interruptibly.  In fact we have that situation in
spufs already which is the original user of the simple attributes.  This
patch merged the temporarily forked attributes in spufs back into the
main ones and allows to return errors.

[akpm@linux-foundation.org: build fix]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Cc: <stefano.brivio@polimi.it>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg KH <greg@kroah.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8b88b099

31 1月, 2008 5 次提交

KVM: MMU: Switch to mmu spinlock · aaee2c94

由 Marcelo Tosatti 提交于 12月 20, 2007

Convert the synchronization of the shadow handling to a separate mmu_lock
spinlock.

Also guard fetch() by mmap_sem in read-mode to protect against alias
and memslot changes.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

aaee2c94

KVM: Add kvm_read_guest_atomic() · 7ec54588

由 Marcelo Tosatti 提交于 12月 20, 2007

In preparation for a mmu spinlock, add kvm_read_guest_atomic()
and use it in fetch() and prefetch_page().
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

7ec54588

KVM: MMU: Concurrent guest walkers · 10589a46

由 Marcelo Tosatti 提交于 12月 20, 2007

Do not hold kvm->lock mutex across the entire pagefault code,
only acquire it in places where it is necessary, such as mmu
hash list, active list, rmap and parent pte handling.

Allow concurrent guest walkers by switching walk_addr() to use
mmap_sem in read-mode.

And get rid of the lockless __gfn_to_page.

[avi: move kvm_mmu_pte_write() locking inside the function]
[avi: add locking for real mode]
[avi: fix cmpxchg locking]
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

10589a46

KVM: Move ioapic code to common directory. · 0eb8f498

由 Zhang Xiantao 提交于 12月 17, 2007

Move ioapic code to common, since IA64 also needs it.
Signed-off-by: NZhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

0eb8f498

A
KVM: Move drivers/kvm/* to virt/kvm/ · 0fce5623
由 Avi Kivity 提交于 12月 16, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
0fce5623

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功