提交 · 87c00572ba05aa8c9db118da75c608f47eb10b9e · openanolis / cloud-kernel

08 5月, 2014 2 次提交

kvm: x86: emulate monitor and mwait instructions as nop · 87c00572

由 Gabriel L. Somlo 提交于 5月 07, 2014

Treat monitor and mwait instructions as nop, which is architecturally
correct (but inefficient) behavior. We do this to prevent misbehaving
guests (e.g. OS X <= 10.7) from crashing after they fail to check for
monitor/mwait availability via cpuid.

Since mwait-based idle loops relying on these nop-emulated instructions
would keep the host CPU pegged at 100%, do NOT advertise their presence
via cpuid, to prevent compliant guests from using them inadvertently.
Signed-off-by: NGabriel L. Somlo <somlo@cmu.edu>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

87c00572

kvm/x86: implement hv EOI assist · b63cf42f

由 Michael S. Tsirkin 提交于 5月 07, 2014

It seems that it's easy to implement the EOI assist
on top of the PV EOI feature: simply convert the
page address to the format expected by PV EOI.

Notes:
-"No EOI required" is set only if interrupt injected
 is edge triggered; this is true because level interrupts are going
 through IOAPIC which disables PV EOI.
 In any case, if guest triggers EOI the bit will get cleared on exit.
-For migration, set of HV_X64_MSR_APIC_ASSIST_PAGE sets
 KVM_PV_EOI_EN internally, so restoring HV_X64_MSR_APIC_ASSIST_PAGE
 seems sufficient
 In any case, bit is cleared on exit so worst case it's never re-enabled
-no handling of PV EOI data is performed at HV_X64_MSR_EOI write;
 HV_X64_MSR_EOI is a separate optimization - it's an X2APIC
 replacement that lets you do EOI with an MSR and not IO.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b63cf42f

07 5月, 2014 6 次提交

KVM: x86: Mark bit 7 in long-mode PDPTE according to 1GB pages support · 5f7dde7b

由 Nadav Amit 提交于 5月 07, 2014

In long-mode, bit 7 in the PDPTE is not reserved only if 1GB pages are
supported by the CPU. Currently the bit is considered by KVM as always
reserved.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5f7dde7b

KVM: vmx: handle_dr does not handle RSP correctly · a4ab9d0c

由 Nadav Amit 提交于 5月 07, 2014

The RSP register is not automatically cached, causing mov DR instruction with
RSP to fail. Instead the regular register accessing interface should be used.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a4ab9d0c

KVM: nVMX: move vmclear and vmptrld pre-checks to nested_vmx_check_vmptr · 4291b588

由 Bandan Das 提交于 5月 06, 2014

Some checks are common to all, and moreover,
according to the spec, the check for whether any bits
beyond the physical address width are set are also
applicable to all of them
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4291b588

KVM: nVMX: fail on invalid vmclear/vmptrld pointer · 96ec1463

由 Bandan Das 提交于 5月 06, 2014

The spec mandates that if the vmptrld or vmclear
address is equal to the vmxon region pointer, the
instruction should fail with error "VMPTRLD with
VMXON pointer" or "VMCLEAR with VMXON pointer"
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

96ec1463

KVM: nVMX: additional checks on vmxon region · 3573e22c

由 Bandan Das 提交于 5月 06, 2014

Currently, the vmxon region isn't used in the nested case.
However, according to the spec, the vmxon instruction performs
additional sanity checks on this region and the associated
pointer. Modify emulated vmxon to better adhere to the spec
requirements
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3573e22c

KVM: nVMX: rearrange get_vmx_mem_address · 19677e32

由 Bandan Das 提交于 5月 06, 2014

Our common function for vmptr checks (in 2/4) needs to fetch
the memory address
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

19677e32

06 5月, 2014 5 次提交

Merge tag 'kvm-s390-20140506' of... · 2ce316f0

由 Paolo Bonzini 提交于 5月 06, 2014

Merge tag 'kvm-s390-20140506' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next

1. Fixes an error return code for the breakpoint setup

2. External interrupt fixes
2.1. Some interrupt conditions like cpu timer or clock comparator
stay pending even after the interrupt is injected. If the external
new PSW is enabled for interrupts this will result in an endless
loop. Usually this indicates a programming error in the guest OS.
Lets detect such situations and go to userspace. We will provide
a QEMU patch that sets the guest in panicked/crashed state to avoid
wasting CPU cycles.
2.2 Resend external interrupts back to the guest if the HW could
not do it.
-

2ce316f0

KVM: s390: Fix external interrupt interception · f14d82e0

由 Thomas Huth 提交于 1月 15, 2014

The external interrupt interception can only occur in rare cases, e.g.
when the PSW of the interrupt handler has a bad value. The old handler
for this interception simply ignored these events (except for increasing
the exit_external_interrupt counter), but for proper operation we either
have to inject the interrupts manually or we should drop to userspace in
case of errors.
Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

f14d82e0

KVM: s390: Add clock comparator and CPU timer IRQ injection · e029ae5b

由 Thomas Huth 提交于 3月 26, 2014

Add an interface to inject clock comparator and CPU timer interrupts
into the guest. This is needed for handling the external interrupt
interception.
Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

e029ae5b

KVM: s390: return -EFAULT if copy_from_user() fails · fcc9aec3

由 Dan Carpenter 提交于 5月 03, 2014

When copy_from_user() fails, this code returns the number of bytes
remaining instead of a negative error code.  The positive number is
returned to the user but otherwise it is harmless.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

fcc9aec3

KVM: x86: improve the usability of the 'kvm_pio' tracepoint · 1171903d

由 Ulrich Obergfell 提交于 5月 02, 2014

This patch moves the 'kvm_pio' tracepoint to emulator_pio_in_emulated()
and emulator_pio_out_emulated(), and it adds an argument (a pointer to
the 'pio_data'). A single 8-bit or 16-bit or 32-bit data item is fetched
from 'pio_data' (depending on 'size'), and the value is included in the
trace record ('val'). If 'count' is greater than one, this is indicated
by the string "(...)" in the trace output.
Signed-off-by: NUlrich Obergfell <uobergfe@redhat.com>
Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1171903d

05 5月, 2014 1 次提交

kvm/irqchip: Speed up KVM_SET_GSI_ROUTING · 719d93cd

由 Christian Borntraeger 提交于 1月 16, 2014

When starting lots of dataplane devices the bootup takes very long on
Christian's s390 with irqfd patches. With larger setups he is even
able to trigger some timeouts in some components. Turns out that the
KVM_SET_GSI_ROUTING ioctl takes very long (strace claims up to 0.1 sec)
when having multiple CPUs. This is caused by the synchronize_rcu and
the HZ=100 of s390. By changing the code to use a private srcu we can
speed things up. This patch reduces the boot time till mounting root
from 8 to 2 seconds on my s390 guest with 100 disks.

Uses of hlist_for_each_entry_rcu, hlist_add_head_rcu, hlist_del_init_rcu
are fine because they do not have lockdep checks (hlist_for_each_entry_rcu
uses rcu_dereference_raw rather than rcu_dereference, and write-sides
do not do rcu lockdep at all).

Note that we're hardly relying on the "sleepable" part of srcu. We just
want SRCU's faster detection of grace periods.

Testing was done by Andrew Theurer using netperf tests STREAM, MAERTS
and RR. The difference between results "before" and "after" the patch
has mean -0.2% and standard deviation 0.6%. Using a paired t-test on the
data points says that there is a 2.5% probability that the patch is the
cause of the performance difference (rather than a random fluctuation).

(Restricting the t-test to RR, which is the most likely to be affected,
changes the numbers to respectively -0.3% mean, 0.7% stdev, and 8%
probability that the numbers actually say something about the patch.
The probability increases mostly because there are fewer data points).

Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> # s390
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

719d93cd

30 4月, 2014 1 次提交

Merge tag 'kvm-s390-20140429' of... · 57b5981c

由 Paolo Bonzini 提交于 4月 30, 2014

Merge tag 'kvm-s390-20140429' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next

1. Guest handling fixes
The handling of MVPG, PFMF and Test Block is fixed to better follow
the architecture. None of these fixes is critical for any current
Linux guests, but let's play safe.

2. Optimization for single CPU guests
We can enable the IBS facility if only one VCPU is running (!STOPPED
state). We also enable this optimization for guest > 1 VCPU as soon
as all but one VCPU is in stopped state. Thus will help guests that
have tools like cpuplugd (from s390-utils) that do dynamic offline/
online of CPUs.

3. NOTES
There is one non-s390 change in include/linux/kvm_host.h that
introduces 2 defines for VCPU requests:
define KVM_REQ_ENABLE_IBS        23
define KVM_REQ_DISABLE_IBS       24

57b5981c

29 4月, 2014 7 次提交

KVM: x86: expose invariant tsc cpuid bit (v2) · e4c9a5a1

由 Marcelo Tosatti 提交于 4月 26, 2014

Invariant TSC is a property of TSC, no additional
support code necessary.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e4c9a5a1

KVM: s390: enable IBS for single running VCPUs · 8ad35755

由 David Hildenbrand 提交于 3月 14, 2014

This patch enables the IBS facility when a single VCPU is running.
The facility is dynamically turned on/off as soon as other VCPUs
enter/leave the stopped state.

When this facility is operating, some instructions can be executed
faster for single-cpu guests.
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NDominik Dingel <dingel@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

8ad35755

KVM: s390: introduce kvm_s390_vcpu_{start,stop} · 6852d7b6

由 David Hildenbrand 提交于 3月 14, 2014

This patch introduces two new functions to set/clear the CPUSTAT_STOPPED bit and
makes use of it at all applicable places. These functions prepare the additional
execution of code when starting/stopping a vcpu.

The CPUSTAT_STOPPED bit should not be touched outside of these functions.
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

6852d7b6

KVM: s390: Add low-address protection to TEST BLOCK · e45efa28

由 Thomas Huth 提交于 3月 07, 2014

TEST BLOCK is also subject to the low-address protection, so we need
to check the destination address in our handler.
Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

e45efa28

KVM: s390: Fixes for PFMF · fb34c603

由 Thomas Huth 提交于 9月 09, 2013

Add a check for low-address protection to the PFMF handler and
convert real-addresses to absolute if necessary, as it is defined
in the Principles of Operations specification.
Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

fb34c603

KVM: s390: Add a function for checking the low-address protection · f8232c8c

由 Thomas Huth 提交于 3月 03, 2014

The s390 architecture has a special protection mechanism that can
be used to prevent write access to the vital data in the low-core
memory area. This patch adds a new helper function that can be used
to check for such write accesses and in case of protection, it also
sets up the exception data accordingly.
Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

f8232c8c

KVM: s390: Handle MVPG partial execution interception · 9a558ee3

由 Thomas Huth 提交于 2月 03, 2014

When the guest executes the MVPG instruction with DAT disabled,
and the source or destination page is not mapped in the host,
the so-called partial execution interception occurs. We need to
handle this event by setting up a mapping for the corresponding
user pages.
Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

9a558ee3

28 4月, 2014 2 次提交

KVM: async_pf: change async_pf_execute() to use get_user_pages(tsk => NULL) · e9545b9f

由 Oleg Nesterov 提交于 4月 28, 2014

async_pf_execute() passes tsk == current to gup(), this is doesn't
hurt but unnecessary and misleading. "tsk" is only used to account
the number of faults and current is the random workqueue thread.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Suggested-by: NAndrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e9545b9f

KVM: async_pf: kill the unnecessary use_mm/unuse_mm async_pf_execute() · d72d946d

由 Oleg Nesterov 提交于 4月 21, 2014

async_pf_execute() has no reasons to adopt apf->mm, gup(current, mm)
should work just fine even if current has another or NULL ->mm.

Recently kvm_async_page_present_sync() was added insedie the "use_mm"
section, but it seems that it doesn't need current->mm too.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d72d946d

24 4月, 2014 9 次提交

KVM: MMU: flush tlb out of mmu lock when write-protect the sptes · 198c74f4

由 Xiao Guangrong 提交于 4月 17, 2014

Now we can flush all the TLBs out of the mmu lock without TLB corruption when
write-proect the sptes, it is because:
- we have marked large sptes readonly instead of dropping them that means we
  just change the spte from writable to readonly so that we only need to care
  the case of changing spte from present to present (changing the spte from
  present to nonpresent will flush all the TLBs immediately), in other words,
  the only case we need to care is mmu_spte_update()

- in mmu_spte_update(), we haved checked
  SPTE_HOST_WRITEABLE | PTE_MMU_WRITEABLE instead of PT_WRITABLE_MASK, that
  means it does not depend on PT_WRITABLE_MASK anymore
Acked-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

198c74f4

KVM: MMU: flush tlb if the spte can be locklessly modified · 7f31c959

由 Xiao Guangrong 提交于 4月 17, 2014

Relax the tlb flush condition since we will write-protect the spte out of mmu
lock. Note lockless write-protection only marks the writable spte to readonly
and the spte can be writable only if both SPTE_HOST_WRITEABLE and
SPTE_MMU_WRITEABLE are set (that are tested by spte_is_locklessly_modifiable)

This patch is used to avoid this kind of race:

      VCPU 0                         VCPU 1
lockless wirte protection:
      set spte.w = 0
                                 lock mmu-lock

                                 write protection the spte to sync shadow page,
                                 see spte.w = 0, then without flush tlb

				 unlock mmu-lock

                                 !!! At this point, the shadow page can still be
                                     writable due to the corrupt tlb entry
     Flush all TLB
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

7f31c959

KVM: MMU: lazily drop large spte · c126d94f

由 Xiao Guangrong 提交于 4月 17, 2014

Currently, kvm zaps the large spte if write-protected is needed, the later
read can fault on that spte. Actually, we can make the large spte readonly
instead of making them un-present, the page fault caused by read access can
be avoided

The idea is from Avi:
| As I mentioned before, write-protecting a large spte is a good idea,
| since it moves some work from protect-time to fault-time, so it reduces
| jitter.  This removes the need for the return value.

This version has fixed the issue reported in 6b73a960, the reason of that
issue is that fast_page_fault() directly sets the readonly large spte to
writable but only dirty the first page into the dirty-bitmap that means
other pages are missed. Fixed it by only the normal sptes (on the
PT_PAGE_TABLE_LEVEL level) can be fast fixed
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

c126d94f

KVM: MMU: properly check last spte in fast_page_fault() · 92a476cb

由 Xiao Guangrong 提交于 4月 17, 2014

Using sp->role.level instead of @level since @level is not got from the
page table hierarchy

There is no issue in current code since the fast page fault currently only
fixes the fault caused by dirty-log that is always on the last level
(level = 1)

This patch makes the code more readable and avoids potential issue in the
further development
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

92a476cb

Revert "KVM: Simplify kvm->tlbs_dirty handling" · a086f6a1

由 Xiao Guangrong 提交于 4月 17, 2014

This reverts commit 5befdc38.

Since we will allow flush tlb out of mmu-lock in the later
patch
Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

a086f6a1

KVM: x86: Processor mode may be determined incorrectly · 42bf549f

由 Nadav Amit 提交于 4月 18, 2014

If EFER.LMA is off, cs.l does not determine execution mode.
Currently, the emulation engine assumes differently.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

42bf549f

KVM: x86: IN instruction emulation should ignore REP-prefix · e6e39f04

由 Nadav Amit 提交于 4月 18, 2014

The IN instruction is not be affected by REP-prefix as INS is. Therefore, the
emulation should ignore the REP prefix as well. The current emulator
implementation tries to perform writeback when IN instruction with REP-prefix
is emulated. This causes it to perform wrong memory write or spurious #GP
exception to be injected to the guest.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e6e39f04

KVM: x86: Fix CR3 reserved bits · 346874c9

由 Nadav Amit 提交于 4月 18, 2014

According to Intel specifications, PAE and non-PAE does not have any reserved
bits.  In long-mode, regardless to PCIDE, only the high bits (above the
physical address) are reserved.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

346874c9

KVM: x86: Fix wrong/stuck PMU when guest does not use PMI · 671bd993

由 Nadav Amit 提交于 4月 18, 2014

If a guest enables a performance counter but does not enable PMI, the
hypervisor currently does not reprogram the performance counter once it
overflows. As a result the host performance counter is kept with the original
sampling period which was configured according to the value of the guest's
counter when the counter was enabled.

Such behaviour can cause very bad consequences. The most distrubing one can
cause the guest not to make any progress at all, and keep exiting due to host
PMI before any guest instructions is exeucted. This situation occurs when the
performance counter holds a very high value when the guest enables the
performance counter. As a result the host's sampling period is configured to be
very short. The host then never reconfigures the sampling period and get stuck
at entry->PMI->exit loop. We encountered such a scenario in our experiments.

The solution is to reprogram the counter even if the guest does not use PMI.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

671bd993

23 4月, 2014 4 次提交

KVM: nVMX: Advertise support for interrupt acknowledgement · e0ba1a6f

由 Bandan Das 提交于 4月 19, 2014

Some Type 1 hypervisors such as XEN won't enable VMX without it present
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

e0ba1a6f

KVM: nVMX: Ack and write vector info to intr_info if L1 asks us to · 77b0f5d6

由 Bandan Das 提交于 4月 19, 2014

This feature emulates the "Acknowledge interrupt on exit" behavior.
We can safely emulate it for L1 to run L2 even if L0 itself has it
disabled (to run L1).
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

77b0f5d6

KVM: nVMX: Don't advertise single context invalidation for invept · 4b855078

由 Bandan Das 提交于 4月 19, 2014

For single context invalidation, we fall through to global
invalidation in handle_invept() except for one case - when
the operand supplied by L1 is different from what we have in
vmcs12. However, typically hypervisors will only call invept
for the currently loaded eptp, so the condition will
never be true.
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

4b855078

KVM: VMX: Advance rip to after an ICEBP instruction · fd2a445a

由 Huw Davies 提交于 4月 16, 2014

When entering an exception after an ICEBP, the saved instruction
pointer should point to after the instruction.

This fixes the bug here: https://bugs.launchpad.net/qemu/+bug/1119686Signed-off-by: NHuw Davies <huw@codeweavers.com>
Reviewed-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

fd2a445a

22 4月, 2014 3 次提交

Merge tag 'kvm-s390-20140422' of... · 63b5cf04

由 Marcelo Tosatti 提交于 4月 22, 2014

Merge tag 'kvm-s390-20140422' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into queue

Lazy storage key handling
-------------------------
Linux does not use the ACC and F bits of the storage key. Newer Linux
versions also do not use the storage keys for dirty and reference
tracking. We can optimize the guest handling for those guests for faults
as well as page-in and page-out by simply not caring about the guest
visible storage key. We trap guest storage key instruction to enable
those keys only on demand.

Migration bitmap

Until now s390 never provided a proper dirty bitmap. Let's provide a
proper migration bitmap for s390. We also change the user dirty tracking
to a fault based mechanism. This makes the host completely independent
from the storage keys. Long term this will allow us to back guest memory
with large pages.

per-VM device attributes
------------------------
To avoid the introduction of new ioctls, let's provide the
attribute semanantic also on the VM-"device".

Userspace controlled CMMA
-------------------------
The CMMA assist is changed from "always on" to "on if requested" via
per-VM device attributes. In addition a callback to reset all usage
states is provided.

Proper guest DAT handling for intercepts
----------------------------------------
While instructions handled by SIE take care of all addressing aspects,
KVM/s390 currently does not care about guest address translation of
intercepts. This worked out fine, because
- the s390 Linux kernel has a 1:1 mapping between kernel virtual<->real
for all pages up to memory size
- intercepts happen only for a small amount of cases
- all of these intercepts happen to be in the kernel text for current
distros

Of course we need to be better for other intercepts, kernel modules etc.
We provide the infrastructure and rework all in-kernel intercepts to work
on logical addresses (paging etc) instead of real ones. The code has
been running internally for several months now, so it is time for going
public.

GDB support
-----------
We provide breakpoints, single stepping and watchpoints.

Fixes/Cleanups
--------------
- Improve program check delivery
- Factor out the handling of transactional memory on program checks
- Use the existing define __LC_PGM_TDB
- Several cleanups in the lowcore structure
- Documentation

NOTES
-----
- All patches touching base s390 are either ACKed or written by the s390
maintainers
- One base KVM patch "KVM: add kvm_is_error_gpa() helper"
- One patch introduces the notion of VM device attributes
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

Conflicts:
include/uapi/linux/kvm.h

63b5cf04

KVM: s390: Factor out handle_itdb to handle TX aborts · e325fe69

由 Michael Mueller 提交于 3月 13, 2014

Factor out the new function handle_itdb(), which copies the ITDB into
guest lowcore to fully handle a TX abort.
Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

e325fe69

KVM: s390: replace TDB_ADDR by __LC_PGM_TDB · a86dcc24

由 Michael Mueller 提交于 3月 13, 2014

The generically assembled low core labels already contain the
address for the TDB.
Signed-off-by: NMichael Mueller <mimu@linux.vnet.ibm.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

a86dcc24

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功