1. 10 11月, 2015 17 次提交
  2. 05 11月, 2015 1 次提交
    • K
      KVM: VMX: Fix commit which broke PML · a3eaa864
      Kai Huang 提交于
      I found PML was broken since below commit:
      
      	commit feda805f
      	Author: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      	Date:   Wed Sep 9 14:05:55 2015 +0800
      
      	KVM: VMX: unify SECONDARY_VM_EXEC_CONTROL update
      
      	Unify the update in vmx_cpuid_update()
      Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
      	[Rewrite to use vmcs_set_secondary_exec_control. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      
      The reason is in above commit vmx_cpuid_update calls vmx_secondary_exec_control,
      in which currently SECONDARY_EXEC_ENABLE_PML bit is cleared unconditionally (as
      PML is enabled in creating vcpu). Therefore if vcpu_cpuid_update is called after
      vcpu is created, PML will be disabled unexpectedly while log-dirty code still
      thinks PML is used.
      
      Fix this by clearing SECONDARY_EXEC_ENABLE_PML in vmx_secondary_exec_control
      only when PML is not supported or not enabled (!enable_pml). This is more
      reasonable as PML is currently either always enabled or disabled. With this
      explicit updating SECONDARY_EXEC_ENABLE_PML in vmx_enable{disable}_pml is not
      needed so also rename vmx_enable{disable}_pml to vmx_create{destroy}_pml_buffer.
      
      Fixes: feda805fSigned-off-by: NKai Huang <kai.huang@linux.intel.com>
      [While at it, change a wrong ASSERT to an "if".  The condition can happen
       if creating the VCPU fails with ENOMEM. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a3eaa864
  3. 04 11月, 2015 11 次提交
  4. 29 10月, 2015 3 次提交
    • C
      KVM: s390: use simple switch statement as multiplexer · 46b708ea
      Christian Borntraeger 提交于
      We currently do some magic shifting (by exploiting that exit codes
      are always a multiple of 4) and a table lookup to jump into the
      exit handlers. This causes some calculations and checks, just to
      do an potentially expensive function call.
      
      Changing that to a switch statement gives the compiler the chance
      to inline and dynamically decide between jump tables or inline
      compare and branches. In addition it makes the code more readable.
      
      bloat-o-meter gives me a small reduction in code size:
      
      add/remove: 0/7 grow/shrink: 1/1 up/down: 986/-1334 (-348)
      function                                     old     new   delta
      kvm_handle_sie_intercept                      72    1058    +986
      handle_prog                                  704     696      -8
      handle_noop                                   54       -     -54
      handle_partial_execution                      60       -     -60
      intercept_funcs                              120       -    -120
      handle_instruction                           198       -    -198
      handle_validity                              210       -    -210
      handle_stop                                  316       -    -316
      handle_external_interrupt                    368       -    -368
      
      Right now my gcc does conditional branches instead of jump tables.
      The inlining seems to give us enough cycles as some micro-benchmarking
      shows minimal improvements, but still in noise.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      46b708ea
    • C
      KVM: s390: drop useless newline in debugging data · 58c383c6
      Christian Borntraeger 提交于
      the s390 debug feature does not need newlines. In fact it will
      result in empty lines. Get rid of 4 leftovers.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      58c383c6
    • D
      KVM: s390: SCA must not cross page boundaries · c5c2c393
      David Hildenbrand 提交于
      We seemed to have missed a few corner cases in commit f6c137ff
      ("KVM: s390: randomize sca address").
      
      The SCA has a maximum size of 2112 bytes. By setting the sca_offset to
      some unlucky numbers, we exceed the page.
      
      0x7c0 (1984) -> Fits exactly
      0x7d0 (2000) -> 16 bytes out
      0x7e0 (2016) -> 32 bytes out
      0x7f0 (2032) -> 48 bytes out
      
      One VCPU entry is 32 bytes long.
      
      For the last two cases, we actually write data to the other page.
      1. The address of the VCPU.
      2. Injection/delivery/clearing of SIGP externall calls via SIGP IF.
      
      Especially the 2. happens regularly. So this could produce two problems:
      1. The guest losing/getting external calls.
      2. Random memory overwrites in the host.
      
      So this problem happens on every 127 + 128 created VM with 64 VCPUs.
      
      Cc: stable@vger.kernel.org # v3.15+
      Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      c5c2c393
  5. 23 10月, 2015 8 次提交
    • M
      arm64: kvm: restore EL1N SP for panic · db85c55f
      Mark Rutland 提交于
      If we panic in hyp mode, we inject a call to panic() into the EL1N host
      kernel. If a guest context is active, we first attempt to restore the
      minimal amount of state necessary to execute the host kernel with
      restore_sysregs.
      
      However, the SP is restored as part of restore_common_regs, and so we
      may return to the host's panic() function with the SP of the guest. Any
      calculations based on the SP will be bogus, and any attempt to access
      the stack will result in recursive data aborts.
      
      When running Linux as a guest, the guest's EL1N SP is like to be some
      valid kernel address. In this case, the host kernel may use that region
      as a stack for panic(), corrupting it in the process.
      
      Avoid the problem by restoring the host SP prior to returning to the
      host. To prevent misleading backtraces in the host, the FP is zeroed at
      the same time. We don't need any of the other "common" registers in
      order to panic successfully.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Cc: <kvmarm@lists.cs.columbia.edu>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      db85c55f
    • C
      arm/arm64: KVM: Improve kvm_exit tracepoint · b5905dc1
      Christoffer Dall 提交于
      The ARM architecture only saves the exit class to the HSR (ESR_EL2 for
      arm64) on synchronous exceptions, not on asynchronous exceptions like an
      IRQ.  However, we only report the exception class on kvm_exit, which is
      confusing because an IRQ looks like it exited at some PC with the same
      reason as the previous exit.  Add a lookup table for the exception index
      and prepend the kvm_exit tracepoint text with the exception type to
      clarify this situation.
      
      Also resolve the exception class (EC) to a human-friendly text version
      so the trace output becomes immediately usable for debugging this code.
      
      Cc: Wei Huang <wei@redhat.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      b5905dc1
    • E
      KVM: arm/arm64: implement kvm_arm_[halt,resume]_guest · 3b92830a
      Eric Auger 提交于
      We introduce kvm_arm_halt_guest and resume functions. They
      will be used for IRQ forward state change.
      
      Halt is synchronous and prevents the guest from being re-entered.
      We use the same mechanism put in place for PSCI former pause,
      now renamed power_off. A new flag is introduced in arch vcpu state,
      pause, only meant to be used by those functions.
      Signed-off-by: NEric Auger <eric.auger@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      3b92830a
    • E
      KVM: arm/arm64: check power_off in critical section before VCPU run · 101d3da0
      Eric Auger 提交于
      In case a vcpu off PSCI call is called just after we executed the
      vcpu_sleep check, we can enter the guest although power_off
      is set. Let's check the power_off state in the critical section,
      just before entering the guest.
      Signed-off-by: NEric Auger <eric.auger@linaro.org>
      Reported-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      101d3da0
    • E
      KVM: arm/arm64: check power_off in kvm_arch_vcpu_runnable · 4f5f1dc0
      Eric Auger 提交于
      kvm_arch_vcpu_runnable now also checks whether the power_off
      flag is set.
      Signed-off-by: NEric Auger <eric.auger@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      4f5f1dc0
    • E
      KVM: arm/arm64: rename pause into power_off · 3781528e
      Eric Auger 提交于
      The kvm_vcpu_arch pause field is renamed into power_off to prepare
      for the introduction of a new pause field. Also vcpu_pause is renamed
      into vcpu_sleep since we will sleep until both power_off and pause are
      false.
      Signed-off-by: NEric Auger <eric.auger@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      3781528e
    • W
      arm/arm64: KVM : Enable vhost device selection under KVM config menu · 75755c6d
      Wei Huang 提交于
      vhost drivers provide guest VMs with better I/O performance and lower
      CPU utilization. This patch allows users to select vhost devices under
      KVM configuration menu on ARM. This makes vhost support on arm/arm64
      on a par with other architectures (e.g. x86, ppc).
      Signed-off-by: NWei Huang <wei@redhat.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      75755c6d
    • C
      arm/arm64: KVM: Rework the arch timer to use level-triggered semantics · 4b4b4512
      Christoffer Dall 提交于
      The arch timer currently uses edge-triggered semantics in the sense that
      the line is never sampled by the vgic and lowering the line from the
      timer to the vgic doesn't have any effect on the pending state of
      virtual interrupts in the vgic.  This means that we do not support a
      guest with the otherwise valid behavior of (1) disable interrupts (2)
      enable the timer (3) disable the timer (4) enable interrupts.  Such a
      guest would validly not expect to see any interrupts on real hardware,
      but will see interrupts on KVM.
      
      This patch fixes this shortcoming through the following series of
      changes.
      
      First, we change the flow of the timer/vgic sync/flush operations.  Now
      the timer is always flushed/synced before the vgic, because the vgic
      samples the state of the timer output.  This has the implication that we
      move the timer operations in to non-preempible sections, but that is
      fine after the previous commit getting rid of hrtimer schedules on every
      entry/exit.
      
      Second, we change the internal behavior of the timer, letting the timer
      keep track of its previous output state, and only lower/raise the line
      to the vgic when the state changes.  Note that in theory this could have
      been accomplished more simply by signalling the vgic every time the
      state *potentially* changed, but we don't want to be hitting the vgic
      more often than necessary.
      
      Third, we get rid of the use of the map->active field in the vgic and
      instead simply set the interrupt as active on the physical distributor
      whenever the input to the GIC is asserted and conversely clear the
      physical active state when the input to the GIC is deasserted.
      
      Fourth, and finally, we now initialize the timer PPIs (and all the other
      unused PPIs for now), to be level-triggered, and modify the sync code to
      sample the line state on HW sync and re-inject a new interrupt if it is
      still pending at that time.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      4b4b4512