1. 18 7月, 2013 1 次提交
  2. 27 6月, 2013 1 次提交
  3. 03 6月, 2013 1 次提交
    • A
      ARM: KVM: prevent NULL pointer dereferences with KVM VCPU ioctl · e8180dca
      Andre Przywara 提交于
      Some ARM KVM VCPU ioctls require the vCPU to be properly initialized
      with the KVM_ARM_VCPU_INIT ioctl before being used with further
      requests. KVM_RUN checks whether this initialization has been
      done, but other ioctls do not.
      Namely KVM_GET_REG_LIST will dereference an array with index -1
      without initialization and thus leads to a kernel oops.
      Fix this by adding checks before executing the ioctl handlers.
      
       [ Removed superflous comment from static function - Christoffer ]
      
      Changes from v1:
       * moved check into a static function with a meaningful name
      Signed-off-by: NAndre Przywara <andre.przywara@linaro.org>
      Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
      e8180dca
  4. 29 4月, 2013 7 次提交
    • A
      ARM: KVM: iterate over all CPUs for CPU compatibility check · d4e071ce
      Andre Przywara 提交于
      kvm_target_cpus() checks the compatibility of the used CPU with
      KVM, which is currently limited to ARM Cortex-A15 cores.
      However by calling it only once on any random CPU it assumes that
      all cores are the same, which is not necessarily the case (for example
      in Big.Little).
      
      [ I cut some of the commit message and changed the formatting of the
        code slightly to pass checkpatch and look more like the rest of the
        kvm/arm init code - Christoffer ]
      Signed-off-by: NAndre Przywara <andre.przywara@linaro.org>
      Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
      d4e071ce
    • M
      ARM: KVM: promote vfp_host pointer to generic host cpu context · 3de50da6
      Marc Zyngier 提交于
      We use the vfp_host pointer to store the host VFP context, should
      the guest start using VFP itself.
      
      Actually, we can use this pointer in a more generic way to store
      CPU speficic data, and arm64 is using it to dump the whole host
      state before switching to the guest.
      
      Simply rename the vfp_host field to host_cpu_context, and the
      corresponding type to kvm_cpu_context_t. No change in functionnality.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
      3de50da6
    • M
      ARM: KVM: add architecture specific hook for capabilities · 17b1e31f
      Marc Zyngier 提交于
      Most of the capabilities are common to both arm and arm64, but
      we still need to handle the exceptions.
      
      Introduce kvm_arch_dev_ioctl_check_extension, which both architectures
      implement (in the 32bit case, it just returns 0).
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
      17b1e31f
    • M
      ARM: KVM: perform HYP initilization for hotplugged CPUs · d157f4a5
      Marc Zyngier 提交于
      Now that we have the necessary infrastructure to boot a hotplugged CPU
      at any point in time, wire a CPU notifier that will perform the HYP
      init for the incoming CPU.
      
      Note that this depends on the platform code and/or firmware to boot the
      incoming CPU with HYP mode enabled and return to the kernel by following
      the normal boot path (HYP stub installed).
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
      d157f4a5
    • M
      ARM: KVM: switch to a dual-step HYP init code · 5a677ce0
      Marc Zyngier 提交于
      Our HYP init code suffers from two major design issues:
      - it cannot support CPU hotplug, as we tear down the idmap very early
      - it cannot perform a TLB invalidation when switching from init to
        runtime mappings, as pages are manipulated from PL1 exclusively
      
      The hotplug problem mandates that we keep two sets of page tables
      (boot and runtime). The TLB problem mandates that we're able to
      transition from one PGD to another while in HYP, invalidating the TLBs
      in the process.
      
      To be able to do this, we need to share a page between the two page
      tables. A page that will have the same VA in both configurations. All we
      need is a VA that has the following properties:
      - This VA can't be used to represent a kernel mapping.
      - This VA will not conflict with the physical address of the kernel text
      
      The vectors page seems to satisfy this requirement:
      - The kernel never maps anything else there
      - The kernel text being copied at the beginning of the physical memory,
        it is unlikely to use the last 64kB (I doubt we'll ever support KVM
        on a system with something like 4MB of RAM, but patches are very
        welcome).
      
      Let's call this VA the trampoline VA.
      
      Now, we map our init page at 3 locations:
      - idmap in the boot pgd
      - trampoline VA in the boot pgd
      - trampoline VA in the runtime pgd
      
      The init scenario is now the following:
      - We jump in HYP with four parameters: boot HYP pgd, runtime HYP pgd,
        runtime stack, runtime vectors
      - Enable the MMU with the boot pgd
      - Jump to a target into the trampoline page (remember, this is the same
        physical page!)
      - Now switch to the runtime pgd (same VA, and still the same physical
        page!)
      - Invalidate TLBs
      - Set stack and vectors
      - Profit! (or eret, if you only care about the code).
      
      Note that we keep the boot mapping permanently (it is not strictly an
      idmap anymore) to allow for CPU hotplug in later patches.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
      5a677ce0
    • M
      ARM: KVM: rework HYP page table freeing · 4f728276
      Marc Zyngier 提交于
      There is no point in freeing HYP page tables differently from Stage-2.
      They now have the same requirements, and should be dealt with the same way.
      
      Promote unmap_stage2_range to be The One True Way, and get rid of a number
      of nasty bugs in the process (good thing we never actually called free_hyp_pmds
      before...).
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
      4f728276
    • M
      ARM: KVM: add support for minimal host vs guest profiling · 210552c1
      Marc Zyngier 提交于
      In order to be able to correctly profile what is happening on the
      host, we need to be able to identify when we're running on the guest,
      and log these events differently.
      
      Perf offers a simple way to register callbacks into KVM. Mimic what
      x86 does and enjoy being able to profile your KVM host.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
      210552c1
  5. 24 4月, 2013 1 次提交
  6. 17 4月, 2013 2 次提交
  7. 07 3月, 2013 8 次提交
  8. 05 3月, 2013 4 次提交
  9. 25 2月, 2013 1 次提交
  10. 12 2月, 2013 8 次提交
  11. 24 1月, 2013 6 次提交
    • M
      KVM: ARM: Power State Coordination Interface implementation · aa024c2f
      Marc Zyngier 提交于
      Implement the PSCI specification (ARM DEN 0022A) to control
      virtual CPUs being "powered" on or off.
      
      PSCI/KVM is detected using the KVM_CAP_ARM_PSCI capability.
      
      A virtual CPU can now be initialized in a "powered off" state,
      using the KVM_ARM_VCPU_POWER_OFF feature flag.
      
      The guest can use either SMC or HVC to execute a PSCI function.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <c.dall@virtualopensystems.com>
      aa024c2f
    • C
      KVM: ARM: Handle I/O aborts · 45e96ea6
      Christoffer Dall 提交于
      When the guest accesses I/O memory this will create data abort
      exceptions and they are handled by decoding the HSR information
      (physical address, read/write, length, register) and forwarding reads
      and writes to QEMU which performs the device emulation.
      
      Certain classes of load/store operations do not support the syndrome
      information provided in the HSR.  We don't support decoding these (patches
      are available elsewhere), so we report an error to user space in this case.
      
      This requires changing the general flow somewhat since new calls to run
      the VCPU must check if there's a pending MMIO load and perform the write
      after userspace has made the data available.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <c.dall@virtualopensystems.com>
      45e96ea6
    • C
      KVM: ARM: Emulation framework and CP15 emulation · 5b3e5e5b
      Christoffer Dall 提交于
      Adds a new important function in the main KVM/ARM code called
      handle_exit() which is called from kvm_arch_vcpu_ioctl_run() on returns
      from guest execution. This function examines the Hyp-Syndrome-Register
      (HSR), which contains information telling KVM what caused the exit from
      the guest.
      
      Some of the reasons for an exit are CP15 accesses, which are
      not allowed from the guest and this commit handles these exits by
      emulating the intended operation in software and skipping the guest
      instruction.
      
      Minor notes about the coproc register reset:
      1) We reserve a value of 0 as an invalid cp15 offset, to catch bugs in our
         table, at cost of 4 bytes per vcpu.
      
      2) Added comments on the table indicating how we handle each register, for
         simplicity of understanding.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NChristoffer Dall <c.dall@virtualopensystems.com>
      5b3e5e5b
    • C
      KVM: ARM: World-switch implementation · f7ed45be
      Christoffer Dall 提交于
      Provides complete world-switch implementation to switch to other guests
      running in non-secure modes. Includes Hyp exception handlers that
      capture necessary exception information and stores the information on
      the VCPU and KVM structures.
      
      The following Hyp-ABI is also documented in the code:
      
      Hyp-ABI: Calling HYP-mode functions from host (in SVC mode):
         Switching to Hyp mode is done through a simple HVC #0 instruction. The
         exception vector code will check that the HVC comes from VMID==0 and if
         so will push the necessary state (SPSR, lr_usr) on the Hyp stack.
         - r0 contains a pointer to a HYP function
         - r1, r2, and r3 contain arguments to the above function.
         - The HYP function will be called with its arguments in r0, r1 and r2.
         On HYP function return, we return directly to SVC.
      
      A call to a function executing in Hyp mode is performed like the following:
      
              <svc code>
              ldr     r0, =BSYM(my_hyp_fn)
              ldr     r1, =my_param
              hvc #0  ; Call my_hyp_fn(my_param) from HYP mode
              <svc code>
      
      Otherwise, the world-switch is pretty straight-forward. All state that
      can be modified by the guest is first backed up on the Hyp stack and the
      VCPU values is loaded onto the hardware. State, which is not loaded, but
      theoretically modifiable by the guest is protected through the
      virtualiation features to generate a trap and cause software emulation.
      Upon guest returns, all state is restored from hardware onto the VCPU
      struct and the original state is restored from the Hyp-stack onto the
      hardware.
      
      SMP support using the VMPIDR calculated on the basis of the host MPIDR
      and overriding the low bits with KVM vcpu_id contributed by Marc Zyngier.
      
      Reuse of VMIDs has been implemented by Antonios Motakis and adapated from
      a separate patch into the appropriate patches introducing the
      functionality. Note that the VMIDs are stored per VM as required by the ARM
      architecture reference manual.
      
      To support VFP/NEON we trap those instructions using the HPCTR. When
      we trap, we switch the FPU.  After a guest exit, the VFP state is
      returned to the host.  When disabling access to floating point
      instructions, we also mask FPEXC_EN in order to avoid the guest
      receiving Undefined instruction exceptions before we have a chance to
      switch back the floating point state.  We are reusing vfp_hard_struct,
      so we depend on VFPv3 being enabled in the host kernel, if not, we still
      trap cp10 and cp11 in order to inject an undefined instruction exception
      whenever the guest tries to use VFP/NEON. VFP/NEON developed by
      Antionios Motakis and Rusty Russell.
      
      Aborts that are permission faults, and not stage-1 page table walk, do
      not report the faulting address in the HPFAR.  We have to resolve the
      IPA, and store it just like the HPFAR register on the VCPU struct. If
      the IPA cannot be resolved, it means another CPU is playing with the
      page tables, and we simply restart the guest.  This quirk was fixed by
      Marc Zyngier.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAntonios Motakis <a.motakis@virtualopensystems.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <c.dall@virtualopensystems.com>
      f7ed45be
    • C
      KVM: ARM: Inject IRQs and FIQs from userspace · 86ce8535
      Christoffer Dall 提交于
      All interrupt injection is now based on the VM ioctl KVM_IRQ_LINE.  This
      works semantically well for the GIC as we in fact raise/lower a line on
      a machine component (the gic).  The IOCTL uses the follwing struct.
      
      struct kvm_irq_level {
      	union {
      		__u32 irq;     /* GSI */
      		__s32 status;  /* not used for KVM_IRQ_LEVEL */
      	};
      	__u32 level;           /* 0 or 1 */
      };
      
      ARM can signal an interrupt either at the CPU level, or at the in-kernel irqchip
      (GIC), and for in-kernel irqchip can tell the GIC to use PPIs designated for
      specific cpus.  The irq field is interpreted like this:
      
        bits:  | 31 ... 24 | 23  ... 16 | 15    ...    0 |
        field: | irq_type  | vcpu_index |   irq_number   |
      
      The irq_type field has the following values:
      - irq_type[0]: out-of-kernel GIC: irq_number 0 is IRQ, irq_number 1 is FIQ
      - irq_type[1]: in-kernel GIC: SPI, irq_number between 32 and 1019 (incl.)
                     (the vcpu_index field is ignored)
      - irq_type[2]: in-kernel GIC: PPI, irq_number between 16 and 31 (incl.)
      
      The irq_number thus corresponds to the irq ID in as in the GICv2 specs.
      
      This is documented in Documentation/kvm/api.txt.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NChristoffer Dall <c.dall@virtualopensystems.com>
      86ce8535
    • C
      KVM: ARM: Memory virtualization setup · d5d8184d
      Christoffer Dall 提交于
      This commit introduces the framework for guest memory management
      through the use of 2nd stage translation. Each VM has a pointer
      to a level-1 table (the pgd field in struct kvm_arch) which is
      used for the 2nd stage translations. Entries are added when handling
      guest faults (later patch) and the table itself can be allocated and
      freed through the following functions implemented in
      arch/arm/kvm/arm_mmu.c:
       - kvm_alloc_stage2_pgd(struct kvm *kvm);
       - kvm_free_stage2_pgd(struct kvm *kvm);
      
      Each entry in TLBs and caches are tagged with a VMID identifier in
      addition to ASIDs. The VMIDs are assigned consecutively to VMs in the
      order that VMs are executed, and caches and tlbs are invalidated when
      the VMID space has been used to allow for more than 255 simultaenously
      running guests.
      
      The 2nd stage pgd is allocated in kvm_arch_init_vm(). The table is
      freed in kvm_arch_destroy_vm(). Both functions are called from the main
      KVM code.
      
      We pre-allocate page table memory to be able to synchronize using a
      spinlock and be called under rcu_read_lock from the MMU notifiers.  We
      steal the mmu_memory_cache implementation from x86 and adapt for our
      specific usage.
      
      We support MMU notifiers (thanks to Marc Zyngier) through
      kvm_unmap_hva and kvm_set_spte_hva.
      
      Finally, define kvm_phys_addr_ioremap() to map a device at a guest IPA,
      which is used by VGIC support to map the virtual CPU interface registers
      to the guest. This support is added by Marc Zyngier.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <c.dall@virtualopensystems.com>
      d5d8184d