1. 23 7月, 2016 1 次提交
    • E
      KVM: arm/arm64: Enable irqchip routing · 180ae7b1
      Eric Auger 提交于
      This patch adds compilation and link against irqchip.
      
      Main motivation behind using irqchip code is to enable MSI
      routing code. In the future irqchip routing may also be useful
      when targeting multiple irqchips.
      
      Routing standard callbacks now are implemented in vgic-irqfd:
      - kvm_set_routing_entry
      - kvm_set_irq
      - kvm_set_msi
      
      They only are supported with new_vgic code.
      
      Both HAVE_KVM_IRQCHIP and HAVE_KVM_IRQ_ROUTING are defined.
      KVM_CAP_IRQ_ROUTING is advertised and KVM_SET_GSI_ROUTING is allowed.
      
      So from now on IRQCHIP routing is enabled and a routing table entry
      must exist for irqfd injection to succeed for a given SPI. This patch
      builds a default flat irqchip routing table (gsi=irqchip.pin) covering
      all the VGIC SPI indexes. This routing table is overwritten by the
      first first user-space call to KVM_SET_GSI_ROUTING ioctl.
      
      MSI routing setup is not yet allowed.
      Signed-off-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      180ae7b1
  2. 19 7月, 2016 2 次提交
  3. 04 7月, 2016 9 次提交
  4. 01 7月, 2016 1 次提交
  5. 29 6月, 2016 3 次提交
  6. 27 6月, 2016 1 次提交
    • J
      KVM: arm/arm64: Stop leaking vcpu pid references · 591d215a
      James Morse 提交于
      kvm provides kvm_vcpu_uninit(), which amongst other things, releases the
      last reference to the struct pid of the task that was last running the vcpu.
      
      On arm64 built with CONFIG_DEBUG_KMEMLEAK, starting a guest with kvmtool,
      then killing it with SIGKILL results (after some considerable time) in:
      > cat /sys/kernel/debug/kmemleak
      > unreferenced object 0xffff80007d5ea080 (size 128):
      >  comm "lkvm", pid 2025, jiffies 4294942645 (age 1107.776s)
      >  hex dump (first 32 bytes):
      >    01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      >    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      >  backtrace:
      >    [<ffff8000001b30ec>] create_object+0xfc/0x278
      >    [<ffff80000071da34>] kmemleak_alloc+0x34/0x70
      >    [<ffff80000019fa2c>] kmem_cache_alloc+0x16c/0x1d8
      >    [<ffff8000000d0474>] alloc_pid+0x34/0x4d0
      >    [<ffff8000000b5674>] copy_process.isra.6+0x79c/0x1338
      >    [<ffff8000000b633c>] _do_fork+0x74/0x320
      >    [<ffff8000000b66b0>] SyS_clone+0x18/0x20
      >    [<ffff800000085cb0>] el0_svc_naked+0x24/0x28
      >    [<ffffffffffffffff>] 0xffffffffffffffff
      
      On x86 kvm_vcpu_uninit() is called on the path from kvm_arch_destroy_vm(),
      on arm no equivalent call is made. Add the call to kvm_arch_vcpu_free().
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Fixes: 749cf76c ("KVM: ARM: Initial skeleton to compile KVM support")
      Cc: <stable@vger.kernel.org> # 3.10+
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      591d215a
  7. 14 6月, 2016 1 次提交
  8. 20 5月, 2016 6 次提交
  9. 10 5月, 2016 1 次提交
    • C
      kvm: arm64: Enable hardware updates of the Access Flag for Stage 2 page tables · 06485053
      Catalin Marinas 提交于
      The ARMv8.1 architecture extensions introduce support for hardware
      updates of the access and dirty information in page table entries. With
      VTCR_EL2.HA enabled (bit 21), when the CPU accesses an IPA with the
      PTE_AF bit cleared in the stage 2 page table, instead of raising an
      Access Flag fault to EL2 the CPU sets the actual page table entry bit
      (10). To ensure that kernel modifications to the page table do not
      inadvertently revert a bit set by hardware updates, certain Stage 2
      software pte/pmd operations must be performed atomically.
      
      The main user of the AF bit is the kvm_age_hva() mechanism. The
      kvm_age_hva_handler() function performs a "test and clear young" action
      on the pte/pmd. This needs to be atomic in respect of automatic hardware
      updates of the AF bit. Since the AF bit is in the same position for both
      Stage 1 and Stage 2, the patch reuses the existing
      ptep_test_and_clear_young() functionality if
      __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG is defined. Otherwise, the
      existing pte_young/pte_mkold mechanism is preserved.
      
      The kvm_set_s2pte_readonly() (and the corresponding pmd equivalent) have
      to perform atomic modifications in order to avoid a race with updates of
      the AF bit. The arm64 implementation has been re-written using
      exclusives.
      
      Currently, kvm_set_s2pte_writable() (and pmd equivalent) take a pointer
      argument and modify the pte/pmd in place. However, these functions are
      only used on local variables rather than actual page table entries, so
      it makes more sense to follow the pte_mkwrite() approach for stage 1
      attributes. The change to kvm_s2pte_mkwrite() makes it clear that these
      functions do not modify the actual page table entries.
      
      The (pte|pmd)_mkyoung() uses on Stage 2 entries (setting the AF bit
      explicitly) do not need to be modified since hardware updates of the
      dirty status are not supported by KVM, so there is no possibility of
      losing such information.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      06485053
  10. 06 5月, 2016 1 次提交
    • A
      mm: thp: kvm: fix memory corruption in KVM with THP enabled · 127393fb
      Andrea Arcangeli 提交于
      After the THP refcounting change, obtaining a compound pages from
      get_user_pages() no longer allows us to assume the entire compound page
      is immediately mappable from a secondary MMU.
      
      A secondary MMU doesn't want to call get_user_pages() more than once for
      each compound page, in order to know if it can map the whole compound
      page.  So a secondary MMU needs to know from a single get_user_pages()
      invocation when it can map immediately the entire compound page to avoid
      a flood of unnecessary secondary MMU faults and spurious
      atomic_inc()/atomic_dec() (pages don't have to be pinned by MMU notifier
      users).
      
      Ideally instead of the page->_mapcount < 1 check, get_user_pages()
      should return the granularity of the "page" mapping in the "mm" passed
      to get_user_pages().  However it's non trivial change to pass the "pmd"
      status belonging to the "mm" walked by get_user_pages up the stack (up
      to the caller of get_user_pages).  So the fix just checks if there is
      not a single pte mapping on the page returned by get_user_pages, and in
      turn if the caller can assume that the whole compound page is mapped in
      the current "mm" (in a pmd_trans_huge()).  In such case the entire
      compound page is safe to map into the secondary MMU without additional
      get_user_pages() calls on the surrounding tail/head pages.  In addition
      of being faster, not having to run other get_user_pages() calls also
      reduces the memory footprint of the secondary MMU fault in case the pmd
      split happened as result of memory pressure.
      
      Without this fix after a MADV_DONTNEED (like invoked by QEMU during
      postcopy live migration or balloning) or after generic swapping (with a
      failure in split_huge_page() that would only result in pmd splitting and
      not a physical page split), KVM would map the whole compound page into
      the shadow pagetables, despite regular faults or userfaults (like
      UFFDIO_COPY) may map regular pages into the primary MMU as result of the
      pte faults, leading to the guest mode and userland mode going out of
      sync and not working on the same memory at all times.
      
      Any other secondary MMU notifier manager (KVM is just one of the many
      MMU notifier users) will need the same information if it doesn't want to
      run a flood of get_user_pages_fast and it can support multiple
      granularity in the secondary MMU mappings, so I think it is justified to
      be exposed not just to KVM.
      
      The other option would be to move transparent_hugepage_adjust to
      mm/huge_memory.c but that currently has all kind of KVM data structures
      in it, so it's definitely not a cut-and-paste work, so I couldn't do a
      fix as cleaner as this one for 4.6.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: "Li, Liang Z" <liang.z.li@intel.com>
      Cc: Amit Shah <amit.shah@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      127393fb
  11. 29 4月, 2016 1 次提交
  12. 28 4月, 2016 1 次提交
    • A
      arm64: kvm: allows kvm cpu hotplug · 67f69197
      AKASHI Takahiro 提交于
      The current kvm implementation on arm64 does cpu-specific initialization
      at system boot, and has no way to gracefully shutdown a core in terms of
      kvm. This prevents kexec from rebooting the system at EL2.
      
      This patch adds a cpu tear-down function and also puts an existing cpu-init
      code into a separate function, kvm_arch_hardware_disable() and
      kvm_arch_hardware_enable() respectively.
      We don't need the arm64 specific cpu hotplug hook any more.
      
      Since this patch modifies common code between arm and arm64, one stub
      definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
      compilation errors.
      Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      [Rebase, added separate VHE init/exit path, changed resets use of
       kvm_call_hyp() to the __version, en/disabled hardware in init_subsystems(),
       added icache maintenance to __kvm_hyp_reset() and removed lr restore, removed
       guest-enter after teardown handling]
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      67f69197
  13. 21 4月, 2016 8 次提交
  14. 06 4月, 2016 1 次提交
  15. 31 3月, 2016 1 次提交
  16. 21 3月, 2016 1 次提交
  17. 01 3月, 2016 1 次提交
    • M
      KVM: arm/arm64: timer: Add active state caching · 9b4a3004
      Marc Zyngier 提交于
      Programming the active state in the (re)distributor can be an
      expensive operation so it makes some sense to try and reduce
      the number of accesses as much as possible. So far, we
      program the active state on each VM entry, but there is some
      opportunity to do less.
      
      An obvious solution is to cache the active state in memory,
      and only program it in the HW when conditions change. But
      because the HW can also change things under our feet (the active
      state can transition from 1 to 0 when the guest does an EOI),
      some precautions have to be taken, which amount to only caching
      an "inactive" state, and always programing it otherwise.
      
      With this in place, we observe a reduction of around 700 cycles
      on a 2GHz GICv2 platform for a NULL hypercall.
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      9b4a3004