1. 08 2月, 2017 5 次提交
    • P
      Merge tag 'kvm_mips_4.11_1' of... · d9c0e59f
      Paolo Bonzini 提交于
      Merge tag 'kvm_mips_4.11_1' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/kvm-mips into HEAD
      
      KVM: MIPS: GVA/GPA page tables, dirty logging, SYNC_MMU etc
      
      Numerous MIPS KVM fixes, improvements, and features for 4.11, many of
      which continue to pave the way for VZ support, the most interesting of
      which are:
      
       - Add GVA->HPA page tables for T&E, to cache GVA mappings.
       - Generate fast-path TLB refill exception handler which loads host TLB
         entries from GVA page table, avoiding repeated guest memory
         translation and guest TLB lookups.
       - Use uaccess macros when T&E needs to access guest memory, which with
         GVA page tables and the Linux TLB refill handler improves robustness
         against TLB faults and fixes EVA hosts.
       - Use BadInstr/BadInstrP registers when available to obtain instruction
         encodings after a synchronous trap.
       - Add GPA->HPA page tables to replace the inflexible linear array,
         allowing for multiple sparsely arranged memory regions.
       - Properly implement dirty page logging.
       - Add KVM_CAP_SYNC_MMU support so that changes in GPA mappings become
         effective in guests even if they are already running, allowing for
         copy-on-write, KSM, idle page tracking, swapping, and guest memory
         ballooning.
       - Add KVM_CAP_READONLY_MEM support, so writes to specified memory
         regions are treated as MMIO.
       - Implement proper CP0_EBase support in T&E.
       - Expose a few more missing CP0 registers to userland.
       - Add KVM_CAP_NR_VCPUS and KVM_CAP_MAX_VCPUS support, and allow up to 8
         VCPUs to be created in a VM.
       - Various cleanups and dropping of dead and duplicated code.
      d9c0e59f
    • P
      Merge branch 'kvm-ppc-next' of... · d5b798c1
      Paolo Bonzini 提交于
      Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD
      
      The big feature this time is support for POWER9 using the radix-tree
      MMU for host and guest.  This required some changes to arch/powerpc
      code, so I talked with Michael Ellerman and he created a topic branch
      with this patchset, which I merged into kvm-ppc-next and which Michael
      will pull into his tree.  Michael also put in some patches from Nick
      Piggin which fix bugs in the interrupt vector code in relocatable
      kernels when coming from a KVM guest.
      
      Other notable changes include:
      
      * Add the ability to change the size of the hashed page table,
        from David Gibson.
      
      * XICS (interrupt controller) emulation fixes and improvements,
        from Li Zhong.
      
      * Bug fixes from myself and Thomas Huth.
      
      These patches define some new KVM capabilities and ioctls, but there
      should be no conflicts with anything else currently upstream, as far
      as I am aware.
      d5b798c1
    • M
      KVM: x86: add KVM_HC_CLOCK_PAIRING hypercall · 55dd00a7
      Marcelo Tosatti 提交于
      Add a hypercall to retrieve the host realtime clock and the TSC value
      used to calculate that clock read.
      
      Used to implement clock synchronization between host and guest.
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      55dd00a7
    • D
      KVM: nVMX: vmx_complete_nested_posted_interrupt() can't fail · 6342c50a
      David Hildenbrand 提交于
      vmx_complete_nested_posted_interrupt() can't fail, let's turn it into
      a void function.
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6342c50a
    • D
      KVM: nVMX: kmap() can't fail · 42cf014d
      David Hildenbrand 提交于
      kmap() can't fail, therefore it will always return a valid pointer. Let's
      just get rid of the unnecessary checks.
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      42cf014d
  2. 03 2月, 2017 35 次提交
    • J
      KVM: MIPS: Allow multiple VCPUs to be created · 12ed1fae
      James Hogan 提交于
      Increase the maximum number of MIPS KVM VCPUs to 8, and implement the
      KVM_CAP_NR_VCPUS and KVM_CAP_MAX_CPUS capabilities which expose the
      recommended and maximum number of VCPUs to userland. The previous
      maximum of 1 didn't allow for any form of SMP guests.
      
      We calculate the values similarly to ARM, recommending as many VCPUs as
      there are CPUs online in the system. This will allow userland to know
      how many VCPUs it is possible to create.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      12ed1fae
    • J
      KVM: MIPS/T&E: Expose read-only CP0_IntCtl register · ad58d4d4
      James Hogan 提交于
      Expose the CP0_IntCtl register through the KVM register access API,
      which is a required register since MIPS32r2. It is currently read-only
      since the VS field isn't implemented due to lack of Config3.VInt or
      Config3.VEIC.
      
      It is implemented in trap_emul.c so that a VZ implementation can allow
      writes.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      ad58d4d4
    • J
      KVM: MIPS/T&E: Expose CP0_EntryLo0/1 registers · 013044cc
      James Hogan 提交于
      Expose the CP0_EntryLo0 and CP0_EntryLo1 registers through the KVM
      register access API. This is fairly straightforward for trap & emulate
      since we don't support the RI and XI bits. For the sake of future
      proofing (particularly for VZ) it is explicitly specified that the API
      always exposes the 64-bit version of these registers (i.e. with the RI
      and XI bits in bit positions 63 and 62 respectively), and they are
      implemented in trap_emul.c rather than mips.c to allow them to be
      implemented differently for VZ.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      013044cc
    • J
      KVM: MIPS/T&E: Default to reset vector · be67a0be
      James Hogan 提交于
      Set the default VCPU state closer to the architectural reset state, with
      PC pointing at the reset vector (uncached PA 0x1fc00000, which for KVM
      T&E is VA 0x5fc00000), and with CP0_Status.BEV and CP0_Status.ERL to 1.
      
      Although QEMU at least will overwrite this state, it makes sense to do
      this now that CP0_EBase is properly implemented to check BEV, and now
      that we support a sparse GPA layout potentially with a boot ROM at GPA
      0x1fc00000.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      be67a0be
    • J
      KVM: MIPS/T&E: Implement CP0_EBase register · 7801bbe1
      James Hogan 提交于
      The CP0_EBase register is a standard feature of MIPS32r2, so we should
      always have been implementing it properly. However the register value
      was ignored and wasn't exposed to userland.
      
      Fix the emulation of exceptions and interrupts to use the value stored
      in guest CP0_EBase, and fix the masks so that the top 3 bits (rather
      than the standard 2) are fixed, so that it is always in the guest KSeg0
      segment.
      
      Also add CP0_EBASE to the KVM one_reg interface so it can be accessed by
      userland, also allowing the CPU number field to be written (which isn't
      permitted by the guest).
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      7801bbe1
    • J
      KVM: MIPS/T&E: Move CP0 register access into T&E · 654229a0
      James Hogan 提交于
      Access to various CP0 registers via the KVM register access API needs to
      be implementation specific to allow restrictions to be made on changes,
      for example when VZ guest registers aren't present, so move them all
      into trap_emul.c in preparation for VZ.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      654229a0
    • J
      KVM: MIPS: Claim KVM_CAP_READONLY_MEM support · 230c5724
      James Hogan 提交于
      Now that load/store faults due to read only memory regions are treated
      as MMIO accesses it is safe to claim support for read only memory
      regions (KVM_CAP_READONLY_MEM).
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      230c5724
    • J
      KVM: MIPS/MMU: Implement KVM_CAP_SYNC_MMU · 411740f5
      James Hogan 提交于
      Implement the SYNC_MMU capability for KVM MIPS, allowing changes in the
      underlying user host virtual address (HVA) mappings to be promptly
      reflected in the corresponding guest physical address (GPA) mappings.
      
      This allows for several features to work with guest RAM which require
      mappings to be altered or protected, such as copy-on-write, KSM (Kernel
      Samepage Merging), idle page tracking, memory swapping, and guest memory
      ballooning.
      
      There are two main aspects of this change, described below.
      
      The KVM MMU notifier architecture callbacks are implemented so we can be
      notified of changes in the HVA mappings. These arrange for the guest
      physical address (GPA) page tables to be modified and possibly for
      derived mappings (GVA page tables and TLBs) to be flushed.
      
       - kvm_unmap_hva[_range]() - These deal with HVA mappings being removed,
         for example before a copy-on-write takes place, which requires the
         corresponding GPA page table mappings to be removed too.
      
       - kvm_set_spte_hva() - These update a GPA page table entry to match the
         new HVA entry, but must be careful to respect KVM specific
         configuration such as not dirtying a clean guest page which is dirty
         to the host, and write protecting writable pages in read only
         memslots (which will soon be supported).
      
       - kvm[_test]_age_hva() - These update GPA page table entries to be old
         (invalid) so that access can be tracked, making them young again.
      
      The GPA page fault handling (kvm_mips_map_page) is updated to use
      gfn_to_pfn_prot() (which may provide read-only pages), to handle
      asynchronous page table invalidation from MMU notifier callbacks, and to
      handle more cases in the fast path.
      
       - mmu_notifier_seq is used to detect asynchronous page table
         invalidations while we're holding a pfn from gfn_to_pfn_prot()
         outside of kvm->mmu_lock, retrying if invalidations have taken place,
         e.g. a COW or a KSM page merge.
      
       - The fast path (_kvm_mips_map_page_fast) now handles marking old pages
         as young / accessed, and disallowing dirtying of clean pages that
         aren't actually writable (e.g. shared pages that should COW, and
         read-only memory regions when they are enabled in a future patch).
      
       - Due to the use of MMU notifications we no longer need to keep the
         page references after we've updated the GPA page tables.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      411740f5
    • J
      KVM: MIPS/MMU: Pass GPA PTE bits to mapped GVA PTEs · f9b11e51
      James Hogan 提交于
      Propagate the GPA PTE protection bits on to the GVA PTEs on a mapped
      fault (except _PAGE_WRITE, and filtered by the guest TLB entry), rather
      than always overriding the protection. This allows dirty page tracking
      to work in mapped guest segments as a clear dirty bit in the GPA PTE
      will propagate to the GVA PTEs even when the guest TLB has the dirty bit
      set.
      
      Since the filtering of protection bits is now abstracted, if the buddy
      GVA PTE is also valid, we obtain the corresponding GPA PTE using a
      simple non-allocating walk and load that into the GVA PTE similarly
      (which may itself be invalid).
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      f9b11e51
    • J
      KVM: MIPS/MMU: Pass GPA PTE bits to KSeg0 GVA PTEs · b584f460
      James Hogan 提交于
      Propagate the GPA PTE protection bits on to the GVA PTEs on a KSeg0
      fault (except _PAGE_WRITE), rather than always overriding the
      protection. This allows dirty page tracking to work in KSeg0 as a clear
      dirty bit in the GPA PTE will propagate to the GVA PTEs.
      
      This makes it simpler to use a single kvm_mips_map_page() to obtain both
      the main GPA PTE and its buddy (which may be invalid), which also allows
      memory regions to be fully accessible when they don't start and end on a
      2*PAGE_SIZE boundary.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      b584f460
    • J
      KVM: MIPS/MMU: Handle dirty logging on GPA faults · b5f1dd1b
      James Hogan 提交于
      Update kvm_mips_map_page() to handle logging of dirty guest physical
      pages. Upcoming patches will propagate the dirty bit to the GVA page
      tables.
      
      A fast path is added for handling protection bits that can be resolved
      without calling into KVM, currently just dirtying of clean pages being
      written to.
      
      The slow path marks the GPA page table entry writable only on writes,
      and at the same time marks the page dirty in the dirty page logging
      bitmask.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      b5f1dd1b
    • J
      KVM: MIPS: Clean & flush on dirty page logging enable · a1ac9e17
      James Hogan 提交于
      When an existing memory region has dirty page logging enabled, make the
      entire slot clean (read only) so that writes will immediately start
      logging dirty pages (once the dirty bit is transferred from GPA to GVA
      page tables in an upcoming patch).
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      a1ac9e17
    • J
      KVM: MIPS/MMU: Use generic dirty log & protect helper · e88643ba
      James Hogan 提交于
      MIPS hasn't up to this point properly supported dirty page logging, as
      pages in slots with dirty logging enabled aren't made clean, and tlbmod
      exceptions from writes to clean pages have been assumed to be due to
      guest TLB protection and unconditionally passed to the guest.
      
      Use the generic dirty logging helper kvm_get_dirty_log_protect() to
      properly implement kvm_vm_ioctl_get_dirty_log(), similar to how ARM
      does. This uses xchg to clear the dirty bits when reading them, rather
      than wiping them out afterwards with a memset, which would potentially
      wipe recently set bits that weren't caught by kvm_get_dirty_log(). It
      also makes the pages clean again using the
      kvm_arch_mmu_enable_log_dirty_pt_masked() architecture callback so that
      further writes after the shadow memslot is flushed will trigger tlbmod
      exceptions and dirty handling.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      e88643ba
    • J
      KVM: MIPS/MMU: Add GPA PT mkclean helper · f0c0c330
      James Hogan 提交于
      Add a helper function to make a range of guest physical address (GPA)
      mappings in the GPA page table clean so that writes can be caught. This
      will be used in a few places to manage dirty page logging.
      
      Note that until the dirty bit is transferred from GPA page table entries
      to GVA page table entries in an upcoming patch this won't trigger a TLB
      modified exception on write.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      f0c0c330
    • J
      KVM: MIPS/T&E: Handle read only GPA in TLB mod · 64ebc9e2
      James Hogan 提交于
      Rewrite TLB modified exception handling to handle read only GPA memory
      regions, instead of unconditionally passing the exception to the guest.
      
      If the guest TLB is not the cause of the exception we call into the
      normal TLB fault handling depending on the memory segment, which will
      soon attempt to remap the physical page to be writable (handling dirty
      page tracking or copy on write in the process).
      
      Failing that we fall back to treating it as MMIO, due to a read only
      memory region. Once the capability is enabled, this will allow read only
      memory regions (such as the Malta boot flash as emulated by QEMU) to
      have writes treated as MMIO, while still allowing reads to run
      untrapped.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      64ebc9e2
    • J
      KVM: MIPS/T&E: Treat unhandled guest KSeg0 as MMIO · b8f79ddb
      James Hogan 提交于
      Treat unhandled accesses to guest KSeg0 as MMIO, rather than only host
      KSeg0 addresses. This will allow read only memory regions (such as the
      Malta boot flash as emulated by QEMU) to have writes (before reads)
      treated as MMIO, and unallocated physical addresses to have all accesses
      treated as MMIO.
      
      The MMIO emulation uses the gva_to_gpa callback, so this is also updated
      for trap & emulate to handle guest KSeg0 addresses.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      b8f79ddb
    • J
      KVM: MIPS/T&E: Abstract bad access handling · 420ea09b
      James Hogan 提交于
      Abstract the handling of bad guest loads and stores which may need to
      trigger an MMIO, so that the same code can be used in a later patch for
      guest KSeg0 addresses (TLB exception handling) as well as for host KSeg1
      addresses (existing address error exception and TLB exception handling).
      
      We now use kvm_mips_emulate_store() and kvm_mips_emulate_load() directly
      rather than the more generic kvm_mips_emulate_inst(), as there is no
      need to expose emulation of any other instructions.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      420ea09b
    • J
      KVM: MIPS: Pass type of fault down to kvm_mips_map_page() · 577ed7f7
      James Hogan 提交于
      kvm_mips_map_page() will need to know whether the fault was due to a
      read or a write in order to support dirty page tracking,
      KVM_CAP_SYNC_MMU, and read only memory regions, so get that information
      passed down to it via new bool write_fault arguments to various
      functions.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      577ed7f7
    • J
      KVM: MIPS/T&E: Ignore user writes to CP0_Config7 · 89d6ad8a
      James Hogan 提交于
      Ignore userland writes to CP0_Config7 rather than reporting an error,
      since we do allow reads of this register and it is claimed to exist in
      the ioctl API.
      
      This allows userland to blindly save and restore KVM registers without
      having to special case certain registers as not being writable, for
      example during live migration once dirty page logging is fixed.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      89d6ad8a
    • J
      KVM: MIPS: Implement kvm_arch_flush_shadow_all/memslot · b6209110
      James Hogan 提交于
      Implement the kvm_arch_flush_shadow_all() and
      kvm_arch_flush_shadow_memslot() KVM functions for MIPS to allow guest
      physical mappings to be safely changed.
      
      The general MIPS KVM code takes care of flushing of GPA page table
      entries. kvm_arch_flush_shadow_all() flushes the whole GPA page table,
      and is always called on the cleanup path so there is no need to acquire
      the kvm->mmu_lock. kvm_arch_flush_shadow_memslot() flushes only the
      range of mappings in the GPA page table corresponding to the slot being
      flushed, and happens when memory regions are moved or deleted.
      
      MIPS KVM implementation callbacks are added for handling the
      implementation specific flushing of mappings derived from the GPA page
      tables. These are implemented for trap_emul.c using
      kvm_flush_remote_tlbs() which should now be functional, and will flush
      the per-VCPU GVA page tables and ASIDS synchronously (before next
      entering guest mode or directly accessing GVA space).
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      b6209110
    • J
      KVM: MIPS/Emulate: Use lockless GVA helpers for cache emulation · 4cf74c9c
      James Hogan 提交于
      Use the lockless GVA helpers to implement the reading of guest
      instructions for emulation. This will allow it to handle asynchronous
      TLB flushes when they are implemented.
      
      This is a little more complicated than the other two cases (get_inst()
      and dynamic translation) due to the need to emulate the appropriate
      guest TLB exception when the address isn't present or isn't valid in the
      guest TLB.
      
      Since there are several protected cache ops that may need to be
      performed safely, this is abstracted by kvm_mips_guest_cache_op() which
      is passed a protected cache op function pointer and takes care of the
      lockless operation and fault handling / retry if the op should fail,
      taking advantage of the new errors which the protected cache ops can now
      return. This allows the existing advance fault handling which relied on
      host TLB lookups to be removed, along with the now unused
      kvm_mips_host_tlb_lookup(),
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      4cf74c9c
    • J
      KVM: MIPS/MMU: Use lockless GVA helpers for get_inst() · 5207ce14
      James Hogan 提交于
      Use the lockless GVA helpers to implement the reading of guest
      instructions for emulation. This will allow it to handle asynchronous
      TLB flushes when they are implemented.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      5207ce14
    • J
      KVM: MIPS/T&E: Use lockless GVA helpers for dyntrans · 4b21e8ab
      James Hogan 提交于
      Use the lockless GVA helpers to implement the dynamic translation of
      guest instructions. This will allow it to handle asynchronous TLB
      flushes when they are implemented.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      4b21e8ab
    • J
      KVM: MIPS/T&E: Add lockless GVA access helpers · 1880afd6
      James Hogan 提交于
      Add helpers to allow for lockless direct access to the GVA space, by
      changing the VCPU mode to READING_SHADOW_PAGE_TABLES for the duration of
      the access. This allows asynchronous TLB flush requests in future
      patches to safely trigger either a TLB flush before the direct GVA space
      access, or a delay until the in-progress lockless direct access is
      complete.
      
      The kvm_trap_emul_gva_lockless_begin() and
      kvm_trap_emul_gva_lockless_end() helpers take care of guarding the
      direct GVA accesses, and kvm_trap_emul_gva_fault() tries to handle a
      uaccess fault resulting from a flush having taken place.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      1880afd6
    • J
      KVM: MIPS/T&E: Reduce stale ASID checks · 91737ea2
      James Hogan 提交于
      The stale ASID checks taking place on VCPU load can be reduced:
      
      - Now that we check for a stale ASID on guest re-entry, there is no need
        to do so when loading the VCPU outside of guest context, since it will
        happen before entering the guest. Note that a lot of KVM VCPU ioctls
        will cause the VCPU to be loaded but guest context won't be entered.
      
      - There is no need to check for a stale kernel_mm ASID when the guest is
        in user mode and vice versa. In fact doing so can potentially be
        problematic since the user_mm ASID regeneration may trigger a new ASID
        cycle, which would cause the kern_mm ASID to become stale after it has
        been checked for staleness.
      
      Therefore only check the ASID for the mm corresponding to the current
      guest mode, and only if we're already in guest context. We drop some of
      the related kvm_debug() calls here too.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      91737ea2
    • J
      KVM: MIPS/T&E: Handle TLB invalidation requests · b29e115a
      James Hogan 提交于
      Add handling of TLB invalidation requests before entering guest mode.
      This will allow asynchonous invalidation of the VCPU mappings when
      physical memory regions are altered. Should the CPU running the VCPU
      already be in guest mode an IPI will be sent to trigger a guest exit.
      
      The reload_asid path will be used in a future patch for when GVA is
      about to be directly accessed by KVM.
      
      In the process, the stale user ASID check in the re-entry path (for lazy
      user GVA flushing) is generalised to check the ASID for the current
      guest mode, in case a TLB invalidation request was handled. This has the
      side effect of making the ASID checks on vcpu_load too conservative,
      which will be addressed in a later patch.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      b29e115a
    • J
      KVM: MIPS: Update vcpu->mode and vcpu->cpu · 4841e0dd
      James Hogan 提交于
      Keep the vcpu->mode and vcpu->cpu variables up to date so that
      kvm_make_all_cpus_request() has a chance of functioning correctly. This
      will soon need to be used for kvm_flush_remote_tlbs().
      
      We can easily update vcpu->cpu when the VCPU context is loaded or saved,
      which will happen when accessing guest context and when the guest is
      scheduled in and out.
      
      We need to be a little careful with vcpu->mode though, as we will in
      future be checking for outstanding VCPU requests, and this must be done
      after the value of IN_GUEST_MODE in vcpu->mode is visible to other CPUs.
      Otherwise the other CPU could fail to trigger an IPI to wait for
      completion dispite the VCPU request not being seen.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      4841e0dd
    • J
      KVM: MIPS/MMU: Convert guest physical map to page table · 06c158c9
      James Hogan 提交于
      Current guest physical memory is mapped to host physical addresses using
      a single linear array (guest_pmap of length guest_pmap_npages). This was
      only really meant to be temporary, and isn't sparse, so its wasteful of
      memory. A small amount of RAM at GPA 0 and a small boot exception vector
      at GPA 0x1fc00000 cannot be represented without a full 128KiB guest_pmap
      allocation (MIPS32 with 16KiB pages), which is one reason why QEMU
      currently runs its boot code at the top of RAM instead of the usual boot
      exception vector address.
      
      Instead use the existing infrastructure for host virtual page table
      management to allocate a page table for guest physical memory too. This
      should be sufficient for now, assuming the size of physical memory
      doesn't exceed the size of virtual memory. It may need extending in
      future to handle XPA (eXtended Physical Addressing) in 32-bit guests, as
      supported by VZ guests on P5600.
      
      Some of this code is based loosely on Cavium's VZ KVM implementation.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      06c158c9
    • J
      KVM: MIPS: Use CP0_BadInstr[P] for emulation · 6a97c775
      James Hogan 提交于
      When exiting from the guest, store the values of the CP0_BadInstr and
      CP0_BadInstrP registers if they exist, which contain the encodings of
      the instructions which caused the last synchronous exception.
      
      When the instruction is needed for emulation, kvm_get_badinstr() and
      kvm_get_badinstrp() are used instead of calling kvm_get_inst() directly,
      to decide whether to read the saved CP0_BadInstr/CP0_BadInstrP registers
      (if they exist), or read the instruction from memory (if not).
      
      The use of these registers should be more robust than using
      kvm_get_inst(), as it actually gives the instruction encoding seen by
      the hardware rather than relying on user accessors after the fact, which
      can be fooled by incoherent icache or a racing code modification. It
      will also work with VZ, where the guest virtual memory isn't directly
      accessible by the host with user accessors.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      6a97c775
    • J
      KVM: MIPS: Improve kvm_get_inst() error return · 122e51d4
      James Hogan 提交于
      Currently kvm_get_inst() returns KVM_INVALID_INST in the event of a
      fault reading the guest instruction. This has the rather arbitrary magic
      value 0xdeadbeef. This API isn't very robust, and in fact 0xdeadbeef is
      a valid MIPS64 instruction encoding, namely "ld t1,-16657(s5)".
      
      Therefore change the kvm_get_inst() API to return 0 or -EFAULT, and to
      return the instruction via a u32 *out argument. We can then drop the
      KVM_INVALID_INST definition entirely.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      122e51d4
    • J
      KVM: MIPS/T&E: Don't treat code fetch faults as MMIO · a1ecc54d
      James Hogan 提交于
      In order to make use of the CP0_BadInstr & CP0_BadInstrP registers we
      need to be a bit more careful not to treat code fetch faults as MMIO,
      lest we hit an UNPREDICTABLE register value when we try to emulate the
      MMIO load instruction but there was no valid instruction word available
      to the hardware.
      
      Add a kvm_is_ifetch_fault() helper to try to figure out whether a load
      fault was due to a code fetch, and prevent MMIO instruction emulation in
      that case.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      a1ecc54d
    • J
      KVM: MIPS/MMU: Drop kvm_get_new_mmu_context() · a98dd741
      James Hogan 提交于
      MIPS KVM uses its own variation of get_new_mmu_context() which takes an
      extra vcpu pointer (unused) and does exactly the same thing.
      
      Switch to just using get_new_mmu_context() directly and drop KVM's
      version of it as it doesn't really serve any purpose.
      
      The nearby declarations of kvm_mips_alloc_new_mmu_context(),
      kvm_mips_vcpu_load() and kvm_mips_vcpu_put() are also removed from
      kvm_host.h, as no definitions or users exist.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      a98dd741
    • J
      KVM: MIPS/Emulate: Drop redundant TLB flushes on exceptions · 7071a885
      James Hogan 提交于
      When exceptions are injected into the MIPS KVM guest, the whole host TLB
      is flushed (except any entries in the guest KSeg0 range). This is
      certainly not mandated by the architecture when exceptions are taken
      (userland can't directly change TLB mappings anyway), and is a pretty
      heavyweight operation:
      
       - There may be hundreds of TLB entries especially when a 512 entry FTLB
         is present. These are walked and read and conditionally invalidated,
         so the TLBINV feature can't be used either.
      
       - It'll indiscriminately wipe out entries belonging to other memory
         spaces. A simple ASID regeneration would be much faster to perform,
         although it'd wipe out the guest KSeg0 mappings too.
      
      My suspicion is that this was simply to plaster over the fact that
      kvm_mips_host_tlb_inv() incorrectly only invalidated TLB entries in the
      ASID for guest usermode, and not the ASID for guest kernelmode.
      
      Now that the recent commit "KVM: MIPS/TLB: Flush host TLB entry in
      kernel ASID" fixes kvm_mips_host_tlb_inv() to flush TLB entries in the
      kernelmode ASID when the guest TLB changes, lets drop these calls and
      the otherwise unused kvm_mips_flush_host_tlb().
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      7071a885
    • J
      KVM: MIPS/TLB: Drop kvm_local_flush_tlb_all() · 49ec508e
      James Hogan 提交于
      Now that KVM no longer uses wired entries we can safely use
      local_flush_tlb_all() when we need to flush the entire TLB (on the start
      of a new ASID cycle). This doesn't flush wired entries, which allows
      other code to use them without KVM clobbering them all the time. It also
      is more up to date, knowing about the tlbinv architectural feature,
      flushing of micro TLB on cores where that is necessary (Loongson I
      believe), and knows to stop the HTW while doing so.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      49ec508e
    • J
      KVM: MIPS/Emulate: Fix CACHE emulation for EVA hosts · 8af0e3c2
      James Hogan 提交于
      Use protected_writeback_dcache_line() instead of flush_dcache_line(),
      and protected_flush_icache_line() instead of flush_icache_line(), so
      that CACHEE (the EVA variant) is used on EVA host kernels.
      
      Without this, guest floating point branch delay slot emulation via a
      trampoline on the user stack fails on EVA host kernels due to failure of
      the icache sync, resulting in the break instruction getting skipped and
      execution from the stack.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      8af0e3c2