1. 28 11月, 2017 1 次提交
    • J
      KVM: Let KVM_SET_SIGNAL_MASK work as advertised · 20b7035c
      Jan H. Schönherr 提交于
      KVM API says for the signal mask you set via KVM_SET_SIGNAL_MASK, that
      "any unblocked signal received [...] will cause KVM_RUN to return with
      -EINTR" and that "the signal will only be delivered if not blocked by
      the original signal mask".
      
      This, however, is only true, when the calling task has a signal handler
      registered for a signal. If not, signal evaluation is short-circuited for
      SIG_IGN and SIG_DFL, and the signal is either ignored without KVM_RUN
      returning or the whole process is terminated.
      
      Make KVM_SET_SIGNAL_MASK behave as advertised by utilizing logic similar
      to that in do_sigtimedwait() to avoid short-circuiting of signals.
      Signed-off-by: NJan H. Schönherr <jschoenh@amazon.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      20b7035c
  2. 15 9月, 2017 1 次提交
  3. 08 8月, 2017 1 次提交
    • L
      KVM: add spinlock optimization framework · 199b5763
      Longpeng(Mike) 提交于
      If a vcpu exits due to request a user mode spinlock, then
      the spinlock-holder may be preempted in user mode or kernel mode.
      (Note that not all architectures trap spin loops in user mode,
      only AMD x86 and ARM/ARM64 currently do).
      
      But if a vcpu exits in kernel mode, then the holder must be
      preempted in kernel mode, so we should choose a vcpu in kernel mode
      as a more likely candidate for the lock holder.
      
      This introduces kvm_arch_vcpu_in_kernel() to decide whether the
      vcpu is in kernel-mode when it's preempted.  kvm_vcpu_on_spin's
      new argument says the same of the spinning VCPU.
      Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      199b5763
  4. 07 4月, 2017 1 次提交
  5. 28 3月, 2017 13 次提交
    • J
      KVM: MIPS/VZ: Trace guest mode changes · edec9d7b
      James Hogan 提交于
      Create a trace event for guest mode changes, and enable VZ's
      GuestCtl0.MC bit after the trace event is enabled to trap all guest mode
      changes.
      
      The MC bit causes Guest Hardware Field Change (GHFC) exceptions whenever
      a guest mode change occurs (such as an exception entry or return from
      exception), so we need to handle this exception now. The MC bit is only
      enabled when restoring register state, so enabling the trace event won't
      take immediate effect.
      
      Tracing guest mode changes can be particularly handy when trying to work
      out what a guest OS gets up to before something goes wrong, especially
      if the problem occurs as a result of some previous guest userland
      exception which would otherwise be invisible in the trace.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      edec9d7b
    • J
      KVM: MIPS/VZ: Support hardware guest timer · f4474d50
      James Hogan 提交于
      Transfer timer state to the VZ guest context (CP0_GTOffset & guest
      CP0_Count) when entering guest mode, enabling direct guest access to it,
      and transfer back to soft timer when saving guest register state.
      
      This usually allows guest code to directly read CP0_Count (via MFC0 and
      RDHWR) and read/write CP0_Compare, without trapping to the hypervisor
      for it to emulate the guest timer. Writing to CP0_Count or CP0_Cause.DC
      is much less common and still triggers a hypervisor GPSI exception, in
      which case the timer state is transferred back to an hrtimer before
      emulating the write.
      
      We are careful to prevent small amounts of drift from building up due to
      undeterministic time intervals between reading of the ktime and reading
      of CP0_Count. Some drift is expected however, since the system
      clocksource may use a different timer to the local CP0_Count timer used
      by VZ. This is permitted to prevent guest CP0_Count from appearing to go
      backwards.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      f4474d50
    • J
      KVM: MIPS: Implement VZ support · c992a4f6
      James Hogan 提交于
      Add the main support for the MIPS Virtualization ASE (A.K.A. VZ) to MIPS
      KVM. The bulk of this work is in vz.c, with various new state and
      definitions elsewhere.
      
      Enough is implemented to be able to run on a minimal VZ core. Further
      patches will fill out support for guest features which are optional or
      can be disabled.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: linux-doc@vger.kernel.org
      c992a4f6
    • J
      KVM: MIPS: Update exit handler for VZ · ea1bdbf6
      James Hogan 提交于
      The general guest exit handler needs a few tweaks for VZ compared to
      trap & emulate, which for now are made directly depending on
      CONFIG_KVM_MIPS_VZ:
      
      - There is no need to re-enable the hardware page table walker (HTW), as
        it can be left enabled during guest mode operation with VZ.
      
      - There is no need to perform a privilege check, as any guest privilege
        violations should have already been detected by the hardware and
        triggered the appropriate guest exception.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      ea1bdbf6
    • J
      KVM: MIPS/Entry: Update entry code to support VZ · 1934a3ad
      James Hogan 提交于
      Update MIPS KVM entry code to support VZ:
      
       - We need to set GuestCtl0.GM while in guest mode.
      
       - For cores supporting GuestID, we need to set the root GuestID to
         match the main GuestID while in guest mode so that the root TLB
         refill handler writes the correct GuestID into the TLB.
      
       - For cores without GuestID where the root ASID dealiases RVA/GPA
         mappings, we need to load that ASID from the gpa_mm rather than the
         per-VCPU guest_kernel_mm or guest_user_mm, since the root TLB maps
         guest physical addresses. We also need to restore the normal process
         ASID on exit.
      
       - The normal linux process pgd needs restoring on exit, as we can't
         leave the GPA mappings active for kernel code.
      
       - GuestCtl0 needs saving on exit for the GExcCode field, as it may be
         clobbered if a preemption occurs.
      
      We also need to move the TLB refill handler to the XTLB vector at offset
      0x80 on 64-bit VZ kernels, as hardware will use Root.Status.KX to
      determine whether a TLB refill or XTLB Refill exception is to be taken
      on a root TLB miss from guest mode, and KX needs to be set for kernel
      code to be able to access the 64-bit segments.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      1934a3ad
    • J
      KVM: MIPS: Abstract guest CP0 register access for VZ · a27660f3
      James Hogan 提交于
      Abstract the MIPS KVM guest CP0 register access macros into inline
      functions which are generated by macros. This allows them to be
      generated differently for VZ, where they will usually need to access the
      hardware guest CP0 context rather than the saved values in RAM.
      
      Accessors for each individual register are generated using these macros:
      
       - __BUILD_KVM_*_SW() for registers which are not present in the VZ
         hardware guest context, so kvm_{read,write}_c0_guest_##name() will
         access the saved value in RAM regardless of whether VZ is enabled.
      
       - __BUILD_KVM_*_HW() for registers which are present in the VZ hardware
         guest context, so kvm_{read,write}_c0_guest_##name() will access the
         hardware register when VZ is enabled.
      
      These build the underlying accessors using further macros:
      
       - __BUILD_KVM_*_SAVED() builds e.g. kvm_{read,write}_sw_gc0_##name()
         functions for accessing the saved versions of the registers in RAM.
         This is used for implementing the common
         kvm_{read,write}_c0_guest_##name() accessors with T&E where registers
         are always stored in RAM, but are also available with VZ HW registers
         to allow them to be accessed while saved.
      
       - __BUILD_KVM_*_VZ() builds e.g. kvm_{read,write}_vz_gc0_##name()
         functions for accessing the VZ hardware guest context registers
         directly. This is used for implementing the common
         kvm_{read,write}_c0_guest_##name() accessors with VZ.
      
       - __BUILD_KVM_*_WRAP() builds wrappers with different names, which
         allows the common kvm_{read,write}_c0_guest_##name() functions to be
         implemented using the VZ accessors while still having the SAVED
         accessors available too.
      
       - __BUILD_KVM_SAVE_VZ() builds functions for saving and restoring VZ
         hardware guest context register state to RAM, improving conciseness
         of VZ context saving and restoring.
      
      Similar macros exist for generating modifiers (set, clear, change),
      either with a normal unlocked read/modify/write, or using atomic LL/SC
      sequences.
      
      These changes change the types of 32-bit registers to u32 instead of
      unsigned long, which requires some changes to printk() functions in MIPS
      KVM.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      a27660f3
    • J
      KVM: MIPS: Add guest exit exception callback · 28c1e762
      James Hogan 提交于
      Add a callback for MIPS KVM implementations to handle the VZ guest
      exit exception. Currently the trap & emulate implementation contains a
      stub which reports an internal error, but the callback will be used
      properly by the VZ implementation.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      28c1e762
    • J
      KVM: MIPS: Add hardware_{enable,disable} callback · edab4fe1
      James Hogan 提交于
      Add an implementation callback for the kvm_arch_hardware_enable() and
      kvm_arch_hardware_disable() architecture functions, with simple stubs
      for trap & emulate. This is in preparation for VZ which will make use of
      them.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      edab4fe1
    • J
      KVM: MIPS: Add callback to check extension · 607ef2fd
      James Hogan 提交于
      Add an implementation callback for checking presence of KVM extensions.
      This allows implementation specific extensions to be provided without
      ifdefs in mips.c.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      607ef2fd
    • J
      KVM: MIPS: Init timer frequency from callback · a517c1ad
      James Hogan 提交于
      Currently the software emulated timer is initialised to a frequency of
      100MHz by kvm_mips_init_count(), but this isn't suitable for VZ where
      the frequency of the guest timer matches that of the host.
      
      Add a count_hz argument so the caller can specify the default frequency,
      and move the call from kvm_arch_vcpu_create() to the implementation
      specific vcpu_setup() callback, so that VZ can specify a different
      frequency.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      a517c1ad
    • J
      KVM: MIPS: Add VZ & TE capabilities · a8a3c426
      James Hogan 提交于
      Add new KVM_CAP_MIPS_VZ and KVM_CAP_MIPS_TE capabilities, and in order
      to allow MIPS KVM to support VZ without confusing old users (which
      expect the trap & emulate implementation), define and start checking
      KVM_CREATE_VM type codes.
      
      The codes available are:
      
       - KVM_VM_MIPS_TE = 0
      
         This is the current value expected from the user, and will create a
         VM using trap & emulate in user mode, confined to the user mode
         address space. This may in future become unavailable if the kernel is
         only configured to support VZ, in which case the EINVAL error will be
         returned and KVM_CAP_MIPS_TE won't be available even though
         KVM_CAP_MIPS_VZ is.
      
       - KVM_VM_MIPS_VZ = 1
      
         This can be provided when the KVM_CAP_MIPS_VZ capability is available
         to create a VM using VZ, with a fully virtualized guest virtual
         address space. If VZ support is unavailable in the kernel, the EINVAL
         error will be returned (although old kernels without the
         KVM_CAP_MIPS_VZ capability may well succeed and create a trap &
         emulate VM).
      
      This is designed to allow the desired implementation (T&E vs VZ) to be
      potentially chosen at runtime rather than being fixed in the kernel
      configuration.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: linux-doc@vger.kernel.org
      a8a3c426
    • J
      KVM: MIPS: Extend counters & events for VZ GExcCodes · a7244920
      James Hogan 提交于
      Extend MIPS KVM stats counters and kvm_transition trace event codes to
      cover hypervisor exceptions, which have their own GExcCode field in
      CP0_GuestCtl0 with up to 32 hypervisor exception cause codes.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      a7244920
    • J
      KVM: MIPS: Update kvm_lose_fpu() for VZ · c58cf741
      James Hogan 提交于
      Update the implementation of kvm_lose_fpu() for VZ, where there is no
      need to enable the FPU/MSA in the root context if the FPU/MSA state is
      loaded but disabled in the guest context.
      
      The trap & emulate implementation needs to disable FPU/MSA in the root
      context when the guest disables them in order to catch the COP1 unusable
      or MSA disabled exception when they're used and pass it on to the guest.
      
      For VZ however as long as the context is loaded and enabled in the root
      context, the guest can enable and disable it in the guest context
      without the hypervisor having to do much, and will take guest exceptions
      without hypervisor intervention if used without being enabled in the
      guest context.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      c58cf741
  6. 02 3月, 2017 1 次提交
  7. 17 2月, 2017 1 次提交
    • P
      KVM: race-free exit from KVM_RUN without POSIX signals · 460df4c1
      Paolo Bonzini 提交于
      The purpose of the KVM_SET_SIGNAL_MASK API is to let userspace "kick"
      a VCPU out of KVM_RUN through a POSIX signal.  A signal is attached
      to a dummy signal handler; by blocking the signal outside KVM_RUN and
      unblocking it inside, this possible race is closed:
      
                VCPU thread                     service thread
         --------------------------------------------------------------
              check flag
                                                set flag
                                                raise signal
              (signal handler does nothing)
              KVM_RUN
      
      However, one issue with KVM_SET_SIGNAL_MASK is that it has to take
      tsk->sighand->siglock on every KVM_RUN.  This lock is often on a
      remote NUMA node, because it is on the node of a thread's creator.
      Taking this lock can be very expensive if there are many userspace
      exits (as is the case for SMP Windows VMs without Hyper-V reference
      time counter).
      
      As an alternative, we can put the flag directly in kvm_run so that
      KVM can see it:
      
                VCPU thread                     service thread
         --------------------------------------------------------------
                                                raise signal
              signal handler
                set run->immediate_exit
              KVM_RUN
                check run->immediate_exit
      Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      460df4c1
  8. 03 2月, 2017 20 次提交
    • J
      KVM: MIPS: Allow multiple VCPUs to be created · 12ed1fae
      James Hogan 提交于
      Increase the maximum number of MIPS KVM VCPUs to 8, and implement the
      KVM_CAP_NR_VCPUS and KVM_CAP_MAX_CPUS capabilities which expose the
      recommended and maximum number of VCPUs to userland. The previous
      maximum of 1 didn't allow for any form of SMP guests.
      
      We calculate the values similarly to ARM, recommending as many VCPUs as
      there are CPUs online in the system. This will allow userland to know
      how many VCPUs it is possible to create.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      12ed1fae
    • J
      KVM: MIPS/T&E: Move CP0 register access into T&E · 654229a0
      James Hogan 提交于
      Access to various CP0 registers via the KVM register access API needs to
      be implementation specific to allow restrictions to be made on changes,
      for example when VZ guest registers aren't present, so move them all
      into trap_emul.c in preparation for VZ.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      654229a0
    • J
      KVM: MIPS: Claim KVM_CAP_READONLY_MEM support · 230c5724
      James Hogan 提交于
      Now that load/store faults due to read only memory regions are treated
      as MMIO accesses it is safe to claim support for read only memory
      regions (KVM_CAP_READONLY_MEM).
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      230c5724
    • J
      KVM: MIPS/MMU: Implement KVM_CAP_SYNC_MMU · 411740f5
      James Hogan 提交于
      Implement the SYNC_MMU capability for KVM MIPS, allowing changes in the
      underlying user host virtual address (HVA) mappings to be promptly
      reflected in the corresponding guest physical address (GPA) mappings.
      
      This allows for several features to work with guest RAM which require
      mappings to be altered or protected, such as copy-on-write, KSM (Kernel
      Samepage Merging), idle page tracking, memory swapping, and guest memory
      ballooning.
      
      There are two main aspects of this change, described below.
      
      The KVM MMU notifier architecture callbacks are implemented so we can be
      notified of changes in the HVA mappings. These arrange for the guest
      physical address (GPA) page tables to be modified and possibly for
      derived mappings (GVA page tables and TLBs) to be flushed.
      
       - kvm_unmap_hva[_range]() - These deal with HVA mappings being removed,
         for example before a copy-on-write takes place, which requires the
         corresponding GPA page table mappings to be removed too.
      
       - kvm_set_spte_hva() - These update a GPA page table entry to match the
         new HVA entry, but must be careful to respect KVM specific
         configuration such as not dirtying a clean guest page which is dirty
         to the host, and write protecting writable pages in read only
         memslots (which will soon be supported).
      
       - kvm[_test]_age_hva() - These update GPA page table entries to be old
         (invalid) so that access can be tracked, making them young again.
      
      The GPA page fault handling (kvm_mips_map_page) is updated to use
      gfn_to_pfn_prot() (which may provide read-only pages), to handle
      asynchronous page table invalidation from MMU notifier callbacks, and to
      handle more cases in the fast path.
      
       - mmu_notifier_seq is used to detect asynchronous page table
         invalidations while we're holding a pfn from gfn_to_pfn_prot()
         outside of kvm->mmu_lock, retrying if invalidations have taken place,
         e.g. a COW or a KSM page merge.
      
       - The fast path (_kvm_mips_map_page_fast) now handles marking old pages
         as young / accessed, and disallowing dirtying of clean pages that
         aren't actually writable (e.g. shared pages that should COW, and
         read-only memory regions when they are enabled in a future patch).
      
       - Due to the use of MMU notifications we no longer need to keep the
         page references after we've updated the GPA page tables.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      411740f5
    • J
      KVM: MIPS: Clean & flush on dirty page logging enable · a1ac9e17
      James Hogan 提交于
      When an existing memory region has dirty page logging enabled, make the
      entire slot clean (read only) so that writes will immediately start
      logging dirty pages (once the dirty bit is transferred from GPA to GVA
      page tables in an upcoming patch).
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      a1ac9e17
    • J
      KVM: MIPS/MMU: Use generic dirty log & protect helper · e88643ba
      James Hogan 提交于
      MIPS hasn't up to this point properly supported dirty page logging, as
      pages in slots with dirty logging enabled aren't made clean, and tlbmod
      exceptions from writes to clean pages have been assumed to be due to
      guest TLB protection and unconditionally passed to the guest.
      
      Use the generic dirty logging helper kvm_get_dirty_log_protect() to
      properly implement kvm_vm_ioctl_get_dirty_log(), similar to how ARM
      does. This uses xchg to clear the dirty bits when reading them, rather
      than wiping them out afterwards with a memset, which would potentially
      wipe recently set bits that weren't caught by kvm_get_dirty_log(). It
      also makes the pages clean again using the
      kvm_arch_mmu_enable_log_dirty_pt_masked() architecture callback so that
      further writes after the shadow memslot is flushed will trigger tlbmod
      exceptions and dirty handling.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      e88643ba
    • J
      KVM: MIPS: Implement kvm_arch_flush_shadow_all/memslot · b6209110
      James Hogan 提交于
      Implement the kvm_arch_flush_shadow_all() and
      kvm_arch_flush_shadow_memslot() KVM functions for MIPS to allow guest
      physical mappings to be safely changed.
      
      The general MIPS KVM code takes care of flushing of GPA page table
      entries. kvm_arch_flush_shadow_all() flushes the whole GPA page table,
      and is always called on the cleanup path so there is no need to acquire
      the kvm->mmu_lock. kvm_arch_flush_shadow_memslot() flushes only the
      range of mappings in the GPA page table corresponding to the slot being
      flushed, and happens when memory regions are moved or deleted.
      
      MIPS KVM implementation callbacks are added for handling the
      implementation specific flushing of mappings derived from the GPA page
      tables. These are implemented for trap_emul.c using
      kvm_flush_remote_tlbs() which should now be functional, and will flush
      the per-VCPU GVA page tables and ASIDS synchronously (before next
      entering guest mode or directly accessing GVA space).
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      b6209110
    • J
      KVM: MIPS: Update vcpu->mode and vcpu->cpu · 4841e0dd
      James Hogan 提交于
      Keep the vcpu->mode and vcpu->cpu variables up to date so that
      kvm_make_all_cpus_request() has a chance of functioning correctly. This
      will soon need to be used for kvm_flush_remote_tlbs().
      
      We can easily update vcpu->cpu when the VCPU context is loaded or saved,
      which will happen when accessing guest context and when the guest is
      scheduled in and out.
      
      We need to be a little careful with vcpu->mode though, as we will in
      future be checking for outstanding VCPU requests, and this must be done
      after the value of IN_GUEST_MODE in vcpu->mode is visible to other CPUs.
      Otherwise the other CPU could fail to trigger an IPI to wait for
      completion dispite the VCPU request not being seen.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      4841e0dd
    • J
      KVM: MIPS/MMU: Convert guest physical map to page table · 06c158c9
      James Hogan 提交于
      Current guest physical memory is mapped to host physical addresses using
      a single linear array (guest_pmap of length guest_pmap_npages). This was
      only really meant to be temporary, and isn't sparse, so its wasteful of
      memory. A small amount of RAM at GPA 0 and a small boot exception vector
      at GPA 0x1fc00000 cannot be represented without a full 128KiB guest_pmap
      allocation (MIPS32 with 16KiB pages), which is one reason why QEMU
      currently runs its boot code at the top of RAM instead of the usual boot
      exception vector address.
      
      Instead use the existing infrastructure for host virtual page table
      management to allocate a page table for guest physical memory too. This
      should be sufficient for now, assuming the size of physical memory
      doesn't exceed the size of virtual memory. It may need extending in
      future to handle XPA (eXtended Physical Addressing) in 32-bit guests, as
      supported by VZ guests on P5600.
      
      Some of this code is based loosely on Cavium's VZ KVM implementation.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      06c158c9
    • J
      KVM: MIPS: Use CP0_BadInstr[P] for emulation · 6a97c775
      James Hogan 提交于
      When exiting from the guest, store the values of the CP0_BadInstr and
      CP0_BadInstrP registers if they exist, which contain the encodings of
      the instructions which caused the last synchronous exception.
      
      When the instruction is needed for emulation, kvm_get_badinstr() and
      kvm_get_badinstrp() are used instead of calling kvm_get_inst() directly,
      to decide whether to read the saved CP0_BadInstr/CP0_BadInstrP registers
      (if they exist), or read the instruction from memory (if not).
      
      The use of these registers should be more robust than using
      kvm_get_inst(), as it actually gives the instruction encoding seen by
      the hardware rather than relying on user accessors after the fact, which
      can be fooled by incoherent icache or a racing code modification. It
      will also work with VZ, where the guest virtual memory isn't directly
      accessible by the host with user accessors.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      6a97c775
    • J
      KVM: MIPS: Improve kvm_get_inst() error return · 122e51d4
      James Hogan 提交于
      Currently kvm_get_inst() returns KVM_INVALID_INST in the event of a
      fault reading the guest instruction. This has the rather arbitrary magic
      value 0xdeadbeef. This API isn't very robust, and in fact 0xdeadbeef is
      a valid MIPS64 instruction encoding, namely "ld t1,-16657(s5)".
      
      Therefore change the kvm_get_inst() API to return 0 or -EFAULT, and to
      return the instruction via a u32 *out argument. We can then drop the
      KVM_INVALID_INST definition entirely.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      122e51d4
    • J
      KVM: MIPS: Drop vm_init() callback · 7a156e9f
      James Hogan 提交于
      Now that the commpage doesn't use wired TLB entries, the per-CPU
      vm_init() callback is the only work done by kvm_mips_init_vm_percpu().
      
      The trap & emulate implementation doesn't actually need to do anything
      from vm_init(), and the future VZ implementation would be better served
      by a kvm_arch_hardware_enable callback anyway.
      
      Therefore drop the vm_init() callback entirely, allowing the
      kvm_mips_init_vm_percpu() function to also be dropped, along with the
      kvm_mips_instance atomic counter.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      7a156e9f
    • J
      KVM: MIPS/MMU: Convert commpage fault handling to page tables · 4c86460c
      James Hogan 提交于
      Now that we have GVA page tables and an optimised TLB refill handler in
      place, convert the handling of commpage faults from the guest kernel to
      fill the GVA page table and invalidate the TLB entry, rather than
      filling the wired TLB entry directly.
      
      For simplicity we no longer use a wired entry for the commpage (refill
      should be much cheaper with the fast-path handler anyway). Since we
      don't need to manipulate the TLB directly any longer, move the function
      from tlb.c to mmu.c. This puts it closer to the similar functions
      handling KSeg0 and TLB mapped page faults from the guest.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      4c86460c
    • J
      KVM: MIPS/MMU: Invalidate stale GVA PTEs on TLBW · aba85929
      James Hogan 提交于
      Implement invalidation of specific pairs of GVA page table entries in
      one or both of the GVA page tables. This is used when existing mappings
      are replaced in the guest TLB by emulated TLBWI/TLBWR instructions. Due
      to the sharing of page tables in the host kernel range, we should be
      careful not to allow host pages to be invalidated.
      
      Add a helper kvm_mips_walk_pgd() which can be used when walking of
      either GPA (future patches) or GVA page tables is needed, optionally
      with allocation of page tables along the way when they don't exist.
      
      GPA page table walking will need to be protected by the kvm->mmu_lock,
      so we also add a small MMU page cache in each KVM VCPU, like that found
      for other architectures but smaller. This allows enough pages to be
      pre-allocated to handle a single fault without holding the lock,
      allowing the helper to run with the lock held without having to handle
      allocation failures.
      
      Using the same mechanism for GVA allows the same code to be used, and
      allows it to use the same cache of allocated pages if the GPA walk
      didn't need to allocate any new tables.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      aba85929
    • J
      KVM: MIPS: Add fast path TLB refill handler · a7cfa7ac
      James Hogan 提交于
      Use functions from the general MIPS TLB exception vector generation code
      (tlbex.c) to construct a fast path TLB refill handler similar to the
      general one, but cut down and capable of preserving K0 and K1.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      a7cfa7ac
    • J
      KVM: MIPS/T&E: Allocate GVA -> HPA page tables · f7f1427d
      James Hogan 提交于
      Allocate GVA -> HPA page tables for guest kernel and guest user mode on
      each VCPU, to allow for fast path TLB refill handling to be added later.
      
      In the process kvm_arch_vcpu_init() needs updating to pass on any error
      from the vcpu_init() callback.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      f7f1427d
    • J
      KVM: MIPS: Wire up vcpu uninit · 630766b3
      James Hogan 提交于
      Wire up a vcpu uninit implementation callback. This will be used for the
      clean up of GVA->HPA page tables.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      630766b3
    • J
      KVM: MIPS: Add vcpu_run() & vcpu_reenter() callbacks · a2c046e4
      James Hogan 提交于
      Add implementation callbacks for entering the guest (vcpu_run()) and
      reentering the guest (vcpu_reenter()), allowing implementation specific
      operations to be performed before entering the guest or after returning
      to the host without cluttering kvm_arch_vcpu_ioctl_run().
      
      This allows the T&E specific lazy user GVA flush to be moved into
      trap_emul.c, along with disabling of the HTW. We also move
      kvm_mips_deliver_interrupts() as VZ will need to restore the guest timer
      state prior to delivering interrupts.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      a2c046e4
    • J
      KVM: MIPS: Remove duplicated ASIDs from vcpu · c550d539
      James Hogan 提交于
      The kvm_vcpu_arch structure contains both mm_structs for allocating MMU
      contexts (primarily the ASID) but it also copies the resulting ASIDs
      into guest_{user,kernel}_asid[] arrays which are referenced from uasm
      generated code.
      
      This duplication doesn't seem to serve any purpose, and it gets in the
      way of generalising the ASID handling across guest kernel/user modes, so
      lets just extract the ASID straight out of the mm_struct on demand, and
      in fact there are convenient cpu_context() and cpu_asid() macros for
      doing so.
      
      To reduce the verbosity of this code we do also add kern_mm and user_mm
      local variables where the kernel and user mm_structs are used.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      c550d539
    • J
      KVM: MIPS: Drop partial KVM_NMI implementation · 00104b41
      James Hogan 提交于
      MIPS incompletely implements the KVM_NMI ioctl to supposedly perform a
      CPU reset, but all it actually does is invalidate the ASIDs. It doesn't
      expose the KVM_CAP_USER_NMI capability which is supposed to indicate the
      presence of the KVM_NMI ioctl, and no user software actually uses it on
      MIPS.
      
      Since this is dead code that would technically need updating for GVA
      page table handling in upcoming patches, remove it now. If we wanted to
      implement NMI injection later it can always be done properly along with
      the KVM_CAP_USER_NMI capability, and if we wanted to implement a proper
      CPU reset it would be better done with a separate ioctl.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      00104b41
  9. 02 2月, 2017 1 次提交