1. 12 5月, 2016 5 次提交
  2. 10 5月, 2016 7 次提交
    • P
      Merge tag 'kvm-s390-next-4.7-2' of... · 6ac0f61f
      Paolo Bonzini 提交于
      Merge tag 'kvm-s390-next-4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
      
      KVM: s390: features and fixes for 4.7 part2
      
      - Use hardware provided information about facility bits that do not
        need any hypervisor activitiy
      - Add missing documentation for KVM_CAP_S390_RI
      - Some updates/fixes for handling cpu models and facilities
      6ac0f61f
    • J
      MIPS: KVM: Add missing disable FPU hazard barriers · 4ac33429
      James Hogan 提交于
      Add the necessary hazard barriers after disabling the FPU in
      kvm_lose_fpu(), just to be safe.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4ac33429
    • J
      MIPS: KVM: Fix preemption warning reading FPU capability · 556f2a52
      James Hogan 提交于
      Reading the KVM_CAP_MIPS_FPU capability returns cpu_has_fpu, however
      this uses smp_processor_id() to read the current CPU capabilities (since
      some old MIPS systems could have FPUs present on only a subset of CPUs).
      
      We don't support any such systems, so work around the warning by using
      raw_cpu_has_fpu instead.
      
      We should probably instead claim not to support FPU at all if any one
      CPU is lacking an FPU, but this should do for now.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      556f2a52
    • J
      MIPS: KVM: Fix preemptable kvm_mips_get_*_asid() calls · f049729c
      James Hogan 提交于
      There are a couple of places in KVM fault handling code which implicitly
      use smp_processor_id() via kvm_mips_get_kernel_asid() and
      kvm_mips_get_user_asid() from preemptable context. This is unsafe as a
      preemption could cause the guest kernel ASID to be changed, resulting in
      a host TLB entry being written with the wrong ASID.
      
      Fix by disabling preemption around the kvm_mips_get_*_asid() call and
      the corresponding kvm_mips_host_tlb_write().
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f049729c
    • J
      MIPS: KVM: Fix timer IRQ race when writing CP0_Compare · b45bacd2
      James Hogan 提交于
      Writing CP0_Compare clears the timer interrupt pending bit
      (CP0_Cause.TI), but this wasn't being done atomically. If a timer
      interrupt raced with the write of the guest CP0_Compare, the timer
      interrupt could end up being pending even though the new CP0_Compare is
      nowhere near CP0_Count.
      
      We were already updating the hrtimer expiry with
      kvm_mips_update_hrtimer(), which used both kvm_mips_freeze_hrtimer() and
      kvm_mips_resume_hrtimer(). Close the race window by expanding out
      kvm_mips_update_hrtimer(), and clearing CP0_Cause.TI and setting
      CP0_Compare between the freeze and resume. Since the pending timer
      interrupt should not be cleared when CP0_Compare is written via the KVM
      user API, an ack argument is added to distinguish the source of the
      write.
      
      Fixes: e30492bb ("MIPS: KVM: Rewrite count/compare timer emulation")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.16.x-
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b45bacd2
    • J
      MIPS: KVM: Fix timer IRQ race when freezing timer · 4355c44f
      James Hogan 提交于
      There's a particularly narrow and subtle race condition when the
      software emulated guest timer is frozen which can allow a guest timer
      interrupt to be missed.
      
      This happens due to the hrtimer expiry being inexact, so very
      occasionally the freeze time will be after the moment when the emulated
      CP0_Count transitions to the same value as CP0_Compare (so an IRQ should
      be generated), but before the moment when the hrtimer is due to expire
      (so no IRQ is generated). The IRQ won't be generated when the timer is
      resumed either, since the resume CP0_Count will already match CP0_Compare.
      
      With VZ guests in particular this is far more likely to happen, since
      the soft timer may be frozen frequently in order to restore the timer
      state to the hardware guest timer. This happens after 5-10 hours of
      guest soak testing, resulting in an overflow in guest kernel timekeeping
      calculations, hanging the guest. A more focussed test case to
      intentionally hit the race (with the help of a new hypcall to cause the
      timer state to migrated between hardware & software) hits the condition
      fairly reliably within around 30 seconds.
      
      Instead of relying purely on the inexact hrtimer expiry to determine
      whether an IRQ should be generated, read the guest CP0_Compare and
      directly check whether the freeze time is before or after it. Only if
      CP0_Count is on or after CP0_Compare do we check the hrtimer expiry to
      determine whether the last IRQ has already been generated (which will
      have pushed back the expiry by one timer period).
      
      Fixes: e30492bb ("MIPS: KVM: Rewrite count/compare timer emulation")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.16.x-
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4355c44f
    • C
      kvm: arm64: Enable hardware updates of the Access Flag for Stage 2 page tables · 06485053
      Catalin Marinas 提交于
      The ARMv8.1 architecture extensions introduce support for hardware
      updates of the access and dirty information in page table entries. With
      VTCR_EL2.HA enabled (bit 21), when the CPU accesses an IPA with the
      PTE_AF bit cleared in the stage 2 page table, instead of raising an
      Access Flag fault to EL2 the CPU sets the actual page table entry bit
      (10). To ensure that kernel modifications to the page table do not
      inadvertently revert a bit set by hardware updates, certain Stage 2
      software pte/pmd operations must be performed atomically.
      
      The main user of the AF bit is the kvm_age_hva() mechanism. The
      kvm_age_hva_handler() function performs a "test and clear young" action
      on the pte/pmd. This needs to be atomic in respect of automatic hardware
      updates of the AF bit. Since the AF bit is in the same position for both
      Stage 1 and Stage 2, the patch reuses the existing
      ptep_test_and_clear_young() functionality if
      __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG is defined. Otherwise, the
      existing pte_young/pte_mkold mechanism is preserved.
      
      The kvm_set_s2pte_readonly() (and the corresponding pmd equivalent) have
      to perform atomic modifications in order to avoid a race with updates of
      the AF bit. The arm64 implementation has been re-written using
      exclusives.
      
      Currently, kvm_set_s2pte_writable() (and pmd equivalent) take a pointer
      argument and modify the pte/pmd in place. However, these functions are
      only used on local variables rather than actual page table entries, so
      it makes more sense to follow the pte_mkwrite() approach for stage 1
      attributes. The change to kvm_s2pte_mkwrite() makes it clear that these
      functions do not modify the actual page table entries.
      
      The (pte|pmd)_mkyoung() uses on Stage 2 entries (setting the AF bit
      explicitly) do not need to be modified since hardware updates of the
      dirty status are not supported by KVM, so there is no possibility of
      losing such information.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      06485053
  3. 09 5月, 2016 9 次提交
  4. 04 5月, 2016 2 次提交
  5. 03 5月, 2016 11 次提交
  6. 29 4月, 2016 2 次提交
  7. 25 4月, 2016 1 次提交
  8. 21 4月, 2016 3 次提交
    • S
      arm64: kvm: Add support for 16K pages · 02e0b760
      Suzuki K Poulose 提交于
      Now that we can handle stage-2 page tables independent
      of the host page table levels, wire up the 16K page
      support.
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      02e0b760
    • S
      kvm-arm: Cleanup stage2 pgd handling · 9163ee23
      Suzuki K Poulose 提交于
      Now that we don't have any fake page table levels for arm64,
      cleanup the common code to get rid of the dead code.
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      9163ee23
    • S
      kvm: arm64: Get rid of fake page table levels · da04fa04
      Suzuki K Poulose 提交于
      On arm64, the hardware supports concatenation of upto 16 tables,
      at entry level for stage2 translations and we make use that whenever
      possible. This could lead to reduced number of translation levels than
      the normal (stage1 table) table. Also, since the IPA(40bit) is smaller
      than the some of the supported VA_BITS (e.g, 48bit), there could be
      different number of levels in stage-1 vs stage-2 tables. To reuse the
      kernel host page table walker for stage2 we have been using a fake
      software page table level, not known to the hardware. But with 16K
      translations, there could be upto 2 fake software levels (with 48bit VA
      and 40bit IPA), which complicates the code. Hence, we want to get rid of
      the hack.
      
      Now that we have explicit accessors for hyp vs stage2 page tables,
      define the stage2 walker helpers accordingly based on the actual
      table used by the hardware.
      
      Once we know the number of translation levels used by the hardware,
      it is merely a job of defining the helpers based on whether a
      particular level is folded or not, looking at the number of levels.
      
      Some facts before we calculate the translation levels:
      
      1) Smallest page size supported by arm64 is 4K.
      2) The minimum number of bits resolved at any page table level
         is (PAGE_SHIFT - 3) at intermediate levels.
      Both of them implies, minimum number of bits required for a level
      change is 9.
      
      Since we can concatenate upto 16 tables at stage2 entry, the total
      number of page table levels used by the hardware for resolving N bits
      is same as that for (N - 4) bits (with concatenation), as there cannot
      be a level in between (N, N-4) as per the above rules.
      
      Hence, we have
      
       STAGE2_PGTABLE_LEVELS = PGTABLE_LEVELS(KVM_PHYS_SHIFT - 4)
      
      With the current IPA limit (40bit), for all supported translations
      and VA_BITS, we have the following condition (even for 36bit VA with
      16K page size):
      
       CONFIG_PGTABLE_LEVELS >= STAGE2_PGTABLE_LEVELS.
      
      So, for e.g,  if PUD is present in stage2, it is present in the hyp(host).
      Hence, we fall back to the host definition if we find that a level is not
      folded. Otherwise we redefine it accordingly. A build time check is added
      to make sure the above condition holds. If this condition breaks in future,
      we can rearrange the host level helpers and fix our code easily.
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      da04fa04