1. 19 3月, 2018 1 次提交
  2. 16 1月, 2018 1 次提交
    • K
      KVM: arm/arm64: fix HYP ID map extension to 52 bits · 98732d1b
      Kristina Martsenko 提交于
      Commit fa2a8445 incorrectly masks the index of the HYP ID map pgd
      entry, causing a non-VHE kernel to hang during boot. This happens when
      VA_BITS=48 and the ID map text is in 52-bit physical memory. In this
      case we don't need an extra table level but need more entries in the
      top-level table, so we need to map into hyp_pgd and need to use
      __kvm_idmap_ptrs_per_pgd to mask in the extra bits. However,
      __create_hyp_mappings currently masks by PTRS_PER_PGD instead.
      
      Fix it so that we always use __kvm_idmap_ptrs_per_pgd for the HYP ID
      map. This ensures that we use the larger mask for the top-level ID map
      table when it has more entries. In all other cases, PTRS_PER_PGD is used
      as normal.
      
      Fixes: fa2a8445 ("arm64: allow ID map to be extended to 52 bits")
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      98732d1b
  3. 11 1月, 2018 1 次提交
  4. 08 1月, 2018 5 次提交
  5. 23 12月, 2017 1 次提交
  6. 18 12月, 2017 1 次提交
  7. 05 9月, 2017 1 次提交
  8. 25 7月, 2017 1 次提交
  9. 23 6月, 2017 2 次提交
    • T
      arm/arm64: KVM: add guest SEA support · 621f48e4
      Tyler Baicar 提交于
      Currently external aborts are unsupported by the guest abort
      handling. Add handling for SEAs so that the host kernel reports
      SEAs which occur in the guest kernel.
      
      When an SEA occurs in the guest kernel, the guest exits and is
      routed to kvm_handle_guest_abort(). Prior to this patch, a print
      message of an unsupported FSC would be printed and nothing else
      would happen. With this patch, the code gets routed to the APEI
      handling of SEAs in the host kernel to report the SEA information.
      Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: NChristoffer Dall <cdall@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      621f48e4
    • J
      KVM: arm/arm64: Signal SIGBUS when stage2 discovers hwpoison memory · 196f878a
      James Morse 提交于
      Once we enable ARCH_SUPPORTS_MEMORY_FAILURE on arm64, notifications for
      broken memory can call memory_failure() in mm/memory-failure.c to offline
      pages of memory, possibly signalling user space processes and notifying all
      the in-kernel users.
      
      memory_failure() has two modes, early and late. Early is used by
      machine-managers like Qemu to receive a notification when a memory error is
      notified to the host. These can then be relayed to the guest before the
      affected page is accessed. To enable this, the process must set
      PR_MCE_KILL_EARLY in PR_MCE_KILL_SET using the prctl() syscall.
      
      Once the early notification has been handled, nothing stops the
      machine-manager or guest from accessing the affected page. If the
      machine-manager does this the page will fail to be mapped and SIGBUS will
      be sent. This patch adds the equivalent path for when the guest accesses
      the page, sending SIGBUS to the machine-manager.
      
      These two signals can be distinguished by the machine-manager using their
      si_code: BUS_MCEERR_AO for 'action optional' early notifications, and
      BUS_MCEERR_AR for 'action required' synchronous/late notifications.
      
      Do as x86 does, and deliver the SIGBUS when we discover pfn ==
      KVM_PFN_ERR_HWPOISON. Use the hugepage size as si_addr_lsb if this vma was
      allocated as a hugepage. Transparent hugepages will be split by
      memory_failure() before we see them here.
      
      Cc: Punit Agrawal <punit.agrawal@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      196f878a
  10. 06 6月, 2017 1 次提交
    • M
      KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages · d6dbdd3c
      Marc Zyngier 提交于
      Under memory pressure, we start ageing pages, which amounts to parsing
      the page tables. Since we don't want to allocate any extra level,
      we pass NULL for our private allocation cache. Which means that
      stage2_get_pud() is allowed to fail. This results in the following
      splat:
      
      [ 1520.409577] Unable to handle kernel NULL pointer dereference at virtual address 00000008
      [ 1520.417741] pgd = ffff810f52fef000
      [ 1520.421201] [00000008] *pgd=0000010f636c5003, *pud=0000010f56f48003, *pmd=0000000000000000
      [ 1520.429546] Internal error: Oops: 96000006 [#1] PREEMPT SMP
      [ 1520.435156] Modules linked in:
      [ 1520.438246] CPU: 15 PID: 53550 Comm: qemu-system-aar Tainted: G        W       4.12.0-rc4-00027-g1885c397eaec #7205
      [ 1520.448705] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB12A 10/26/2016
      [ 1520.463726] task: ffff800ac5fb4e00 task.stack: ffff800ce04e0000
      [ 1520.469666] PC is at stage2_get_pmd+0x34/0x110
      [ 1520.474119] LR is at kvm_age_hva_handler+0x44/0xf0
      [ 1520.478917] pc : [<ffff0000080b137c>] lr : [<ffff0000080b149c>] pstate: 40000145
      [ 1520.486325] sp : ffff800ce04e33d0
      [ 1520.489644] x29: ffff800ce04e33d0 x28: 0000000ffff40064
      [ 1520.494967] x27: 0000ffff27e00000 x26: 0000000000000000
      [ 1520.500289] x25: ffff81051ba65008 x24: 0000ffff40065000
      [ 1520.505618] x23: 0000ffff40064000 x22: 0000000000000000
      [ 1520.510947] x21: ffff810f52b20000 x20: 0000000000000000
      [ 1520.516274] x19: 0000000058264000 x18: 0000000000000000
      [ 1520.521603] x17: 0000ffffa6fe7438 x16: ffff000008278b70
      [ 1520.526940] x15: 000028ccd8000000 x14: 0000000000000008
      [ 1520.532264] x13: ffff7e0018298000 x12: 0000000000000002
      [ 1520.537582] x11: ffff000009241b93 x10: 0000000000000940
      [ 1520.542908] x9 : ffff0000092ef800 x8 : 0000000000000200
      [ 1520.548229] x7 : ffff800ce04e36a8 x6 : 0000000000000000
      [ 1520.553552] x5 : 0000000000000001 x4 : 0000000000000000
      [ 1520.558873] x3 : 0000000000000000 x2 : 0000000000000008
      [ 1520.571696] x1 : ffff000008fd5000 x0 : ffff0000080b149c
      [ 1520.577039] Process qemu-system-aar (pid: 53550, stack limit = 0xffff800ce04e0000)
      [...]
      [ 1521.510735] [<ffff0000080b137c>] stage2_get_pmd+0x34/0x110
      [ 1521.516221] [<ffff0000080b149c>] kvm_age_hva_handler+0x44/0xf0
      [ 1521.522054] [<ffff0000080b0610>] handle_hva_to_gpa+0xb8/0xe8
      [ 1521.527716] [<ffff0000080b3434>] kvm_age_hva+0x44/0xf0
      [ 1521.532854] [<ffff0000080a58b0>] kvm_mmu_notifier_clear_flush_young+0x70/0xc0
      [ 1521.539992] [<ffff000008238378>] __mmu_notifier_clear_flush_young+0x88/0xd0
      [ 1521.546958] [<ffff00000821eca0>] page_referenced_one+0xf0/0x188
      [ 1521.552881] [<ffff00000821f36c>] rmap_walk_anon+0xec/0x250
      [ 1521.558370] [<ffff000008220f78>] rmap_walk+0x78/0xa0
      [ 1521.563337] [<ffff000008221104>] page_referenced+0x164/0x180
      [ 1521.569002] [<ffff0000081f1af0>] shrink_active_list+0x178/0x3b8
      [ 1521.574922] [<ffff0000081f2058>] shrink_node_memcg+0x328/0x600
      [ 1521.580758] [<ffff0000081f23f4>] shrink_node+0xc4/0x328
      [ 1521.585986] [<ffff0000081f2718>] do_try_to_free_pages+0xc0/0x340
      [ 1521.592000] [<ffff0000081f2a64>] try_to_free_pages+0xcc/0x240
      [...]
      
      The trivial fix is to handle this NULL pud value early, rather than
      dereferencing it blindly.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <cdall@linaro.org>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      d6dbdd3c
  11. 16 5月, 2017 2 次提交
  12. 15 5月, 2017 1 次提交
    • S
      kvm: arm/arm64: Fix race in resetting stage2 PGD · 6c0d706b
      Suzuki K Poulose 提交于
      In kvm_free_stage2_pgd() we check the stage2 PGD before holding
      the lock and proceed to take the lock if it is valid. And we unmap
      the page tables, followed by releasing the lock. We reset the PGD
      only after dropping this lock, which could cause a race condition
      where another thread waiting on or even holding the lock, could
      potentially see that the PGD is still valid and proceed to perform
      a stage2 operation and later encounter a NULL PGD.
      
      [223090.242280] Unable to handle kernel NULL pointer dereference at
      virtual address 00000040
      [223090.262330] PC is at unmap_stage2_range+0x8c/0x428
      [223090.262332] LR is at kvm_unmap_hva_handler+0x2c/0x3c
      [223090.262531] Call trace:
      [223090.262533] [<ffff0000080adb78>] unmap_stage2_range+0x8c/0x428
      [223090.262535] [<ffff0000080adf40>] kvm_unmap_hva_handler+0x2c/0x3c
      [223090.262537] [<ffff0000080ace2c>] handle_hva_to_gpa+0xb0/0x104
      [223090.262539] [<ffff0000080af988>] kvm_unmap_hva+0x5c/0xbc
      [223090.262543] [<ffff0000080a2478>]
      kvm_mmu_notifier_invalidate_page+0x50/0x8c
      [223090.262547] [<ffff0000082274f8>]
      __mmu_notifier_invalidate_page+0x5c/0x84
      [223090.262551] [<ffff00000820b700>] try_to_unmap_one+0x1d0/0x4a0
      [223090.262553] [<ffff00000820c5c8>] rmap_walk+0x1cc/0x2e0
      [223090.262555] [<ffff00000820c90c>] try_to_unmap+0x74/0xa4
      [223090.262557] [<ffff000008230ce4>] migrate_pages+0x31c/0x5ac
      [223090.262561] [<ffff0000081f869c>] compact_zone+0x3fc/0x7ac
      [223090.262563] [<ffff0000081f8ae0>] compact_zone_order+0x94/0xb0
      [223090.262564] [<ffff0000081f91c0>] try_to_compact_pages+0x108/0x290
      [223090.262569] [<ffff0000081d5108>] __alloc_pages_direct_compact+0x70/0x1ac
      [223090.262571] [<ffff0000081d64a0>] __alloc_pages_nodemask+0x434/0x9f4
      [223090.262572] [<ffff0000082256f0>] alloc_pages_vma+0x230/0x254
      [223090.262574] [<ffff000008235e5c>] do_huge_pmd_anonymous_page+0x114/0x538
      [223090.262576] [<ffff000008201bec>] handle_mm_fault+0xd40/0x17a4
      [223090.262577] [<ffff0000081fb324>] __get_user_pages+0x12c/0x36c
      [223090.262578] [<ffff0000081fb804>] get_user_pages_unlocked+0xa4/0x1b8
      [223090.262579] [<ffff0000080a3ce8>] __gfn_to_pfn_memslot+0x280/0x31c
      [223090.262580] [<ffff0000080a3dd0>] gfn_to_pfn_prot+0x4c/0x5c
      [223090.262582] [<ffff0000080af3f8>] kvm_handle_guest_abort+0x240/0x774
      [223090.262584] [<ffff0000080b2bac>] handle_exit+0x11c/0x1ac
      [223090.262586] [<ffff0000080ab99c>] kvm_arch_vcpu_ioctl_run+0x31c/0x648
      [223090.262587] [<ffff0000080a1d78>] kvm_vcpu_ioctl+0x378/0x768
      [223090.262590] [<ffff00000825df5c>] do_vfs_ioctl+0x324/0x5a4
      [223090.262591] [<ffff00000825e26c>] SyS_ioctl+0x90/0xa4
      [223090.262595] [<ffff000008085d84>] el0_svc_naked+0x38/0x3c
      
      This patch moves the stage2 PGD manipulation under the lock.
      Reported-by: NAlexander Graf <agraf@suse.de>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Reviewed-by: NChristoffer Dall <cdall@linaro.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      6c0d706b
  13. 04 5月, 2017 1 次提交
    • C
      KVM: arm/arm64: Move shared files to virt/kvm/arm · 35d2d5d4
      Christoffer Dall 提交于
      For some time now we have been having a lot of shared functionality
      between the arm and arm64 KVM support in arch/arm, which not only
      required a horrible inter-arch reference from the Makefile in
      arch/arm64/kvm, but also created confusion for newcomers to the code
      base, as was recently seen on the mailing list.
      
      Further, it causes confusion for things like cscope, which needs special
      attention to index specific shared files for arm64 from the arm tree.
      
      Move the shared files into virt/kvm/arm and move the trace points along
      with it.  When moving the tracepoints we have to modify the way the vgic
      creates definitions of the trace points, so we take the chance to
      include the VGIC tracepoints in its very own special vgic trace.h file.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      35d2d5d4
  14. 09 4月, 2017 2 次提交
  15. 04 4月, 2017 1 次提交
  16. 20 3月, 2017 2 次提交
  17. 30 1月, 2017 2 次提交
  18. 09 9月, 2016 1 次提交
    • S
      kvm-arm: Unmap shadow pagetables properly · 293f2936
      Suzuki K Poulose 提交于
      On arm/arm64, we depend on the kvm_unmap_hva* callbacks (via
      mmu_notifiers::invalidate_*) to unmap the stage2 pagetables when
      the userspace buffer gets unmapped. However, when the Hypervisor
      process exits without explicit unmap of the guest buffers, the only
      notifier we get is kvm_arch_flush_shadow_all() (via mmu_notifier::release
      ) which does nothing on arm. Later this causes us to access pages that
      were already released [via exit_mmap() -> unmap_vmas()] when we actually
      get to unmap the stage2 pagetable [via kvm_arch_destroy_vm() ->
      kvm_free_stage2_pgd()]. This triggers crashes with CONFIG_DEBUG_PAGEALLOC,
      which unmaps any free'd pages from the linear map.
      
       [  757.644120] Unable to handle kernel paging request at virtual address
        ffff800661e00000
       [  757.652046] pgd = ffff20000b1a2000
       [  757.655471] [ffff800661e00000] *pgd=00000047fffe3003, *pud=00000047fcd8c003,
        *pmd=00000047fcc7c003, *pte=00e8004661e00712
       [  757.666492] Internal error: Oops: 96000147 [#3] PREEMPT SMP
       [  757.672041] Modules linked in:
       [  757.675100] CPU: 7 PID: 3630 Comm: qemu-system-aar Tainted: G      D
       4.8.0-rc1 #3
       [  757.683240] Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board,
        BIOS 3.06.15 Aug 19 2016
       [  757.692938] task: ffff80069cdd3580 task.stack: ffff8006adb7c000
       [  757.698840] PC is at __flush_dcache_area+0x1c/0x40
       [  757.703613] LR is at kvm_flush_dcache_pmd+0x60/0x70
       [  757.708469] pc : [<ffff20000809dbdc>] lr : [<ffff2000080b4a70>] pstate: 20000145
       ...
       [  758.357249] [<ffff20000809dbdc>] __flush_dcache_area+0x1c/0x40
       [  758.363059] [<ffff2000080b6748>] unmap_stage2_range+0x458/0x5f0
       [  758.368954] [<ffff2000080b708c>] kvm_free_stage2_pgd+0x34/0x60
       [  758.374761] [<ffff2000080b2280>] kvm_arch_destroy_vm+0x20/0x68
       [  758.380570] [<ffff2000080aa330>] kvm_put_kvm+0x210/0x358
       [  758.385860] [<ffff2000080aa524>] kvm_vm_release+0x2c/0x40
       [  758.391239] [<ffff2000082ad234>] __fput+0x114/0x2e8
       [  758.396096] [<ffff2000082ad46c>] ____fput+0xc/0x18
       [  758.400869] [<ffff200008104658>] task_work_run+0x108/0x138
       [  758.406332] [<ffff2000080dc8ec>] do_exit+0x48c/0x10e8
       [  758.411363] [<ffff2000080dd5fc>] do_group_exit+0x6c/0x130
       [  758.416739] [<ffff2000080ed924>] get_signal+0x284/0xa18
       [  758.421943] [<ffff20000808a098>] do_signal+0x158/0x860
       [  758.427060] [<ffff20000808aad4>] do_notify_resume+0x6c/0x88
       [  758.432608] [<ffff200008083624>] work_pending+0x10/0x14
       [  758.437812] Code: 9ac32042 8b010001 d1000443 8a230000 (d50b7e20)
      
      This patch fixes the issue by moving the kvm_free_stage2_pgd() to
      kvm_arch_flush_shadow_all().
      
      Cc: <stable@vger.kernel.org> # 3.9+
      Tested-by: NItaru Kitayama <itaru.kitayama@riken.jp>
      Reported-by: NItaru Kitayama <itaru.kitayama@riken.jp>
      Reported-by: NJames Morse <james.morse@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      293f2936
  19. 08 9月, 2016 2 次提交
    • M
      arm/arm64: KVM: Inject virtual abort when guest exits on external abort · 4055710b
      Marc Zyngier 提交于
      If we spot a data abort bearing the ESR_EL2.EA bit set, we know that
      this is an external abort, and that should be punished by the injection
      of an abort.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      4055710b
    • M
      arm/kvm: excise redundant cache maintenance · dcadda14
      Mark Rutland 提交于
      When modifying Stage-2 page tables, we perform cache maintenance to
      account for non-coherent page table walks. However, this is unnecessary,
      as page table walks are guaranteed to be coherent in the presence of the
      virtualization extensions.
      
      Per ARM DDI 0406C.c, section B1.7 ("The Virtualization Extensions"), the
      virtualization extensions mandate the multiprocessing extensions.
      
      Per ARM DDI 0406C.c, section B3.10.1 ("General TLB maintenance
      requirements"), as described in the sub-section titled "TLB maintenance
      operations and the memory order model", this maintenance is not required
      in the presence of the multiprocessing extensions.
      
      Hence, we need not perform this cache maintenance when modifying Stage-2
      entries.
      
      This patch removes the logic for performing the redundant maintenance.
      To ensure visibility and ordering of updates, a dsb(ishst) that was
      otherwise implicit in the maintenance is folded into kvm_set_pmd() and
      kvm_set_pte().
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: kvmarm@lists.cs.columbia.edu
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      dcadda14
  20. 06 9月, 2016 1 次提交
  21. 17 8月, 2016 1 次提交
    • C
      KVM: arm/arm64: Change misleading use of is_error_pfn · 9ac71595
      Christoffer Dall 提交于
      When converting a gfn to a pfn, we call gfn_to_pfn_prot, which returns
      various kinds of error values.  It turns out that is_error_pfn() only
      returns true when the gfn was found in a memory slot and could somehow
      not be used, but it does not return true if the gfn does not belong to
      any memory slot.
      
      Change use to is_error_noslot_pfn() which covers both cases.
      
      Note: Since we already check for kvm_is_error_hva(hva) explicitly in the
      caller of this function while holding the kvm->srcu lock protecting the
      memory slots, this should never be a problem, but nevertheless this
      change is warranted as it shows the intention of the code.
      Reported-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      9ac71595
  22. 04 7月, 2016 6 次提交
  23. 29 6月, 2016 2 次提交
  24. 10 5月, 2016 1 次提交
    • C
      kvm: arm64: Enable hardware updates of the Access Flag for Stage 2 page tables · 06485053
      Catalin Marinas 提交于
      The ARMv8.1 architecture extensions introduce support for hardware
      updates of the access and dirty information in page table entries. With
      VTCR_EL2.HA enabled (bit 21), when the CPU accesses an IPA with the
      PTE_AF bit cleared in the stage 2 page table, instead of raising an
      Access Flag fault to EL2 the CPU sets the actual page table entry bit
      (10). To ensure that kernel modifications to the page table do not
      inadvertently revert a bit set by hardware updates, certain Stage 2
      software pte/pmd operations must be performed atomically.
      
      The main user of the AF bit is the kvm_age_hva() mechanism. The
      kvm_age_hva_handler() function performs a "test and clear young" action
      on the pte/pmd. This needs to be atomic in respect of automatic hardware
      updates of the AF bit. Since the AF bit is in the same position for both
      Stage 1 and Stage 2, the patch reuses the existing
      ptep_test_and_clear_young() functionality if
      __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG is defined. Otherwise, the
      existing pte_young/pte_mkold mechanism is preserved.
      
      The kvm_set_s2pte_readonly() (and the corresponding pmd equivalent) have
      to perform atomic modifications in order to avoid a race with updates of
      the AF bit. The arm64 implementation has been re-written using
      exclusives.
      
      Currently, kvm_set_s2pte_writable() (and pmd equivalent) take a pointer
      argument and modify the pte/pmd in place. However, these functions are
      only used on local variables rather than actual page table entries, so
      it makes more sense to follow the pte_mkwrite() approach for stage 1
      attributes. The change to kvm_s2pte_mkwrite() makes it clear that these
      functions do not modify the actual page table entries.
      
      The (pte|pmd)_mkyoung() uses on Stage 2 entries (setting the AF bit
      explicitly) do not need to be modified since hardware updates of the
      dirty status are not supported by KVM, so there is no possibility of
      losing such information.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      06485053