1. 15 6月, 2017 1 次提交
  2. 08 6月, 2017 9 次提交
  3. 07 6月, 2017 3 次提交
  4. 06 6月, 2017 2 次提交
    • M
      KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages · d6dbdd3c
      Marc Zyngier 提交于
      Under memory pressure, we start ageing pages, which amounts to parsing
      the page tables. Since we don't want to allocate any extra level,
      we pass NULL for our private allocation cache. Which means that
      stage2_get_pud() is allowed to fail. This results in the following
      splat:
      
      [ 1520.409577] Unable to handle kernel NULL pointer dereference at virtual address 00000008
      [ 1520.417741] pgd = ffff810f52fef000
      [ 1520.421201] [00000008] *pgd=0000010f636c5003, *pud=0000010f56f48003, *pmd=0000000000000000
      [ 1520.429546] Internal error: Oops: 96000006 [#1] PREEMPT SMP
      [ 1520.435156] Modules linked in:
      [ 1520.438246] CPU: 15 PID: 53550 Comm: qemu-system-aar Tainted: G        W       4.12.0-rc4-00027-g1885c397eaec #7205
      [ 1520.448705] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB12A 10/26/2016
      [ 1520.463726] task: ffff800ac5fb4e00 task.stack: ffff800ce04e0000
      [ 1520.469666] PC is at stage2_get_pmd+0x34/0x110
      [ 1520.474119] LR is at kvm_age_hva_handler+0x44/0xf0
      [ 1520.478917] pc : [<ffff0000080b137c>] lr : [<ffff0000080b149c>] pstate: 40000145
      [ 1520.486325] sp : ffff800ce04e33d0
      [ 1520.489644] x29: ffff800ce04e33d0 x28: 0000000ffff40064
      [ 1520.494967] x27: 0000ffff27e00000 x26: 0000000000000000
      [ 1520.500289] x25: ffff81051ba65008 x24: 0000ffff40065000
      [ 1520.505618] x23: 0000ffff40064000 x22: 0000000000000000
      [ 1520.510947] x21: ffff810f52b20000 x20: 0000000000000000
      [ 1520.516274] x19: 0000000058264000 x18: 0000000000000000
      [ 1520.521603] x17: 0000ffffa6fe7438 x16: ffff000008278b70
      [ 1520.526940] x15: 000028ccd8000000 x14: 0000000000000008
      [ 1520.532264] x13: ffff7e0018298000 x12: 0000000000000002
      [ 1520.537582] x11: ffff000009241b93 x10: 0000000000000940
      [ 1520.542908] x9 : ffff0000092ef800 x8 : 0000000000000200
      [ 1520.548229] x7 : ffff800ce04e36a8 x6 : 0000000000000000
      [ 1520.553552] x5 : 0000000000000001 x4 : 0000000000000000
      [ 1520.558873] x3 : 0000000000000000 x2 : 0000000000000008
      [ 1520.571696] x1 : ffff000008fd5000 x0 : ffff0000080b149c
      [ 1520.577039] Process qemu-system-aar (pid: 53550, stack limit = 0xffff800ce04e0000)
      [...]
      [ 1521.510735] [<ffff0000080b137c>] stage2_get_pmd+0x34/0x110
      [ 1521.516221] [<ffff0000080b149c>] kvm_age_hva_handler+0x44/0xf0
      [ 1521.522054] [<ffff0000080b0610>] handle_hva_to_gpa+0xb8/0xe8
      [ 1521.527716] [<ffff0000080b3434>] kvm_age_hva+0x44/0xf0
      [ 1521.532854] [<ffff0000080a58b0>] kvm_mmu_notifier_clear_flush_young+0x70/0xc0
      [ 1521.539992] [<ffff000008238378>] __mmu_notifier_clear_flush_young+0x88/0xd0
      [ 1521.546958] [<ffff00000821eca0>] page_referenced_one+0xf0/0x188
      [ 1521.552881] [<ffff00000821f36c>] rmap_walk_anon+0xec/0x250
      [ 1521.558370] [<ffff000008220f78>] rmap_walk+0x78/0xa0
      [ 1521.563337] [<ffff000008221104>] page_referenced+0x164/0x180
      [ 1521.569002] [<ffff0000081f1af0>] shrink_active_list+0x178/0x3b8
      [ 1521.574922] [<ffff0000081f2058>] shrink_node_memcg+0x328/0x600
      [ 1521.580758] [<ffff0000081f23f4>] shrink_node+0xc4/0x328
      [ 1521.585986] [<ffff0000081f2718>] do_try_to_free_pages+0xc0/0x340
      [ 1521.592000] [<ffff0000081f2a64>] try_to_free_pages+0xcc/0x240
      [...]
      
      The trivial fix is to handle this NULL pud value early, rather than
      dereferencing it blindly.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <cdall@linaro.org>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      d6dbdd3c
    • C
      KVM: arm/arm64: vgic-v3: Fix nr_pre_bits bitfield extraction · d68356cc
      Christoffer Dall 提交于
      We used to extract PRIbits from the ICH_VT_EL2 which was the upper field
      in the register word, so a mask wasn't necessary, but as we switched to
      looking at PREbits, which is bits 26 through 28 with the PRIbits field
      being potentially non-zero, we really need to mask off the field value,
      otherwise fun things may happen.
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      d68356cc
  5. 04 6月, 2017 12 次提交
  6. 24 5月, 2017 1 次提交
  7. 23 5月, 2017 3 次提交
  8. 18 5月, 2017 2 次提交
    • C
      KVM: arm/arm64: Hold slots_lock when unregistering kvm io bus devices · fa472fa9
      Christoffer Dall 提交于
      We were not holding the kvm->slots_lock as required when calling
      kvm_io_bus_unregister_dev() as required.
      
      This only affects the error path, but still, let's do our due
      diligence.
      
      Reported by: Eric Auger <eric.auger@redhat.com>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      fa472fa9
    • C
      KVM: arm/arm64: Fix bug when registering redist iodevs · 552c9f47
      Christoffer Dall 提交于
      If userspace creates the VCPUs after initializing the VGIC, then we end
      up in a situation where we trigger a bug in kvm_vcpu_get_idx(), because
      it is called prior to adding the VCPU into the vcpus array on the VM.
      
      There is no tight coupling between the VCPU index and the area of the
      redistributor region used for the VCPU, so we can simply ensure that all
      creations of redistributors are serialized per VM, and increment an
      offset when we successfully add a redistributor.
      
      The vgic_register_redist_iodev() function can be called from two paths:
      vgic_redister_all_redist_iodev() which is called via the kvm_vgic_addr()
      device attribute handler.  This patch already holds the kvm->lock mutex.
      
      The other path is via kvm_vgic_vcpu_init, which is called through a
      longer chain from kvm_vm_ioctl_create_vcpu(), which releases the
      kvm->lock mutex just before calling kvm_arch_vcpu_create(), so we can
      simply take this mutex again later for our purposes.
      
      Fixes: ab6f468c10 ("KVM: arm/arm64: Register iodevs when setting redist base and creating VCPUs")
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      Tested-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      552c9f47
  9. 16 5月, 2017 4 次提交
  10. 15 5月, 2017 3 次提交
    • Z
      KVM: arm: rename pm_fake handler to trap_raz_wi · 9b619a8f
      Zhichao Huang 提交于
      pm_fake doesn't quite describe what the handler does (ignoring writes
      and returning 0 for reads).
      
      As we're about to use it (a lot) in a different context, rename it
      with a (admitedly cryptic) name that make sense for all users.
      Signed-off-by: NZhichao Huang <zhichao.huang@linaro.org>
      Reviewed-by: NAlex Bennee <alex.bennee@linaro.org>
      Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      9b619a8f
    • Z
      KVM: arm: plug potential guest hardware debug leakage · 661e6b02
      Zhichao Huang 提交于
      Hardware debugging in guests is not intercepted currently, it means
      that a malicious guest can bring down the entire machine by writing
      to the debug registers.
      
      This patch enable trapping of all debug registers, preventing the
      guests to access the debug registers. This includes access to the
      debug mode(DBGDSCR) in the guest world all the time which could
      otherwise mess with the host state. Reads return 0 and writes are
      ignored (RAZ_WI).
      
      The result is the guest cannot detect any working hardware based debug
      support. As debug exceptions are still routed to the guest normal
      debug using software based breakpoints still works.
      
      To support debugging using hardware registers we need to implement a
      debug register aware world switch as well as special trapping for
      registers that may affect the host state.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NZhichao Huang <zhichao.huang@linaro.org>
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NChristoffer Dall <cdall@linaro.org>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      661e6b02
    • S
      kvm: arm/arm64: Fix race in resetting stage2 PGD · 6c0d706b
      Suzuki K Poulose 提交于
      In kvm_free_stage2_pgd() we check the stage2 PGD before holding
      the lock and proceed to take the lock if it is valid. And we unmap
      the page tables, followed by releasing the lock. We reset the PGD
      only after dropping this lock, which could cause a race condition
      where another thread waiting on or even holding the lock, could
      potentially see that the PGD is still valid and proceed to perform
      a stage2 operation and later encounter a NULL PGD.
      
      [223090.242280] Unable to handle kernel NULL pointer dereference at
      virtual address 00000040
      [223090.262330] PC is at unmap_stage2_range+0x8c/0x428
      [223090.262332] LR is at kvm_unmap_hva_handler+0x2c/0x3c
      [223090.262531] Call trace:
      [223090.262533] [<ffff0000080adb78>] unmap_stage2_range+0x8c/0x428
      [223090.262535] [<ffff0000080adf40>] kvm_unmap_hva_handler+0x2c/0x3c
      [223090.262537] [<ffff0000080ace2c>] handle_hva_to_gpa+0xb0/0x104
      [223090.262539] [<ffff0000080af988>] kvm_unmap_hva+0x5c/0xbc
      [223090.262543] [<ffff0000080a2478>]
      kvm_mmu_notifier_invalidate_page+0x50/0x8c
      [223090.262547] [<ffff0000082274f8>]
      __mmu_notifier_invalidate_page+0x5c/0x84
      [223090.262551] [<ffff00000820b700>] try_to_unmap_one+0x1d0/0x4a0
      [223090.262553] [<ffff00000820c5c8>] rmap_walk+0x1cc/0x2e0
      [223090.262555] [<ffff00000820c90c>] try_to_unmap+0x74/0xa4
      [223090.262557] [<ffff000008230ce4>] migrate_pages+0x31c/0x5ac
      [223090.262561] [<ffff0000081f869c>] compact_zone+0x3fc/0x7ac
      [223090.262563] [<ffff0000081f8ae0>] compact_zone_order+0x94/0xb0
      [223090.262564] [<ffff0000081f91c0>] try_to_compact_pages+0x108/0x290
      [223090.262569] [<ffff0000081d5108>] __alloc_pages_direct_compact+0x70/0x1ac
      [223090.262571] [<ffff0000081d64a0>] __alloc_pages_nodemask+0x434/0x9f4
      [223090.262572] [<ffff0000082256f0>] alloc_pages_vma+0x230/0x254
      [223090.262574] [<ffff000008235e5c>] do_huge_pmd_anonymous_page+0x114/0x538
      [223090.262576] [<ffff000008201bec>] handle_mm_fault+0xd40/0x17a4
      [223090.262577] [<ffff0000081fb324>] __get_user_pages+0x12c/0x36c
      [223090.262578] [<ffff0000081fb804>] get_user_pages_unlocked+0xa4/0x1b8
      [223090.262579] [<ffff0000080a3ce8>] __gfn_to_pfn_memslot+0x280/0x31c
      [223090.262580] [<ffff0000080a3dd0>] gfn_to_pfn_prot+0x4c/0x5c
      [223090.262582] [<ffff0000080af3f8>] kvm_handle_guest_abort+0x240/0x774
      [223090.262584] [<ffff0000080b2bac>] handle_exit+0x11c/0x1ac
      [223090.262586] [<ffff0000080ab99c>] kvm_arch_vcpu_ioctl_run+0x31c/0x648
      [223090.262587] [<ffff0000080a1d78>] kvm_vcpu_ioctl+0x378/0x768
      [223090.262590] [<ffff00000825df5c>] do_vfs_ioctl+0x324/0x5a4
      [223090.262591] [<ffff00000825e26c>] SyS_ioctl+0x90/0xa4
      [223090.262595] [<ffff000008085d84>] el0_svc_naked+0x38/0x3c
      
      This patch moves the stage2 PGD manipulation under the lock.
      Reported-by: NAlexander Graf <agraf@suse.de>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Reviewed-by: NChristoffer Dall <cdall@linaro.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@linaro.org>
      6c0d706b