1. 26 11月, 2014 1 次提交
    • M
      arm64: KVM: fix unmapping with 48-bit VAs · 7cbb87d6
      Mark Rutland 提交于
      Currently if using a 48-bit VA, tearing down the hyp page tables (which
      can happen in the absence of a GICH or GICV resource) results in the
      rather nasty splat below, evidently becasue we access a table that
      doesn't actually exist.
      
      Commit 38f791a4 (arm64: KVM: Implement 48 VA support for KVM EL2
      and Stage-2) added a pgd_none check to __create_hyp_mappings to account
      for the additional level of tables, but didn't add a corresponding check
      to unmap_range, and this seems to be the source of the problem.
      
      This patch adds the missing pgd_none check, ensuring we don't try to
      access tables that don't exist.
      
      Original splat below:
      
      kvm [1]: Using HYP init bounce page @83fe94a000
      kvm [1]: Cannot obtain GICH resource
      Unable to handle kernel paging request at virtual address ffff7f7fff000000
      pgd = ffff800000770000
      [ffff7f7fff000000] *pgd=0000000000000000
      Internal error: Oops: 96000004 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc2+ #89
      task: ffff8003eb500000 ti: ffff8003eb45c000 task.ti: ffff8003eb45c000
      PC is at unmap_range+0x120/0x580
      LR is at free_hyp_pgds+0xac/0xe4
      pc : [<ffff80000009b768>] lr : [<ffff80000009cad8>] pstate: 80000045
      sp : ffff8003eb45fbf0
      x29: ffff8003eb45fbf0 x28: ffff800000736000
      x27: ffff800000735000 x26: ffff7f7fff000000
      x25: 0000000040000000 x24: ffff8000006f5000
      x23: 0000000000000000 x22: 0000007fffffffff
      x21: 0000800000000000 x20: 0000008000000000
      x19: 0000000000000000 x18: ffff800000648000
      x17: ffff800000537228 x16: 0000000000000000
      x15: 000000000000001f x14: 0000000000000000
      x13: 0000000000000001 x12: 0000000000000020
      x11: 0000000000000062 x10: 0000000000000006
      x9 : 0000000000000000 x8 : 0000000000000063
      x7 : 0000000000000018 x6 : 00000003ff000000
      x5 : ffff800000744188 x4 : 0000000000000001
      x3 : 0000000040000000 x2 : ffff800000000000
      x1 : 0000007fffffffff x0 : 000000003fffffff
      
      Process swapper/0 (pid: 1, stack limit = 0xffff8003eb45c058)
      Stack: (0xffff8003eb45fbf0 to 0xffff8003eb460000)
      fbe0:                                     eb45fcb0 ffff8003 0009cad8 ffff8000
      fc00: 00000000 00000080 00736140 ffff8000 00736000 ffff8000 00000000 00007c80
      fc20: 00000000 00000080 006f5000 ffff8000 00000000 00000080 00743000 ffff8000
      fc40: 00735000 ffff8000 006d3030 ffff8000 006fe7b8 ffff8000 00000000 00000080
      fc60: ffffffff 0000007f fdac1000 ffff8003 fd94b000 ffff8003 fda47000 ffff8003
      fc80: 00502b40 ffff8000 ff000000 ffff7f7f fdec6000 00008003 fdac1630 ffff8003
      fca0: eb45fcb0 ffff8003 ffffffff 0000007f eb45fd00 ffff8003 0009b378 ffff8000
      fcc0: ffffffea 00000000 006fe000 ffff8000 00736728 ffff8000 00736120 ffff8000
      fce0: 00000040 00000000 00743000 ffff8000 006fe7b8 ffff8000 0050cd48 00000000
      fd00: eb45fd60 ffff8003 00096070 ffff8000 006f06e0 ffff8000 006f06e0 ffff8000
      fd20: fd948b40 ffff8003 0009a320 ffff8000 00000000 00000000 00000000 00000000
      fd40: 00000ae0 00000000 006aa25c ffff8000 eb45fd60 ffff8003 0017ca44 00000002
      fd60: eb45fdc0 ffff8003 0009a33c ffff8000 006f06e0 ffff8000 006f06e0 ffff8000
      fd80: fd948b40 ffff8003 0009a320 ffff8000 00000000 00000000 00735000 ffff8000
      fda0: 006d3090 ffff8000 006aa25c ffff8000 00735000 ffff8000 006d3030 ffff8000
      fdc0: eb45fdd0 ffff8003 000814c0 ffff8000 eb45fe50 ffff8003 006aaac4 ffff8000
      fde0: 006ddd90 ffff8000 00000006 00000000 006d3000 ffff8000 00000095 00000000
      fe00: 006a1e90 ffff8000 00735000 ffff8000 006d3000 ffff8000 006aa25c ffff8000
      fe20: 00735000 ffff8000 006d3030 ffff8000 eb45fe50 ffff8003 006fac68 ffff8000
      fe40: 00000006 00000006 fe293ee6 ffff8003 eb45feb0 ffff8003 004f8ee8 ffff8000
      fe60: 004f8ed4 ffff8000 00735000 ffff8000 00000000 00000000 00000000 00000000
      fe80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      fea0: 00000000 00000000 00000000 00000000 00000000 00000000 000843d0 ffff8000
      fec0: 004f8ed4 ffff8000 00000000 00000000 00000000 00000000 00000000 00000000
      fee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      ff00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      ff20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      ff40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      ff60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      ff80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      ffa0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000005 00000000
      ffe0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      Call trace:
      [<ffff80000009b768>] unmap_range+0x120/0x580
      [<ffff80000009cad4>] free_hyp_pgds+0xa8/0xe4
      [<ffff80000009b374>] kvm_arch_init+0x268/0x44c
      [<ffff80000009606c>] kvm_init+0x24/0x260
      [<ffff80000009a338>] arm_init+0x18/0x24
      [<ffff8000000814bc>] do_one_initcall+0x88/0x1a0
      [<ffff8000006aaac0>] kernel_init_freeable+0x148/0x1e8
      [<ffff8000004f8ee4>] kernel_init+0x10/0xd4
      Code: 8b000263 92628479 d1000720 eb01001f (f9400340)
      ---[ end trace 3bc230562e926fa4 ]---
      Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Jungseok Lee <jungseoklee85@gmail.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7cbb87d6
  2. 15 10月, 2014 1 次提交
  3. 14 10月, 2014 2 次提交
    • C
      arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE · c3058d5d
      Christoffer Dall 提交于
      When creating or moving a memslot, make sure the IPA space is within the
      addressable range of the guest.  Otherwise, user space can create too
      large a memslot and KVM would try to access potentially unallocated page
      table entries when inserting entries in the Stage-2 page tables.
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      c3058d5d
    • C
      arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2 · 38f791a4
      Christoffer Dall 提交于
      This patch adds the necessary support for all host kernel PGSIZE and
      VA_SPACE configuration options for both EL2 and the Stage-2 page tables.
      
      However, for 40bit and 42bit PARange systems, the architecture mandates
      that VTCR_EL2.SL0 is maximum 1, resulting in fewer levels of stage-2
      pagge tables than levels of host kernel page tables.  At the same time,
      systems with a PARange > 42bit, we limit the IPA range by always setting
      VTCR_EL2.T0SZ to 24.
      
      To solve the situation with different levels of page tables for Stage-2
      translation than the host kernel page tables, we allocate a dummy PGD
      with pointers to our actual inital level Stage-2 page table, in order
      for us to reuse the kernel pgtable manipulation primitives.  Reproducing
      all these in KVM does not look pretty and unnecessarily complicates the
      32-bit side.
      
      Systems with a PARange < 40bits are not yet supported.
      
       [ I have reworked this patch from its original form submitted by
         Jungseok to take the architecture constraints into consideration.
         There were too many changes from the original patch for me to
         preserve the authorship.  Thanks to Catalin Marinas for his help in
         figuring out a good solution to this challenge.  I have also fixed
         various bugs and missing error code handling from the original
         patch. - Christoffer ]
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NJungseok Lee <jungseoklee85@gmail.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      38f791a4
  4. 13 10月, 2014 1 次提交
  5. 10 10月, 2014 3 次提交
  6. 26 9月, 2014 1 次提交
  7. 11 9月, 2014 1 次提交
  8. 28 8月, 2014 1 次提交
  9. 11 7月, 2014 3 次提交
  10. 28 4月, 2014 1 次提交
    • M
      arm: KVM: fix possible misalignment of PGDs and bounce page · 5d4e08c4
      Mark Salter 提交于
      The kvm/mmu code shared by arm and arm64 uses kalloc() to allocate
      a bounce page (if hypervisor init code crosses page boundary) and
      hypervisor PGDs. The problem is that kalloc() does not guarantee
      the proper alignment. In the case of the bounce page, the page sized
      buffer allocated may also cross a page boundary negating the purpose
      and leading to a hang during kvm initialization. Likewise the PGDs
      allocated may not meet the minimum alignment requirements of the
      underlying MMU. This patch uses __get_free_page() to guarantee the
      worst case alignment needs of the bounce page and PGDs on both arm
      and arm64.
      
      Cc: <stable@vger.kernel.org> # 3.10+
      Signed-off-by: NMark Salter <msalter@redhat.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      5d4e08c4
  11. 03 3月, 2014 4 次提交
  12. 09 1月, 2014 1 次提交
    • M
      arm/arm64: KVM: relax the requirements of VMA alignment for THP · 136d737f
      Marc Zyngier 提交于
      The THP code in KVM/ARM is a bit restrictive in not allowing a THP
      to be used if the VMA is not 2MB aligned. Actually, it is not so much
      the VMA that matters, but the associated memslot:
      
      A process can perfectly mmap a region with no particular alignment
      restriction, and then pass a 2MB aligned address to KVM. In this
      case, KVM will only use this 2MB aligned region, and will ignore
      the range between vma->vm_start and memslot->userspace_addr.
      
      It can also choose to place this memslot at whatever alignment it
      wants in the IPA space. In the end, what matters is the relative
      alignment of the user space and IPA mappings with respect to a
      2M page. They absolutely must be the same if you want to use THP.
      
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      136d737f
  13. 12 12月, 2013 1 次提交
  14. 17 11月, 2013 1 次提交
  15. 18 10月, 2013 2 次提交
  16. 14 8月, 2013 1 次提交
  17. 08 8月, 2013 2 次提交
    • M
      arm64: KVM: fix 2-level page tables unmapping · 979acd5e
      Marc Zyngier 提交于
      When using 64kB pages, we only have two levels of page tables,
      meaning that PGD, PUD and PMD are fused. In this case, trying
      to refcount PUDs and PMDs independently is a a complete disaster,
      as they are the same.
      
      We manage to get it right for the allocation (stage2_set_pte uses
      {pmd,pud}_none), but the unmapping path clears both pud and pmd
      refcounts, which fails spectacularly with 2-level page tables.
      
      The fix is to avoid calling clear_pud_entry when both the pmd and
      pud pages are empty. For this, and instead of introducing another
      pud_empty function, consolidate both pte_empty and pmd_empty into
      page_empty (the code is actually identical) and use that to also
      test the validity of the pud.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      979acd5e
    • C
      ARM: KVM: Fix unaligned unmap_range leak · d3840b26
      Christoffer Dall 提交于
      The unmap_range function did not properly cover the case when the start
      address was not aligned to PMD_SIZE or PUD_SIZE and an entire pte table
      or pmd table was cleared, causing us to leak memory when incrementing
      the addr.
      
      The fix is to always move onto the next page table entry boundary
      instead of adding the full size of the VA range covered by the
      corresponding table level entry.
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      d3840b26
  18. 27 6月, 2013 1 次提交
  19. 03 6月, 2013 1 次提交
  20. 29 4月, 2013 6 次提交
    • M
      ARM: KVM: perform HYP initilization for hotplugged CPUs · d157f4a5
      Marc Zyngier 提交于
      Now that we have the necessary infrastructure to boot a hotplugged CPU
      at any point in time, wire a CPU notifier that will perform the HYP
      init for the incoming CPU.
      
      Note that this depends on the platform code and/or firmware to boot the
      incoming CPU with HYP mode enabled and return to the kernel by following
      the normal boot path (HYP stub installed).
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
      d157f4a5
    • M
      ARM: KVM: switch to a dual-step HYP init code · 5a677ce0
      Marc Zyngier 提交于
      Our HYP init code suffers from two major design issues:
      - it cannot support CPU hotplug, as we tear down the idmap very early
      - it cannot perform a TLB invalidation when switching from init to
        runtime mappings, as pages are manipulated from PL1 exclusively
      
      The hotplug problem mandates that we keep two sets of page tables
      (boot and runtime). The TLB problem mandates that we're able to
      transition from one PGD to another while in HYP, invalidating the TLBs
      in the process.
      
      To be able to do this, we need to share a page between the two page
      tables. A page that will have the same VA in both configurations. All we
      need is a VA that has the following properties:
      - This VA can't be used to represent a kernel mapping.
      - This VA will not conflict with the physical address of the kernel text
      
      The vectors page seems to satisfy this requirement:
      - The kernel never maps anything else there
      - The kernel text being copied at the beginning of the physical memory,
        it is unlikely to use the last 64kB (I doubt we'll ever support KVM
        on a system with something like 4MB of RAM, but patches are very
        welcome).
      
      Let's call this VA the trampoline VA.
      
      Now, we map our init page at 3 locations:
      - idmap in the boot pgd
      - trampoline VA in the boot pgd
      - trampoline VA in the runtime pgd
      
      The init scenario is now the following:
      - We jump in HYP with four parameters: boot HYP pgd, runtime HYP pgd,
        runtime stack, runtime vectors
      - Enable the MMU with the boot pgd
      - Jump to a target into the trampoline page (remember, this is the same
        physical page!)
      - Now switch to the runtime pgd (same VA, and still the same physical
        page!)
      - Invalidate TLBs
      - Set stack and vectors
      - Profit! (or eret, if you only care about the code).
      
      Note that we keep the boot mapping permanently (it is not strictly an
      idmap anymore) to allow for CPU hotplug in later patches.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
      5a677ce0
    • M
      ARM: KVM: rework HYP page table freeing · 4f728276
      Marc Zyngier 提交于
      There is no point in freeing HYP page tables differently from Stage-2.
      They now have the same requirements, and should be dealt with the same way.
      
      Promote unmap_stage2_range to be The One True Way, and get rid of a number
      of nasty bugs in the process (good thing we never actually called free_hyp_pmds
      before...).
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
      4f728276
    • M
      ARM: KVM: move to a KVM provided HYP idmap · 2fb41059
      Marc Zyngier 提交于
      After the HYP page table rework, it is pretty easy to let the KVM
      code provide its own idmap, rather than expecting the kernel to
      provide it. It takes actually less code to do so.
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
      2fb41059
    • M
      ARM: KVM: fix HYP mapping limitations around zero · 3562c76d
      Marc Zyngier 提交于
      The current code for creating HYP mapping doesn't like to wrap
      around zero, which prevents from mapping anything into the last
      page of the virtual address space.
      
      It doesn't take much effort to remove this limitation, making
      the code more consistent with the rest of the kernel in the process.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
      3562c76d
    • M
      ARM: KVM: simplify HYP mapping population · 6060df84
      Marc Zyngier 提交于
      The way we populate HYP mappings is a bit convoluted, to say the least.
      Passing a pointer around to keep track of the current PFN is quite
      odd, and we end-up having two different PTE accessors for no good
      reason.
      
      Simplify the whole thing by unifying the two PTE accessors, passing
      a pgprot_t around, and moving the various validity checks to the
      upper layers.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <cdall@cs.columbia.edu>
      6060df84
  21. 07 3月, 2013 5 次提交