1. 14 11月, 2016 1 次提交
  2. 05 11月, 2016 1 次提交
    • M
      arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a single CPU · 94d0e598
      Marc Zyngier 提交于
      Architecturally, TLBs are private to the (physical) CPU they're
      associated with. But when multiple vcpus from the same VM are
      being multiplexed on the same CPU, the TLBs are not private
      to the vcpus (and are actually shared across the VMID).
      
      Let's consider the following scenario:
      
      - vcpu-0 maps PA to VA
      - vcpu-1 maps PA' to VA
      
      If run on the same physical CPU, vcpu-1 can hit TLB entries generated
      by vcpu-0 accesses, and access the wrong physical page.
      
      The solution to this is to keep a per-VM map of which vcpu ran last
      on each given physical CPU, and invalidate local TLBs when switching
      to a different vcpu from the same VM.
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      94d0e598
  3. 22 10月, 2016 1 次提交
    • M
      arm/arm64: KVM: Map the BSS at HYP · c8ea0395
      Marc Zyngier 提交于
      When used with a compiler that doesn't implement "asm goto"
      (such as the AArch64 port of GCC 4.8), jump labels generate a
      memory access to find out about the value of the key (instead
      of just patching the code). The key itself is likely to be
      stored in the BSS.
      
      This is perfectly fine, except that we don't map the BSS at HYP,
      leading to an exploding kernel at the first access. The obvious
      fix is simply to map the BSS there (which should have been done
      a long while ago, but hey...).
      Reported-by: NEric Auger <eric.auger@redhat.com>
      Tested-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      c8ea0395
  4. 22 9月, 2016 1 次提交
  5. 16 9月, 2016 1 次提交
  6. 09 9月, 2016 1 次提交
    • S
      kvm-arm: Unmap shadow pagetables properly · 293f2936
      Suzuki K Poulose 提交于
      On arm/arm64, we depend on the kvm_unmap_hva* callbacks (via
      mmu_notifiers::invalidate_*) to unmap the stage2 pagetables when
      the userspace buffer gets unmapped. However, when the Hypervisor
      process exits without explicit unmap of the guest buffers, the only
      notifier we get is kvm_arch_flush_shadow_all() (via mmu_notifier::release
      ) which does nothing on arm. Later this causes us to access pages that
      were already released [via exit_mmap() -> unmap_vmas()] when we actually
      get to unmap the stage2 pagetable [via kvm_arch_destroy_vm() ->
      kvm_free_stage2_pgd()]. This triggers crashes with CONFIG_DEBUG_PAGEALLOC,
      which unmaps any free'd pages from the linear map.
      
       [  757.644120] Unable to handle kernel paging request at virtual address
        ffff800661e00000
       [  757.652046] pgd = ffff20000b1a2000
       [  757.655471] [ffff800661e00000] *pgd=00000047fffe3003, *pud=00000047fcd8c003,
        *pmd=00000047fcc7c003, *pte=00e8004661e00712
       [  757.666492] Internal error: Oops: 96000147 [#3] PREEMPT SMP
       [  757.672041] Modules linked in:
       [  757.675100] CPU: 7 PID: 3630 Comm: qemu-system-aar Tainted: G      D
       4.8.0-rc1 #3
       [  757.683240] Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board,
        BIOS 3.06.15 Aug 19 2016
       [  757.692938] task: ffff80069cdd3580 task.stack: ffff8006adb7c000
       [  757.698840] PC is at __flush_dcache_area+0x1c/0x40
       [  757.703613] LR is at kvm_flush_dcache_pmd+0x60/0x70
       [  757.708469] pc : [<ffff20000809dbdc>] lr : [<ffff2000080b4a70>] pstate: 20000145
       ...
       [  758.357249] [<ffff20000809dbdc>] __flush_dcache_area+0x1c/0x40
       [  758.363059] [<ffff2000080b6748>] unmap_stage2_range+0x458/0x5f0
       [  758.368954] [<ffff2000080b708c>] kvm_free_stage2_pgd+0x34/0x60
       [  758.374761] [<ffff2000080b2280>] kvm_arch_destroy_vm+0x20/0x68
       [  758.380570] [<ffff2000080aa330>] kvm_put_kvm+0x210/0x358
       [  758.385860] [<ffff2000080aa524>] kvm_vm_release+0x2c/0x40
       [  758.391239] [<ffff2000082ad234>] __fput+0x114/0x2e8
       [  758.396096] [<ffff2000082ad46c>] ____fput+0xc/0x18
       [  758.400869] [<ffff200008104658>] task_work_run+0x108/0x138
       [  758.406332] [<ffff2000080dc8ec>] do_exit+0x48c/0x10e8
       [  758.411363] [<ffff2000080dd5fc>] do_group_exit+0x6c/0x130
       [  758.416739] [<ffff2000080ed924>] get_signal+0x284/0xa18
       [  758.421943] [<ffff20000808a098>] do_signal+0x158/0x860
       [  758.427060] [<ffff20000808aad4>] do_notify_resume+0x6c/0x88
       [  758.432608] [<ffff200008083624>] work_pending+0x10/0x14
       [  758.437812] Code: 9ac32042 8b010001 d1000443 8a230000 (d50b7e20)
      
      This patch fixes the issue by moving the kvm_free_stage2_pgd() to
      kvm_arch_flush_shadow_all().
      
      Cc: <stable@vger.kernel.org> # 3.9+
      Tested-by: NItaru Kitayama <itaru.kitayama@riken.jp>
      Reported-by: NItaru Kitayama <itaru.kitayama@riken.jp>
      Reported-by: NJames Morse <james.morse@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      293f2936
  7. 12 8月, 2016 1 次提交
  8. 19 7月, 2016 2 次提交
  9. 04 7月, 2016 3 次提交
  10. 01 7月, 2016 1 次提交
  11. 29 6月, 2016 3 次提交
  12. 27 6月, 2016 1 次提交
    • J
      KVM: arm/arm64: Stop leaking vcpu pid references · 591d215a
      James Morse 提交于
      kvm provides kvm_vcpu_uninit(), which amongst other things, releases the
      last reference to the struct pid of the task that was last running the vcpu.
      
      On arm64 built with CONFIG_DEBUG_KMEMLEAK, starting a guest with kvmtool,
      then killing it with SIGKILL results (after some considerable time) in:
      > cat /sys/kernel/debug/kmemleak
      > unreferenced object 0xffff80007d5ea080 (size 128):
      >  comm "lkvm", pid 2025, jiffies 4294942645 (age 1107.776s)
      >  hex dump (first 32 bytes):
      >    01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      >    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      >  backtrace:
      >    [<ffff8000001b30ec>] create_object+0xfc/0x278
      >    [<ffff80000071da34>] kmemleak_alloc+0x34/0x70
      >    [<ffff80000019fa2c>] kmem_cache_alloc+0x16c/0x1d8
      >    [<ffff8000000d0474>] alloc_pid+0x34/0x4d0
      >    [<ffff8000000b5674>] copy_process.isra.6+0x79c/0x1338
      >    [<ffff8000000b633c>] _do_fork+0x74/0x320
      >    [<ffff8000000b66b0>] SyS_clone+0x18/0x20
      >    [<ffff800000085cb0>] el0_svc_naked+0x24/0x28
      >    [<ffffffffffffffff>] 0xffffffffffffffff
      
      On x86 kvm_vcpu_uninit() is called on the path from kvm_arch_destroy_vm(),
      on arm no equivalent call is made. Add the call to kvm_arch_vcpu_free().
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Fixes: 749cf76c ("KVM: ARM: Initial skeleton to compile KVM support")
      Cc: <stable@vger.kernel.org> # 3.10+
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      591d215a
  13. 14 6月, 2016 1 次提交
  14. 20 5月, 2016 3 次提交
    • C
      KVM: arm/arm64: vgic-new: Synchronize changes to active state · 35a2d585
      Christoffer Dall 提交于
      When modifying the active state of an interrupt via the MMIO interface,
      we should ensure that the write has the intended effect.
      
      If a guest sets an interrupt to active, but that interrupt is already
      flushed into a list register on a running VCPU, then that VCPU will
      write the active state back into the struct vgic_irq upon returning from
      the guest and syncing its state.  This is a non-benign race, because the
      guest can observe that an interrupt is not active, and it can have a
      reasonable expectations that other VCPUs will not ack any IRQs, and then
      set the state to active, and expect it to stay that way.  Currently we
      are not honoring this case.
      
      Thefore, change both the SACTIVE and CACTIVE mmio handlers to stop the
      world, change the irq state, potentially queue the irq if we're setting
      it to active, and then continue.
      
      We take this chance to slightly optimize these functions by not stopping
      the world when touching private interrupts where there is inherently no
      possible race.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      35a2d585
    • C
      KVM: arm/arm64: Provide functionality to pause and resume a guest · b13216cf
      Christoffer Dall 提交于
      For some rare corner cases in our VGIC emulation later we have to stop
      the guest to make sure the VGIC state is consistent.
      Provide the necessary framework to pause and resume a guest.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      b13216cf
    • C
      KVM: arm/arm64: Move timer IRQ map to latest possible time · 41a54482
      Christoffer Dall 提交于
      We are about to modify the VGIC to allocate all data structures
      dynamically and store mapped IRQ information on a per-IRQ struct, which
      is indeed allocated dynamically at init time.
      
      Therefore, we cannot record the mapped IRQ info from the timer at timer
      reset time like it's done now, because VCPU reset happens before timer
      init.
      
      A possible later time to do this is on the first run of a per VCPU, it
      just requires us to move the enable state to be a per-VCPU state and do
      the lookup of the physical IRQ number when we are about to run the VCPU.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      41a54482
  15. 28 4月, 2016 1 次提交
    • A
      arm64: kvm: allows kvm cpu hotplug · 67f69197
      AKASHI Takahiro 提交于
      The current kvm implementation on arm64 does cpu-specific initialization
      at system boot, and has no way to gracefully shutdown a core in terms of
      kvm. This prevents kexec from rebooting the system at EL2.
      
      This patch adds a cpu tear-down function and also puts an existing cpu-init
      code into a separate function, kvm_arch_hardware_disable() and
      kvm_arch_hardware_enable() respectively.
      We don't need the arm64 specific cpu hotplug hook any more.
      
      Since this patch modifies common code between arm and arm64, one stub
      definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
      compilation errors.
      Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      [Rebase, added separate VHE init/exit path, changed resets use of
       kvm_call_hyp() to the __version, en/disabled hardware in init_subsystems(),
       added icache maintenance to __kvm_hyp_reset() and removed lr restore, removed
       guest-enter after teardown handling]
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      67f69197
  16. 21 4月, 2016 1 次提交
  17. 06 4月, 2016 1 次提交
  18. 31 3月, 2016 1 次提交
  19. 21 3月, 2016 1 次提交
  20. 01 3月, 2016 8 次提交
  21. 25 2月, 2016 1 次提交
    • M
      KVM: Use simple waitqueue for vcpu->wq · 8577370f
      Marcelo Tosatti 提交于
      The problem:
      
      On -rt, an emulated LAPIC timer instances has the following path:
      
      1) hard interrupt
      2) ksoftirqd is scheduled
      3) ksoftirqd wakes up vcpu thread
      4) vcpu thread is scheduled
      
      This extra context switch introduces unnecessary latency in the
      LAPIC path for a KVM guest.
      
      The solution:
      
      Allow waking up vcpu thread from hardirq context,
      thus avoiding the need for ksoftirqd to be scheduled.
      
      Normal waitqueues make use of spinlocks, which on -RT
      are sleepable locks. Therefore, waking up a waitqueue
      waiter involves locking a sleeping lock, which
      is not allowed from hard interrupt context.
      
      cyclictest command line:
      
      This patch reduces the average latency in my tests from 14us to 11us.
      
      Daniel writes:
      Paolo asked for numbers from kvm-unit-tests/tscdeadline_latency
      benchmark on mainline. The test was run 1000 times on
      tip/sched/core 4.4.0-rc8-01134-g0905f04e:
      
        ./x86-run x86/tscdeadline_latency.flat -cpu host
      
      with idle=poll.
      
      The test seems not to deliver really stable numbers though most of
      them are smaller. Paolo write:
      
      "Anything above ~10000 cycles means that the host went to C1 or
      lower---the number means more or less nothing in that case.
      
      The mean shows an improvement indeed."
      
      Before:
      
                     min             max         mean           std
      count  1000.000000     1000.000000  1000.000000   1000.000000
      mean   5162.596000  2019270.084000  5824.491541  20681.645558
      std      75.431231   622607.723969    89.575700   6492.272062
      min    4466.000000    23928.000000  5537.926500    585.864966
      25%    5163.000000  1613252.750000  5790.132275  16683.745433
      50%    5175.000000  2281919.000000  5834.654000  23151.990026
      75%    5190.000000  2382865.750000  5861.412950  24148.206168
      max    5228.000000  4175158.000000  6254.827300  46481.048691
      
      After
                     min            max         mean           std
      count  1000.000000     1000.00000  1000.000000   1000.000000
      mean   5143.511000  2076886.10300  5813.312474  21207.357565
      std      77.668322   610413.09583    86.541500   6331.915127
      min    4427.000000    25103.00000  5529.756600    559.187707
      25%    5148.000000  1691272.75000  5784.889825  17473.518244
      50%    5160.000000  2308328.50000  5832.025000  23464.837068
      75%    5172.000000  2393037.75000  5853.177675  24223.969976
      max    5222.000000  3922458.00000  6186.720500  42520.379830
      
      [Patch was originaly based on the swait implementation found in the -rt
       tree. Daniel ported it to mainline's version and gathered the
       benchmark numbers for tscdeadline_latency test.]
      Signed-off-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: linux-rt-users@vger.kernel.org
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1455871601-27484-4-git-send-email-wagi@monom.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      8577370f
  22. 19 2月, 2016 1 次提交
  23. 18 12月, 2015 2 次提交
  24. 14 12月, 2015 2 次提交