1. 14 6月, 2016 16 次提交
    • J
      MIPS: KVM: Restore host EBase from ebase variable · 878edf01
      James Hogan 提交于
      The host kernel's exception vector base address is currently saved in
      the VCPU structure at creation time, and restored on a guest exit.
      However it doesn't change and can already be easily accessed from the
      'ebase' variable (arch/mips/kernel/traps.c), so drop the host_ebase
      member of kvm_vcpu_arch, export the 'ebase' variable to modules and load
      from there instead.
      
      This does result in a single extra instruction (lui) on the guest exit
      path, but simplifies the code a bit and removes the redundant storage of
      the host exception base address.
      
      Credit for the idea goes to Cavium's VZ KVM implementation.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      878edf01
    • J
      MIPS: KVM: Drop unused hpa0/hpa1 args from function · 26ee17ff
      James Hogan 提交于
      The function kvm_mips_handle_mapped_seg_tlb_fault() has two completely
      unused pointer arguments, hpa0 and hpa1, for which all users always pass
      NULL.
      
      Drop these two arguments and update the callers.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      26ee17ff
    • J
      MIPS: KVM: Simplify even/odd TLB handling · 021df206
      James Hogan 提交于
      When handling TLB faults in the guest KSeg0 region, a pair of physical
      addresses are read from the guest physical address map. However that
      process is rather convoluted with an if/then/else statement. Simplify it
      to just clear the lowest bit for the even entry and set the lowest bit
      for the odd entry.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      021df206
    • J
      MIPS: KVM: Don't indirect KVM functions · 9befad23
      James Hogan 提交于
      Several KVM module functions are indirected so that they can be accessed
      from tlb.c which is statically built into the kernel. This is no longer
      necessary as the relevant bits of code have moved into mmu.c which is
      part of the KVM module, so drop the indirections.
      
      Note: is_error_pfn() is defined inline in kvm_host.h, so didn't actually
      require the KVM module to be loaded for it to work anyway.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9befad23
    • J
      MIPS: KVM: Move non-TLB handling code out of tlb.c · 403015b3
      James Hogan 提交于
      Various functions in tlb.c perform higher level MMU handling, but don't
      strictly need to be statically built into the kernel as they don't
      directly manipulate TLB entries. Move these functions out into a
      separate mmu.c which will be built into the KVM kernel module. This
      allows them to directly reference KVM functions in the KVM kernel module
      in future.
      
      Module exports of these functions have been removed, since they aren't
      needed outside of KVM.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      403015b3
    • J
      MIPS: KVM: Make various Cause variables 32-bit · 31cf7498
      James Hogan 提交于
      The CP0 Cause register is passed around in KVM quite a bit, often as an
      unsigned long, even though it is always 32-bits long.
      
      Resize it to u32 throughout MIPS KVM.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      31cf7498
    • J
      MIPS: KVM: Convert code to kernel sized types · 8cffd197
      James Hogan 提交于
      Convert the MIPS KVM C code to use standard kernel sized types (e.g.
      u32) instead of inttypes.h style ones (e.g. uint32_t) or other types as
      appropriate.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      8cffd197
    • J
      MIPS: KVM: Convert headers to kernel sized types · bdb7ed86
      James Hogan 提交于
      Convert the MIPS kvm_host.h structs, function declaration prototypes and
      associated definition prototypes to use standard kernel sized types
      (e.g. u32) instead of inttypes.h style ones (e.g. uint32_t).
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      bdb7ed86
    • J
      MIPS: KVM: Drop unused kvm_mips_sync_icache() · 2193c713
      James Hogan 提交于
      The function kvm_mips_sync_icache() is unused, so lets remove it.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2193c713
    • J
      MIPS: KVM: Drop unused host_cp0_entryhi · e4e94c0f
      James Hogan 提交于
      The host EntryHi in the KVM VCPU context is virtually unused. It gets
      stored on exceptions, but only ever used in a kvm_debug() when a TLB
      miss occurs.
      
      Drop it entirely, removing that information from the kvm_debug output.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e4e94c0f
    • J
      MIPS: KVM: Drop unused guest_inst from kvm_vcpu_arch · d40dd9e8
      James Hogan 提交于
      The MIPS kvm_vcpu_arch::guest_inst isn't used, so drop it from the
      struct and drop its asm-offsets definition.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d40dd9e8
    • P
      Merge branch 'kvm-mips-fixes' into HEAD · b11c3f59
      Paolo Bonzini 提交于
      Merge MIPS patches destined to both 4.7 and kvm/next, to avoid
      unnecessary conflicts.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b11c3f59
    • J
      MIPS: KVM: Fix CACHE triggered exception emulation · 6df82a7b
      James Hogan 提交于
      When emulating TLB miss / invalid exceptions during CACHE instruction
      emulation, be sure to set up the correct PC and host_cp0_badvaddr state
      for the kvm_mips_emlulate_tlb*_ld() function to pick up for guest EPC
      and BadVAddr.
      
      PC needs to be rewound otherwise the guest EPC will end up pointing at
      the next instruction after the faulting CACHE instruction.
      
      host_cp0_badvaddr must be set because guest CACHE instructions trap with
      a Coprocessor Unusable exception, which doesn't update the host BadVAddr
      as a TLB exception would.
      
      This doesn't tend to get hit when dynamic translation of emulated
      instructions is enabled, since only the first execution of each CACHE
      instruction actually goes through this code path, with subsequent
      executions hitting the SYNCI instruction that it gets replaced with.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6df82a7b
    • J
      MIPS: KVM: Don't unwind PC when emulating CACHE · cc81e948
      James Hogan 提交于
      When a CACHE instruction is emulated by kvm_mips_emulate_cache(), the PC
      is first updated to point to the next instruction, and afterwards it
      falls through the "dont_update_pc" label, which rewinds the PC back to
      its original address.
      
      This works when dynamic translation of emulated instructions is enabled,
      since the CACHE instruction is replaced with a SYNCI which works without
      trapping, however when dynamic translation is disabled the guest hangs
      on CACHE instructions as they always trap and are never stepped over.
      
      Roughly swap the meanings of the "done" and "dont_update_pc" to match
      kvm_mips_emulate_CP0(), so that "done" will roll back the PC on failure,
      and "dont_update_pc" won't change PC at all (for the sake of exceptions
      that have already modified the PC).
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cc81e948
    • J
      MIPS: KVM: Include bit 31 in segment matches · 7f5a1ddc
      James Hogan 提交于
      When faulting guest addresses are matched against guest segments with
      the KVM_GUEST_KSEGX() macro, change the mask to 0xe0000000 so as to
      include bit 31.
      
      This is mainly for safety's sake, as it prevents a rogue BadVAddr in the
      host kseg2/kseg3 segments (e.g. 0xC*******) after a TLB exception from
      matching the guest kseg0 segment (e.g. 0x4*******), triggering an
      internal KVM error instead of allowing the corresponding guest kseg0
      page to be mapped into the host vmalloc space.
      
      Such a rogue BadVAddr was observed to happen with the host MIPS kernel
      running under QEMU with KVM built as a module, due to a not entirely
      transparent optimisation in the QEMU TLB handling. This has already been
      worked around properly in a previous commit.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7f5a1ddc
    • J
      MIPS: KVM: Fix modular KVM under QEMU · 797179bc
      James Hogan 提交于
      Copy __kvm_mips_vcpu_run() into unmapped memory, so that we can never
      get a TLB refill exception in it when KVM is built as a module.
      
      This was observed to happen with the host MIPS kernel running under
      QEMU, due to a not entirely transparent optimisation in the QEMU TLB
      handling where TLB entries replaced with TLBWR are copied to a separate
      part of the TLB array. Code in those pages continue to be executable,
      but those mappings persist only until the next ASID switch, even if they
      are marked global.
      
      An ASID switch happens in __kvm_mips_vcpu_run() at exception level after
      switching to the guest exception base. Subsequent TLB mapped kernel
      instructions just prior to switching to the guest trigger a TLB refill
      exception, which enters the guest exception handlers without updating
      EPC. This appears as a guest triggered TLB refill on a host kernel
      mapped (host KSeg2) address, which is not handled correctly as user
      (guest) mode accesses to kernel (host) segments always generate address
      error exceptions.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: <stable@vger.kernel.org> # 3.10.x-
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      797179bc
  2. 03 6月, 2016 5 次提交
    • K
      kvm/x86: remove unnecessary header file inclusion · dca4d728
      Kai Huang 提交于
      arch/x86/kvm/iommu.c includes <linux/intel-iommu.h> and <linux/dmar.h>, which
      both are unnecessary, in fact incorrect to be here as they are intel specific.
      
      Building kvm on x86 passed after removing above inclusion.
      Signed-off-by: NKai Huang <kai.huang@linux.intel.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      dca4d728
    • P
      KVM: x86: protect KVM_CREATE_PIT/KVM_CREATE_PIT2 with kvm->lock · 250715a6
      Paolo Bonzini 提交于
      The syzkaller folks reported a NULL pointer dereference that seems
      to be cause by a race between KVM_CREATE_IRQCHIP and KVM_CREATE_PIT2.
      The former takes kvm->lock (except when registering the devices,
      which needs kvm->slots_lock); the latter takes kvm->slots_lock only.
      Change KVM_CREATE_PIT2 to follow the same model as KVM_CREATE_IRQCHIP.
      
      Testcase:
      
          #include <pthread.h>
          #include <linux/kvm.h>
          #include <fcntl.h>
          #include <sys/ioctl.h>
          #include <stdint.h>
          #include <string.h>
          #include <stdlib.h>
          #include <sys/syscall.h>
          #include <unistd.h>
      
          long r[23];
      
          void* thr1(void* arg)
          {
              struct kvm_pit_config pitcfg = { .flags = 4 };
              switch ((long)arg) {
              case 0: r[2]  = open("/dev/kvm", O_RDONLY|O_ASYNC);    break;
              case 1: r[3]  = ioctl(r[2], KVM_CREATE_VM, 0);         break;
              case 2: r[4]  = ioctl(r[3], KVM_CREATE_IRQCHIP, 0);    break;
              case 3: r[22] = ioctl(r[3], KVM_CREATE_PIT2, &pitcfg); break;
              }
              return 0;
          }
      
          int main(int argc, char **argv)
          {
              long i;
              pthread_t th[4];
      
              memset(r, -1, sizeof(r));
              for (i = 0; i < 4; i++) {
                  pthread_create(&th[i], 0, thr, (void*)i);
                  if (argc > 1 && rand()%2) usleep(rand()%1000);
              }
              usleep(20000);
              return 0;
          }
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      250715a6
    • P
      KVM: x86: rename process_smi to enter_smm, process_smi_request to process_smi · ee2cd4b7
      Paolo Bonzini 提交于
      Make the function names more similar between KVM_REQ_NMI and KVM_REQ_SMI.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      ee2cd4b7
    • P
      KVM: x86: avoid simultaneous queueing of both IRQ and SMI · c43203ca
      Paolo Bonzini 提交于
      If the processor exits to KVM while delivering an interrupt,
      the hypervisor then requeues the interrupt for the next vmentry.
      Trying to enter SMM in this same window causes to enter non-root
      mode in emulated SMM (i.e. with IF=0) and with a request to
      inject an IRQ (i.e. with a valid VM-entry interrupt info field).
      This is invalid guest state (SDM 26.3.1.4 "Check on Guest RIP
      and RFLAGS") and the processor fails vmentry.
      
      The fix is to defer the injection from KVM_REQ_SMI to KVM_REQ_EVENT,
      like we already do for e.g. NMIs.  This patch doesn't change the
      name of the process_smi function so that it can be applied to
      stable releases.  The next patch will modify the names so that
      process_nmi and process_smi handle respectively KVM_REQ_NMI and
      KVM_REQ_SMI.
      
      This is especially common with Windows, probably due to the
      self-IPI trick that it uses to deliver deferred procedure
      calls (DPCs).
      Reported-by: NLaszlo Ersek <lersek@redhat.com>
      Reported-by: NMichał Zegan <webczat_200@poczta.onet.pl>
      Fixes: 64d60670
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      c43203ca
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 4340fa55
      Linus Torvalds 提交于
      Pull KVM fixes from Radim Krčmář:
       "ARM:
         - two fixes for 4.6 vgic [Christoffer] (cc stable)
      
         - six fixes for 4.7 vgic [Marc]
      
        x86:
         - six fixes from syzkaller reports [Paolo] (two of them cc stable)
      
         - allow OS X to boot [Dmitry]
      
         - don't trust compilers [Nadav]"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGS
        KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID
        KVM: irqfd: fix NULL pointer dereference in kvm_irq_map_gsi
        KVM: fail KVM_SET_VCPU_EVENTS with invalid exception number
        KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID
        kvm: x86: avoid warning on repeated KVM_SET_TSS_ADDR
        KVM: Handle MSR_IA32_PERF_CTL
        KVM: x86: avoid write-tearing of TDP
        KVM: arm/arm64: vgic-new: Removel harmful BUG_ON
        arm64: KVM: vgic-v3: Relax synchronization when SRE==1
        arm64: KVM: vgic-v3: Prevent the guest from messing with ICC_SRE_EL1
        arm64: KVM: Make ICC_SRE_EL1 access return the configured SRE value
        KVM: arm/arm64: vgic-v3: Always resample level interrupts
        KVM: arm/arm64: vgic-v2: Always resample level interrupts
        KVM: arm/arm64: vgic-v3: Clear all dirty LRs
        KVM: arm/arm64: vgic-v2: Clear all dirty LRs
      4340fa55
  3. 02 6月, 2016 12 次提交
    • P
      KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGS · d14bdb55
      Paolo Bonzini 提交于
      MOV to DR6 or DR7 causes a #GP if an attempt is made to write a 1 to
      any of bits 63:32.  However, this is not detected at KVM_SET_DEBUGREGS
      time, and the next KVM_RUN oopses:
      
         general protection fault: 0000 [#1] SMP
         CPU: 2 PID: 14987 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1
         Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
         [...]
         Call Trace:
          [<ffffffffa072c93d>] kvm_arch_vcpu_ioctl_run+0x141d/0x14e0 [kvm]
          [<ffffffffa071405d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm]
          [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480
          [<ffffffff812418a9>] SyS_ioctl+0x79/0x90
          [<ffffffff817a0f2e>] entry_SYSCALL_64_fastpath+0x12/0x71
         Code: 55 83 ff 07 48 89 e5 77 27 89 ff ff 24 fd 90 87 80 81 0f 23 fe 5d c3 0f 23 c6 5d c3 0f 23 ce 5d c3 0f 23 d6 5d c3 0f 23 de 5d c3 <0f> 23 f6 5d c3 0f 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
         RIP  [<ffffffff810639eb>] native_set_debugreg+0x2b/0x40
          RSP <ffff88005836bd50>
      
      Testcase (beautified/reduced from syzkaller output):
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <string.h>
          #include <stdint.h>
          #include <linux/kvm.h>
          #include <fcntl.h>
          #include <sys/ioctl.h>
      
          long r[8];
      
          int main()
          {
              struct kvm_debugregs dr = { 0 };
      
              r[2] = open("/dev/kvm", O_RDONLY);
              r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
              r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
      
              memcpy(&dr,
                     "\x5d\x6a\x6b\xe8\x57\x3b\x4b\x7e\xcf\x0d\xa1\x72"
                     "\xa3\x4a\x29\x0c\xfc\x6d\x44\x00\xa7\x52\xc7\xd8"
                     "\x00\xdb\x89\x9d\x78\xb5\x54\x6b\x6b\x13\x1c\xe9"
                     "\x5e\xd3\x0e\x40\x6f\xb4\x66\xf7\x5b\xe3\x36\xcb",
                     48);
              r[7] = ioctl(r[4], KVM_SET_DEBUGREGS, &dr);
              r[6] = ioctl(r[4], KVM_RUN, 0);
          }
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      d14bdb55
    • P
      KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID · f8c1b85b
      Paolo Bonzini 提交于
      This causes an ugly dmesg splat.  Beautified syzkaller testcase:
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <sys/ioctl.h>
          #include <fcntl.h>
          #include <linux/kvm.h>
      
          long r[8];
      
          int main()
          {
              struct kvm_irq_routing ir = { 0 };
              r[2] = open("/dev/kvm", O_RDWR);
              r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
              r[4] = ioctl(r[3], KVM_SET_GSI_ROUTING, &ir);
              return 0;
          }
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      f8c1b85b
    • P
      KVM: irqfd: fix NULL pointer dereference in kvm_irq_map_gsi · c622a3c2
      Paolo Bonzini 提交于
      Found by syzkaller:
      
          BUG: unable to handle kernel NULL pointer dereference at 0000000000000120
          IP: [<ffffffffa0797202>] kvm_irq_map_gsi+0x12/0x90 [kvm]
          PGD 6f80b067 PUD b6535067 PMD 0
          Oops: 0000 [#1] SMP
          CPU: 3 PID: 4988 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1
          [...]
          Call Trace:
           [<ffffffffa0795f62>] irqfd_update+0x32/0xc0 [kvm]
           [<ffffffffa0796c7c>] kvm_irqfd+0x3dc/0x5b0 [kvm]
           [<ffffffffa07943f4>] kvm_vm_ioctl+0x164/0x6f0 [kvm]
           [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480
           [<ffffffff812418a9>] SyS_ioctl+0x79/0x90
           [<ffffffff817a1062>] tracesys_phase2+0x84/0x89
          Code: b5 71 a7 e0 5b 41 5c 41 5d 5d f3 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 8f 10 2e 00 00 31 c0 48 89 e5 <39> 91 20 01 00 00 76 6a 48 63 d2 48 8b 94 d1 28 01 00 00 48 85
          RIP  [<ffffffffa0797202>] kvm_irq_map_gsi+0x12/0x90 [kvm]
           RSP <ffff8800926cbca8>
          CR2: 0000000000000120
      
      Testcase:
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <string.h>
          #include <stdint.h>
          #include <linux/kvm.h>
          #include <fcntl.h>
          #include <sys/ioctl.h>
      
          long r[26];
      
          int main()
          {
              memset(r, -1, sizeof(r));
              r[2] = open("/dev/kvm", 0);
              r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
      
              struct kvm_irqfd ifd;
              ifd.fd = syscall(SYS_eventfd2, 5, 0);
              ifd.gsi = 3;
              ifd.flags = 2;
              ifd.resamplefd = ifd.fd;
              r[25] = ioctl(r[3], KVM_IRQFD, &ifd);
              return 0;
          }
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      c622a3c2
    • P
      KVM: fail KVM_SET_VCPU_EVENTS with invalid exception number · 78e546c8
      Paolo Bonzini 提交于
      This cannot be returned by KVM_GET_VCPU_EVENTS, so it is okay to return
      EINVAL.  It causes a WARN from exception_type:
      
          WARNING: CPU: 3 PID: 16732 at arch/x86/kvm/x86.c:345 exception_type+0x49/0x50 [kvm]()
          CPU: 3 PID: 16732 Comm: a.out Tainted: G        W       4.4.6-300.fc23.x86_64 #1
          Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
           0000000000000286 000000006308a48b ffff8800bec7fcf8 ffffffff813b542e
           0000000000000000 ffffffffa0966496 ffff8800bec7fd30 ffffffff810a40f2
           ffff8800552a8000 0000000000000000 00000000002c267c 0000000000000001
          Call Trace:
           [<ffffffff813b542e>] dump_stack+0x63/0x85
           [<ffffffff810a40f2>] warn_slowpath_common+0x82/0xc0
           [<ffffffff810a423a>] warn_slowpath_null+0x1a/0x20
           [<ffffffffa0924809>] exception_type+0x49/0x50 [kvm]
           [<ffffffffa0934622>] kvm_arch_vcpu_ioctl_run+0x10a2/0x14e0 [kvm]
           [<ffffffffa091c04d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm]
           [<ffffffff81241248>] do_vfs_ioctl+0x298/0x480
           [<ffffffff812414a9>] SyS_ioctl+0x79/0x90
           [<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71
          ---[ end trace b1a0391266848f50 ]---
      
      Testcase (beautified/reduced from syzkaller output):
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <string.h>
          #include <stdint.h>
          #include <fcntl.h>
          #include <sys/ioctl.h>
          #include <linux/kvm.h>
      
          long r[31];
      
          int main()
          {
              memset(r, -1, sizeof(r));
              r[2] = open("/dev/kvm", O_RDONLY);
              r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
              r[7] = ioctl(r[3], KVM_CREATE_VCPU, 0);
      
              struct kvm_vcpu_events ve = {
                      .exception.injected = 1,
                      .exception.nr = 0xd4
              };
              r[27] = ioctl(r[7], KVM_SET_VCPU_EVENTS, &ve);
              r[30] = ioctl(r[7], KVM_RUN, 0);
              return 0;
          }
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      78e546c8
    • P
      KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID · 83676e92
      Paolo Bonzini 提交于
      This causes an ugly dmesg splat.  Beautified syzkaller testcase:
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <sys/ioctl.h>
          #include <fcntl.h>
          #include <linux/kvm.h>
      
          long r[8];
      
          int main()
          {
              struct kvm_cpuid2 c = { 0 };
              r[2] = open("/dev/kvm", O_RDWR);
              r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
              r[4] = ioctl(r[3], KVM_CREATE_VCPU, 0x8);
              r[7] = ioctl(r[4], KVM_SET_CPUID, &c);
              return 0;
          }
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      83676e92
    • P
      kvm: x86: avoid warning on repeated KVM_SET_TSS_ADDR · b21629da
      Paolo Bonzini 提交于
      Found by syzkaller:
      
          WARNING: CPU: 3 PID: 15175 at arch/x86/kvm/x86.c:7705 __x86_set_memory_region+0x1dc/0x1f0 [kvm]()
          CPU: 3 PID: 15175 Comm: a.out Tainted: G        W       4.4.6-300.fc23.x86_64 #1
          Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
           0000000000000286 00000000950899a7 ffff88011ab3fbf0 ffffffff813b542e
           0000000000000000 ffffffffa0966496 ffff88011ab3fc28 ffffffff810a40f2
           00000000000001fd 0000000000003000 ffff88014fc50000 0000000000000000
          Call Trace:
           [<ffffffff813b542e>] dump_stack+0x63/0x85
           [<ffffffff810a40f2>] warn_slowpath_common+0x82/0xc0
           [<ffffffff810a423a>] warn_slowpath_null+0x1a/0x20
           [<ffffffffa09251cc>] __x86_set_memory_region+0x1dc/0x1f0 [kvm]
           [<ffffffffa092521b>] x86_set_memory_region+0x3b/0x60 [kvm]
           [<ffffffffa09bb61c>] vmx_set_tss_addr+0x3c/0x150 [kvm_intel]
           [<ffffffffa092f4d4>] kvm_arch_vm_ioctl+0x654/0xbc0 [kvm]
           [<ffffffffa091d31a>] kvm_vm_ioctl+0x9a/0x6f0 [kvm]
           [<ffffffff81241248>] do_vfs_ioctl+0x298/0x480
           [<ffffffff812414a9>] SyS_ioctl+0x79/0x90
           [<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71
      
      Testcase:
      
          #include <unistd.h>
          #include <sys/ioctl.h>
          #include <fcntl.h>
          #include <string.h>
          #include <linux/kvm.h>
      
          long r[8];
      
          int main()
          {
              memset(r, -1, sizeof(r));
      	r[2] = open("/dev/kvm", O_RDONLY|O_TRUNC);
              r[3] = ioctl(r[2], KVM_CREATE_VM, 0x0ul);
              r[5] = ioctl(r[3], KVM_SET_TSS_ADDR, 0x20000000ul);
              r[7] = ioctl(r[3], KVM_SET_TSS_ADDR, 0x20000000ul);
              return 0;
          }
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      b21629da
    • D
      KVM: Handle MSR_IA32_PERF_CTL · 0c2df2a1
      Dmitry Bilunov 提交于
      Intel CPUs having Turbo Boost feature implement an MSR to provide a
      control interface via rdmsr/wrmsr instructions. One could detect the
      presence of this feature by issuing one of these instructions and
      handling the #GP exception which is generated in case the referenced MSR
      is not implemented by the CPU.
      
      KVM's vCPU model behaves exactly as a real CPU in this case by injecting
      a fault when MSR_IA32_PERF_CTL is called (which KVM does not support).
      However, some operating systems use this register during an early boot
      stage in which their kernel is not capable of handling #GP correctly,
      causing #DP and finally a triple fault effectively resetting the vCPU.
      
      This patch implements a dummy handler for MSR_IA32_PERF_CTL to avoid the
      crashes.
      Signed-off-by: NDmitry Bilunov <kmeaw@yandex-team.ru>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      0c2df2a1
    • N
      KVM: x86: avoid write-tearing of TDP · b19ee2ff
      Nadav Amit 提交于
      In theory, nothing prevents the compiler from write-tearing PTEs, or
      split PTE writes. These partially-modified PTEs can be fetched by other
      cores and cause mayhem. I have not really encountered such case in
      real-life, but it does seem possible.
      
      For example, the compiler may try to do something creative for
      kvm_set_pte_rmapp() and perform multiple writes to the PTE.
      Signed-off-by: NNadav Amit <nadav.amit@gmail.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      b19ee2ff
    • R
      Merge tag 'kvm-arm-for-v4.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm · 13e98fd1
      Radim Krčmář 提交于
      KVM/ARM Fixes for v4.7-rc2
      
      Fixes for the vgic, 2 of the patches address a bug introduced in v4.6
      while the rest are for the new vgic.
      13e98fd1
    • M
      KVM: arm/arm64: vgic-new: Removel harmful BUG_ON · 05fb05a6
      Marc Zyngier 提交于
      When changing the active bit from an MMIO trap, we decide to
      explode if the intid is that of a private interrupt.
      
      This flawed logic comes from the fact that we were assuming that
      kvm_vcpu_kick() as called by kvm_arm_halt_vcpu() would not return before
      the called vcpu responded, but this is not the case, so we need to
      perform this wait even for private interrupts.
      
      Dropping the BUG_ON seems like the right thing to do.
      
       [ Commit message tweaked by Christoffer ]
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      05fb05a6
    • L
      Merge tag 'pinctrl-v4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 719af93a
      Linus Torvalds 提交于
      Pull pin control fixes from Linus Walleij:
       "Here are three pin control fixes for v4.7.  Not much, and just driver
        fixes:
      
         - add device tree matches to MAINTAINERS
      
         - inversion bug in the Nomadik driver
      
         - dual edge handling bug in the mediatek driver"
      
      * tag 'pinctrl-v4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: mediatek: fix dual-edge code defect
        MAINTAINERS: Add file patterns for pinctrl device tree bindings
        pinctrl: nomadik: fix inversion of gpio direction
      719af93a
    • L
      Merge tag 'dma-buf-for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf · ebb8cb2b
      Linus Torvalds 提交于
      Pull dma-buf updates from Sumit Semwal:
      
       - use of vma_pages instead of explicit computation
      
       - DocBook and headerdoc updates for dma-buf
      
      * tag 'dma-buf-for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf:
        dma-buf: use vma_pages()
        fence: add missing descriptions for fence
        doc: update/fixup dma-buf related DocBook
        reservation: add headerdoc comments
        dma-buf: headerdoc fixes
      ebb8cb2b
  4. 01 6月, 2016 7 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 6b15d665
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix negative error code usage in ATM layer, from Stefan Hajnoczi.
      
       2) If CONFIG_SYSCTL is disabled, the default TTL is not initialized
          properly.  From Ezequiel Garcia.
      
       3) Missing spinlock init in mvneta driver, from Gregory CLEMENT.
      
       4) Missing unlocks in hwmb error paths, also from Gregory CLEMENT.
      
       5) Fix deadlock on team->lock when propagating features, from Ivan
          Vecera.
      
       6) Work around buffer offset hw bug in alx chips, from Feng Tang.
      
       7) Fix double listing of SCTP entries in sctp_diag dumps, from Xin
          Long.
      
       8) Various statistics bug fixes in mlx4 from Eric Dumazet.
      
       9) Fix some randconfig build errors wrt fou ipv6 from Arnd Bergmann.
      
      10) All of l2tp was namespace aware, but the ipv6 support code was not
          doing so.  From Shmulik Ladkani.
      
      11) Handle on-stack hrtimers properly in pktgen, from Guenter Roeck.
      
      12) Propagate MAC changes properly through VLAN devices, from Mike
          Manning.
      
      13) Fix memory leak in bnx2x_init_one(), from Vitaly Kuznetsov.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits)
        sfc: Track RPS flow IDs per channel instead of per function
        usbnet: smsc95xx: fix link detection for disabled autonegotiation
        virtio_net: fix virtnet_open and virtnet_probe competing for try_fill_recv
        bnx2x: avoid leaking memory on bnx2x_init_one() failures
        fou: fix IPv6 Kconfig options
        openvswitch: update checksum in {push,pop}_mpls
        sctp: sctp_diag should dump sctp socket type
        net: fec: update dirty_tx even if no skb
        vlan: Propagate MAC address to VLANs
        atm: iphase: off by one in rx_pkt()
        atm: firestream: add more reserved strings
        vxlan: Accept user specified MTU value when create new vxlan link
        net: pktgen: Call destroy_hrtimer_on_stack()
        timer: Export destroy_hrtimer_on_stack()
        net: l2tp: Make l2tp_ip6 namespace aware
        Documentation: ip-sysctl.txt: clarify secure_redirects
        sfc: use flow dissector helpers for aRFS
        ieee802154: fix logic error in ieee802154_llsec_parse_dev_addr
        net: nps_enet: Disable interrupts before napi reschedule
        net/lapb: tuse %*ph to dump buffers
        ...
      6b15d665
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 58c1f995
      Linus Torvalds 提交于
      Pull sparc fixes from David Miller:
       "sparc64 mmu context allocation and trap return bug fixes"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: Fix return from trap window fill crashes.
        sparc: Harden signal return frame checks.
        sparc64: Take ctx_alloc_lock properly in hugetlb_setup().
      58c1f995
    • J
      sfc: Track RPS flow IDs per channel instead of per function · faf8dcc1
      Jon Cooper 提交于
      Otherwise we get confused when two flows on different channels get the
       same flow ID.
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      faf8dcc1
    • C
      usbnet: smsc95xx: fix link detection for disabled autonegotiation · d69d1694
      Christoph Fritz 提交于
      To detect link status up/down for connections where autonegotiation is
      explicitly disabled, we don't get an irq but need to poll the status
      register for link up/down detection.
      This patch adds a workqueue to poll for link status.
      Signed-off-by: NChristoph Fritz <chf.fritz@googlemail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d69d1694
    • W
      virtio_net: fix virtnet_open and virtnet_probe competing for try_fill_recv · f00e35e2
      wangyunjian 提交于
      In function virtnet_open() and virtnet_probe(), func try_fill_recv() may
      be executed at the same time. VQ in virtqueue_add() has not been protected
      well and BUG_ON will be triggered when virito_net.ko being removed.
      Signed-off-by: NYunjian Wang <wangyunjian@huawei.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f00e35e2
    • V
      bnx2x: avoid leaking memory on bnx2x_init_one() failures · bae5499c
      Vitaly Kuznetsov 提交于
      bnx2x_init_bp() allocates memory with bnx2x_alloc_mem_bp() so if we
      fail later in bnx2x_init_one() we need to free this memory
      with bnx2x_free_mem_bp() to avoid leakages. E.g. I'm observing memory
      leaks reported by kmemleak when a failure (unrelated) happens in
      bnx2x_vfpf_acquire().
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Acked-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bae5499c
    • A
      fou: fix IPv6 Kconfig options · 95e4daa8
      Arnd Bergmann 提交于
      The Kconfig options I added to work around broken compilation ended
      up screwing up things more, as I used the wrong symbol to control
      compilation of the file, resulting in IPv6 fou support to never be built
      into the kernel.
      
      Changing CONFIG_NET_FOU_IPV6_TUNNELS to CONFIG_IPV6_FOU fixes that
      problem, I had renamed the symbol in one location but not the other,
      and as the file is never being used by other kernel code, this did not
      lead to a build failure that I would have caught.
      
      After that fix, another issue with the same patch becomes obvious, as we
      'select INET6_TUNNEL', which is related to IPV6_TUNNEL, but not the same,
      and this can still cause the original build failure when IPV6_TUNNEL is
      not built-in but IPV6_FOU is. The fix is equally trivial, we just need
      to select the right symbol.
      
      I have successfully build 350 randconfig kernels with this patch
      and verified that the driver is now being built.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reported-by: NValentin Rothberg <valentinrothberg@gmail.com>
      Fixes: fabb13db ("fou: add Kconfig options for IPv6 support")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95e4daa8