1. 15 12月, 2017 1 次提交
    • L
      KVM/x86: Check input paging mode when cs.l is set · f2981033
      Lan Tianyu 提交于
      Reported by syzkaller:
          WARNING: CPU: 0 PID: 27962 at arch/x86/kvm/emulate.c:5631 x86_emulate_insn+0x557/0x15f0 [kvm]
          Modules linked in: kvm_intel kvm [last unloaded: kvm]
          CPU: 0 PID: 27962 Comm: syz-executor Tainted: G    B   W        4.15.0-rc2-next-20171208+ #32
          Hardware name: Intel Corporation S1200SP/S1200SP, BIOS S1200SP.86B.01.03.0006.040720161253 04/07/2016
          RIP: 0010:x86_emulate_insn+0x557/0x15f0 [kvm]
          RSP: 0018:ffff8807234476d0 EFLAGS: 00010282
          RAX: 0000000000000000 RBX: ffff88072d0237a0 RCX: ffffffffa0065c4d
          RDX: 1ffff100e5a046f9 RSI: 0000000000000003 RDI: ffff88072d0237c8
          RBP: ffff880723447728 R08: ffff88072d020000 R09: ffffffffa008d240
          R10: 0000000000000002 R11: ffffed00e7d87db3 R12: ffff88072d0237c8
          R13: ffff88072d023870 R14: ffff88072d0238c2 R15: ffffffffa008d080
          FS:  00007f8a68666700(0000) GS:ffff880802200000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 000000002009506c CR3: 000000071fec4005 CR4: 00000000003626f0
          Call Trace:
           x86_emulate_instruction+0x3bc/0xb70 [kvm]
           ? reexecute_instruction.part.162+0x130/0x130 [kvm]
           vmx_handle_exit+0x46d/0x14f0 [kvm_intel]
           ? trace_event_raw_event_kvm_entry+0xe7/0x150 [kvm]
           ? handle_vmfunc+0x2f0/0x2f0 [kvm_intel]
           ? wait_lapic_expire+0x25/0x270 [kvm]
           vcpu_enter_guest+0x720/0x1ef0 [kvm]
           ...
      
      When CS.L is set, vcpu should run in the 64 bit paging mode.
      Current kvm set_sregs function doesn't have such check when
      userspace inputs sreg values. This will lead unexpected behavior.
      This patch is to add checks for CS.L, EFER.LME, EFER.LMA and
      CR4.PAE when get SREG inputs from userspace in order to avoid
      unexpected behavior.
      Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Jim Mattson <jmattson@google.com>
      Signed-off-by: NTianyu Lan <tianyu.lan@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f2981033
  2. 14 12月, 2017 3 次提交
    • P
      kvm: x86: fix WARN due to uninitialized guest FPU state · 5663d8f9
      Peter Xu 提交于
      ------------[ cut here ]------------
       Bad FPU state detected at kvm_put_guest_fpu+0xd8/0x2d0 [kvm], reinitializing FPU registers.
       WARNING: CPU: 1 PID: 4594 at arch/x86/mm/extable.c:103 ex_handler_fprestore+0x88/0x90
       CPU: 1 PID: 4594 Comm: qemu-system-x86 Tainted: G    B      OE    4.15.0-rc2+ #10
       RIP: 0010:ex_handler_fprestore+0x88/0x90
       Call Trace:
        fixup_exception+0x4e/0x60
        do_general_protection+0xff/0x270
        general_protection+0x22/0x30
       RIP: 0010:kvm_put_guest_fpu+0xd8/0x2d0 [kvm]
       RSP: 0018:ffff8803d5627810 EFLAGS: 00010246
        kvm_vcpu_reset+0x3b4/0x3c0 [kvm]
        kvm_apic_accept_events+0x1c0/0x240 [kvm]
        kvm_arch_vcpu_ioctl_run+0x1658/0x2fb0 [kvm]
        kvm_vcpu_ioctl+0x479/0x880 [kvm]
        do_vfs_ioctl+0x142/0x9a0
        SyS_ioctl+0x74/0x80
        do_syscall_64+0x15f/0x600
      
      where kvm_put_guest_fpu is called without a prior kvm_load_guest_fpu.
      To fix it, move kvm_load_guest_fpu to the very beginning of
      kvm_arch_vcpu_ioctl_run.
      
      Cc: stable@vger.kernel.org
      Fixes: f775b13eSigned-off-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5663d8f9
    • W
      KVM: X86: Fix load RFLAGS w/o the fixed bit · d73235d1
      Wanpeng Li 提交于
       *** Guest State ***
       CR0: actual=0x0000000000000030, shadow=0x0000000060000010, gh_mask=fffffffffffffff7
       CR4: actual=0x0000000000002050, shadow=0x0000000000000000, gh_mask=ffffffffffffe871
       CR3 = 0x00000000fffbc000
       RSP = 0x0000000000000000  RIP = 0x0000000000000000
       RFLAGS=0x00000000         DR7 = 0x0000000000000400
              ^^^^^^^^^^
      
      The failed vmentry is triggered by the following testcase when ept=Y:
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <string.h>
          #include <stdint.h>
          #include <linux/kvm.h>
          #include <fcntl.h>
          #include <sys/ioctl.h>
      
          long r[5];
          int main()
          {
          	r[2] = open("/dev/kvm", O_RDONLY);
          	r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
          	r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
          	struct kvm_regs regs = {
          		.rflags = 0,
          	};
          	ioctl(r[4], KVM_SET_REGS, &regs);
          	ioctl(r[4], KVM_RUN, 0);
          }
      
      X86 RFLAGS bit 1 is fixed set, userspace can simply clearing bit 1
      of RFLAGS with KVM_SET_REGS ioctl which results in vmentry fails.
      This patch fixes it by oring X86_EFLAGS_FIXED during ioctl.
      
      Cc: stable@vger.kernel.org
      Suggested-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NQuan Xu <quan.xu0@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Jim Mattson <jmattson@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d73235d1
    • W
      KVM: MMU: Fix infinite loop when there is no available mmu page · ed52870f
      Wanpeng Li 提交于
      The below test case can cause infinite loop in kvm when ept=0.
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <string.h>
          #include <stdint.h>
          #include <linux/kvm.h>
          #include <fcntl.h>
          #include <sys/ioctl.h>
      
          long r[5];
          int main()
          {
          	r[2] = open("/dev/kvm", O_RDONLY);
          	r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
          	r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
          	ioctl(r[4], KVM_RUN, 0);
          }
      
      It doesn't setup the memory regions, mmu_alloc_shadow/direct_roots() in
      kvm return 1 when kvm fails to allocate root page table which can result
      in beblow infinite loop:
      
          vcpu_run() {
          	for (;;) {
      	    	r = vcpu_enter_guest()::kvm_mmu_reload() returns 1
      	    	if (r <= 0)
      	    		break;
      	    	if (need_resched())
      	    		cond_resched();
            }
          }
      
      This patch fixes it by returning -ENOSPC when there is no available kvm mmu
      page for root page table.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: 26eeb53c (KVM: MMU: Bail out immediately if there is no available mmu page)
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ed52870f
  3. 09 12月, 2017 2 次提交
  4. 07 12月, 2017 10 次提交
    • A
      ARM: omap2: hide omap3_save_secure_ram on non-OMAP3 builds · 863204cf
      Arnd Bergmann 提交于
      In configurations without CONFIG_OMAP3 but with secure RAM support,
      we now run into a link failure:
      
      arch/arm/mach-omap2/omap-secure.o: In function `omap3_save_secure_ram':
      omap-secure.c:(.text+0x130): undefined reference to `save_secure_ram_context'
      
      The omap3_save_secure_ram() function is only called from the OMAP34xx
      power management code, so we can simply hide that function in the
      appropriate #ifdef.
      
      Fixes: d09220a8 ("ARM: OMAP2+: Fix SRAM virt to phys translation for save_secure_ram_context")
      Acked-by: NTony Lindgren <tony@atomide.com>
      Tested-by: NDan Murphy <dmurphy@ti.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      863204cf
    • R
      arm: dts: nspire: Add missing #phy-cells to usb-nop-xceiv · c5bbf358
      Rob Herring 提交于
      "usb-nop-xceiv" is using the phy binding, but is missing #phy-cells
      property. This is probably because the binding was the precursor to the phy
      binding.
      
      Fixes the following warning in nspire dts files:
      
      Warning (phys_property): Missing property '#phy-cells' in node ...
      Signed-off-by: NRob Herring <robh@kernel.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      c5bbf358
    • H
      s390: fix compat system call table · e779498d
      Heiko Carstens 提交于
      When wiring up the socket system calls the compat entries were
      incorrectly set. Not all of them point to the corresponding compat
      wrapper functions, which clear the upper 33 bits of user space
      pointers, like it is required.
      
      Fixes: 977108f8 ("s390: wire up separate socketcalls system calls")
      Cc: <stable@vger.kernel.org> # v4.3+
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      e779498d
    • A
      x86/vdso: Change time() prototype to match __vdso_time() · 88edb57d
      Arnd Bergmann 提交于
      gcc-8 warns that time() is an alias for __vdso_time() but the two
      have different prototypes:
      
        arch/x86/entry/vdso/vclock_gettime.c:327:5: error: 'time' alias between functions of incompatible types 'int(time_t *)' {aka 'int(long int *)'} and 'time_t(time_t *)' {aka 'long int(long int *)'} [-Werror=attribute-alias]
         int time(time_t *t)
             ^~~~
        arch/x86/entry/vdso/vclock_gettime.c:318:16: note: aliased declaration here
      
      I could not figure out whether this is intentional, but I see that
      changing it to return time_t avoids the warning.
      
      Returning 'int' from time() is also a bit questionable, as it causes an
      overflow in y2038 even on 64-bit architectures that use a 64-bit time_t
      type. On 32-bit architecture with 64-bit time_t, time() should always
      be implement by the C library by calling a (to be added) clock_gettime()
      variant that takes a sufficiently wide argument.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Link: http://lkml.kernel.org/r/20171204150203.852959-1-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      88edb57d
    • D
      arm64/sve: Avoid dereference of dead task_struct in KVM guest entry · cb968afc
      Dave Martin 提交于
      When deciding whether to invalidate FPSIMD state cached in the cpu,
      the backend function sve_flush_cpu_state() attempts to dereference
      __this_cpu_read(fpsimd_last_state).  However, this is not safe:
      there is no guarantee that this task_struct pointer is still valid,
      because the task could have exited in the meantime.
      
      This means that we need another means to get the appropriate value
      of TIF_SVE for the associated task.
      
      This patch solves this issue by adding a cached copy of the TIF_SVE
      flag in fpsimd_last_state, which we can check without dereferencing
      the task pointer.
      
      In particular, although this patch is not a KVM fix per se, this
      means that this check is now done safely in the KVM world switch
      path (which is currently the only user of this code).
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      cb968afc
    • C
      x86: Fix Sparse warnings about non-static functions · d553d03f
      Colin Ian King 提交于
      Functions x86_vector_debug_show(), uv_handle_nmi() and uv_nmi_setup_common()
      are local to the source and do not need to be in global scope, so make them
      static.
      
      Fixes up various sparse warnings.
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Acked-by: NMike Travis <mike.travis@hpe.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Kosina <trivial@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <russ.anderson@hpe.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kernel-janitors@vger.kernel.org
      Cc: travis@sgi.com
      Link: http://lkml.kernel.org/r/20171206173358.24388-1-colin.king@canonical.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d553d03f
    • W
      arm64: SW PAN: Update saved ttbr0 value on enter_lazy_tlb · d96cc49b
      Will Deacon 提交于
      enter_lazy_tlb is called when a kernel thread rides on the back of
      another mm, due to a context switch or an explicit call to unuse_mm
      where a call to switch_mm is elided.
      
      In these cases, it's important to keep the saved ttbr value up to date
      with the active mm, otherwise we can end up with a stale value which
      points to a potentially freed page table.
      
      This patch implements enter_lazy_tlb for arm64, so that the saved ttbr0
      is kept up-to-date with the active mm for kernel threads.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Vinayak Menon <vinmenon@codeaurora.org>
      Cc: <stable@vger.kernel.org>
      Fixes: 39bc88e5 ("arm64: Disable TTBR0_EL1 during normal kernel execution")
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Reported-by: NVinayak Menon <vinmenon@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d96cc49b
    • W
      arm64: SW PAN: Point saved ttbr0 at the zero page when switching to init_mm · 0adbdfde
      Will Deacon 提交于
      update_saved_ttbr0 mandates that mm->pgd is not swapper, since swapper
      contains kernel mappings and should never be installed into ttbr0. However,
      this means that callers must avoid passing the init_mm to update_saved_ttbr0
      which in turn can cause the saved ttbr0 value to be out-of-date in the context
      of the idle thread. For example, EFI runtime services may leave the saved ttbr0
      pointing at the EFI page table, and kernel threads may end up with stale
      references to freed page tables.
      
      This patch changes update_saved_ttbr0 so that the init_mm points the saved
      ttbr0 value to the empty zero page, which always exists and never contains
      valid translations. EFI and switch can then call into update_saved_ttbr0
      unconditionally.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Vinayak Menon <vinmenon@codeaurora.org>
      Cc: <stable@vger.kernel.org>
      Fixes: 39bc88e5 ("arm64: Disable TTBR0_EL1 during normal kernel execution")
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Reported-by: NVinayak Menon <vinmenon@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      0adbdfde
    • D
      arm64: fpsimd: Abstract out binding of task's fpsimd context to the cpu. · 8884b7bd
      Dave Martin 提交于
      There is currently some duplicate logic to associate current's
      FPSIMD context with the cpu when loading FPSIMD state into the cpu
      regs.
      
      Subsequent patches will update that logic, so in order to ensure it
      only needs to be done in one place, this patch factors the relevant
      code out into a new function fpsimd_bind_to_cpu().
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8884b7bd
    • D
      arm64: fpsimd: Prevent registers leaking from dead tasks · 071b6d4a
      Dave Martin 提交于
      Currently, loading of a task's fpsimd state into the CPU registers
      is skipped if that task's state is already present in the registers
      of that CPU.
      
      However, the code relies on the struct fpsimd_state * (and by
      extension struct task_struct *) to unambiguously identify a task.
      
      There is a particular case in which this doesn't work reliably:
      when a task exits, its task_struct may be recycled to describe a
      new task.
      
      Consider the following scenario:
      
       1) Task P loads its fpsimd state onto cpu C.
              per_cpu(fpsimd_last_state, C) := P;
              P->thread.fpsimd_state.cpu := C;
      
       2) Task X is scheduled onto C and loads its fpsimd state on C.
              per_cpu(fpsimd_last_state, C) := X;
              X->thread.fpsimd_state.cpu := C;
      
       3) X exits, causing X's task_struct to be freed.
      
       4) P forks a new child T, which obtains X's recycled task_struct.
      	T == X.
      	T->thread.fpsimd_state.cpu == C (inherited from P).
      
       5) T is scheduled on C.
      	T's fpsimd state is not loaded, because
      	per_cpu(fpsimd_last_state, C) == T (== X) &&
      	T->thread.fpsimd_state.cpu == C.
      
              (This is the check performed by fpsimd_thread_switch().)
      
      So, T gets X's registers because the last registers loaded onto C
      were those of X, in (2).
      
      This patch fixes the problem by ensuring that the sched-in check
      fails in (5): fpsimd_flush_task_state(T) is called when T is
      forked, so that T->thread.fpsimd_state.cpu == C cannot be true.
      This relies on the fact that T is not schedulable until after
      copy_thread() completes.
      
      Once T's fpsimd state has been loaded on some CPU C there may still
      be other cpus D for which per_cpu(fpsimd_last_state, D) ==
      &X->thread.fpsimd_state.  But D is necessarily != C in this case,
      and the check in (5) must fail.
      
      An alternative fix would be to do refcounting on task_struct.  This
      would result in each CPU holding a reference to the last task whose
      fpsimd state was loaded there.  It's not clear whether this is
      preferable, and it involves higher overhead than the fix proposed
      in this patch.  It would also move all the task_struct freeing
      work into the context switch critical section, or otherwise some
      deferred cleanup mechanism would need to be introduced, neither of
      which seems obviously justified.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 005f78cd ("arm64: defer reloading a task's FPSIMD state to userland resume")
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      [will: word-smithed the comment so it makes more sense]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      071b6d4a
  5. 06 12月, 2017 19 次提交
  6. 05 12月, 2017 5 次提交