1. 16 4月, 2019 1 次提交
    • S
      KVM: x86: Load SMRAM in a single shot when leaving SMM · ed19321f
      Sean Christopherson 提交于
      RSM emulation is currently broken on VMX when the interrupted guest has
      CR4.VMXE=1.  Rather than dance around the issue of HF_SMM_MASK being set
      when loading SMSTATE into architectural state, ideally RSM emulation
      itself would be reworked to clear HF_SMM_MASK prior to loading non-SMM
      architectural state.
      
      Ostensibly, the only motivation for having HF_SMM_MASK set throughout
      the loading of state from the SMRAM save state area is so that the
      memory accesses from GET_SMSTATE() are tagged with role.smm.  Load
      all of the SMRAM save state area from guest memory at the beginning of
      RSM emulation, and load state from the buffer instead of reading guest
      memory one-by-one.
      
      This paves the way for clearing HF_SMM_MASK prior to loading state,
      and also aligns RSM with the enter_smm() behavior, which fills a
      buffer and writes SMRAM save state in a single go.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ed19321f
  2. 06 1月, 2019 1 次提交
    • M
      jump_label: move 'asm goto' support test to Kconfig · e9666d10
      Masahiro Yamada 提交于
      Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label".
      
      The jump label is controlled by HAVE_JUMP_LABEL, which is defined
      like this:
      
        #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
        # define HAVE_JUMP_LABEL
        #endif
      
      We can improve this by testing 'asm goto' support in Kconfig, then
      make JUMP_LABEL depend on CC_HAS_ASM_GOTO.
      
      Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will
      match to the real kernel capability.
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
      e9666d10
  3. 29 10月, 2018 1 次提交
  4. 28 9月, 2018 1 次提交
  5. 06 8月, 2018 1 次提交
  6. 12 6月, 2018 2 次提交
  7. 15 5月, 2018 1 次提交
  8. 04 4月, 2018 1 次提交
  9. 17 3月, 2018 2 次提交
  10. 25 1月, 2018 1 次提交
  11. 21 12月, 2017 1 次提交
  12. 14 12月, 2017 3 次提交
  13. 06 12月, 2017 1 次提交
  14. 17 11月, 2017 2 次提交
    • D
      KVM: x86: fix em_fxstor() sleeping while in atomic · 4d772cb8
      David Hildenbrand 提交于
      Commit 9d643f63 ("KVM: x86: avoid large stack allocations in
      em_fxrstor") optimize the stack size, but introduced a guest memory access
      which might sleep while in atomic.
      
      Fix it by introducing, again, a second fxregs_state. Try to avoid
      large stacks by using noinline. Add some helpful comments.
      
      Reported by syzbot:
      
      in_atomic(): 1, irqs_disabled(): 0, pid: 2909, name: syzkaller879109
      2 locks held by syzkaller879109/2909:
        #0:  (&vcpu->mutex){+.+.}, at: [<ffffffff8106222c>] vcpu_load+0x1c/0x70
      arch/x86/kvm/../../../virt/kvm/kvm_main.c:154
        #1:  (&kvm->srcu){....}, at: [<ffffffff810dd162>] vcpu_enter_guest
      arch/x86/kvm/x86.c:6983 [inline]
        #1:  (&kvm->srcu){....}, at: [<ffffffff810dd162>] vcpu_run
      arch/x86/kvm/x86.c:7061 [inline]
        #1:  (&kvm->srcu){....}, at: [<ffffffff810dd162>]
      kvm_arch_vcpu_ioctl_run+0x1bc2/0x58b0 arch/x86/kvm/x86.c:7222
      CPU: 1 PID: 2909 Comm: syzkaller879109 Not tainted 4.13.0-rc4-next-20170811
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Call Trace:
        __dump_stack lib/dump_stack.c:16 [inline]
        dump_stack+0x194/0x257 lib/dump_stack.c:52
        ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6014
        __might_sleep+0x95/0x190 kernel/sched/core.c:5967
        __might_fault+0xab/0x1d0 mm/memory.c:4383
        __copy_from_user include/linux/uaccess.h:71 [inline]
        __kvm_read_guest_page+0x58/0xa0
      arch/x86/kvm/../../../virt/kvm/kvm_main.c:1771
        kvm_vcpu_read_guest_page+0x44/0x60
      arch/x86/kvm/../../../virt/kvm/kvm_main.c:1791
        kvm_read_guest_virt_helper+0x76/0x140 arch/x86/kvm/x86.c:4407
        kvm_read_guest_virt_system+0x3c/0x50 arch/x86/kvm/x86.c:4466
        segmented_read_std+0x10c/0x180 arch/x86/kvm/emulate.c:819
        em_fxrstor+0x27b/0x410 arch/x86/kvm/emulate.c:4022
        x86_emulate_insn+0x55d/0x3c50 arch/x86/kvm/emulate.c:5471
        x86_emulate_instruction+0x411/0x1ca0 arch/x86/kvm/x86.c:5698
        kvm_mmu_page_fault+0x18b/0x2c0 arch/x86/kvm/mmu.c:4854
        handle_ept_violation+0x1fc/0x5e0 arch/x86/kvm/vmx.c:6400
        vmx_handle_exit+0x281/0x1ab0 arch/x86/kvm/vmx.c:8718
        vcpu_enter_guest arch/x86/kvm/x86.c:6999 [inline]
        vcpu_run arch/x86/kvm/x86.c:7061 [inline]
        kvm_arch_vcpu_ioctl_run+0x1cee/0x58b0 arch/x86/kvm/x86.c:7222
        kvm_vcpu_ioctl+0x64c/0x1010 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2591
        vfs_ioctl fs/ioctl.c:45 [inline]
        do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
        SYSC_ioctl fs/ioctl.c:700 [inline]
        SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
        entry_SYSCALL_64_fastpath+0x1f/0xbe
      RIP: 0033:0x437fc9
      RSP: 002b:00007ffc7b4d5ab8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000437fc9
      RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005
      RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000020ae8000
      R10: 0000000000009120 R11: 0000000000000206 R12: 0000000000000000
      R13: 0000000000000004 R14: 0000000000000004 R15: 0000000020077000
      
      Fixes: 9d643f63 ("KVM: x86: avoid large stack allocations in em_fxrstor")
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      4d772cb8
    • W
      KVM: X86: Fix operand/address-size during instruction decoding · 3853be26
      Wanpeng Li 提交于
      Pedro reported:
        During tests that we conducted on KVM, we noticed that executing a "PUSH %ES"
        instruction under KVM produces different results on both memory and the SP
        register depending on whether EPT support is enabled. With EPT the SP is
        reduced by 4 bytes (and the written value is 0-padded) but without EPT support
        it is only reduced by 2 bytes. The difference can be observed when the CS.DB
        field is 1 (32-bit) but not when it's 0 (16-bit).
      
      The internal segment descriptor cache exist even in real/vm8096 mode. The CS.D
      also should be respected instead of just default operand/address-size/66H
      prefix/67H prefix during instruction decoding. This patch fixes it by also
      adjusting operand/address-size according to CS.D.
      Reported-by: NPedro Fonseca <pfonseca@cs.washington.edu>
      Tested-by: NPedro Fonseca <pfonseca@cs.washington.edu>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Pedro Fonseca <pfonseca@cs.washington.edu>
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      3853be26
  15. 12 10月, 2017 1 次提交
    • L
      KVM: x86: introduce ISA specific SMM entry/exit callbacks · 0234bf88
      Ladi Prosek 提交于
      Entering and exiting SMM may require ISA specific handling under certain
      circumstances. This commit adds two new callbacks with empty implementations.
      Actual functionality will be added in following commits.
      
      * pre_enter_smm() is to be called when injecting an SMM, before any
        SMM related vcpu state has been changed
      * pre_leave_smm() is to be called when emulating the RSM instruction,
        when the vcpu is in real mode and before any SMM related vcpu state
        has been restored
      Signed-off-by: NLadi Prosek <lprosek@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0234bf88
  16. 05 10月, 2017 1 次提交
  17. 23 9月, 2017 1 次提交
    • J
      x86/asm: Fix inline asm call constraints for Clang · f5caf621
      Josh Poimboeuf 提交于
      For inline asm statements which have a CALL instruction, we list the
      stack pointer as a constraint to convince GCC to ensure the frame
      pointer is set up first:
      
        static inline void foo()
        {
      	register void *__sp asm(_ASM_SP);
      	asm("call bar" : "+r" (__sp))
        }
      
      Unfortunately, that pattern causes Clang to corrupt the stack pointer.
      
      The fix is easy: convert the stack pointer register variable to a global
      variable.
      
      It should be noted that the end result is different based on the GCC
      version.  With GCC 6.4, this patch has exactly the same result as
      before:
      
      	defconfig	defconfig-nofp	distro		distro-nofp
       before	9820389		9491555		8816046		8516940
       after	9820389		9491555		8816046		8516940
      
      With GCC 7.2, however, GCC's behavior has changed.  It now changes its
      behavior based on the conversion of the register variable to a global.
      That somehow convinces it to *always* set up the frame pointer before
      inserting *any* inline asm.  (Therefore, listing the variable as an
      output constraint is a no-op and is no longer necessary.)  It's a bit
      overkill, but the performance impact should be negligible.  And in fact,
      there's a nice improvement with frame pointers disabled:
      
      	defconfig	defconfig-nofp	distro		distro-nofp
       before	9796316		9468236		9076191		8790305
       after	9796957		9464267		9076381		8785949
      
      So in summary, while listing the stack pointer as an output constraint
      is no longer necessary for newer versions of GCC, it's still needed for
      older versions.
      Suggested-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: NMatthias Kaehlcke <mka@chromium.org>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f5caf621
  18. 19 9月, 2017 1 次提交
  19. 25 8月, 2017 3 次提交
  20. 30 6月, 2017 1 次提交
  21. 22 6月, 2017 1 次提交
    • P
      KVM: x86: fix singlestepping over syscall · c8401dda
      Paolo Bonzini 提交于
      TF is handled a bit differently for syscall and sysret, compared
      to the other instructions: TF is checked after the instruction completes,
      so that the OS can disable #DB at a syscall by adding TF to FMASK.
      When the sysret is executed the #DB is taken "as if" the syscall insn
      just completed.
      
      KVM emulates syscall so that it can trap 32-bit syscall on Intel processors.
      Fix the behavior, otherwise you could get #DB on a user stack which is not
      nice.  This does not affect Linux guests, as they use an IST or task gate
      for #DB.
      
      This fixes CVE-2017-7518.
      
      Cc: stable@vger.kernel.org
      Reported-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      c8401dda
  22. 01 6月, 2017 1 次提交
    • N
      KVM: x86: avoid large stack allocations in em_fxrstor · 9d643f63
      Nick Desaulniers 提交于
      em_fxstor previously called fxstor_fixup.  Both created instances of
      struct fxregs_state on the stack, which triggered the warning:
      
      arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes
      in function
            'em_fxrstor' [-Wframe-larger-than=]
      static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
                 ^
      with CONFIG_FRAME_WARN set to 1024.
      
      This patch does the fixup in em_fxstor now, avoiding one additional
      struct fxregs_state, and now fxstor_fixup can be removed as it has no
      other call sites.
      
      Further, the calculation for offsets into xmm_space can be shared
      between em_fxstor and em_fxsave.
      Signed-off-by: NNick Desaulniers <nick.desaulniers@gmail.com>
      [Clean up calculation of offsets and fix it for 64-bit mode. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9d643f63
  23. 20 5月, 2017 1 次提交
  24. 27 4月, 2017 1 次提交
    • L
      KVM: x86: fix emulation of RSM and IRET instructions · 6ed071f0
      Ladi Prosek 提交于
      On AMD, the effect of set_nmi_mask called by emulate_iret_real and em_rsm
      on hflags is reverted later on in x86_emulate_instruction where hflags are
      overwritten with ctxt->emul_flags (the kvm_set_hflags call). This manifests
      as a hang when rebooting Windows VMs with QEMU, OVMF, and >1 vcpu.
      
      Instead of trying to merge ctxt->emul_flags into vcpu->arch.hflags after
      an instruction is emulated, this commit deletes emul_flags altogether and
      makes the emulator access vcpu->arch.hflags using two new accessors. This
      way all changes, on the emulator side as well as in functions called from
      the emulator and accessing vcpu state with emul_to_vcpu, are preserved.
      
      More details on the bug and its manifestation with Windows and OVMF:
      
        It's a KVM bug in the interaction between SMI/SMM and NMI, specific to AMD.
        I believe that the SMM part explains why we started seeing this only with
        OVMF.
      
        KVM masks and unmasks NMI when entering and leaving SMM. When KVM emulates
        the RSM instruction in em_rsm, the set_nmi_mask call doesn't stick because
        later on in x86_emulate_instruction we overwrite arch.hflags with
        ctxt->emul_flags, effectively reverting the effect of the set_nmi_mask call.
        The AMD-specific hflag of interest here is HF_NMI_MASK.
      
        When rebooting the system, Windows sends an NMI IPI to all but the current
        cpu to shut them down. Only after all of them are parked in HLT will the
        initiating cpu finish the restart. If NMI is masked, other cpus never get
        the memo and the initiating cpu spins forever, waiting for
        hal!HalpInterruptProcessorsStarted to drop. That's the symptom we observe.
      
      Fixes: a584539b ("KVM: x86: pass the whole hflags field to emulator and back")
      Signed-off-by: NLadi Prosek <lprosek@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6ed071f0
  25. 21 4月, 2017 1 次提交
  26. 12 1月, 2017 2 次提交
  27. 09 1月, 2017 1 次提交
  28. 25 11月, 2016 1 次提交
    • R
      KVM: x86: drop error recovery in em_jmp_far and em_ret_far · 2117d539
      Radim Krčmář 提交于
      em_jmp_far and em_ret_far assumed that setting IP can only fail in 64
      bit mode, but syzkaller proved otherwise (and SDM agrees).
      Code segment was restored upon failure, but it was left uninitialized
      outside of long mode, which could lead to a leak of host kernel stack.
      We could have fixed that by always saving and restoring the CS, but we
      take a simpler approach and just break any guest that manages to fail
      as the error recovery is error-prone and modern CPUs don't need emulator
      for this.
      
      Found by syzkaller:
      
        WARNING: CPU: 2 PID: 3668 at arch/x86/kvm/emulate.c:2217 em_ret_far+0x428/0x480
        Kernel panic - not syncing: panic_on_warn set ...
      
        CPU: 2 PID: 3668 Comm: syz-executor Not tainted 4.9.0-rc4+ #49
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
         [...]
        Call Trace:
         [...] __dump_stack lib/dump_stack.c:15
         [...] dump_stack+0xb3/0x118 lib/dump_stack.c:51
         [...] panic+0x1b7/0x3a3 kernel/panic.c:179
         [...] __warn+0x1c4/0x1e0 kernel/panic.c:542
         [...] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
         [...] em_ret_far+0x428/0x480 arch/x86/kvm/emulate.c:2217
         [...] em_ret_far_imm+0x17/0x70 arch/x86/kvm/emulate.c:2227
         [...] x86_emulate_insn+0x87a/0x3730 arch/x86/kvm/emulate.c:5294
         [...] x86_emulate_instruction+0x520/0x1ba0 arch/x86/kvm/x86.c:5545
         [...] emulate_instruction arch/x86/include/asm/kvm_host.h:1116
         [...] complete_emulated_io arch/x86/kvm/x86.c:6870
         [...] complete_emulated_mmio+0x4e9/0x710 arch/x86/kvm/x86.c:6934
         [...] kvm_arch_vcpu_ioctl_run+0x3b7a/0x5a90 arch/x86/kvm/x86.c:6978
         [...] kvm_vcpu_ioctl+0x61e/0xdd0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2557
         [...] vfs_ioctl fs/ioctl.c:43
         [...] do_vfs_ioctl+0x18c/0x1040 fs/ioctl.c:679
         [...] SYSC_ioctl fs/ioctl.c:694
         [...] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:685
         [...] entry_SYSCALL_64_fastpath+0x1f/0xc2
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: stable@vger.kernel.org
      Fixes: d1442d85 ("KVM: x86: Handle errors when RIP is set during far jumps")
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      2117d539
  29. 17 11月, 2016 4 次提交