1. 19 6月, 2019 1 次提交
  2. 16 4月, 2019 5 次提交
    • S
      KVM: x86: Always use 32-bit SMRAM save state for 32-bit kernels · b68f3cc7
      Sean Christopherson 提交于
      Invoking the 64-bit variation on a 32-bit kenrel will crash the guest,
      trigger a WARN, and/or lead to a buffer overrun in the host, e.g.
      rsm_load_state_64() writes r8-r15 unconditionally, but enum kvm_reg and
      thus x86_emulate_ctxt._regs only define r8-r15 for CONFIG_X86_64.
      
      KVM allows userspace to report long mode support via CPUID, even though
      the guest is all but guaranteed to crash if it actually tries to enable
      long mode.  But, a pure 32-bit guest that is ignorant of long mode will
      happily plod along.
      
      SMM complicates things as 64-bit CPUs use a different SMRAM save state
      area.  KVM handles this correctly for 64-bit kernels, e.g. uses the
      legacy save state map if userspace has hid long mode from the guest,
      but doesn't fare well when userspace reports long mode support on a
      32-bit host kernel (32-bit KVM doesn't support 64-bit guests).
      
      Since the alternative is to crash the guest, e.g. by not loading state
      or explicitly requesting shutdown, unconditionally use the legacy SMRAM
      save state map for 32-bit KVM.  If a guest has managed to get far enough
      to handle SMIs when running under a weird/buggy userspace hypervisor,
      then don't deliberately crash the guest since there are no downsides
      (from KVM's perspective) to allow it to continue running.
      
      Fixes: 660a5d51 ("KVM: x86: save/load state on SMM switch")
      Cc: stable@vger.kernel.org
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b68f3cc7
    • S
      KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU · 8f4dc2e7
      Sean Christopherson 提交于
      Neither AMD nor Intel CPUs have an EFER field in the legacy SMRAM save
      state area, i.e. don't save/restore EFER across SMM transitions.  KVM
      somewhat models this, e.g. doesn't clear EFER on entry to SMM if the
      guest doesn't support long mode.  But during RSM, KVM unconditionally
      clears EFER so that it can get back to pure 32-bit mode in order to
      start loading CRs with their actual non-SMM values.
      
      Clear EFER only when it will be written when loading the non-SMM state
      so as to preserve bits that can theoretically be set on 32-bit vCPUs,
      e.g. KVM always emulates EFER_SCE.
      
      And because CR4.PAE is cleared only to play nice with EFER, wrap that
      code in the long mode check as well.  Note, this may result in a
      compiler warning about cr4 being consumed uninitialized.  Re-read CR4
      even though it's technically unnecessary, as doing so allows for more
      readable code and RSM emulation is not a performance critical path.
      
      Fixes: 660a5d51 ("KVM: x86: save/load state on SMM switch")
      Cc: stable@vger.kernel.org
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      8f4dc2e7
    • S
      KVM: x86: clear SMM flags before loading state while leaving SMM · 9ec19493
      Sean Christopherson 提交于
      RSM emulation is currently broken on VMX when the interrupted guest has
      CR4.VMXE=1.  Stop dancing around the issue of HF_SMM_MASK being set when
      loading SMSTATE into architectural state, e.g. by toggling it for
      problematic flows, and simply clear HF_SMM_MASK prior to loading
      architectural state (from SMRAM save state area).
      Reported-by: NJon Doron <arilou@gmail.com>
      Cc: Jim Mattson <jmattson@google.com>
      Cc: Liran Alon <liran.alon@oracle.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Fixes: 5bea5123 ("KVM: VMX: check nested state and CR4.VMXE against SMM")
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Tested-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9ec19493
    • S
      KVM: x86: Open code kvm_set_hflags · c5833c7a
      Sean Christopherson 提交于
      Prepare for clearing HF_SMM_MASK prior to loading state from the SMRAM
      save state map, i.e. kvm_smm_changed() needs to be called after state
      has been loaded and so cannot be done automatically when setting
      hflags from RSM.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c5833c7a
    • S
      KVM: x86: Load SMRAM in a single shot when leaving SMM · ed19321f
      Sean Christopherson 提交于
      RSM emulation is currently broken on VMX when the interrupted guest has
      CR4.VMXE=1.  Rather than dance around the issue of HF_SMM_MASK being set
      when loading SMSTATE into architectural state, ideally RSM emulation
      itself would be reworked to clear HF_SMM_MASK prior to loading non-SMM
      architectural state.
      
      Ostensibly, the only motivation for having HF_SMM_MASK set throughout
      the loading of state from the SMRAM save state area is so that the
      memory accesses from GET_SMSTATE() are tagged with role.smm.  Load
      all of the SMRAM save state area from guest memory at the beginning of
      RSM emulation, and load state from the buffer instead of reading guest
      memory one-by-one.
      
      This paves the way for clearing HF_SMM_MASK prior to loading state,
      and also aligns RSM with the enter_smm() behavior, which fills a
      buffer and writes SMRAM save state in a single go.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ed19321f
  3. 06 1月, 2019 1 次提交
    • M
      jump_label: move 'asm goto' support test to Kconfig · e9666d10
      Masahiro Yamada 提交于
      Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label".
      
      The jump label is controlled by HAVE_JUMP_LABEL, which is defined
      like this:
      
        #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
        # define HAVE_JUMP_LABEL
        #endif
      
      We can improve this by testing 'asm goto' support in Kconfig, then
      make JUMP_LABEL depend on CC_HAS_ASM_GOTO.
      
      Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will
      match to the real kernel capability.
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
      e9666d10
  4. 29 10月, 2018 1 次提交
  5. 28 9月, 2018 1 次提交
  6. 06 8月, 2018 1 次提交
  7. 12 6月, 2018 2 次提交
  8. 15 5月, 2018 1 次提交
  9. 04 4月, 2018 1 次提交
  10. 17 3月, 2018 2 次提交
  11. 25 1月, 2018 1 次提交
  12. 21 12月, 2017 1 次提交
  13. 14 12月, 2017 3 次提交
  14. 06 12月, 2017 1 次提交
  15. 17 11月, 2017 2 次提交
    • D
      KVM: x86: fix em_fxstor() sleeping while in atomic · 4d772cb8
      David Hildenbrand 提交于
      Commit 9d643f63 ("KVM: x86: avoid large stack allocations in
      em_fxrstor") optimize the stack size, but introduced a guest memory access
      which might sleep while in atomic.
      
      Fix it by introducing, again, a second fxregs_state. Try to avoid
      large stacks by using noinline. Add some helpful comments.
      
      Reported by syzbot:
      
      in_atomic(): 1, irqs_disabled(): 0, pid: 2909, name: syzkaller879109
      2 locks held by syzkaller879109/2909:
        #0:  (&vcpu->mutex){+.+.}, at: [<ffffffff8106222c>] vcpu_load+0x1c/0x70
      arch/x86/kvm/../../../virt/kvm/kvm_main.c:154
        #1:  (&kvm->srcu){....}, at: [<ffffffff810dd162>] vcpu_enter_guest
      arch/x86/kvm/x86.c:6983 [inline]
        #1:  (&kvm->srcu){....}, at: [<ffffffff810dd162>] vcpu_run
      arch/x86/kvm/x86.c:7061 [inline]
        #1:  (&kvm->srcu){....}, at: [<ffffffff810dd162>]
      kvm_arch_vcpu_ioctl_run+0x1bc2/0x58b0 arch/x86/kvm/x86.c:7222
      CPU: 1 PID: 2909 Comm: syzkaller879109 Not tainted 4.13.0-rc4-next-20170811
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Call Trace:
        __dump_stack lib/dump_stack.c:16 [inline]
        dump_stack+0x194/0x257 lib/dump_stack.c:52
        ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6014
        __might_sleep+0x95/0x190 kernel/sched/core.c:5967
        __might_fault+0xab/0x1d0 mm/memory.c:4383
        __copy_from_user include/linux/uaccess.h:71 [inline]
        __kvm_read_guest_page+0x58/0xa0
      arch/x86/kvm/../../../virt/kvm/kvm_main.c:1771
        kvm_vcpu_read_guest_page+0x44/0x60
      arch/x86/kvm/../../../virt/kvm/kvm_main.c:1791
        kvm_read_guest_virt_helper+0x76/0x140 arch/x86/kvm/x86.c:4407
        kvm_read_guest_virt_system+0x3c/0x50 arch/x86/kvm/x86.c:4466
        segmented_read_std+0x10c/0x180 arch/x86/kvm/emulate.c:819
        em_fxrstor+0x27b/0x410 arch/x86/kvm/emulate.c:4022
        x86_emulate_insn+0x55d/0x3c50 arch/x86/kvm/emulate.c:5471
        x86_emulate_instruction+0x411/0x1ca0 arch/x86/kvm/x86.c:5698
        kvm_mmu_page_fault+0x18b/0x2c0 arch/x86/kvm/mmu.c:4854
        handle_ept_violation+0x1fc/0x5e0 arch/x86/kvm/vmx.c:6400
        vmx_handle_exit+0x281/0x1ab0 arch/x86/kvm/vmx.c:8718
        vcpu_enter_guest arch/x86/kvm/x86.c:6999 [inline]
        vcpu_run arch/x86/kvm/x86.c:7061 [inline]
        kvm_arch_vcpu_ioctl_run+0x1cee/0x58b0 arch/x86/kvm/x86.c:7222
        kvm_vcpu_ioctl+0x64c/0x1010 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2591
        vfs_ioctl fs/ioctl.c:45 [inline]
        do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
        SYSC_ioctl fs/ioctl.c:700 [inline]
        SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
        entry_SYSCALL_64_fastpath+0x1f/0xbe
      RIP: 0033:0x437fc9
      RSP: 002b:00007ffc7b4d5ab8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000437fc9
      RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005
      RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000020ae8000
      R10: 0000000000009120 R11: 0000000000000206 R12: 0000000000000000
      R13: 0000000000000004 R14: 0000000000000004 R15: 0000000020077000
      
      Fixes: 9d643f63 ("KVM: x86: avoid large stack allocations in em_fxrstor")
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      4d772cb8
    • W
      KVM: X86: Fix operand/address-size during instruction decoding · 3853be26
      Wanpeng Li 提交于
      Pedro reported:
        During tests that we conducted on KVM, we noticed that executing a "PUSH %ES"
        instruction under KVM produces different results on both memory and the SP
        register depending on whether EPT support is enabled. With EPT the SP is
        reduced by 4 bytes (and the written value is 0-padded) but without EPT support
        it is only reduced by 2 bytes. The difference can be observed when the CS.DB
        field is 1 (32-bit) but not when it's 0 (16-bit).
      
      The internal segment descriptor cache exist even in real/vm8096 mode. The CS.D
      also should be respected instead of just default operand/address-size/66H
      prefix/67H prefix during instruction decoding. This patch fixes it by also
      adjusting operand/address-size according to CS.D.
      Reported-by: NPedro Fonseca <pfonseca@cs.washington.edu>
      Tested-by: NPedro Fonseca <pfonseca@cs.washington.edu>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Pedro Fonseca <pfonseca@cs.washington.edu>
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      3853be26
  16. 12 10月, 2017 1 次提交
    • L
      KVM: x86: introduce ISA specific SMM entry/exit callbacks · 0234bf88
      Ladi Prosek 提交于
      Entering and exiting SMM may require ISA specific handling under certain
      circumstances. This commit adds two new callbacks with empty implementations.
      Actual functionality will be added in following commits.
      
      * pre_enter_smm() is to be called when injecting an SMM, before any
        SMM related vcpu state has been changed
      * pre_leave_smm() is to be called when emulating the RSM instruction,
        when the vcpu is in real mode and before any SMM related vcpu state
        has been restored
      Signed-off-by: NLadi Prosek <lprosek@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0234bf88
  17. 05 10月, 2017 1 次提交
  18. 23 9月, 2017 1 次提交
    • J
      x86/asm: Fix inline asm call constraints for Clang · f5caf621
      Josh Poimboeuf 提交于
      For inline asm statements which have a CALL instruction, we list the
      stack pointer as a constraint to convince GCC to ensure the frame
      pointer is set up first:
      
        static inline void foo()
        {
      	register void *__sp asm(_ASM_SP);
      	asm("call bar" : "+r" (__sp))
        }
      
      Unfortunately, that pattern causes Clang to corrupt the stack pointer.
      
      The fix is easy: convert the stack pointer register variable to a global
      variable.
      
      It should be noted that the end result is different based on the GCC
      version.  With GCC 6.4, this patch has exactly the same result as
      before:
      
      	defconfig	defconfig-nofp	distro		distro-nofp
       before	9820389		9491555		8816046		8516940
       after	9820389		9491555		8816046		8516940
      
      With GCC 7.2, however, GCC's behavior has changed.  It now changes its
      behavior based on the conversion of the register variable to a global.
      That somehow convinces it to *always* set up the frame pointer before
      inserting *any* inline asm.  (Therefore, listing the variable as an
      output constraint is a no-op and is no longer necessary.)  It's a bit
      overkill, but the performance impact should be negligible.  And in fact,
      there's a nice improvement with frame pointers disabled:
      
      	defconfig	defconfig-nofp	distro		distro-nofp
       before	9796316		9468236		9076191		8790305
       after	9796957		9464267		9076381		8785949
      
      So in summary, while listing the stack pointer as an output constraint
      is no longer necessary for newer versions of GCC, it's still needed for
      older versions.
      Suggested-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: NMatthias Kaehlcke <mka@chromium.org>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f5caf621
  19. 19 9月, 2017 1 次提交
  20. 25 8月, 2017 3 次提交
  21. 30 6月, 2017 1 次提交
  22. 22 6月, 2017 1 次提交
    • P
      KVM: x86: fix singlestepping over syscall · c8401dda
      Paolo Bonzini 提交于
      TF is handled a bit differently for syscall and sysret, compared
      to the other instructions: TF is checked after the instruction completes,
      so that the OS can disable #DB at a syscall by adding TF to FMASK.
      When the sysret is executed the #DB is taken "as if" the syscall insn
      just completed.
      
      KVM emulates syscall so that it can trap 32-bit syscall on Intel processors.
      Fix the behavior, otherwise you could get #DB on a user stack which is not
      nice.  This does not affect Linux guests, as they use an IST or task gate
      for #DB.
      
      This fixes CVE-2017-7518.
      
      Cc: stable@vger.kernel.org
      Reported-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      c8401dda
  23. 01 6月, 2017 1 次提交
    • N
      KVM: x86: avoid large stack allocations in em_fxrstor · 9d643f63
      Nick Desaulniers 提交于
      em_fxstor previously called fxstor_fixup.  Both created instances of
      struct fxregs_state on the stack, which triggered the warning:
      
      arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes
      in function
            'em_fxrstor' [-Wframe-larger-than=]
      static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
                 ^
      with CONFIG_FRAME_WARN set to 1024.
      
      This patch does the fixup in em_fxstor now, avoiding one additional
      struct fxregs_state, and now fxstor_fixup can be removed as it has no
      other call sites.
      
      Further, the calculation for offsets into xmm_space can be shared
      between em_fxstor and em_fxsave.
      Signed-off-by: NNick Desaulniers <nick.desaulniers@gmail.com>
      [Clean up calculation of offsets and fix it for 64-bit mode. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9d643f63
  24. 20 5月, 2017 1 次提交
  25. 27 4月, 2017 1 次提交
    • L
      KVM: x86: fix emulation of RSM and IRET instructions · 6ed071f0
      Ladi Prosek 提交于
      On AMD, the effect of set_nmi_mask called by emulate_iret_real and em_rsm
      on hflags is reverted later on in x86_emulate_instruction where hflags are
      overwritten with ctxt->emul_flags (the kvm_set_hflags call). This manifests
      as a hang when rebooting Windows VMs with QEMU, OVMF, and >1 vcpu.
      
      Instead of trying to merge ctxt->emul_flags into vcpu->arch.hflags after
      an instruction is emulated, this commit deletes emul_flags altogether and
      makes the emulator access vcpu->arch.hflags using two new accessors. This
      way all changes, on the emulator side as well as in functions called from
      the emulator and accessing vcpu state with emul_to_vcpu, are preserved.
      
      More details on the bug and its manifestation with Windows and OVMF:
      
        It's a KVM bug in the interaction between SMI/SMM and NMI, specific to AMD.
        I believe that the SMM part explains why we started seeing this only with
        OVMF.
      
        KVM masks and unmasks NMI when entering and leaving SMM. When KVM emulates
        the RSM instruction in em_rsm, the set_nmi_mask call doesn't stick because
        later on in x86_emulate_instruction we overwrite arch.hflags with
        ctxt->emul_flags, effectively reverting the effect of the set_nmi_mask call.
        The AMD-specific hflag of interest here is HF_NMI_MASK.
      
        When rebooting the system, Windows sends an NMI IPI to all but the current
        cpu to shut them down. Only after all of them are parked in HLT will the
        initiating cpu finish the restart. If NMI is masked, other cpus never get
        the memo and the initiating cpu spins forever, waiting for
        hal!HalpInterruptProcessorsStarted to drop. That's the symptom we observe.
      
      Fixes: a584539b ("KVM: x86: pass the whole hflags field to emulator and back")
      Signed-off-by: NLadi Prosek <lprosek@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6ed071f0
  26. 21 4月, 2017 1 次提交
  27. 12 1月, 2017 2 次提交
  28. 09 1月, 2017 1 次提交