1. 26 9月, 2011 40 次提交
    • P
      KVM: PPC: Implement H_CEDE hcall for book3s_hv in real-mode code · 19ccb76a
      Paul Mackerras 提交于
      With a KVM guest operating in SMT4 mode (i.e. 4 hardware threads per
      core), whenever a CPU goes idle, we have to pull all the other
      hardware threads in the core out of the guest, because the H_CEDE
      hcall is handled in the kernel.  This is inefficient.
      
      This adds code to book3s_hv_rmhandlers.S to handle the H_CEDE hcall
      in real mode.  When a guest vcpu does an H_CEDE hcall, we now only
      exit to the kernel if all the other vcpus in the same core are also
      idle.  Otherwise we mark this vcpu as napping, save state that could
      be lost in nap mode (mainly GPRs and FPRs), and execute the nap
      instruction.  When the thread wakes up, because of a decrementer or
      external interrupt, we come back in at kvm_start_guest (from the
      system reset interrupt vector), find the `napping' flag set in the
      paca, and go to the resume path.
      
      This has some other ramifications.  First, when starting a core, we
      now start all the threads, both those that are immediately runnable and
      those that are idle.  This is so that we don't have to pull all the
      threads out of the guest when an idle thread gets a decrementer interrupt
      and wants to start running.  In fact the idle threads will all start
      with the H_CEDE hcall returning; being idle they will just do another
      H_CEDE immediately and go to nap mode.
      
      This required some changes to kvmppc_run_core() and kvmppc_run_vcpu().
      These functions have been restructured to make them simpler and clearer.
      We introduce a level of indirection in the wait queue that gets woken
      when external and decrementer interrupts get generated for a vcpu, so
      that we can have the 4 vcpus in a vcore using the same wait queue.
      We need this because the 4 vcpus are being handled by one thread.
      
      Secondly, when we need to exit from the guest to the kernel, we now
      have to generate an IPI for any napping threads, because an HDEC
      interrupt doesn't wake up a napping thread.
      
      Thirdly, we now need to be able to handle virtual external interrupts
      and decrementer interrupts becoming pending while a thread is napping,
      and deliver those interrupts to the guest when the thread wakes.
      This is done in kvmppc_cede_reentry, just before fast_guest_return.
      
      Finally, since we are not using the generic kvm_vcpu_block for book3s_hv,
      and hence not calling kvm_arch_vcpu_runnable, we can remove the #ifdef
      from kvm_arch_vcpu_runnable.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      19ccb76a
    • P
      KVM: PPC: book3s_pr: Simplify transitions between virtual and real mode · 02143947
      Paul Mackerras 提交于
      This simplifies the way that the book3s_pr makes the transition to
      real mode when entering the guest.  We now call kvmppc_entry_trampoline
      (renamed from kvmppc_rmcall) in the base kernel using a normal function
      call instead of doing an indirect call through a pointer in the vcpu.
      If kvm is a module, the module loader takes care of generating a
      trampoline as it does for other calls to functions outside the module.
      
      kvmppc_entry_trampoline then disables interrupts and jumps to
      kvmppc_handler_trampoline_enter in real mode using an rfi[d].
      That then uses the link register as the address to return to
      (potentially in module space) when the guest exits.
      
      This also simplifies the way that we call the Linux interrupt handler
      when we exit the guest due to an external, decrementer or performance
      monitor interrupt.  Instead of turning on the MMU, then deciding that
      we need to call the Linux handler and turning the MMU back off again,
      we now go straight to the handler at the point where we would turn the
      MMU on.  The handler will then return to the virtual-mode code
      (potentially in the module).
      
      Along the way, this moves the setting and clearing of the HID5 DCBZ32
      bit into real-mode interrupts-off code, and also makes sure that
      we clear the MSR[RI] bit before loading values into SRR0/1.
      
      The net result is that we no longer need any code addresses to be
      stored in vcpu->arch.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      02143947
    • P
      KVM: PPC: Assemble book3s{,_hv}_rmhandlers.S separately · 177339d7
      Paul Mackerras 提交于
      This makes arch/powerpc/kvm/book3s_rmhandlers.S and
      arch/powerpc/kvm/book3s_hv_rmhandlers.S be assembled as
      separate compilation units rather than having them #included in
      arch/powerpc/kernel/exceptions-64s.S.  We no longer have any
      conditional branches between the exception prologs in
      exceptions-64s.S and the KVM handlers, so there is no need to
      keep their contents close together in the vmlinux image.
      
      In their current location, they are using up part of the limited
      space between the first-level interrupt handlers and the firmware
      NMI data area at offset 0x7000, and with some kernel configurations
      this area will overflow (e.g. allyesconfig), leading to an
      "attempt to .org backwards" error when compiling exceptions-64s.S.
      
      Moving them out requires that we add some #includes that the
      book3s_{,hv_}rmhandlers.S code was previously getting implicitly
      via exceptions-64s.S.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      177339d7
    • A
      KVM: PPC: Add sanity checking to vcpu_run · af8f38b3
      Alexander Graf 提交于
      There are multiple features in PowerPC KVM that can now be enabled
      depending on the user's wishes. Some of the combinations don't make
      sense or don't work though.
      
      So this patch adds a way to check if the executing environment would
      actually be able to run the guest properly. It also adds sanity
      checks if PVR is set (should always be true given the current code
      flow), if PAPR is only used with book3s_64 where it works and that
      HV KVM is only used in PAPR mode.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      af8f38b3
    • A
      KVM: PPC: Enable the PAPR CAP for Book3S · 930b412a
      Alexander Graf 提交于
      Now that Book3S PV mode can also run PAPR guests, we can add a PAPR cap and
      enable it for all Book3S targets. Enabling that CAP switches KVM into PAPR
      mode.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      930b412a
    • A
      KVM: PPC: Support SC1 hypercalls for PAPR in PR mode · a668f2bd
      Alexander Graf 提交于
      PAPR defines hypercalls as SC1 instructions. Using these, the guest modifies
      page tables and does other privileged operations that it wouldn't be allowed
      to do in supervisor mode.
      
      This patch adds support for PR KVM to trap these instructions and route them
      through the same PAPR hypercall interface that we already use for HV style
      KVM.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      a668f2bd
    • A
      KVM: PPC: Stub emulate CFAR and PURR SPRs · aacf9aa3
      Alexander Graf 提交于
      Recent Linux versions use the CFAR and PURR SPRs, but don't really care about
      their contents (yet). So for now, we can simply return 0 when the guest wants
      to read them.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      aacf9aa3
    • A
      KVM: PPC: Add PAPR hypercall code for PR mode · 0254f074
      Alexander Graf 提交于
      When running a PAPR guest, we need to handle a few hypercalls in kernel space,
      most prominently the page table invalidation (to sync the shadows).
      
      So this patch adds handling for a few PAPR hypercalls to PR mode KVM. I tried
      to share the code with HV mode, but it ended up being a lot easier this way
      around, as the two differ too much in those details.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      
      ---
      
      v1 -> v2:
      
        - whitespace fix
      0254f074
    • A
      KVM: PPC: Add support for explicit HIOR setting · a15bd354
      Alexander Graf 提交于
      Until now, we always set HIOR based on the PVR, but this is just wrong.
      Instead, we should be setting HIOR explicitly, so user space can decide
      what the initial HIOR value is - just like on real hardware.
      
      We keep the old PVR based way around for backwards compatibility, but
      once user space uses the SREGS based method, we drop the PVR logic.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      a15bd354
    • A
      KVM: PPC: Read out syscall instruction on trap · 77e675ad
      Alexander Graf 提交于
      We have a few traps where we cache the instruction that cause the trap
      for analysis later on. Since we now need to be able to distinguish
      between SC 0 and SC 1 system calls and the only way to find out which
      is which is by looking at the instruction, we also read out the instruction
      causing the system call.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      77e675ad
    • A
      KVM: PPC: Interpret SDR1 as HVA in PAPR mode · 04fcc11b
      Alexander Graf 提交于
      When running a PAPR guest, the guest is not allowed to set SDR1 - instead
      the HTAB information is held in internal hypervisor structures. But all of
      our current code relies on SDR1 and walking the HTAB like on real hardware.
      
      So in order to not be too intrusive, we simply set SDR1 to the HTAB we hold
      in host memory. That way we can keep the HTAB in user space, but use it from
      kernel space to map the guest.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      04fcc11b
    • A
      KVM: PPC: Check privilege level on SPRs · 317a8fa3
      Alexander Graf 提交于
      We have 3 privilege levels: problem state, supervisor state and hypervisor
      state. Each of them can access different SPRs, so we need to check on every
      SPR if it's accessible in the respective mode.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      317a8fa3
    • A
      KVM: PPC: Add papr_enabled flag · 9432ba60
      Alexander Graf 提交于
      When running a PAPR guest, some things change. The privilege level drops
      from hypervisor to supervisor, SDR1 gets treated differently and we interpret
      hypercalls. For bisectability sake, add the flag now, but only enable it when
      all the support code is there.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      9432ba60
    • A
      KVM: PPC: move compute_tlbie_rb to book3s common header · db507c30
      Alexander Graf 提交于
      We need the compute_tlbie_rb in _pr and _hv implementations for papr
      soon, so let's move it over to a common header file that both
      implementations can leverage.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      db507c30
    • A
      KVM: Restore missing powerpc API docs · 36442687
      Avi Kivity 提交于
      Commit 371fefd6 lost a doc hunk somehow, restore it.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      36442687
    • K
      KVM: APIC: avoid instruction emulation for EOI writes · 58fbbf26
      Kevin Tian 提交于
      Instruction emulation for EOI writes can be skipped, since sane
      guest simply uses MOV instead of string operations. This is a nice
      improvement when guest doesn't support x2apic or hyper-V EOI
      support.
      
      a single VM bandwidth is observed with ~8% bandwidth improvement
      (7.4Gbps->8Gbps), by saving ~5% cycles from EOI emulation.
      Signed-off-by: NKevin Tian <kevin.tian@intel.com>
      <Based on earlier work from>:
      Signed-off-by: NEddie Dong <eddie.dong@intel.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      58fbbf26
    • N
      KVM: SVM: Fix TSC MSR read in nested SVM · 45133eca
      Nadav Har'El 提交于
      When the TSC MSR is read by an L2 guest (when L1 allowed this MSR to be
      read without exit), we need to return L2's notion of the TSC, not L1's.
      
      The current code incorrectly returned L1 TSC, because svm_get_msr() was also
      used in x86.c where this was assumed, but now that these places call the new
      svm_read_l1_tsc(), the MSR read can be fixed.
      Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
      Tested-by: NJoerg Roedel <joerg.roedel@amd.com>
      Acked-by: NJoerg Roedel <joerg.roedel@amd.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      45133eca
    • N
      KVM: nVMX: Fix nested VMX TSC emulation · 27fc51b2
      Nadav Har'El 提交于
      This patch fixes two corner cases in nested (L2) handling of TSC-related
      issues:
      
      1. Somewhat suprisingly, according to the Intel spec, if L1 allows WRMSR to
      the TSC MSR without an exit, then this should set L1's TSC value itself - not
      offset by vmcs12.TSC_OFFSET (like was wrongly done in the previous code).
      
      2. Allow L1 to disable the TSC_OFFSETING control, and then correctly ignore
      the vmcs12.TSC_OFFSET.
      Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      27fc51b2
    • N
      KVM: L1 TSC handling · d5c1785d
      Nadav Har'El 提交于
      KVM assumed in several places that reading the TSC MSR returns the value for
      L1. This is incorrect, because when L2 is running, the correct TSC read exit
      emulation is to return L2's value.
      
      We therefore add a new x86_ops function, read_l1_tsc, to use in places that
      specifically need to read the L1 TSC, NOT the TSC of the current level of
      guest.
      
      Note that one change, of one line in kvm_arch_vcpu_load, is made redundant
      by a different patch sent by Zachary Amsden (and not yet applied):
      kvm_arch_vcpu_load() should not read the guest TSC, and if it didn't, of
      course we didn't have to change the call of kvm_get_msr() to read_l1_tsc().
      
      [avi: moved callback to kvm_x86_ops tsc block]
      Signed-off-by: NNadav Har'El <nyh@il.ibm.com>
      Acked-by: NZachary Amsdem <zamsden@gmail.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      d5c1785d
    • S
      KVM: nVMX: Document 'nested' parameter · e1a72ae2
      Sasha Levin 提交于
      Add documentation of the new 'nested' parameter to
      'Documentation/kernel-parameters.txt'.
      
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Nadav Har'El <nyh@il.ibm.com>
      Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      e1a72ae2
    • Y
      KVM: MMU: Fix SMEP failure during fetch · cd46868c
      Yang, Wei Y 提交于
      This patch fix kvm-unit-tests hanging and incorrect PT_ACCESSED_MASK
      bit set in the case of SMEP fault.  The code updated 'eperm' after
      the variable was checked.
      Signed-off-by: NYang, Wei <wei.y.yang@intel.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      cd46868c
    • A
      KVM: MMU: Do not unconditionally read PDPTE from guest memory · e4e517b4
      Avi Kivity 提交于
      Architecturally, PDPTEs are cached in the PDPTRs when CR3 is reloaded.
      On SVM, it is not possible to implement this, but on VMX this is possible
      and was indeed implemented until nested SVM changed this to unconditionally
      read PDPTEs dynamically.  This has noticable impact when running PAE guests.
      
      Fix by changing the MMU to read PDPTRs from the cache, falling back to
      reading from memory for the nested MMU.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Tested-by: NJoerg Roedel <joerg.roedel@amd.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      e4e517b4
    • J
      KVM: VMX: trivial: use BUG_ON · cf3ace79
      Julia Lawall 提交于
      Use BUG_ON(x) rather than if(x) BUG();
      
      The semantic patch that fixes this problem is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@ identifier x; @@
      -if (x) BUG();
      +BUG_ON(x);
      
      @@ identifier x; @@
      -if (!x) BUG();
      +BUG_ON(!x);
      // </smpl>
      Signed-off-by: NJulia Lawall <julia@diku.dk>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      cf3ace79
    • M
      KVM: x86: report valid microcode update ID · 742bc670
      Marcelo Tosatti 提交于
      Windows Server 2008 SP2 checked build with smp > 1 BSOD's during
      boot due to lack of microcode update:
      
      *** Assertion failed: The system BIOS on this machine does not properly
      support the processor.  The system BIOS did not load any microcode update.
      A BIOS containing the latest microcode update is needed for system reliability.
      (CurrentUpdateRevision != 0)
      ***   Source File: d:\longhorn\base\hals\update\intelupd\update.c, line 440
      
      Report a non-zero microcode update signature to make it happy.
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      742bc670
    • T
      KVM: x86 emulator: Make x86_decode_insn() return proper macros · 1d2887e2
      Takuya Yoshikawa 提交于
      Return EMULATION_OK/FAILED consistently.  Also treat instruction fetch
      errors, not restricted to X86EMUL_UNHANDLEABLE, as EMULATION_FAILED;
      although this cannot happen in practice, the current logic will continue
      the emulation even if the decoder fails to fetch the instruction.
      Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      1d2887e2
    • T
      KVM: x86 emulator: Let compiler know insn_fetch() rarely fails · 7d88bb48
      Takuya Yoshikawa 提交于
      Fetching the instruction which was to be executed by the guest cannot
      fail normally.  So compiler should always predict that it will succeed.
      Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      7d88bb48
    • T
      KVM: x86 emulator: Drop _size argument from insn_fetch() · e85a1085
      Takuya Yoshikawa 提交于
      _type is enough to know the size.
      Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      e85a1085
    • T
      KVM: x86 emulator: Use ctxt->_eip directly in do_insn_fetch_byte() · 807941b1
      Takuya Yoshikawa 提交于
      Instead of passing ctxt->_eip from insn_fetch() call sites, get it from
      ctxt in do_insn_fetch_byte().  This is done by replacing the argument
      _eip of insn_fetch() with _ctxt, which should be better than letting the
      macro use ctxt silently in its body.
      
      Though this changes the place where ctxt->_eip is incremented from
      insn_fetch() to do_insn_fetch_byte(), this does not have any real
      effect.
      Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      807941b1
    • S
      KVM: Intelligent device lookup on I/O bus · 743eeb0b
      Sasha Levin 提交于
      Currently the method of dealing with an IO operation on a bus (PIO/MMIO)
      is to call the read or write callback for each device registered
      on the bus until we find a device which handles it.
      
      Since the number of devices on a bus can be significant due to ioeventfds
      and coalesced MMIO zones, this leads to a lot of overhead on each IO
      operation.
      
      Instead of registering devices, we now register ranges which points to
      a device. Lookup is done using an efficient bsearch instead of a linear
      search.
      
      Performance test was conducted by comparing exit count per second with
      200 ioeventfds created on one byte and the guest is trying to access a
      different byte continuously (triggering usermode exits).
      Before the patch the guest has achieved 259k exits per second, after the
      patch the guest does 274k exits per second.
      
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      743eeb0b
    • S
      KVM: Use __print_symbolic() for vmexit tracepoints · 0d460ffc
      Stefan Hajnoczi 提交于
      The vmexit tracepoints format the exit_reason to make it human-readable.
      Since the exit_reason depends on the instruction set (vmx or svm),
      formatting is handled with ftrace_print_symbols_seq() by referring to
      the appropriate exit reason table.
      
      However, the ftrace_print_symbols_seq() function is not meant to be used
      directly in tracepoints since it does not export the formatting table
      which userspace tools like trace-cmd and perf use to format traces.
      
      In practice perf dies when formatting vmexit-related events and
      trace-cmd falls back to printing the numeric value (with extra
      formatting code in the kvm plugin to paper over this limitation).  Other
      userspace consumers of vmexit-related tracepoints would be in similar
      trouble.
      
      To avoid significant changes to the kvm_exit tracepoint, this patch
      moves the vmx and svm exit reason tables into arch/x86/kvm/trace.h and
      selects the right table with __print_symbolic() depending on the
      instruction set.  Note that __print_symbolic() is designed for exporting
      the formatting table to userspace and allows trace-cmd and perf to work.
      Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      0d460ffc
    • S
      KVM: Record instruction set in all vmexit tracepoints · e097e5ff
      Stefan Hajnoczi 提交于
      The kvm_exit tracepoint recently added the isa argument to aid decoding
      exit_reason.  The semantics of exit_reason depend on the instruction set
      (vmx or svm) and the isa argument allows traces to be analyzed on other
      machines.
      
      Add the isa argument to kvm_nested_vmexit and kvm_nested_vmexit_inject
      so these tracepoints can also be self-describing.
      Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      e097e5ff
    • M
      KVM: Really fix HV_X64_MSR_APIC_ASSIST_PAGE · d1613ad5
      Mike Waychison 提交于
      Commit 0945d4b228 tried to fix the get_msr path for the
      HV_X64_MSR_APIC_ASSIST_PAGE msr, but was poorly tested.  We should be
      returning 0 if the read succeeded, and passing the value back to the
      caller via the pdata out argument, not returning the value directly.
      Signed-off-by: NMike Waychison <mikew@google.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      d1613ad5
    • M
      KVM: x86: get_msr support for HV_X64_MSR_APIC_ASSIST_PAGE · 14fa67ee
      Mike Waychison 提交于
      "get" support for the HV_X64_MSR_APIC_ASSIST_PAGE msr was missing, even
      though it is explicitly enumerated as something the vmm should save in
      msrs_to_save and reported to userland via the KVM_GET_MSR_INDEX_LIST
      ioctl.
      
      Add "get" support for HV_X64_MSR_APIC_ASSIST_PAGE.  We simply return the
      guest visible value of this register, which seems to be correct as a set
      on the register is validated for us already.
      Signed-off-by: NMike Waychison <mikew@google.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      14fa67ee
    • S
      KVM: Make coalesced mmio use a device per zone · 2b3c246a
      Sasha Levin 提交于
      This patch changes coalesced mmio to create one mmio device per
      zone instead of handling all zones in one device.
      
      Doing so enables us to take advantage of existing locking and prevents
      a race condition between coalesced mmio registration/unregistration
      and lookups.
      Suggested-by: NAvi Kivity <avi@redhat.com>
      Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      2b3c246a
    • S
      KVM: x86: Raise the hard VCPU count limit · 8c3ba334
      Sasha Levin 提交于
      The patch raises the hard limit of VCPU count to 254.
      
      This will allow developers to easily work on scalability
      and will allow users to test high VCPU setups easily without
      patching the kernel.
      
      To prevent possible issues with current setups, KVM_CAP_NR_VCPUS
      now returns the recommended VCPU limit (which is still 64) - this
      should be a safe value for everybody, while a new KVM_CAP_MAX_VCPUS
      returns the hard limit which is now 254.
      
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Suggested-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      8c3ba334
    • S
      KVM: MMIO: Lock coalesced device when checking for available entry · c298125f
      Sasha Levin 提交于
      Move the check whether there are available entries to within the spinlock.
      This allows working with larger amount of VCPUs and reduces premature
      exits when using a large number of VCPUs.
      
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      c298125f
    • X
      KVM: x86: cleanup the code of read/write emulation · 22388a3c
      Xiao Guangrong 提交于
      Using the read/write operation to remove the same code
      Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      22388a3c
    • X
      KVM: x86: abstract the operation for read/write emulation · 77d197b2
      Xiao Guangrong 提交于
      The operations of read emulation and write emulation are very similar, so we
      can abstract the operation of them, in larter patch, it is used to cleanup the
      same code
      Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      77d197b2
    • X
      KVM: x86: fix broken read emulation spans a page boundary · ca7d58f3
      Xiao Guangrong 提交于
      If the range spans a page boundary, the mmio access can be broke, fix it as
      write emulation.
      
      And we already get the guest physical address, so use it to read guest data
      directly to avoid walking guest page table again
      Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      ca7d58f3
    • A
      KVM: x86 emulator: fix Src2CL decode · 9be3be1f
      Avi Kivity 提交于
      Src2CL decode (used for double width shifts) erronously decodes only bit 3
      of %rcx, instead of bits 7:0.
      
      Fix by decoding %cl in its entirety.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      9be3be1f