1. 24 2月, 2016 1 次提交
  2. 16 1月, 2016 1 次提交
    • D
      kvm: rename pfn_t to kvm_pfn_t · ba049e93
      Dan Williams 提交于
      To date, we have implemented two I/O usage models for persistent memory,
      PMEM (a persistent "ram disk") and DAX (mmap persistent memory into
      userspace).  This series adds a third, DAX-GUP, that allows DAX mappings
      to be the target of direct-i/o.  It allows userspace to coordinate
      DMA/RDMA from/to persistent memory.
      
      The implementation leverages the ZONE_DEVICE mm-zone that went into
      4.3-rc1 (also discussed at kernel summit) to flag pages that are owned
      and dynamically mapped by a device driver.  The pmem driver, after
      mapping a persistent memory range into the system memmap via
      devm_memremap_pages(), arranges for DAX to distinguish pfn-only versus
      page-backed pmem-pfns via flags in the new pfn_t type.
      
      The DAX code, upon seeing a PFN_DEV+PFN_MAP flagged pfn, flags the
      resulting pte(s) inserted into the process page tables with a new
      _PAGE_DEVMAP flag.  Later, when get_user_pages() is walking ptes it keys
      off _PAGE_DEVMAP to pin the device hosting the page range active.
      Finally, get_page() and put_page() are modified to take references
      against the device driver established page mapping.
      
      Finally, this need for "struct page" for persistent memory requires
      memory capacity to store the memmap array.  Given the memmap array for a
      large pool of persistent may exhaust available DRAM introduce a
      mechanism to allocate the memmap from persistent memory.  The new
      "struct vmem_altmap *" parameter to devm_memremap_pages() enables
      arch_add_memory() to use reserved pmem capacity rather than the page
      allocator.
      
      This patch (of 18):
      
      The core has developed a need for a "pfn_t" type [1].  Move the existing
      pfn_t in KVM to kvm_pfn_t [2].
      
      [1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.html
      [2]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002218.htmlSigned-off-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba049e93
  3. 12 1月, 2016 1 次提交
  4. 17 12月, 2015 4 次提交
    • P
      KVM: vmx: detect mismatched size in VMCS read/write · 8a86aea9
      Paolo Bonzini 提交于
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ---
      	I am sending this as RFC because the error messages it produces are
      	very ugly.  Because of inlining, the original line is lost.  The
      	alternative is to change vmcs_read/write/checkXX into macros, but
      	then you need to have a single huge BUILD_BUG_ON or BUILD_BUG_ON_MSG
      	because multiple BUILD_BUG_ON* with the same __LINE__ are not
      	supported well.
      8a86aea9
    • P
      KVM: VMX: fix read/write sizes of VMCS fields in dump_vmcs · 845c5b40
      Paolo Bonzini 提交于
      This was not printing the high parts of several 64-bit fields on
      32-bit kernels.  Separate from the previous one to make the patches
      easier to review.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      845c5b40
    • P
      KVM: VMX: fix read/write sizes of VMCS fields · f3531054
      Paolo Bonzini 提交于
      In theory this should have broken EPT on 32-bit kernels (due to
      reading the high part of natural-width field GUEST_CR3).  Not sure
      if no one noticed or the processor behaves differently from the
      documentation.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f3531054
    • L
      KVM: VMX: fix the writing POSTED_INTR_NV · 0bcf261c
      Li RongQing 提交于
      POSTED_INTR_NV is 16bit, should not use 64bit write function
      
      [ 5311.676074] vmwrite error: reg 3 value 0 (err 12)
        [ 5311.680001] CPU: 49 PID: 4240 Comm: qemu-system-i38 Tainted: G I 4.1.13-WR8.0.0.0_standard #1
        [ 5311.689343] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
        [ 5311.699550] 00000000 00000000 e69a7e1c c1950de1 00000000 e69a7e38 fafcff45 fafebd24
        [ 5311.706924] 00000003 00000000 0000000c b6a06dfa e69a7e40 fafcff79 e69a7eb0 fafd5f57
        [ 5311.714296] e69a7ec0 c1080600 00000000 00000001 c0e18018 000001be 00000000 00000b43
        [ 5311.721651] Call Trace:
        [ 5311.722942] [<c1950de1>] dump_stack+0x4b/0x75
        [ 5311.726467] [<fafcff45>] vmwrite_error+0x35/0x40 [kvm_intel]
        [ 5311.731444] [<fafcff79>] vmcs_writel+0x29/0x30 [kvm_intel]
        [ 5311.736228] [<fafd5f57>] vmx_create_vcpu+0x337/0xb90 [kvm_intel]
        [ 5311.741600] [<c1080600>] ? dequeue_task_fair+0x2e0/0xf60
        [ 5311.746197] [<faf3b9ca>] kvm_arch_vcpu_create+0x3a/0x70 [kvm]
        [ 5311.751278] [<faf29e9d>] kvm_vm_ioctl+0x14d/0x640 [kvm]
        [ 5311.755771] [<c1129d44>] ? free_pages_prepare+0x1a4/0x2d0
        [ 5311.760455] [<c13e2842>] ? debug_smp_processor_id+0x12/0x20
        [ 5311.765333] [<c10793be>] ? sched_move_task+0xbe/0x170
        [ 5311.769621] [<c11752b3>] ? kmem_cache_free+0x213/0x230
        [ 5311.774016] [<faf29d50>] ? kvm_set_memory_region+0x60/0x60 [kvm]
        [ 5311.779379] [<c1199fa2>] do_vfs_ioctl+0x2e2/0x500
        [ 5311.783285] [<c11752b3>] ? kmem_cache_free+0x213/0x230
        [ 5311.787677] [<c104dc73>] ? __mmdrop+0x63/0xd0
        [ 5311.791196] [<c104dc73>] ? __mmdrop+0x63/0xd0
        [ 5311.794712] [<c104dc73>] ? __mmdrop+0x63/0xd0
        [ 5311.798234] [<c11a2ed7>] ? __fget+0x57/0x90
        [ 5311.801559] [<c11a2f72>] ? __fget_light+0x22/0x50
        [ 5311.805464] [<c119a240>] SyS_ioctl+0x80/0x90
        [ 5311.808885] [<c1957d30>] sysenter_do_call+0x12/0x12
        [ 5312.059280] kvm: zapping shadow pages for mmio generation wraparound
        [ 5313.678415] kvm [4231]: vcpu0 disabled perfctr wrmsr: 0xc2 data 0xffff
        [ 5313.726518] kvm [4231]: vcpu0 unhandled rdmsr: 0x570
      Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
      Cc: Yang Zhang <yang.z.zhang@Intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0bcf261c
  5. 14 12月, 2015 1 次提交
  6. 11 12月, 2015 1 次提交
  7. 26 11月, 2015 2 次提交
    • A
      kvm/x86: per-vcpu apicv deactivation support · d62caabb
      Andrey Smetanin 提交于
      The decision on whether to use hardware APIC virtualization used to be
      taken globally, based on the availability of the feature in the CPU
      and the value of a module parameter.
      
      However, under certain circumstances we want to control it on per-vcpu
      basis.  In particular, when the userspace activates HyperV synthetic
      interrupt controller (SynIC), APICv has to be disabled as it's
      incompatible with SynIC auto-EOI behavior.
      
      To achieve that, introduce 'apicv_active' flag on struct
      kvm_vcpu_arch, and kvm_vcpu_deactivate_apicv() function to turn APICv
      off.  The flag is initialized based on the module parameter and CPU
      capability, and consulted whenever an APICv-specific action is
      performed.
      Signed-off-by: NAndrey Smetanin <asmetanin@virtuozzo.com>
      Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      CC: Gleb Natapov <gleb@kernel.org>
      CC: Paolo Bonzini <pbonzini@redhat.com>
      CC: Roman Kagan <rkagan@virtuozzo.com>
      CC: Denis V. Lunev <den@openvz.org>
      CC: qemu-devel@nongnu.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d62caabb
    • A
      kvm/x86: split ioapic-handled and EOI exit bitmaps · 6308630b
      Andrey Smetanin 提交于
      The function to determine if the vector is handled by ioapic used to
      rely on the fact that only ioapic-handled vectors were set up to
      cause vmexits when virtual apic was in use.
      
      We're going to break this assumption when introducing Hyper-V
      synthetic interrupts: they may need to cause vmexits too.
      
      To achieve that, introduce a new bitmap dedicated specifically for
      ioapic-handled vectors, and populate EOI exit bitmap from it for now.
      Signed-off-by: NAndrey Smetanin <asmetanin@virtuozzo.com>
      Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      CC: Gleb Natapov <gleb@kernel.org>
      CC: Paolo Bonzini <pbonzini@redhat.com>
      CC: Roman Kagan <rkagan@virtuozzo.com>
      CC: Denis V. Lunev <den@openvz.org>
      CC: qemu-devel@nongnu.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6308630b
  8. 25 11月, 2015 1 次提交
    • H
      KVM: nVMX: remove incorrect vpid check in nested invvpid emulation · b2467e74
      Haozhong Zhang 提交于
      This patch removes the vpid check when emulating nested invvpid
      instruction of type all-contexts invalidation. The existing code is
      incorrect because:
       (1) According to Intel SDM Vol 3, Section "INVVPID - Invalidate
           Translations Based on VPID", invvpid instruction does not check
           vpid in the invvpid descriptor when its type is all-contexts
           invalidation.
       (2) According to the same document, invvpid of type all-contexts
           invalidation does not require there is an active VMCS, so/and
           get_vmcs12() in the existing code may result in a NULL-pointer
           dereference. In practice, it can crash both KVM itself and L1
           hypervisors that use invvpid (e.g. Xen).
      Signed-off-by: NHaozhong Zhang <haozhong.zhang@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b2467e74
  9. 10 11月, 2015 10 次提交
  10. 05 11月, 2015 1 次提交
    • K
      KVM: VMX: Fix commit which broke PML · a3eaa864
      Kai Huang 提交于
      I found PML was broken since below commit:
      
      	commit feda805f
      	Author: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      	Date:   Wed Sep 9 14:05:55 2015 +0800
      
      	KVM: VMX: unify SECONDARY_VM_EXEC_CONTROL update
      
      	Unify the update in vmx_cpuid_update()
      Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
      	[Rewrite to use vmcs_set_secondary_exec_control. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      
      The reason is in above commit vmx_cpuid_update calls vmx_secondary_exec_control,
      in which currently SECONDARY_EXEC_ENABLE_PML bit is cleared unconditionally (as
      PML is enabled in creating vcpu). Therefore if vcpu_cpuid_update is called after
      vcpu is created, PML will be disabled unexpectedly while log-dirty code still
      thinks PML is used.
      
      Fix this by clearing SECONDARY_EXEC_ENABLE_PML in vmx_secondary_exec_control
      only when PML is not supported or not enabled (!enable_pml). This is more
      reasonable as PML is currently either always enabled or disabled. With this
      explicit updating SECONDARY_EXEC_ENABLE_PML in vmx_enable{disable}_pml is not
      needed so also rename vmx_enable{disable}_pml to vmx_create{destroy}_pml_buffer.
      
      Fixes: feda805fSigned-off-by: NKai Huang <kai.huang@linux.intel.com>
      [While at it, change a wrong ASSERT to an "if".  The condition can happen
       if creating the VCPU fails with ENOMEM. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a3eaa864
  11. 04 11月, 2015 1 次提交
  12. 19 10月, 2015 1 次提交
    • P
      kvm: x86: zero EFER on INIT · 5690891b
      Paolo Bonzini 提交于
      Not zeroing EFER means that a 32-bit firmware cannot enter paging mode
      without clearing EFER.LME first (which it should not know about).
      Yang Zhang from Intel confirmed that the manual is wrong and EFER is
      cleared to zero on INIT.
      
      Fixes: d28bc9dd
      Cc: stable@vger.kernel.org
      Cc: Yang Z Zhang <yang.z.zhang@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5690891b
  13. 16 10月, 2015 3 次提交
  14. 14 10月, 2015 3 次提交
  15. 01 10月, 2015 9 次提交