1. 08 9月, 2016 13 次提交
  2. 19 8月, 2016 4 次提交
  3. 18 8月, 2016 3 次提交
    • P
      kvm: nVMX: fix nested tsc scaling · c95ba92a
      Peter Feiner 提交于
      When the host supported TSC scaling, L2 would use a TSC multiplier of
      0, which causes a VM entry failure. Now L2's TSC uses the same
      multiplier as L1.
      Signed-off-by: NPeter Feiner <pfeiner@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c95ba92a
    • R
      KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write · dccbfcf5
      Radim Krčmář 提交于
      If vmcs12 does not intercept APIC_BASE writes, then KVM will handle the
      write with vmcs02 as the current VMCS.
      This will incorrectly apply modifications intended for vmcs01 to vmcs02
      and L2 can use it to gain access to L0's x2APIC registers by disabling
      virtualized x2APIC while using msr bitmap that assumes enabled.
      
      Postpone execution of vmx_set_virtual_x2apic_mode until vmcs01 is the
      current VMCS.  An alternative solution would temporarily make vmcs01 the
      current VMCS, but it requires more care.
      
      Fixes: 8d14695f ("x86, apicv: add virtual x2apic support")
      Reported-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      dccbfcf5
    • R
      KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC · d048c098
      Radim Krčmář 提交于
      msr bitmap can be used to avoid a VM exit (interception) on guest MSR
      accesses.  In some configurations of VMX controls, the guest can even
      directly access host's x2APIC MSRs.  See SDM 29.5 VIRTUALIZING MSR-BASED
      APIC ACCESSES.
      
      L2 could read all L0's x2APIC MSRs and write TPR, EOI, and SELF_IPI.
      To do so, L1 would first trick KVM to disable all possible interceptions
      by enabling APICv features and then would turn those features off;
      nested_vmx_merge_msr_bitmap() only disabled interceptions, so VMX would
      not intercept previously enabled MSRs even though they were not safe
      with the new configuration.
      
      Correctly re-enabling interceptions is not enough as a second bug would
      still allow L1+L2 to access host's MSRs: msr bitmap was shared for all
      VMCSs, so L1 could trigger a race to get the desired combination of msr
      bitmap and VMX controls.
      
      This fix allocates a msr bitmap for every L1 VCPU, allows only safe
      x2APIC MSRs from L1's msr bitmap, and disables msr bitmaps if they would
      have to intercept everything anyway.
      
      Fixes: 3af18d9c ("KVM: nVMX: Prepare for using hardware MSR bitmap")
      Reported-by: NJim Mattson <jmattson@google.com>
      Suggested-by: NWincy Van <fanwenyi0529@gmail.com>
      Reviewed-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      d048c098
  4. 17 8月, 2016 4 次提交
  5. 13 8月, 2016 7 次提交
    • G
      h8300: Add missing include file to asm/io.h · 2b05980d
      Guenter Roeck 提交于
      h8300 builds fail with
      
      arch/h8300/include/asm/io.h:9:15: error: unknown type name ‘u8’
      arch/h8300/include/asm/io.h:15:15: error: unknown type name ‘u16’
      arch/h8300/include/asm/io.h:21:15: error: unknown type name ‘u32’
      
      and many related errors.
      
      Fixes: 23c82d41bdf4 ("kexec-allow-architectures-to-override-boot-mapping-fix")
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      2b05980d
    • G
      unicore32: mm: Add missing parameter to arch_vma_access_permitted · 783011b1
      Guenter Roeck 提交于
      unicore32 fails to compile with the following errors.
      
      mm/memory.c: In function ‘__handle_mm_fault’:
      mm/memory.c:3381: error:
      	too many arguments to function ‘arch_vma_access_permitted’
      mm/gup.c: In function ‘check_vma_flags’:
      mm/gup.c:456: error:
      	too many arguments to function ‘arch_vma_access_permitted’
      mm/gup.c: In function ‘vma_permits_fault’:
      mm/gup.c:640: error:
      	too many arguments to function ‘arch_vma_access_permitted’
      
      Fixes: d61172b4 ("mm/core, x86/mm/pkeys: Differentiate instruction fetches")
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Acked-by: NGuan Xuetao <gxt@mprc.pku.edu.cn>
      783011b1
    • M
      arm64: defconfig: enable CONFIG_LOCALVERSION_AUTO · 53fb45d3
      Masahiro Yamada 提交于
      When CONFIG_LOCALVERSION_AUTO is disabled, the version string is
      just a tag name (or with a '+' appended if HEAD is not a tagged
      commit).
      
      During the development (and especially when git-bisecting), longer
      version string would be helpful to identify the commit we are running.
      
      This is a default y option, so drop the unset to enable it.
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      53fb45d3
    • R
      arm64: defconfig: add options for virtualization and containers · 2323439f
      Riku Voipio 提交于
      Enable options commonly needed by popular virtualization
      and container applications. Use modules when possible to
      avoid too much overhead for users not interested.
      
      - add namespace and cgroup options needed
      - add seccomp - optional, but enhances Qemu etc
      - bridge, nat, veth, macvtap and multicast for routing
        guests and containers
      - btfrs and overlayfs modules for container COW backends
      - while near it, make fuse a module instead of built-in.
      
      Generated with make saveconfig and dropping unrelated spurious
      change hunks while commiting. bloat-o-meter old-vmlinux vmlinux:
      
      add/remove: 905/390 grow/shrink: 767/229 up/down: 183513/-94861 (88652)
      ....
      Total: Before=10515408, After=10604060, chg +0.84%
      Signed-off-by: NRiku Voipio <riku.voipio@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      2323439f
    • M
      arm64: hibernate: handle allocation failures · dfbca61a
      Mark Rutland 提交于
      In create_safe_exec_page(), we create a copy of the hibernate exit text,
      along with some page tables to map this via TTBR0. We then install the
      new tables in TTBR0.
      
      In swsusp_arch_resume() we call create_safe_exec_page() before trying a
      number of operations which may fail (e.g. copying the linear map page
      tables). If these fail, we bail out of swsusp_arch_resume() and return
      an error code, but leave TTBR0 as-is. Subsequently, the core hibernate
      code will call free_basic_memory_bitmaps(), which will free all of the
      memory allocations we made, including the page tables installed in
      TTBR0.
      
      Thus, we may have TTBR0 pointing at dangling freed memory for some
      period of time. If the hibernate attempt was triggered by a user
      requesting a hibernate test via the reboot syscall, we may return to
      userspace with the clobbered TTBR0 value.
      
      Avoid these issues by reorganising swsusp_arch_resume() such that we
      have no failure paths after create_safe_exec_page(). We also add a check
      that the zero page allocation succeeded, matching what we have for other
      allocations.
      
      Fixes: 82869ac5 ("arm64: kernel: Add support for hibernate/suspend-to-disk")
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NJames Morse <james.morse@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: <stable@vger.kernel.org> # 4.7+
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      dfbca61a
    • M
      arm64: hibernate: avoid potential TLB conflict · 0194e760
      Mark Rutland 提交于
      In create_safe_exec_page we install a set of global mappings in TTBR0,
      then subsequently invalidate TLBs. While TTBR0 points at the zero page,
      and the TLBs should be free of stale global entries, we may have stale
      ASID-tagged entries (e.g. from the EFI runtime services mappings) for
      the same VAs. Per the ARM ARM these ASID-tagged entries may conflict
      with newly-allocated global entries, and we must follow a
      Break-Before-Make approach to avoid issues resulting from this.
      
      This patch reworks create_safe_exec_page to invalidate TLBs while the
      zero page is still in place, ensuring that there are no potential
      conflicts when the new TTBR0 value is installed. As a single CPU is
      online while this code executes, we do not need to perform broadcast TLB
      maintenance, and can call local_flush_tlb_all(), which also subsumes
      some barriers. The remaining assembly is converted to use write_sysreg()
      and isb().
      
      Other than this, we safely manipulate TTBRs in the hibernate dance. The
      code we install as part of the new TTBR0 mapping (the hibernated
      kernel's swsusp_arch_suspend_exit) installs a zero page into TTBR1,
      invalidates TLBs, then installs its preferred value. Upon being restored
      to the middle of swsusp_arch_suspend, the new image will call
      __cpu_suspend_exit, which will call cpu_uninstall_idmap, installing the
      zero page in TTBR0 and invalidating all TLB entries.
      
      Fixes: 82869ac5 ("arm64: kernel: Add support for hibernate/suspend-to-disk")
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NJames Morse <james.morse@arm.com>
      Tested-by: NJames Morse <james.morse@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: <stable@vger.kernel.org> # 4.7+
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      0194e760
    • L
      arm64: Handle el1 synchronous instruction aborts cleanly · 9adeb8e7
      Laura Abbott 提交于
      Executing from a non-executable area gives an ugly message:
      
      lkdtm: Performing direct entry EXEC_RODATA
      lkdtm: attempting ok execution at ffff0000084c0e08
      lkdtm: attempting bad execution at ffff000008880700
      Bad mode in Synchronous Abort handler detected on CPU2, code 0x8400000e -- IABT (current EL)
      CPU: 2 PID: 998 Comm: sh Not tainted 4.7.0-rc2+ #13
      Hardware name: linux,dummy-virt (DT)
      task: ffff800077e35780 ti: ffff800077970000 task.ti: ffff800077970000
      PC is at lkdtm_rodata_do_nothing+0x0/0x8
      LR is at execute_location+0x74/0x88
      
      The 'IABT (current EL)' indicates the error but it's a bit cryptic
      without knowledge of the ARM ARM. There is also no indication of the
      specific address which triggered the fault. The increase in kernel
      page permissions makes hitting this case more likely as well.
      Handling the case in the vectors gives a much more familiar looking
      error message:
      
      lkdtm: Performing direct entry EXEC_RODATA
      lkdtm: attempting ok execution at ffff0000084c0840
      lkdtm: attempting bad execution at ffff000008880680
      Unable to handle kernel paging request at virtual address ffff000008880680
      pgd = ffff8000089b2000
      [ffff000008880680] *pgd=00000000489b4003, *pud=0000000048904003, *pmd=0000000000000000
      Internal error: Oops: 8400000e [#1] PREEMPT SMP
      Modules linked in:
      CPU: 1 PID: 997 Comm: sh Not tainted 4.7.0-rc1+ #24
      Hardware name: linux,dummy-virt (DT)
      task: ffff800077f9f080 ti: ffff800008a1c000 task.ti: ffff800008a1c000
      PC is at lkdtm_rodata_do_nothing+0x0/0x8
      LR is at execute_location+0x74/0x88
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NLaura Abbott <labbott@redhat.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      9adeb8e7
  6. 12 8月, 2016 9 次提交
    • J
      MIPS: KVM: Propagate kseg0/mapped tlb fault errors · 9b731bcf
      James Hogan 提交于
      Propagate errors from kvm_mips_handle_kseg0_tlb_fault() and
      kvm_mips_handle_mapped_seg_tlb_fault(), usually triggering an internal
      error since they normally indicate the guest accessed bad physical
      memory or the commpage in an unexpected way.
      
      Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
      Fixes: e685c689 ("KVM/MIPS32: Privileged instruction/target branch emulation.")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.10.x-
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      9b731bcf
    • J
      MIPS: KVM: Fix gfn range check in kseg0 tlb faults · 0741f52d
      James Hogan 提交于
      Two consecutive gfns are loaded into host TLB, so ensure the range check
      isn't off by one if guest_pmap_npages is odd.
      
      Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.10.x-
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      0741f52d
    • J
      MIPS: KVM: Add missing gfn range check · 8985d503
      James Hogan 提交于
      kvm_mips_handle_mapped_seg_tlb_fault() calculates the guest frame number
      based on the guest TLB EntryLo values, however it is not range checked
      to ensure it lies within the guest_pmap. If the physical memory the
      guest refers to is out of range then dump the guest TLB and emit an
      internal error.
      
      Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.10.x-
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      8985d503
    • J
      MIPS: KVM: Fix mapped fault broken commpage handling · c604cffa
      James Hogan 提交于
      kvm_mips_handle_mapped_seg_tlb_fault() appears to map the guest page at
      virtual address 0 to PFN 0 if the guest has created its own mapping
      there. The intention is unclear, but it may have been an attempt to
      protect the zero page from being mapped to anything but the comm page in
      code paths you wouldn't expect from genuine commpage accesses (guest
      kernel mode cache instructions on that address, hitting trapping
      instructions when executing from that address with a coincidental TLB
      eviction during the KVM handling, and guest user mode accesses to that
      address).
      
      Fix this to check for mappings exactly at KVM_GUEST_COMMPAGE_ADDR (it
      may not be at address 0 since commit 42aa12e7 ("MIPS: KVM: Move
      commpage so 0x0 is unmapped")), and set the corresponding EntryLo to be
      interpreted as 0 (invalid).
      
      Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.10.x-
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      c604cffa
    • C
      KVM: Protect device ops->create and list_add with kvm->lock · a28ebea2
      Christoffer Dall 提交于
      KVM devices were manipulating list data structures without any form of
      synchronization, and some implementations of the create operations also
      suffered from a lack of synchronization.
      
      Now when we've split the xics create operation into create and init, we
      can hold the kvm->lock mutex while calling the create operation and when
      manipulating the devices list.
      
      The error path in the generic code gets slightly ugly because we have to
      take the mutex again and delete the device from the list, but holding
      the mutex during anon_inode_getfd or releasing/locking the mutex in the
      common non-error path seemed wrong.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      a28ebea2
    • C
      KVM: PPC: Move xics_debugfs_init out of create · 023e9fdd
      Christoffer Dall 提交于
      As we are about to hold the kvm->lock during the create operation on KVM
      devices, we should move the call to xics_debugfs_init into its own
      function, since holding a mutex over extended amounts of time might not
      be a good idea.
      
      Introduce an init operation on the kvm_device_ops struct which cannot
      fail and call this, if configured, after the device has been created.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      023e9fdd
    • J
      KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed · aca411a4
      Julius Niedworok 提交于
      When triggering KVM_RUN without a user memory region being mapped
      (KVM_SET_USER_MEMORY_REGION) a validity intercept occurs. This could
      happen, if the user memory region was not mapped initially or if it
      was unmapped after the vcpu is initialized. The function
      kvm_s390_handle_requests checks for the KVM_REQ_MMU_RELOAD bit. The
      check function always clears this bit. If gmap_mprotect_notify
      returns an error code, the mapping failed, but the KVM_REQ_MMU_RELOAD
      was not set anymore. So the next time kvm_s390_handle_requests is
      called, the execution would fall trough the check for
      KVM_REQ_MMU_RELOAD. The bit needs to be resetted, if
      gmap_mprotect_notify returns an error code. Resetting the bit with
      kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu) fixes the bug.
      Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: NJulius Niedworok <jniedwor@linux.vnet.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      aca411a4
    • J
      KVM: s390: set the prefix initially properly · 75a4615c
      Julius Niedworok 提交于
      When KVM_RUN is triggered on a VCPU without an initial reset, a
      validity intercept occurs.
      Setting the prefix will set the KVM_REQ_MMU_RELOAD bit initially,
      thus preventing the bug.
      Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: NJulius Niedworok <jniedwor@linux.vnet.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      75a4615c
    • K
      perf/x86/intel/uncore: Add enable_box for client MSR uncore · 95f3be79
      Kan Liang 提交于
      There are bug reports about miscounting uncore counters on some
      client machines like Sandybridge, Broadwell and Skylake. It is
      very likely to be observed on idle systems.
      
      This issue is caused by a hardware issue. PERF_GLOBAL_CTL could be
      cleared after Package C7, and nothing will be count.
      The related errata (HSD 158) could be found in:
      
        www.intel.com/content/dam/www/public/us/en/documents/specification-updates/4th-gen-core-family-desktop-specification-update.pdf
      
      This patch tries to work around this issue by re-enabling PERF_GLOBAL_CTL
      in ->enable_box(). The workaround does not cover all cases. It helps for new
      events after returning from C7. But it cannot prevent C7, it will still
      miscount if a counter is already active.
      
      There is no drawback in leaving it enabled, so it does not need
      disable_box() here.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1470925874-59943-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      95f3be79