1. 19 7月, 2018 1 次提交
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 47f7dc4b
      Linus Torvalds 提交于
      Pull kvm fixes from Paolo Bonzini:
       "Miscellaneous bugfixes, plus a small patchlet related to Spectre v2"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        kvmclock: fix TSC calibration for nested guests
        KVM: VMX: Mark VMXArea with revision_id of physical CPU even when eVMCS enabled
        KVM: irqfd: fix race between EPOLLHUP and irq_bypass_register_consumer
        KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in parallel.
        x86/kvmclock: set pvti_cpu0_va after enabling kvmclock
        x86/kvm/Kconfig: Ensure CRYPTO_DEV_CCP_DD state at minimum matches KVM_AMD
        kvm: nVMX: Restore exit qual for VM-entry failure due to MSR loading
        x86/kvm/vmx: don't read current->thread.{fs,gs}base of legacy tasks
        KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR
      47f7dc4b
  2. 18 7月, 2018 5 次提交
    • P
      kvmclock: fix TSC calibration for nested guests · e10f7805
      Peng Hao 提交于
      Inside a nested guest, access to hardware can be slow enough that
      tsc_read_refs always return ULLONG_MAX, causing tsc_refine_calibration_work
      to be called periodically and the nested guest to spend a lot of time
      reading the ACPI timer.
      
      However, if the TSC frequency is available from the pvclock page,
      we can just set X86_FEATURE_TSC_KNOWN_FREQ and avoid the recalibration.
      'refine' operation.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NPeng Hao <peng.hao2@zte.com.cn>
      [Commit message rewritten. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e10f7805
    • L
      KVM: VMX: Mark VMXArea with revision_id of physical CPU even when eVMCS enabled · 2307af1c
      Liran Alon 提交于
      When eVMCS is enabled, all VMCS allocated to be used by KVM are marked
      with revision_id of KVM_EVMCS_VERSION instead of revision_id reported
      by MSR_IA32_VMX_BASIC.
      
      However, even though not explictly documented by TLFS, VMXArea passed
      as VMXON argument should still be marked with revision_id reported by
      physical CPU.
      
      This issue was found by the following setup:
      * L0 = KVM which expose eVMCS to it's L1 guest.
      * L1 = KVM which consume eVMCS reported by L0.
      This setup caused the following to occur:
      1) L1 execute hardware_enable().
      2) hardware_enable() calls kvm_cpu_vmxon() to execute VMXON.
      3) L0 intercept L1 VMXON and execute handle_vmon() which notes
      vmxarea->revision_id != VMCS12_REVISION and therefore fails with
      nested_vmx_failInvalid() which sets RFLAGS.CF.
      4) L1 kvm_cpu_vmxon() don't check RFLAGS.CF for failure and therefore
      hardware_enable() continues as usual.
      5) L1 hardware_enable() then calls ept_sync_global() which executes
      INVEPT.
      6) L0 intercept INVEPT and execute handle_invept() which notes
      !vmx->nested.vmxon and thus raise a #UD to L1.
      7) Raised #UD caused L1 to panic.
      Reviewed-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Cc: stable@vger.kernel.org
      Fixes: 773e8a04Signed-off-by: NLiran Alon <liran.alon@oracle.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2307af1c
    • P
      KVM: irqfd: fix race between EPOLLHUP and irq_bypass_register_consumer · 9432a317
      Paolo Bonzini 提交于
      A comment warning against this bug is there, but the code is not doing what
      the comment says.  Therefore it is possible that an EPOLLHUP races against
      irq_bypass_register_consumer.  The EPOLLHUP handler schedules irqfd_shutdown,
      and if that runs soon enough, you get a use-after-free.
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      9432a317
    • L
      KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in parallel. · b5020a8e
      Lan Tianyu 提交于
      Syzbot reports crashes in kvm_irqfd_assign(), caused by use-after-free
      when kvm_irqfd_assign() and kvm_irqfd_deassign() run in parallel
      for one specific eventfd. When the assign path hasn't finished but irqfd
      has been added to kvm->irqfds.items list, another thead may deassign the
      eventfd and free struct kvm_kernel_irqfd(). The assign path then uses
      the struct kvm_kernel_irqfd that has been freed by deassign path. To avoid
      such issue, keep irqfd under kvm->irq_srcu protection after the irqfd
      has been added to kvm->irqfds.items list, and call synchronize_srcu()
      in irq_shutdown() to make sure that irqfd has been fully initialized in
      the assign path.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NTianyu Lan <tianyu.lan@intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b5020a8e
    • L
      Mark HI and TASKLET softirq synchronous · 3c53776e
      Linus Torvalds 提交于
      Way back in 4.9, we committed 4cd13c21 ("softirq: Let ksoftirqd do
      its job"), and ever since we've had small nagging issues with it.  For
      example, we've had:
      
        1ff68820 ("watchdog: core: make sure the watchdog_worker is not deferred")
        8d5755b3 ("watchdog: softdog: fire watchdog even if softirqs do not get to run")
        217f6974 ("net: busy-poll: allow preemption in sk_busy_loop()")
      
      all of which worked around some of the effects of that commit.
      
      The DVB people have also complained that the commit causes excessive USB
      URB latencies, which seems to be due to the USB code using tasklets to
      schedule USB traffic.  This seems to be an issue mainly when already
      living on the edge, but waiting for ksoftirqd to handle it really does
      seem to cause excessive latencies.
      
      Now Hanna Hawa reports that this issue isn't just limited to USB URB and
      DVB, but also causes timeout problems for the Marvell SoC team:
      
       "I'm facing kernel panic issue while running raid 5 on sata disks
        connected to Macchiatobin (Marvell community board with Armada-8040
        SoC with 4 ARMv8 cores of CA72) Raid 5 built with Marvell DMA engine
        and async_tx mechanism (ASYNC_TX_DMA [=y]); the DMA driver (mv_xor_v2)
        uses a tasklet to clean the done descriptors from the queue"
      
      The latency problem causes a panic:
      
        mv_xor_v2 f0400000.xor: dma_sync_wait: timeout!
        Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for transaction
      
      We've discussed simply just reverting the original commit entirely, and
      also much more involved solutions (with per-softirq threads etc).  This
      patch is intentionally stupid and fairly limited, because the issue
      still remains, and the other solutions either got sidetracked or had
      other issues.
      
      We should probably also consider the timer softirqs to be synchronous
      and not be delayed to ksoftirqd (since they were the issue with the
      earlier watchdog problems), but that should be done as a separate patch.
      This does only the tasklet cases.
      Reported-and-tested-by: NHanna Hawa <hannah@marvell.com>
      Reported-and-tested-by: NJosef Griebichler <griebichler.josef@gmx.at>
      Reported-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3c53776e
  3. 17 7月, 2018 3 次提交
    • L
      Merge tag 'pinctrl-v4.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 30b06abf
      Linus Torvalds 提交于
      Pull pin control fixes from Linus Walleij:
      
       - A slew of driver fixes for Mediatek mt7622
      
       - Fix a direction inversion bug in the Ingenic driver
      
       - Fix unsupported drive strength setting on the PFC r8a77970
      
       - Off by one and NULL dereference fixes in the NSP driver
      
      * tag 'pinctrl-v4.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: nsp: Fix potential NULL dereference
        pinctrl: nsp: off by ones in nsp_pinmux_enable()
        pinctrl: sh-pfc: r8a77970: remove SH_PFC_PIN_CFG_DRIVE_STRENGTH flag
        pinctrl: ingenic: Fix inverted direction for < JZ4770
        pinctrl: mt7622: fix a kernel panic when gpio-hog is being applied
        pinctrl: mt7622: stop using the deprecated pinctrl_add_gpio_range
        pinctrl: mt7622: fix that pinctrl_claim_hogs cannot work
        pinctrl: mt7622: fix initialization sequence between eint and gpiochip
        pinctrl: mt7622: fix error path on failing at groups building
      30b06abf
    • L
      Merge tag 'drm-fixes-2018-07-16-1' of git://anongit.freedesktop.org/drm/drm · 706bf68b
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
      
       - two AGP fixes in here
      
       - a bunch of mostly amdgpu fixes
      
       - sun4i build fix
      
       - two armada fixes
      
       - some tegra fixes
      
       - one i915 core and one i915 gvt fix
      
      * tag 'drm-fixes-2018-07-16-1' of git://anongit.freedesktop.org/drm/drm:
        drm/amdgpu/pp/smu7: use a local variable for toc indexing
        amd/dc/dce100: On dce100, set clocks to 0 on suspend
        drm/amd/display: Convert 10kHz clks from PPLib into kHz for Vega
        drm/amdgpu: Verify root PD is mapped into kernel address space (v4)
        drm/amd/display: fix invalid function table override
        drm/amdgpu: Reserve VM root shared fence slot for command submission (v3)
        Revert "drm/amd/display: Don't return ddc result and read_bytes in same return value"
        char: amd64-agp: Use 64-bit arithmetic instead of 32-bit
        char: agp: Change return type to vm_fault_t
        drm/i915: Fix hotplug irq ack on i965/g4x
        drm/armada: fix irq handling
        drm/armada: fix colorkey mode property
        drm/tegra: Fix comparison operator for buffer size
        gpu: host1x: Check whether size of unpin isn't 0
        gpu: host1x: Skip IOMMU initialization if firewall is enabled
        drm/sun4i: link in front-end code if needed
        drm/i915/gvt: update vreg on inhibit context lri command
      706bf68b
    • P
      mm: don't do zero_resv_unavail if memmap is not allocated · d1b47a7c
      Pavel Tatashin 提交于
      Moving zero_resv_unavail before memmap_init_zone(), caused a regression on
      x86-32.
      
      The cause is that we access struct pages before they are allocated when
      CONFIG_FLAT_NODE_MEM_MAP is used.
      
      free_area_init_nodes()
        zero_resv_unavail()
          mm_zero_struct_page(pfn_to_page(pfn)); <- struct page is not alloced
        free_area_init_node()
          if CONFIG_FLAT_NODE_MEM_MAP
            alloc_node_mem_map()
              memblock_virt_alloc_node_nopanic() <- struct page alloced here
      
      On the other hand memblock_virt_alloc_node_nopanic() zeroes all the memory
      that it returns, so we do not need to do zero_resv_unavail() here.
      
      Fixes: e181ae0c ("mm: zero unavailable pages before memmap init")
      Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Tested-by: NMatt Hart <matt@mattface.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d1b47a7c
  4. 16 7月, 2018 8 次提交
  5. 15 7月, 2018 23 次提交