1. 09 12月, 2021 3 次提交
  2. 08 12月, 2021 5 次提交
  3. 05 12月, 2021 5 次提交
    • T
      KVM: SVM: Do not terminate SEV-ES guests on GHCB validation failure · ad5b3532
      Tom Lendacky 提交于
      Currently, an SEV-ES guest is terminated if the validation of the VMGEXIT
      exit code or exit parameters fails.
      
      The VMGEXIT instruction can be issued from userspace, even though
      userspace (likely) can't update the GHCB. To prevent userspace from being
      able to kill the guest, return an error through the GHCB when validation
      fails rather than terminating the guest. For cases where the GHCB can't be
      updated (e.g. the GHCB can't be mapped, etc.), just return back to the
      guest.
      
      The new error codes are documented in the lasest update to the GHCB
      specification.
      
      Fixes: 291bd20d ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT")
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <b57280b5562893e2616257ac9c2d4525a9aeeb42.1638471124.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ad5b3532
    • S
      KVM: SEV: Fall back to vmalloc for SEV-ES scratch area if necessary · a655276a
      Sean Christopherson 提交于
      Use kvzalloc() to allocate KVM's buffer for SEV-ES's GHCB scratch area so
      that KVM falls back to __vmalloc() if physically contiguous memory isn't
      available.  The buffer is purely a KVM software construct, i.e. there's
      no need for it to be physically contiguous.
      
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20211109222350.2266045-3-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a655276a
    • S
      KVM: SEV: Return appropriate error codes if SEV-ES scratch setup fails · 75236f5f
      Sean Christopherson 提交于
      Return appropriate error codes if setting up the GHCB scratch area for an
      SEV-ES guest fails.  In particular, returning -EINVAL instead of -ENOMEM
      when allocating the kernel buffer could be confusing as userspace would
      likely suspect a guest issue.
      
      Fixes: 8f423a80 ("KVM: SVM: Support MMIO for an SEV-ES guest")
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20211109222350.2266045-2-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      75236f5f
    • H
      parisc: Mark cr16 CPU clocksource unstable on all SMP machines · afdb4a5b
      Helge Deller 提交于
      In commit c8c37359 ("parisc: Enhance detection of synchronous cr16
      clocksources") I assumed that CPUs on the same physical core are syncronous.
      While booting up the kernel on two different C8000 machines, one with a
      dual-core PA8800 and one with a dual-core PA8900 CPU, this turned out to be
      wrong. The symptom was that I saw a jump in the internal clocks printed to the
      syslog and strange overall behaviour.  On machines which have 4 cores (2
      dual-cores) the problem isn't visible, because the current logic already marked
      the cr16 clocksource unstable in this case.
      
      This patch now marks the cr16 interval timers unstable if we have more than one
      CPU in the system, and it fixes this issue.
      
      Fixes: c8c37359 ("parisc: Enhance detection of synchronous cr16 clocksources")
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org> # v5.15+
      afdb4a5b
    • H
      parisc: Fix "make install" on newer debian releases · 0f9fee4c
      Helge Deller 提交于
      On newer debian releases the debian-provided "installkernel" script is
      installed in /usr/sbin. Fix the kernel install.sh script to look for the
      script in this directory as well.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org> # v3.13+
      0f9fee4c
  4. 04 12月, 2021 4 次提交
  5. 03 12月, 2021 2 次提交
    • J
      x86/64/mm: Map all kernel memory into trampoline_pgd · 51523ed1
      Joerg Roedel 提交于
      The trampoline_pgd only maps the 0xfffffff000000000-0xffffffffffffffff
      range of kernel memory (with 4-level paging). This range contains the
      kernel's text+data+bss mappings and the module mapping space but not the
      direct mapping and the vmalloc area.
      
      This is enough to get the application processors out of real-mode, but
      for code that switches back to real-mode the trampoline_pgd is missing
      important parts of the address space. For example, consider this code
      from arch/x86/kernel/reboot.c, function machine_real_restart() for a
      64-bit kernel:
      
        #ifdef CONFIG_X86_32
        	load_cr3(initial_page_table);
        #else
        	write_cr3(real_mode_header->trampoline_pgd);
      
        	/* Exiting long mode will fail if CR4.PCIDE is set. */
        	if (boot_cpu_has(X86_FEATURE_PCID))
        		cr4_clear_bits(X86_CR4_PCIDE);
        #endif
      
        	/* Jump to the identity-mapped low memory code */
        #ifdef CONFIG_X86_32
        	asm volatile("jmpl *%0" : :
        		     "rm" (real_mode_header->machine_real_restart_asm),
        		     "a" (type));
        #else
        	asm volatile("ljmpl *%0" : :
        		     "m" (real_mode_header->machine_real_restart_asm),
        		     "D" (type));
        #endif
      
      The code switches to the trampoline_pgd, which unmaps the direct mapping
      and also the kernel stack. The call to cr4_clear_bits() will find no
      stack and crash the machine. The real_mode_header pointer below points
      into the direct mapping, and dereferencing it also causes a crash.
      
      The reason this does not crash always is only that kernel mappings are
      global and the CR3 switch does not flush those mappings. But if theses
      mappings are not in the TLB already, the above code will crash before it
      can jump to the real-mode stub.
      
      Extend the trampoline_pgd to contain all kernel mappings to prevent
      these crashes and to make code which runs on this page-table more
      robust.
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20211202153226.22946-5-joro@8bytes.org
      51523ed1
    • H
      s390: update defconfigs · 3c088b1e
      Heiko Carstens 提交于
      Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
      3c088b1e
  6. 02 12月, 2021 8 次提交
    • M
      arm64: ftrace: add missing BTIs · 35b6b28e
      Mark Rutland 提交于
      When branch target identifiers are in use, code reachable via an
      indirect branch requires a BTI landing pad at the branch target site.
      
      When building FTRACE_WITH_REGS atop patchable-function-entry, we miss
      BTIs at the start start of the `ftrace_caller` and `ftrace_regs_caller`
      trampolines, and when these are called from a module via a PLT (which
      will use a `BR X16`), we will encounter a BTI failure, e.g.
      
      | # insmod lkdtm.ko
      | lkdtm: No crash points registered, enable through debugfs
      | # echo function_graph > /sys/kernel/debug/tracing/current_tracer
      | # cat /sys/kernel/debug/provoke-crash/DIRECT
      | Unhandled 64-bit el1h sync exception on CPU0, ESR 0x34000001 -- BTI
      | CPU: 0 PID: 174 Comm: cat Not tainted 5.16.0-rc2-dirty #3
      | Hardware name: linux,dummy-virt (DT)
      | pstate: 60400405 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=jc)
      | pc : ftrace_caller+0x0/0x3c
      | lr : lkdtm_debugfs_open+0xc/0x20 [lkdtm]
      | sp : ffff800012e43b00
      | x29: ffff800012e43b00 x28: 0000000000000000 x27: ffff800012e43c88
      | x26: 0000000000000000 x25: 0000000000000000 x24: ffff0000c171f200
      | x23: ffff0000c27b1e00 x22: ffff0000c2265240 x21: ffff0000c23c8c30
      | x20: ffff8000090ba380 x19: 0000000000000000 x18: 0000000000000000
      | x17: 0000000000000000 x16: ffff80001002bb4c x15: 0000000000000000
      | x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000900ff0
      | x11: ffff0000c4166310 x10: ffff800012e43b00 x9 : ffff8000104f2384
      | x8 : 0000000000000001 x7 : 0000000000000000 x6 : 000000000000003f
      | x5 : 0000000000000040 x4 : ffff800012e43af0 x3 : 0000000000000001
      | x2 : ffff8000090b0000 x1 : ffff0000c171f200 x0 : ffff0000c23c8c30
      | Kernel panic - not syncing: Unhandled exception
      | CPU: 0 PID: 174 Comm: cat Not tainted 5.16.0-rc2-dirty #3
      | Hardware name: linux,dummy-virt (DT)
      | Call trace:
      |  dump_backtrace+0x0/0x1a4
      |  show_stack+0x24/0x30
      |  dump_stack_lvl+0x68/0x84
      |  dump_stack+0x1c/0x38
      |  panic+0x168/0x360
      |  arm64_exit_nmi.isra.0+0x0/0x80
      |  el1h_64_sync_handler+0x68/0xd4
      |  el1h_64_sync+0x78/0x7c
      |  ftrace_caller+0x0/0x3c
      |  do_dentry_open+0x134/0x3b0
      |  vfs_open+0x38/0x44
      |  path_openat+0x89c/0xe40
      |  do_filp_open+0x8c/0x13c
      |  do_sys_openat2+0xbc/0x174
      |  __arm64_sys_openat+0x6c/0xbc
      |  invoke_syscall+0x50/0x120
      |  el0_svc_common.constprop.0+0xdc/0x100
      |  do_el0_svc+0x84/0xa0
      |  el0_svc+0x28/0x80
      |  el0t_64_sync_handler+0xa8/0x130
      |  el0t_64_sync+0x1a0/0x1a4
      | SMP: stopping secondary CPUs
      | Kernel Offset: disabled
      | CPU features: 0x0,00000f42,da660c5f
      | Memory Limit: none
      | ---[ end Kernel panic - not syncing: Unhandled exception ]---
      
      Fix this by adding the required `BTI C`, as we only require these to be
      reachable via BL for direct calls or BR X16/X17 for PLTs. For now, these
      are open-coded in the function prologue, matching the style of the
      `__hwasan_tag_mismatch` trampoline.
      
      In future we may wish to consider adding a new SYM_CODE_START_*()
      variant which has an implicit BTI.
      
      When ftrace is built atop mcount, the trampolines are marked with
      SYM_FUNC_START(), and so get an implicit BTI. We may need to change
      these over to SYM_CODE_START() in future for RELIABLE_STACKTRACE, in
      case we need to apply special care aroud the return address being
      rewritten.
      
      Fixes: 97fed779 ("arm64: bti: Provide Kconfig for kernel mode BTI")
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: NMark Brown <broonie@kernel.org>
      Link: https://lore.kernel.org/r/20211129135709.2274019-1-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
      35b6b28e
    • M
      arm64: kexec: use __pa_symbol(empty_zero_page) · 2f218324
      Mark Rutland 提交于
      In machine_kexec_post_load() we use __pa() on `empty_zero_page`, so that
      we can use the physical address during arm64_relocate_new_kernel() to
      switch TTBR1 to a new set of tables. While `empty_zero_page` is part of
      the old kernel, we won't clobber it until after this switch, so using it
      is benign.
      
      However, `empty_zero_page` is part of the kernel image rather than a
      linear map address, so it is not correct to use __pa(x), and we should
      instead use __pa_symbol(x) or __pa(lm_alias(x)). Otherwise, when the
      kernel is built with DEBUG_VIRTUAL, we'll encounter splats as below, as
      I've seen when fuzzing v5.16-rc3 with Syzkaller:
      
      | ------------[ cut here ]------------
      | virt_to_phys used for non-linear address: 000000008492561a (empty_zero_page+0x0/0x1000)
      | WARNING: CPU: 3 PID: 11492 at arch/arm64/mm/physaddr.c:15 __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12
      | CPU: 3 PID: 11492 Comm: syz-executor.0 Not tainted 5.16.0-rc3-00001-g48bd452a045c #1
      | Hardware name: linux,dummy-virt (DT)
      | pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      | pc : __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12
      | lr : __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12
      | sp : ffff80001af17bb0
      | x29: ffff80001af17bb0 x28: ffff1cc65207b400 x27: ffffb7828730b120
      | x26: 0000000000000e11 x25: 0000000000000000 x24: 0000000000000001
      | x23: ffffb7828963e000 x22: ffffb78289644000 x21: 0000600000000000
      | x20: 000000000000002d x19: 0000b78289644000 x18: 0000000000000000
      | x17: 74706d6528206131 x16: 3635323934383030 x15: 303030303030203a
      | x14: 1ffff000035e2eb8 x13: ffff6398d53f4f0f x12: 1fffe398d53f4f0e
      | x11: 1fffe398d53f4f0e x10: ffff6398d53f4f0e x9 : ffffb7827c6f76dc
      | x8 : ffff1cc6a9fa7877 x7 : 0000000000000001 x6 : ffff6398d53f4f0f
      | x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff1cc66f2a99c0
      | x2 : 0000000000040000 x1 : d7ce7775b09b5d00 x0 : 0000000000000000
      | Call trace:
      |  __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12
      |  machine_kexec_post_load+0x284/0x670 arch/arm64/kernel/machine_kexec.c:150
      |  do_kexec_load+0x570/0x670 kernel/kexec.c:155
      |  __do_sys_kexec_load kernel/kexec.c:250 [inline]
      |  __se_sys_kexec_load kernel/kexec.c:231 [inline]
      |  __arm64_sys_kexec_load+0x1d8/0x268 kernel/kexec.c:231
      |  __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
      |  invoke_syscall+0x90/0x2e0 arch/arm64/kernel/syscall.c:52
      |  el0_svc_common.constprop.2+0x1e4/0x2f8 arch/arm64/kernel/syscall.c:142
      |  do_el0_svc+0xf8/0x150 arch/arm64/kernel/syscall.c:181
      |  el0_svc+0x60/0x248 arch/arm64/kernel/entry-common.c:603
      |  el0t_64_sync_handler+0x90/0xb8 arch/arm64/kernel/entry-common.c:621
      |  el0t_64_sync+0x180/0x184 arch/arm64/kernel/entry.S:572
      | irq event stamp: 2428
      | hardirqs last  enabled at (2427): [<ffffb7827c6f2308>] __up_console_sem+0xf0/0x118 kernel/printk/printk.c:255
      | hardirqs last disabled at (2428): [<ffffb7828223df98>] el1_dbg+0x28/0x80 arch/arm64/kernel/entry-common.c:375
      | softirqs last  enabled at (2424): [<ffffb7827c411c00>] softirq_handle_end kernel/softirq.c:401 [inline]
      | softirqs last  enabled at (2424): [<ffffb7827c411c00>] __do_softirq+0xa28/0x11e4 kernel/softirq.c:587
      | softirqs last disabled at (2417): [<ffffb7827c59015c>] do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline]
      | softirqs last disabled at (2417): [<ffffb7827c59015c>] invoke_softirq kernel/softirq.c:439 [inline]
      | softirqs last disabled at (2417): [<ffffb7827c59015c>] __irq_exit_rcu kernel/softirq.c:636 [inline]
      | softirqs last disabled at (2417): [<ffffb7827c59015c>] irq_exit_rcu+0x53c/0x688 kernel/softirq.c:648
      | ---[ end trace 0ca578534e7ca938 ]---
      
      With or without DEBUG_VIRTUAL __pa() will fall back to __kimg_to_phys()
      for non-linear addresses, and will happen to do the right thing in this
      case, even with the warning. But we should not depend upon this, and to
      keep the warning useful we should fix this case.
      
      Fix this issue by using __pa_symbol(), which handles kernel image
      addresses (and checks its input is a kernel image address). This matches
      what we do elsewhere, e.g. in arch/arm64/include/asm/pgtable.h:
      
      | #define ZERO_PAGE(vaddr)       phys_to_page(__pa_symbol(empty_zero_page))
      
      Fixes: 3744b528 ("arm64: kexec: install a copy of the linear-map")
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: NPasha Tatashin <pasha.tatashin@soleen.com>
      Link: https://lore.kernel.org/r/20211130121849.3319010-1-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
      2f218324
    • S
      KVM: x86/mmu: Retry page fault if root is invalidated by memslot update · a955cad8
      Sean Christopherson 提交于
      Bail from the page fault handler if the root shadow page was obsoleted by
      a memslot update.  Do the check _after_ acuiring mmu_lock, as the TDP MMU
      doesn't rely on the memslot/MMU generation, and instead relies on the
      root being explicit marked invalid by kvm_mmu_zap_all_fast(), which takes
      mmu_lock for write.
      
      For the TDP MMU, inserting a SPTE into an obsolete root can leak a SP if
      kvm_tdp_mmu_zap_invalidated_roots() has already zapped the SP, i.e. has
      moved past the gfn associated with the SP.
      
      For other MMUs, the resulting behavior is far more convoluted, though
      unlikely to be truly problematic.  Installing SPs/SPTEs into the obsolete
      root isn't directly problematic, as the obsolete root will be unloaded
      and dropped before the vCPU re-enters the guest.  But because the legacy
      MMU tracks shadow pages by their role, any SP created by the fault can
      can be reused in the new post-reload root.  Again, that _shouldn't_ be
      problematic as any leaf child SPTEs will be created for the current/valid
      memslot generation, and kvm_mmu_get_page() will not reuse child SPs from
      the old generation as they will be flagged as obsolete.  But, given that
      continuing with the fault is pointess (the root will be unloaded), apply
      the check to all MMUs.
      
      Fixes: b7cccd39 ("KVM: x86/mmu: Fast invalidation for TDP MMU")
      Cc: stable@vger.kernel.org
      Cc: Ben Gardon <bgardon@google.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20211120045046.3940942-5-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a955cad8
    • D
      KVM: VMX: Set failure code in prepare_vmcs02() · bfbb307c
      Dan Carpenter 提交于
      The error paths in the prepare_vmcs02() function are supposed to set
      *entry_failure_code but this path does not.  It leads to using an
      uninitialized variable in the caller.
      
      Fixes: 71f73470 ("KVM: nVMX: Load GUEST_IA32_PERF_GLOBAL_CTRL MSR on VM-Entry")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Message-Id: <20211130125337.GB24578@kili>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      bfbb307c
    • P
      KVM: ensure APICv is considered inactive if there is no APIC · ef8b4b72
      Paolo Bonzini 提交于
      kvm_vcpu_apicv_active() returns false if a virtual machine has no in-kernel
      local APIC, however kvm_apicv_activated might still be true if there are
      no reasons to disable APICv; in fact it is quite likely that there is none
      because APICv is inhibited by specific configurations of the local APIC
      and those configurations cannot be programmed.  This triggers a WARN:
      
         WARN_ON_ONCE(kvm_apicv_activated(vcpu->kvm) != kvm_vcpu_apicv_active(vcpu));
      
      To avoid this, introduce another cause for APICv inhibition, namely the
      absence of an in-kernel local APIC.  This cause is enabled by default,
      and is dropped by either KVM_CREATE_IRQCHIP or the enabling of
      KVM_CAP_IRQCHIP_SPLIT.
      Reported-by: NIgnat Korchagin <ignat@cloudflare.com>
      Fixes: ee49a893 ("KVM: x86: Move SVM's APICv sanity check to common x86", 2021-10-22)
      Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: NSean Christopherson <seanjc@google.com>
      Tested-by: NIgnat Korchagin <ignat@cloudflare.com>
      Message-Id: <20211130123746.293379-1-pbonzini@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ef8b4b72
    • L
      KVM: x86/pmu: Fix reserved bits for AMD PerfEvtSeln register · cb1d220d
      Like Xu 提交于
      If we run the following perf command in an AMD Milan guest:
      
        perf stat \
        -e cpu/event=0x1d0/ \
        -e cpu/event=0x1c7/ \
        -e cpu/umask=0x1f,event=0x18e/ \
        -e cpu/umask=0x7,event=0x18e/ \
        -e cpu/umask=0x18,event=0x18e/ \
        ./workload
      
      dmesg will report a #GP warning from an unchecked MSR access
      error on MSR_F15H_PERF_CTLx.
      
      This is because according to APM (Revision: 4.03) Figure 13-7,
      the bits [35:32] of AMD PerfEvtSeln register is a part of the
      event select encoding, which extends the EVENT_SELECT field
      from 8 bits to 12 bits.
      
      Opportunistically update pmu->reserved_bits for reserved bit 19.
      Reported-by: NJim Mattson <jmattson@google.com>
      Fixes: ca724305 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM")
      Signed-off-by: NLike Xu <likexu@tencent.com>
      Message-Id: <20211118130320.95997-1-likexu@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cb1d220d
    • F
      x86/tsc: Disable clocksource watchdog for TSC on qualified platorms · b50db709
      Feng Tang 提交于
      There are cases that the TSC clocksource is wrongly judged as unstable by
      the clocksource watchdog mechanism which tries to validate the TSC against
      HPET, PM_TIMER or jiffies. While there is hardly a general reliable way to
      check the validity of a watchdog, Thomas Gleixner proposed [1]:
      
      "I'm inclined to lift that requirement when the CPU has:
      
          1) X86_FEATURE_CONSTANT_TSC
          2) X86_FEATURE_NONSTOP_TSC
          3) X86_FEATURE_NONSTOP_TSC_S3
          4) X86_FEATURE_TSC_ADJUST
          5) At max. 4 sockets
      
       After two decades of horrors we're finally at a point where TSC seems
       to be halfway reliable and less abused by BIOS tinkerers. TSC_ADJUST
       was really key as we can now detect even small modifications reliably
       and the important point is that we can cure them as well (not pretty
       but better than all other options)."
      
      As feature #3 X86_FEATURE_NONSTOP_TSC_S3 only exists on several generations
      of Atom processorz, and is always coupled with X86_FEATURE_CONSTANT_TSC
      and X86_FEATURE_NONSTOP_TSC, skip checking it, and also be more defensive
      to use maximal 2 sockets.
      
      The check is done inside tsc_init() before registering 'tsc-early' and
      'tsc' clocksources, as there were cases that both of them had been
      wrongly judged as unreliable.
      
      For more background of tsc/watchdog, there is a good summary in [2]
      
      [tglx} Update vs. jiffies:
      
        On systems where the only remaining clocksource aside of TSC is jiffies
        there is no way to make this work because that creates a circular
        dependency. Jiffies accuracy depends on not missing a periodic timer
        interrupt, which is not guaranteed. That could be detected by TSC, but as
        TSC is not trusted this cannot be compensated. The consequence is a
        circulus vitiosus which results in shutting down TSC and falling back to
        the jiffies clocksource which is even more unreliable.
      
      [1]. https://lore.kernel.org/lkml/87eekfk8bd.fsf@nanos.tec.linutronix.de/
      [2]. https://lore.kernel.org/lkml/87a6pimt1f.ffs@nanos.tec.linutronix.de/
      
      [ tglx: Refine comment and amend changelog ]
      
      Fixes: 6e3cd952 ("x86/hpet: Use another crystalball to evaluate HPET usability")
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NFeng Tang <feng.tang@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: "Paul E. McKenney" <paulmck@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20211117023751.24190-2-feng.tang@intel.com
      b50db709
    • F
      x86/tsc: Add a timer to make sure TSC_adjust is always checked · c7719e79
      Feng Tang 提交于
      The TSC_ADJUST register is checked every time a CPU enters idle state, but
      Thomas Gleixner mentioned there is still a caveat that a system won't enter
      idle [1], either because it's too busy or configured purposely to not enter
      idle.
      
      Setup a periodic timer (every 10 minutes) to make sure the check is
      happening on a regular base.
      
      [1] https://lore.kernel.org/lkml/875z286xtk.fsf@nanos.tec.linutronix.de/
      
      Fixes: 6e3cd952 ("x86/hpet: Use another crystalball to evaluate HPET usability")
      Requested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NFeng Tang <feng.tang@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: "Paul E. McKenney" <paulmck@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20211117023751.24190-1-feng.tang@intel.com
      c7719e79
  7. 01 12月, 2021 4 次提交
  8. 30 11月, 2021 9 次提交