提交 · 826b7373d1cc294273bd533a87658469398d944e · openeuler / Kernel

10 5月, 2022 6 次提交

KVM: x86: Forcibly leave nested virt when SMM state is toggled · 826b7373

由 Sean Christopherson 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.97
commit 080dbe7e9b86a0392d8dffc00d9971792afc121f
bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=080dbe7e9b86a0392d8dffc00d9971792afc121f

--------------------------------

commit f7e57078 upstream.

Forcibly leave nested virtualization operation if userspace toggles SMM
state via KVM_SET_VCPU_EVENTS or KVM_SYNC_X86_EVENTS.  If userspace
forces the vCPU out of SMM while it's post-VMXON and then injects an SMI,
vmx_enter_smm() will overwrite vmx->nested.smm.vmxon and end up with both
vmxon=false and smm.vmxon=false, but all other nVMX state allocated.

Don't attempt to gracefully handle the transition as (a) most transitions
are nonsencial, e.g. forcing SMM while L2 is running, (b) there isn't
sufficient information to handle all transitions, e.g. SVM wants access
to the SMRAM save state, and (c) KVM_SET_VCPU_EVENTS must precede
KVM_SET_NESTED_STATE during state restore as the latter disallows putting
the vCPU into L2 if SMM is active, and disallows tagging the vCPU as
being post-VMXON in SMM if SMM is not active.

Abuse of KVM_SET_VCPU_EVENTS manifests as a WARN and memory leak in nVMX
due to failure to free vmcs01's shadow VMCS, but the bug goes far beyond
just a memory leak, e.g. toggling SMM on while L2 is active puts the vCPU
in an architecturally impossible state.

  WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline]
  WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656
  Modules linked in:
  CPU: 1 PID: 3606 Comm: syz-executor725 Not tainted 5.17.0-rc1-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  RIP: 0010:free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline]
  RIP: 0010:free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656
  Code: <0f> 0b eb b3 e8 8f 4d 9f 00 e9 f7 fe ff ff 48 89 df e8 92 4d 9f 00
  Call Trace:
   <TASK>
   kvm_arch_vcpu_destroy+0x72/0x2f0 arch/x86/kvm/x86.c:11123
   kvm_vcpu_destroy arch/x86/kvm/../../../virt/kvm/kvm_main.c:441 [inline]
   kvm_destroy_vcpus+0x11f/0x290 arch/x86/kvm/../../../virt/kvm/kvm_main.c:460
   kvm_free_vcpus arch/x86/kvm/x86.c:11564 [inline]
   kvm_arch_destroy_vm+0x2e8/0x470 arch/x86/kvm/x86.c:11676
   kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1217 [inline]
   kvm_put_kvm+0x4fa/0xb00 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1250
   kvm_vm_release+0x3f/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1273
   __fput+0x286/0x9f0 fs/file_table.c:311
   task_work_run+0xdd/0x1a0 kernel/task_work.c:164
   exit_task_work include/linux/task_work.h:32 [inline]
   do_exit+0xb29/0x2a30 kernel/exit.c:806
   do_group_exit+0xd2/0x2f0 kernel/exit.c:935
   get_signal+0x4b0/0x28c0 kernel/signal.c:2862
   arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868
   handle_signal_work kernel/entry/common.c:148 [inline]
   exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
   exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207
   __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
   syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
   do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
   entry_SYSCALL_64_after_hwframe+0x44/0xae
   </TASK>

Cc: stable@vger.kernel.org
Reported-by: syzbot+8112db3ab20e70d50c31@syzkaller.appspotmail.com
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20220125220358.2091737-1-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Backported-by: NTadeusz Struk <tadeusz.struk@linaro.org>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

826b7373

x86/MCE/AMD: Allow thresholding interface updates after init · 29e9ffc7

由 Yazen Ghannam 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit 08f090bb9b6951a510437ef26ad78ffb3ee17142
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=08f090bb9b6951a510437ef26ad78ffb3ee17142

--------------------------------

commit 1f52b0ab upstream.

Changes to the AMD Thresholding sysfs code prevents sysfs writes from
updating the underlying registers once CPU init is completed, i.e.
"threshold_banks" is set.

Allow the registers to be updated if the thresholding interface is
already initialized or if in the init path. Use the "set_lvt_off" value
to indicate if running in the init path, since this value is only set
during init.

Fixes: a037f3ca ("x86/mce/amd: Make threshold bank setting hotplug robust")
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220117161328.19148-1-yazen.ghannam@amd.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

29e9ffc7

KVM: x86: Update vCPU's runtime CPUID on write to MSR_IA32_XSS · 83e122dd

由 Like Xu 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit e92cac1dd803aca5bc326ec22bdcd4f56855d7ce
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e92cac1dd803aca5bc326ec22bdcd4f56855d7ce

--------------------------------

commit 4c282e51 upstream.

Do a runtime CPUID update for a vCPU if MSR_IA32_XSS is written, as the
size in bytes of the XSAVE area is affected by the states enabled in XSS.

Fixes: 20300099 ("kvm: vmx: add MSR logic for XSAVES")
Cc: stable@vger.kernel.org
Signed-off-by: NLike Xu <likexu@tencent.com>
[sean: split out as a separate patch, adjust Fixes tag]
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20220126172226.2298529-3-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

83e122dd

perf/x86/intel/uncore: Fix CAS_COUNT_WRITE issue for ICX · 0442358b

由 Zhengjun Xing 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit 7a32d17fb73a607dcb0797cdd6edbccd76fa059a
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7a32d17fb73a607dcb0797cdd6edbccd76fa059a

--------------------------------

commit 96fd2e89 upstream.

The user recently report a perf issue in the ICX platform, when test by
perf event “uncore_imc_x/cas_count_write”,the write bandwidth is always
very small (only 0.38MB/s), it is caused by the wrong "umask" for the
"cas_count_write" event. When double-checking, find "cas_count_read"
also is wrong.

The public document for ICX uncore:

3rd Gen Intel® Xeon® Processor Scalable Family, Codename Ice Lake,Uncore
Performance Monitoring Reference Manual, Revision 1.00, May 2021

On 2.4.7, it defines Unit Masks for CAS_COUNT:
RD b00001111
WR b00110000

So corrected both "cas_count_read" and "cas_count_write" for ICX.

Old settings:
 hswep_uncore_imc_events
	INTEL_UNCORE_EVENT_DESC(cas_count_read,  "event=0x04,umask=0x03")
	INTEL_UNCORE_EVENT_DESC(cas_count_write, "event=0x04,umask=0x0c")

New settings:
 snr_uncore_imc_events
	INTEL_UNCORE_EVENT_DESC(cas_count_read,  "event=0x04,umask=0x0f")
	INTEL_UNCORE_EVENT_DESC(cas_count_write, "event=0x04,umask=0x30")

Fixes: 2b3b76b5 ("perf/x86/intel/uncore: Add Ice Lake server uncore support")
Signed-off-by: NZhengjun Xing <zhengjun.xing@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NAdrian Hunter <adrian.hunter@intel.com>
Reviewed-by: NKan Liang <kan.liang@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20211223144826.841267-1-zhengjun.xing@linux.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0442358b

Revert "KVM: SVM: avoid infinite loop on NPF from bad address" · feeadf29

由 Sean Christopherson 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit a2c8e1d9e41b7d916257653d3bbe36418c4e7b88
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a2c8e1d9e41b7d916257653d3bbe36418c4e7b88

--------------------------------

commit 31c25585 upstream.

Revert a completely broken check on an "invalid" RIP in SVM's workaround
for the DecodeAssists SMAP errata.  kvm_vcpu_gfn_to_memslot() obviously
expects a gfn, i.e. operates in the guest physical address space, whereas
RIP is a virtual (not even linear) address.  The "fix" worked for the
problematic KVM selftest because the test identity mapped RIP.

Fully revert the hack instead of trying to translate RIP to a GPA, as the
non-SEV case is now handled earlier, and KVM cannot access guest page
tables to translate RIP.

This reverts commit e72436bc.

Fixes: e72436bc ("KVM: SVM: avoid infinite loop on NPF from bad address")
Reported-by: NLiam Merwick <liam.merwick@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <seanjc@google.com>
Reviewed-by: NLiam Merwick <liam.merwick@oracle.com>
Message-Id: <20220120010719.711476-3-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

feeadf29

KVM: x86/mmu: Fix write-protection of PTs mapped by the TDP MMU · 34011a9a

由 David Matlack 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.95
commit a447d7f786ec925d1c23f6509255f43ffc2ddffe
bugzilla: https://gitee.com/openeuler/kernel/issues/I55EDV

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a447d7f786ec925d1c23f6509255f43ffc2ddffe

--------------------------------

commit 7c8a4742 upstream.

When the TDP MMU is write-protection GFNs for page table protection (as
opposed to for dirty logging, or due to the HVA not being writable), it
checks if the SPTE is already write-protected and if so skips modifying
the SPTE and the TLB flush.

This behavior is incorrect because it fails to check if the SPTE
is write-protected for page table protection, i.e. fails to check
that MMU-writable is '0'.  If the SPTE was write-protected for dirty
logging but not page table protection, the SPTE could locklessly be made
writable, and vCPUs could still be running with writable mappings cached
in their TLB.

Fix this by only skipping setting the SPTE if the SPTE is already
write-protected *and* MMU-writable is already clear.  Technically,
checking only MMU-writable would suffice; a SPTE cannot be writable
without MMU-writable being set.  But check both to be paranoid and
because it arguably yields more readable code.

Fixes: 46044f72 ("kvm: x86/mmu: Support write protection for nesting in tdp MMU")
Cc: stable@vger.kernel.org
Signed-off-by: NDavid Matlack <dmatlack@google.com>
Message-Id: <20220113233020.3986005-2-dmatlack@google.com>
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

34011a9a

28 4月, 2022 1 次提交

Revert "clocksource: Reduce clocksource-skew threshold" · 81d82781

由 Zheng Zengkai 提交于 4月 28, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=cacc6c30e3eb7c452132ee5b273e248d2f263323

--------------------------------

This reverts commit 270507d8.
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

81d82781

27 4月, 2022 21 次提交

x86: KVM: Fixed the bug that WAITmax cannot be updated in real time · 414a578b

由 liangtian 提交于 4月 27, 2022

virt inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I53PTV?from=project-issue
CVE: NA

-----------------------------------------------------

Since the reset function is in kvm_intel module instead of kvm
module, the attribute weak function in kvm_main.c could not be found, which
would cause st_max in X86 never be refreshed.
The solution is to define the reset function in x86.c under the kvm module.
Signed-off-by: Nliangtian <liangtian13@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

414a578b

x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT · 1fa7251b

由 Josh Poimboeuf 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit d04937ae94903087279e4a016b7741cdee59d521
category: bugfix
bugzilla: 186453 https://gitee.com/src-openeuler/kernel/issues/I50WBM
CVE: CVE-2022-0001

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d04937ae9490

--------------------------------

commit 0de05d05 upstream.

The commit

44a3918c ("x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting")

added a warning for the "eIBRS + unprivileged eBPF" combination, which
has been shown to be vulnerable against Spectre v2 BHB-based attacks.

However, there's no warning about the "eIBRS + LFENCE retpoline +
unprivileged eBPF" combo. The LFENCE adds more protection by shortening
the speculation window after a mispredicted branch. That makes an attack
significantly more difficult, even with unprivileged eBPF. So at least
for now the logic doesn't warn about that combination.

But if you then add SMT into the mix, the SMT attack angle weakens the
effectiveness of the LFENCE considerably.

So extend the "eIBRS + unprivileged eBPF" warning to also include the
"eIBRS + LFENCE + unprivileged eBPF + SMT" case.

[ bp: Massage commit message. ]
Suggested-by: NAlyssa Milburn <alyssa.milburn@linux.intel.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1fa7251b

x86/speculation: Warn about Spectre v2 LFENCE mitigation · 0d3bfece

由 Josh Poimboeuf 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit cc9e3e55bde71b2fac1494f503d5ffc560c7fb8d
category: bugfix
bugzilla: 186453 https://gitee.com/src-openeuler/kernel/issues/I50WBM
CVE: CVE-2022-0001

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=cc9e3e55bde7

--------------------------------

commit eafd987d upstream.

With:

  f8a66d60 ("x86,bugs: Unconditionally allow spectre_v2=retpoline,amd")

it became possible to enable the LFENCE "retpoline" on Intel. However,
Intel doesn't recommend it, as it has some weaknesses compared to
retpoline.

Now AMD doesn't recommend it either.

It can still be left available as a cmdline option. It's faster than
retpoline but is weaker in certain scenarios -- particularly SMT, but
even non-SMT may be vulnerable in some cases.

So just unconditionally warn if the user requests it on the cmdline.

  [ bp: Massage commit message. ]
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0d3bfece

x86/speculation: Use generic retpoline by default on AMD · a0c71f13

由 Kim Phillips 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit 2fdf67a1d215574c31b1a716f80fa0fdccd401d7
category: bugfix
bugzilla: 186453 https://gitee.com/src-openeuler/kernel/issues/I50WBM
CVE: CVE-2022-0001

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2fdf67a1d215

--------------------------------

commit 244d00b5 upstream.

AMD retpoline may be susceptible to speculation. The speculation
execution window for an incorrect indirect branch prediction using
LFENCE/JMP sequence may potentially be large enough to allow
exploitation using Spectre V2.

By default, don't use retpoline,lfence on AMD.  Instead, use the
generic retpoline.
Signed-off-by: NKim Phillips <kim.phillips@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a0c71f13

x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting · 131f862c

由 Josh Poimboeuf 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit afc2d635b5e18e2b33116d8e121ee149882e33eb
category: bugfix
bugzilla: 186453 https://gitee.com/src-openeuler/kernel/issues/I50WBM
CVE: CVE-2022-0001

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=afc2d635b5e1

--------------------------------

commit 44a3918c upstream.

With unprivileged eBPF enabled, eIBRS (without retpoline) is vulnerable
to Spectre v2 BHB-based attacks.

When both are enabled, print a warning message and report it in the
'spectre_v2' sysfs vulnerabilities file.
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
[fllinden@amazon.com: backported to 5.10]
Signed-off-by: NFrank van der Linden <fllinden@amazon.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

131f862c

x86/speculation: Add eIBRS + Retpoline options · 5db42e7e

由 Peter Zijlstra 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit a6a119d647ad1f73067d3cffb43104df3f920bcc
category: bugfix
bugzilla: 186453 https://gitee.com/src-openeuler/kernel/issues/I50WBM
CVE: CVE-2022-0001

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a6a119d647ad

--------------------------------

commit 1e19da85 upstream.

Thanks to the chaps at VUsec it is now clear that eIBRS is not
sufficient, therefore allow enabling of retpolines along with eIBRS.

Add spectre_v2=eibrs, spectre_v2=eibrs,lfence and
spectre_v2=eibrs,retpoline options to explicitly pick your preferred
means of mitigation.

Since there's new mitigations there's also user visible changes in
/sys/devices/system/cpu/vulnerabilities/spectre_v2 to reflect these
new mitigations.

  [ bp: Massage commit message, trim error messages,
    do more precise eIBRS mode checking. ]
Co-developed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NPatrick Colp <patrick.colp@oracle.com>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5db42e7e

x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE · 1876e00c

由 Peter Zijlstra (Intel) 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit f38774bb6e231d647d40ceeb8ddf9082eabde667
category: bugfix
bugzilla: 186453 https://gitee.com/src-openeuler/kernel/issues/I50WBM
CVE: CVE-2022-0001

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f38774bb6e23

--------------------------------

commit d45476d9 upstream.

The RETPOLINE_AMD name is unfortunate since it isn't necessarily
AMD only, in fact Hygon also uses it. Furthermore it will likely be
sufficient for some Intel processors. Therefore rename the thing to
RETPOLINE_LFENCE to better describe what it is.

Add the spectre_v2=retpoline,lfence option as an alias to
spectre_v2=retpoline,amd to preserve existing setups. However, the output
of /sys/devices/system/cpu/vulnerabilities/spectre_v2 will be changed.

  [ bp: Fix typos, massage. ]
Co-developed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
[fllinden@amazon.com: backported to 5.10]
Signed-off-by: NFrank van der Linden <fllinden@amazon.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1876e00c

x86,bugs: Unconditionally allow spectre_v2=retpoline,amd · 6441243d

由 Peter Zijlstra 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit 206cfe2dac3ed79bcd1c759f05400593a5f55488
category: bugfix
bugzilla: 186453 https://gitee.com/src-openeuler/kernel/issues/I50WBM
CVE: CVE-2022-0001

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=206cfe2dac3e

--------------------------------

commit f8a66d60 upstream.

Currently Linux prevents usage of retpoline,amd on !AMD hardware, this
is unfriendly and gets in the way of testing. Remove this restriction.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Tested-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120310.487348118@infradead.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

6441243d

x86/kbuild: Enable CONFIG_KALLSYMS_ALL=y in the defconfigs · 7ab8c064

由 Ingo Molnar 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit d240b08d8ac4e85909f2d90e573688131e8f9284
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d240b08d8ac4e85909f2d90e573688131e8f9284

--------------------------------

[ Upstream commit b6aa86cf ]

Most distro kernels have this option enabled, to improve debug output.

Lockdep also selects it.

Enable this in the defconfig kernel as well, to make it more
representative of what people are using on x86.
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/YdTn7gssoMVDMgMw@gmail.comSigned-off-by: NSasha Levin <sashal@kernel.org>

 Conflicts:
	arch/x86/configs/i386_defconfig
	arch/x86/configs/x86_64_defconfig
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

7ab8c064

um: registers: Rename function names to avoid conflicts and build problems · ca0a384d

由 Randy Dunlap 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit 756a7188b277f10b807e6e7321ccf8b929cc6e4a
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=756a7188b277f10b807e6e7321ccf8b929cc6e4a

--------------------------------

[ Upstream commit 077b7320 ]

The function names init_registers() and restore_registers() are used
in several net/ethernet/ and gpu/drm/ drivers for other purposes (not
calls to UML functions), so rename them.

This fixes multiple build errors.
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: linux-um@lists.infradead.org
Signed-off-by: NRichard Weinberger <richard@nod.at>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

ca0a384d

x86/mce: Mark mce_read_aux() noinstr · 246b0275

由 Borislav Petkov 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit 8c72de32ff134f48115591b9ea2bb03c1bbd3804
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=8c72de32ff134f48115591b9ea2bb03c1bbd3804

--------------------------------

[ Upstream commit db6c996d ]

Fixes

  vmlinux.o: warning: objtool: do_machine_check()+0x681: call to mce_read_aux() leaves .noinstr.text section
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-10-bp@alien8.deSigned-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

246b0275

x86/mce: Mark mce_end() noinstr · 423629d9

由 Borislav Petkov 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit 1ad3e60f1fec185d11196028136e60e8e3009b37
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=1ad3e60f1fec185d11196028136e60e8e3009b37

--------------------------------

[ Upstream commit b4813539 ]

It is called by the #MC handler which is noinstr.

Fixes

  vmlinux.o: warning: objtool: do_machine_check()+0xbd6: call to memset() leaves .noinstr.text section
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-9-bp@alien8.deSigned-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

423629d9

x86/mce: Mark mce_panic() noinstr · b5da0c18

由 Borislav Petkov 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit f21ca973b43fb23416bd89dc267aa51249c20afb
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f21ca973b43fb23416bd89dc267aa51249c20afb

--------------------------------

[ Upstream commit 3c7ce80a ]

And allow instrumentation inside it because it does calls to other
facilities which will not be tagged noinstr.

Fixes

  vmlinux.o: warning: objtool: do_machine_check()+0xc73: call to mce_panic() leaves .noinstr.text section
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-8-bp@alien8.deSigned-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

b5da0c18

x86/mce: Allow instrumentation during task work queueing · 696cc432

由 Borislav Petkov 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit de360d94438688fd29e548a79abb9ee6ecd4de0f
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=de360d94438688fd29e548a79abb9ee6ecd4de0f

--------------------------------

[ Upstream commit 4fbce464 ]

Fixes

  vmlinux.o: warning: objtool: do_machine_check()+0xdb1: call to queue_task_work() leaves .noinstr.text section
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208111343.8130-6-bp@alien8.deSigned-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

696cc432

x86/mm: Flush global TLB when switching to trampoline page-table · 4d158161

由 Joerg Roedel 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit e61aa46d0f27bd460080ccd244296d1944b9813e
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e61aa46d0f27bd460080ccd244296d1944b9813e

--------------------------------

[ Upstream commit 71d5049b ]

Move the switching code into a function so that it can be re-used and
add a global TLB flush. This makes sure that usage of memory which is
not mapped in the trampoline page-table is reliably caught.

Also move the clearing of CR4.PCIDE before the CR3 switch because the
cr4_clear_bits() function will access data not mapped into the
trampoline page-table.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211202153226.22946-4-joro@8bytes.orgSigned-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

4d158161

clocksource: Reduce clocksource-skew threshold · 270507d8

由 Paul E. McKenney 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit cacc6c30e3eb7c452132ee5b273e248d2f263323
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=cacc6c30e3eb7c452132ee5b273e248d2f263323

--------------------------------

[ Upstream commit 2e27e793 ]

Currently, WATCHDOG_THRESHOLD is set to detect a 62.5-millisecond skew in
a 500-millisecond WATCHDOG_INTERVAL. This requires that clocks be skewed
by more than 12.5% in order to be marked unstable. Except that a clock
that is skewed by that much is probably destroying unsuspecting software
right and left. And given that there are now checks for false-positive
skews due to delays between reading the two clocks, it should be possible
to greatly decrease WATCHDOG_THRESHOLD, at least for fine-grained clocks
such as TSC.

Therefore, add a new uncertainty_margin field to the clocksource structure
that contains the maximum uncertainty in nanoseconds for the corresponding
clock. This field may be initialized manually, as it is for
clocksource_tsc_early and clocksource_jiffies, which is copied to
refined_jiffies. If the field is not initialized manually, it will be
computed at clock-registry time as the period of the clock in question
based on the scale and freq parameters to __clocksource_update_freq_scale()
function. If either of those two parameters are zero, the
tens-of-milliseconds WATCHDOG_THRESHOLD is used as a cowardly alternative
to dividing by zero. No matter how the uncertainty_margin field is
calculated, it is bounded below by twice WATCHDOG_MAX_SKEW, that is, by 100
microseconds.

Note that manually initialized uncertainty_margin fields are not adjusted,
but there is a WARN_ON_ONCE() that triggers if any such field is less than
twice WATCHDOG_MAX_SKEW. This WARN_ON_ONCE() is intended to discourage
production use of the one-nanosecond uncertainty_margin values that are
used to test the clock-skew code itself.

The actual clock-skew check uses the sum of the uncertainty_margin fields
of the two clocksource structures being compared. Integer overflow is
avoided because the largest computed value of the uncertainty_margin
fields is one billion (10^9), and double that value fits into an
unsigned int. However, if someone manually specifies (say) UINT_MAX,
they will get what they deserve.

Note that the refined_jiffies uncertainty_margin field is initialized to
TICK_NSEC, which means that skew checks involving this clocksource will
be sufficently forgiving. In a similar vein, the clocksource_tsc_early
uncertainty_margin field is initialized to 32*NSEC_PER_MSEC, which
replicates the current behavior and allows custom setting if needed
in order to address the rare skews detected for this clocksource in
current mainline.
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NFeng Tang <feng.tang@intel.com>
Link: https://lore.kernel.org/r/20210527190124.440372-4-paulmck@kernel.orgSigned-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

270507d8

x86/mce/inject: Avoid out-of-bounds write when setting flags · c6f9076c

由 Zhang Zixun 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit 595e1ec55b307d232f8672ccbe6c84089b277b43
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=595e1ec55b307d232f8672ccbe6c84089b277b43

--------------------------------

[ Upstream commit de768416 ]

A contrived zero-length write, for example, by using write(2):

  ...
  ret = write(fd, str, 0);
  ...

to the "flags" file causes:

  BUG: KASAN: stack-out-of-bounds in flags_write
  Write of size 1 at addr ffff888019be7ddf by task writefile/3787

  CPU: 4 PID: 3787 Comm: writefile Not tainted 5.16.0-rc7+ #12
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014

due to accessing buf one char before its start.

Prevent such out-of-bounds access.

  [ bp: Productize into a proper patch. Link below is the next best
    thing because the original mail didn't get archived on lore. ]

Fixes: 0451d14d ("EDAC, mce_amd_inj: Modify flags attribute to use string arguments")
Signed-off-by: NZhang Zixun <zhang133010@icloud.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/linux-edac/YcnePfF1OOqoQwrX@zn.tnic/Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

c6f9076c

x86/boot/compressed: Move CLANG_FLAGS to beginning of KBUILD_CFLAGS · 7dc43407

由 Nathan Chancellor 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit aea5302d9ddc8c9f637393c63d824f45026e906e
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=aea5302d9ddc8c9f637393c63d824f45026e906e

--------------------------------

[ Upstream commit 5fe392ff ]

When cross compiling i386_defconfig on an arm64 host with clang, there
are a few instances of '-Waddress-of-packed-member' and
'-Wgnu-variable-sized-type-not-at-end' in arch/x86/boot/compressed/,
which should both be disabled with the cc-disable-warning calls in that
directory's Makefile, which indicates that cc-disable-warning is failing
at the point of testing these flags.

The cc-disable-warning calls fail because at the point that the flags
are tested, KBUILD_CFLAGS has '-march=i386' without $(CLANG_FLAGS),
which has the '--target=' flag to tell clang what architecture it is
targeting. Without the '--target=' flag, the host architecture (arm64)
is used and i386 is not a valid value for '-march=' in that case. This
error can be seen by adding some logging to try-run:

  clang-14: error: the clang compiler does not support '-march=i386'

Invoking the compiler has to succeed prior to calling cc-option or
cc-disable-warning in order to accurately test whether or not the flag
is supported; if it doesn't, the requested flag can never be added to
the compiler flags. Move $(CLANG_FLAGS) to the beginning of KBUILD_FLAGS
so that any new flags that might be added in the future can be
accurately tested.

Fixes: d5cbd80e ("x86/boot: Add $(CLANG_FLAGS) to compressed KBUILD_CFLAGS")
Signed-off-by: NNathan Chancellor <nathan@kernel.org>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211222163040.1961481-1-nathan@kernel.orgSigned-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

7dc43407

x86/uaccess: Move variable into switch case statement · 6406aa42

由 Kees Cook 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit d21b47c607379c50924f961ea45cdb7702bf8007
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d21b47c607379c50924f961ea45cdb7702bf8007

--------------------------------

[ Upstream commit 61646ca8 ]

When building with automatic stack variable initialization, GCC 12
complains about variables defined outside of switch case statements.
Move the variable into the case that uses it, which silences the warning:

./arch/x86/include/asm/uaccess.h:317:23: warning: statement will never be executed [-Wswitch-unreachable]
  317 |         unsigned char x_u8__; \
      |                       ^~~~~~

Fixes: 865c50e1 ("x86/uaccess: utilize CONFIG_CC_HAS_ASM_GOTO_OUTPUT")
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211209043456.1377875-1-keescook@chromium.orgSigned-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

6406aa42

x86/gpu: Reserve stolen memory for first integrated Intel GPU · 07b971da

由 Lucas De Marchi 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit 98259dd54e8e0b22400bfe858569423ee4f031f3
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=98259dd54e8e0b22400bfe858569423ee4f031f3

--------------------------------

commit 9c494ca4 upstream.

"Stolen memory" is memory set aside for use by an Intel integrated GPU.
The intel_graphics_quirks() early quirk reserves this memory when it is
called for a GPU that appears in the intel_early_ids[] table of integrated
GPUs.

Previously intel_graphics_quirks() was marked as QFLAG_APPLY_ONCE, so it
was called only for the first Intel GPU found.  If a discrete GPU happened
to be enumerated first, intel_graphics_quirks() was called for it but not
for any integrated GPU found later.  Therefore, stolen memory for such an
integrated GPU was never reserved.

For example, this problem occurs in this Alderlake-P (integrated) + DG2
(discrete) topology where the DG2 is found first, but stolen memory is
associated with the integrated GPU:

  - 00:01.0 Bridge
    `- 03:00.0 DG2 discrete GPU
  - 00:02.0 Integrated GPU (with stolen memory)

Remove the QFLAG_APPLY_ONCE flag and call intel_graphics_quirks() for every
Intel GPU.  Reserve stolen memory for the first GPU that appears in
intel_early_ids[].

[bhelgaas: commit log, add code comment, squash in
https://lore.kernel.org/r/20220118190558.2ququ4vdfjuahicm@ldmartin-desk2]
Link: https://lore.kernel.org/r/20220114002843.2083382-1-lucas.demarchi@intel.comSigned-off-by: NLucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

07b971da

KVM: VMX: switch blocked_vcpu_on_cpu_lock to raw spinlock · ea6515b7

由 Marcelo Tosatti 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.94
commit aa1346113c752783f585d1d08627cfa38aa14e47
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=aa1346113c752783f585d1d08627cfa38aa14e47

--------------------------------

commit 5f02ef74 upstream.

blocked_vcpu_on_cpu_lock is taken from hard interrupt context
(pi_wakeup_handler), therefore it cannot sleep.

Switch it to a raw spinlock.

Fixes:

[41297.066254] BUG: scheduling while atomic: CPU 0/KVM/635218/0x00010001
[41297.066323] Preemption disabled at:
[41297.066324] [<ffffffff902ee47f>] irq_enter_rcu+0xf/0x60
[41297.066339] Call Trace:
[41297.066342]  <IRQ>
[41297.066346]  dump_stack_lvl+0x34/0x44
[41297.066353]  ? irq_enter_rcu+0xf/0x60
[41297.066356]  __schedule_bug.cold+0x7d/0x8b
[41297.066361]  __schedule+0x439/0x5b0
[41297.066365]  ? task_blocks_on_rt_mutex.constprop.0.isra.0+0x1b0/0x440
[41297.066369]  schedule_rtlock+0x1e/0x40
[41297.066371]  rtlock_slowlock_locked+0xf1/0x260
[41297.066374]  rt_spin_lock+0x3b/0x60
[41297.066378]  pi_wakeup_handler+0x31/0x90 [kvm_intel]
[41297.066388]  sysvec_kvm_posted_intr_wakeup_ipi+0x9d/0xd0
[41297.066392]  </IRQ>
[41297.066392]  asm_sysvec_kvm_posted_intr_wakeup_ipi+0x12/0x20
...
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

ea6515b7

19 4月, 2022 3 次提交

KVM: x86: remove PMU FIXED_CTR3 from msrs_to_save_all · a0e48d8a

由 Wei Wang 提交于 4月 19, 2022

stable inclusion
from stable-v5.10.93
commit 4c7fb4d519e599bb69581d80fbfc1392cbea5fea
bugzilla: 186204 https://gitee.com/openeuler/kernel/issues/I5311N

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4c7fb4d519e599bb69581d80fbfc1392cbea5fea

--------------------------------

commit 9fb12fe5 upstream.

The fixed counter 3 is used for the Topdown metrics, which hasn't been
enabled for KVM guests. Userspace accessing to it will fail as it's not
included in get_fixed_pmc(). This breaks KVM selftests on ICX+ machines,
which have this counter.

To reproduce it on ICX+ machines, ./state_test reports:
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

==== Test Assertion Failure ====
lib/x86_64/processor.c:1078: r == nmsrs
pid=4564 tid=4564 - Argument list too long
1  0x000000000040b1b9: vcpu_save_state at processor.c:1077
2  0x0000000000402478: main at state_test.c:209 (discriminator 6)
3  0x00007fbe21ed5f92: ?? ??:0
4  0x000000000040264d: _start at ??:?
 Unexpected result from KVM_GET_MSRS, r: 17 (failed MSR was 0x30c)

With this patch, it works well.
Signed-off-by: NWei Wang <wei.w.wang@intel.com>
Message-Id: <20211217124934.32893-1-wei.w.wang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Fixes: e2ada66e ("kvm: x86: Add Intel PMU MSRs to msrs_to_save[]")
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a0e48d8a

KVM: x86: Register Processor Trace interrupt hook iff PT enabled in guest · 72420054

由 Sean Christopherson 提交于 4月 19, 2022

stable inclusion
from stable-v5.10.93
commit 413b427f5fff5d658c2605ca889d6b13b88efd0c
bugzilla: 186204 https://gitee.com/openeuler/kernel/issues/I5311N

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=413b427f5fff5d658c2605ca889d6b13b88efd0c

--------------------------------

commit f4b027c5 upstream.

Override the Processor Trace (PT) interrupt handler for guest mode if and
only if PT is configured for host+guest mode, i.e. is being used
independently by both host and guest.  If PT is configured for system
mode, the host fully controls PT and must handle all events.

Fixes: 8479e04e ("KVM: x86: Inject PMI for KVM guest")
Reported-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
Reported-by: NArtem Kashkanov <artem.kashkanov@intel.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211111020738.2512932-4-seanjc@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

72420054

perf: Protect perf_guest_cbs with RCU · 6482c46d

由 Sean Christopherson 提交于 4月 19, 2022

stable inclusion
from stable-v5.10.93
commit 723acd75a062f7630ed9149733a47d4158f5dbdf
bugzilla: 186204 https://gitee.com/openeuler/kernel/issues/I5311N

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=723acd75a062f7630ed9149733a47d4158f5dbdf

--------------------------------

commit ff083a2d upstream.

Protect perf_guest_cbs with RCU to fix multiple possible errors.  Luckily,
all paths that read perf_guest_cbs already require RCU protection, e.g. to
protect the callback chains, so only the direct perf_guest_cbs touchpoints
need to be modified.

Bug #1 is a simple lack of WRITE_ONCE/READ_ONCE behavior to ensure
perf_guest_cbs isn't reloaded between a !NULL check and a dereference.
Fixed via the READ_ONCE() in rcu_dereference().

Bug #2 is that on weakly-ordered architectures, updates to the callbacks
themselves are not guaranteed to be visible before the pointer is made
visible to readers.  Fixed by the smp_store_release() in
rcu_assign_pointer() when the new pointer is non-NULL.

Bug #3 is that, because the callbacks are global, it's possible for
readers to run in parallel with an unregisters, and thus a module
implementing the callbacks can be unloaded while readers are in flight,
resulting in a use-after-free.  Fixed by a synchronize_rcu() call when
unregistering callbacks.

Bug #1 escaped notice because it's extremely unlikely a compiler will
reload perf_guest_cbs in this sequence.  perf_guest_cbs does get reloaded
for future derefs, e.g. for ->is_user_mode(), but the ->is_in_guest()
guard all but guarantees the consumer will win the race, e.g. to nullify
perf_guest_cbs, KVM has to completely exit the guest and teardown down
all VMs before KVM start its module unload / unregister sequence.  This
also makes it all but impossible to encounter bug #3.

Bug #2 has not been a problem because all architectures that register
callbacks are strongly ordered and/or have a static set of callbacks.

But with help, unloading kvm_intel can trigger bug #1 e.g. wrapping
perf_guest_cbs with READ_ONCE in perf_misc_flags() while spamming
kvm_intel module load/unload leads to:

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] PREEMPT SMP
  CPU: 6 PID: 1825 Comm: stress Not tainted 5.14.0-rc2+ #459
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:perf_misc_flags+0x1c/0x70
  Call Trace:
   perf_prepare_sample+0x53/0x6b0
   perf_event_output_forward+0x67/0x160
   __perf_event_overflow+0x52/0xf0
   handle_pmi_common+0x207/0x300
   intel_pmu_handle_irq+0xcf/0x410
   perf_event_nmi_handler+0x28/0x50
   nmi_handle+0xc7/0x260
   default_do_nmi+0x6b/0x170
   exc_nmi+0x103/0x130
   asm_exc_nmi+0x76/0xbf

Fixes: 39447b38 ("perf: Enhance perf to allow for guest statistic collection from host")
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211111020738.2512932-2-seanjc@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

6482c46d

12 4月, 2022 1 次提交

config: enable CONFIG_MEMCG_MEMFS_INFO by default · 50911496

由 Liu Shixin 提交于 4月 07, 2022

hulk inclusion
category: feature
bugzilla: 186182, https://gitee.com/openeuler/kernel/issues/I4UOJI
CVE: NA

--------------------------------

enable CONFIG_MEMCG_MEMFS_INFO by default.
Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

50911496

27 3月, 2022 2 次提交

net/spnic: Remove spnic driver. · 52cb1ead

由 Yanling Song 提交于 3月 27, 2022

Ramaxel inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZR0O
CVE: NA

----------------------------------

There are some issues of the driver that cannot be fixed now.
The driver is not good enough for the LTS quality requirements
of openEuler，so remove it.
Signed-off-by: NYanling Song <songyl@ramaxel.com>
Reviewed-by: NYang Gan <yanggan@ramaxel.com>
Acked-by: NXie Xiuqi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

52cb1ead

SCSI: spfc: remove SPFC driver · 4a36a4fb

由 Yun Xu 提交于 3月 27, 2022

Ramaxel inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZR0O
CVE: NA

------------------------------------

There are some issues of the driver that cannot be fixed now.
The driver is not good enough for the LTS quality requirements of
openEuler，so remove it.
Signed-off-by: NYun Xu <xuyun@ramaxel.com>
Signed-off-by: NYanling Song <songyl@ramaxel.com>
Reviewed-by: NYun Xu <xuyun@ramaxel.com>
Acked-by: NXie Xiuqi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4a36a4fb

08 3月, 2022 1 次提交

efi: Fix efi_find_mirror redefine in x86 · 5f406a26

由 Ma Wupeng 提交于 3月 08, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4SK3S
CVE: NA

----------------------------------------------

Commit cc3d801f ("efi: Make efi_find_mirror() public") add
efi_find_mirror() defination into linux/efi.h, but forget to
drop this in arch/x86/include/asm/efi.h, kill it.

Fixes: cc3d801f ("efi: Make efi_find_mirror() public")
Signed-off-by: NMa Wupeng <mawupeng1@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5f406a26

02 3月, 2022 2 次提交

drivers: hooks: add bonding driver vendor hooks · 9a4a0a86

由 Wei Yongjun 提交于 3月 02, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4UV43
CVE: NA

---------------------------

Allow vendor modules to attach bonding driver hooks. This
patch introduce vendor_bond_check_dev_link hook.

Usage:

  static void vendor_foo(void *data, const struct bonding *bond,
  		       const struct slave *slave, int *state)
  {
          pr_info("%s\n", __func__);
  }

  static int __init vendor_bond_init(void)
  {
  	return register_trace_vendor_bond_check_dev_link(&vendor_foo, NULL);
  }

  static void __exit vendor_bond_exit(void)
  {
  	unregister_trace_vendor_bond_check_dev_link(&vendor_foo, NULL);
  }

  module_init(vendor_bond_init);
  module_exit(vendor_bond_exit);
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: NZhang Jialin <zhangjialin11@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9a4a0a86

configs: enable CONFIG_INTEL_IDXD · 143fbc82

由 fuyufan 提交于 3月 02, 2022

euler inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4URME
CVE: NA

--------------------------------

Enable CONFIG_INTEL_IDXD in openeuler_defconfig for x86.
Support Intel Data Accelerators on Xeon hardware.
Signed-off-by: Nfuyufan <fuyufan@huawei.com>
Reviewed-by: NKai Liu <kai.liu@suse.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

143fbc82

23 2月, 2022 3 次提交

efi: Make efi_find_mirror() public · cc3d801f

由 Ma Wupeng 提交于 2月 23, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4PM01
CVE: NA

--------------------------------

Commit b05b9f5f ("x86, mirror: x86 enabling - find mirrored memory
ranges") introduce the efi_find_mirror function on x86. In order to reuse
the API we make it public in preparation for arm64 to support mirrord
memory.
Co-developed-by: NJing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: NMa Wupeng <mawupeng1@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

cc3d801f

efi: Make efi_print_memmap() public · 2e5602aa

由 Ma Wupeng 提交于 2月 23, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4PM01
CVE: NA

--------------------------------

Make efi_print_memmap() public in preparation for adding fake memory
support for architecture with efi support, eg, arm64.
Co-developed-by: NJing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: NMa Wupeng <mawupeng1@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2e5602aa

configs: enable CONFIG_NTB_INTEL · c37700f3

由 Chao Liu 提交于 2月 23, 2022

euler inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4UP2Q
CVE: NA

--------------------------------

Enable CONFIG_NTB_INTEL in openeuler_defconfig for x86.
Support Intel NTB on capable Xeon and Atom hardware.
Signed-off-by: NChao Liu <liuchao173@huawei.com>
Reviewed-by: NKai Liu <kai.liu@suse.com>
Reviewed-by: NLiu Sirui <liusirui@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c37700f3

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功