提交 fb48f2f0 编写于 作者: A Alexey Kardashevskiy 提交者: Zheng Zengkai

KVM: PPC: Fix clearing never mapped TCEs in realmode

stable inclusion
from stable-5.10.67
commit d320c1b2e7282de155d21b6f22cfc39e99ee3410
bugzilla: 182619 https://gitee.com/openeuler/kernel/issues/I4EWO7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d320c1b2e7282de155d21b6f22cfc39e99ee3410

--------------------------------

[ Upstream commit 1d78dfde ]

Since commit e1a1ef84 ("KVM: PPC: Book3S: Allocate guest TCEs on
demand too"), pages for TCE tables for KVM guests are allocated only
when needed. This allows skipping any update when clearing TCEs. This
works mostly fine as TCE updates are handled when the MMU is enabled.
The realmode handlers fail with H_TOO_HARD when pages are not yet
allocated, except when clearing a TCE in which case KVM prints a warning
and proceeds to dereference a NULL pointer, which crashes the host OS.

This has not been caught so far as the change in commit e1a1ef84 is
reasonably new, and POWER9 runs mostly radix which does not use realmode
handlers. With hash, the default TCE table is memset() by QEMU when the
machine is reset which triggers page faults and the KVM TCE device's
kvm_spapr_tce_fault() handles those with MMU on. And the huge DMA
windows are not cleared by VMs which instead successfully create a DMA
window big enough to map the VM memory 1:1 and then VMs just map
everything without clearing.

This started crashing now as commit 381ceda8 ("powerpc/pseries/iommu:
Make use of DDW for indirect mapping") added a mode when a dymanic DMA
window not big enough to map the VM memory 1:1 but it is used anyway,
and the VM now is the first (i.e. not QEMU) to clear a just created
table. Note that upstream QEMU needs to be modified to trigger the VM to
trigger the host OS crash.

This replaces WARN_ON_ONCE_RM() with a check and return, and adds
another warning if TCE is not being cleared.

Fixes: e1a1ef84 ("KVM: PPC: Book3S: Allocate guest TCEs on demand too")
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210827040706.517652-1-aik@ozlabs.ruSigned-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
上级 6ea49400
...@@ -173,10 +173,13 @@ static void kvmppc_rm_tce_put(struct kvmppc_spapr_tce_table *stt, ...@@ -173,10 +173,13 @@ static void kvmppc_rm_tce_put(struct kvmppc_spapr_tce_table *stt,
idx -= stt->offset; idx -= stt->offset;
page = stt->pages[idx / TCES_PER_PAGE]; page = stt->pages[idx / TCES_PER_PAGE];
/* /*
* page must not be NULL in real mode, * kvmppc_rm_ioba_validate() allows pages not be allocated if TCE is
* kvmppc_rm_ioba_validate() must have taken care of this. * being cleared, otherwise it returns H_TOO_HARD and we skip this.
*/ */
WARN_ON_ONCE_RM(!page); if (!page) {
WARN_ON_ONCE_RM(tce != 0);
return;
}
tbl = kvmppc_page_address(page); tbl = kvmppc_page_address(page);
tbl[idx % TCES_PER_PAGE] = tce; tbl[idx % TCES_PER_PAGE] = tce;
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册