- 18 8月, 2021 2 次提交
-
-
由 Dan Carpenter 提交于
stable inclusion from stable-v5.10.44 commit be23c4af3d8a1b986fe9b43b8966797653a76ca4 bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=341 CVE: NA -------------------------------- [ Upstream commit 1dde47a6 ] We spotted a bug recently during a review where a driver was unregistering a bus that wasn't registered, which would trigger this BUG_ON(). Let's handle that situation more gracefully, and just print a warning and return. Reported-by: Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk> Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com> Reviewed-by: Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew(a)lunn.ch> Signed-off-by: David S. Miller <davem(a)davemloft.net> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: wangqing <wangqing(a)uniontech.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Dinghao Liu 提交于
stable inclusion from stable-v5.10.44 commit 2f523cd4a9311cba629facc7d353eabbd492bd5b bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=345 CVE: NA -------------------------------- [ Upstream commit 07adc022 ] When cdns3_gadget_start() fails, a pairing PM usage counter decrement is needed to keep the counter balanced. Signed-off-by: Dinghao Liu <dinghao.liu(a)zju.edu.cn> Link: https://lore.kernel.org/r/20210412054908.7975-1-dinghao.liu(a)zju.edu.cn Signed-off-by: Peter Chen <peter.chen(a)kernel.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Xu Zehui <zehuixu(a)whu.edu.cn> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 30 7月, 2021 1 次提交
-
-
由 Eric Sandeen 提交于
stable inclusion from stable-5.10.52 commit 174c34d9cda1b5818419b8f5a332ced10755e52f bugzilla: 331 CVE: CVE-2021-33909 --------------------------------------------------------------- commit 8cae8cd8 upstream. There is no reasonable need for a buffer larger than this, and it avoids int overflow pitfalls. Fixes: 058504ed ("fs/seq_file: fallback to vmalloc allocation") Suggested-by: NAl Viro <viro@zeniv.linux.org.uk> Reported-by: NQualys Security Advisory <qsa@qualys.com> Signed-off-by: NEric Sandeen <sandeen@redhat.com> Cc: stable@kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 21 7月, 2021 14 次提交
-
-
由 Shijie Luo 提交于
euleros inclusion category: feature feature: etmem bugzilla: 48246 ------------------------------------------------- Pmem is slower than dram. So alloc pages from pmem's peer dram node to accelerate access to pmem page struct. Signed-off-by: NShijie Luo <luoshijie1@huawei.com> Signed-off-by: Kemeng Shi<shikemeng@huawei.com> Reviewed-by: louhongxiang <louhongxiang@huawei.com Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Fan Du 提交于
euleros inclusion category: feature feature: etmem bugzilla: 48246 ------------------------------------------------- Each CPU socket can have 1 DRAM and 1 PMEM node, we call them "peer nodes". Migration between DRAM and PMEM will by default happen between peer nodes. It's a temp solution. In multiple memory layers, a node can have both promotion and demotion targets instead of a single peer node. User space may also be able to infer promotion/demotion targets based on future HMAT info. Signed-off-by: NFan Du <fan.du@intel.com> Signed-off-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NShijie Luo <luoshijie1@huawei.com> Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: Nlouhongxiang <louhongxiang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kemeng Shi 提交于
euleros inclusion category: feature feature: etmem bugzilla: 48246 ------------------------------------------------- Add proc/sys/vm/hugepage_nocache_copy switch. Set 1 to copy hugepage with movnt SSE instructoin if cpu support it. Set 0 to copy hugepage as usual. Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: Nlouhongxiang <louhongxiang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kemeng Shi 提交于
euleros inclusion category: feature feature: etmem bugzilla: 48246 ------------------------------------------------- Add /proc/sys/kernel/hugepage_pmem_allocall switch. Set 1 to allowed all memory in pmem could alloc for hugepage. Set 0(default) hugepage alloc is limited by zone watermark as usual. Add /proc/sys/kernel/hugepage_mig_noalloc switch. Set 1 to forbid new hugepage alloc in hugepage migration when hugepage in dest node runs out. Set 0(default) to allow hugepage alloc in hugepage migration as usual. Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: Nlouhongxiang <louhongxiang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Fan Du 提交于
euleros inclusion category: feature feature: etmem bugzilla: 48246 ------------------------------------------------- User space migration daemon could check /sys/bus/node/devices/nodeX/type for node type. Software can interrogate node type for node memory type and distance to get desirable target node in migration. grep -r . /sys/devices/system/node/*/type /sys/devices/system/node/node0/type:dram /sys/devices/system/node/node1/type:dram /sys/devices/system/node/node2/type:pmem /sys/devices/system/node/node3/type:pmem Along with next patch which export `peer_node`, migration daemon could easily find the memory type of current node, and the target node in case of migration. grep -r . /sys/devices/system/node/*/peer_node /sys/devices/system/node/node0/peer_node:2 /sys/devices/system/node/node1/peer_node:3 /sys/devices/system/node/node2/peer_node:0 /sys/devices/system/node/node3/peer_node:1 Signed-off-by: NFan Du <fan.du@intel.com> Signed-off-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: louhongxiang <louhongxiang@huawei.com Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kemeng Shi 提交于
euleros inclusion category: feature feature: etmem bugzilla: 48246 ------------------------------------------------- Driver dax_kmem will export pmem as a NUMA node. This patch will record node consists of persistent memory for futher use. Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: louhongxiang <louhongxiang@huawei.com Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kemeng Shi 提交于
euleros inclusion category: feature feature: etmem bugzilla: 48246 ------------------------------------------------- add a callback in pte_hole during walk_page_range for user to scan page without page table. Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: louhongxiang <louhongxiang@huawei.com Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kemeng Shi 提交于
euleros inclusion category: feature feature: etmem bugzilla: 48246 ------------------------------------------------- Now, we will call cond_resched after scan a full memslot. If we scan a huge memslot, it will take long time before cond_resched. So call cond_resched after scan walk_step size memory. Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: louhongxiang <louhongxiang@huawei.com Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kemeng Shi 提交于
euleros inclusion category: feature feature: etmem bugzilla: 48246 ------------------------------------------------- Kvm shadow page may be freed when etmem_scan is walking ept page table. Hold mmu_lock when walking ept page table to avoid UAF. To avoid holding mmu_lock for too long time, walk step module parameter is added to control lock holding time. Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: Nlouhongxiang <louhongxiang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kemeng Shi 提交于
euleros inclusion category: feature feature: etmem bugzilla: 48246 ------------------------------------------------- Module scan/swap and etmem access export file operations without protection. Kernel crash can be triggered by following: 1.insert scan/swap module. 2.etmem check if exported file operations are set. 3.remove scan/swap module. 4.etmem call checked file operation. 5.kernel crash happens. Fix this as following: Module scan/swap set and clear operations with lock held. Etmem in kernel calls try_module_get to with lock held. Etmem call read/open/release/ioctl callback without lock held with module get. Another concurrent access situaction is that open for idles_pages and swap_pages will success without scan/swap module inserted. If scan/swap module is inserteds after open, subsequent call of open/read/close will call exported file operations set by scan/swap. This also may trigger kernel crash as following: 1.open idle_pages or swap_pages 2.modprobe scan/swap module 3.close idle_pages or swap_pages(module_put is called without try_module_get) 4.modprobe -r scan/swap module found invalid module reference count in trace delete_module syscall->try_stop_module->try_release_module_ref and report a BUG_ON for ret < 0. Fix this by only return file successfully with scan/swap module inserted. Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: Nlouhongxiang <louhongxiang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kemeng Shi 提交于
euleros inclusion category: feature feature: etmem bugzilla: 48246 ------------------------------------------------- free pic before return from vm_idle_read in etmem scan Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: louhongxiang <louhongxiang@huawei.com Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kemeng Shi 提交于
euleros inclusion category: feature feature: etmem bugzilla: 48246 ------------------------------------------------- Before this patch, etmem_scan is failed if vm and host has different page level. This patch supports scan 4 level ept while 5 level page is enabled in host. Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: louhongxiang <louhongxiang@huawei.com Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kemeng Shi 提交于
euleros inclusion category: feature feature: etmem bugzilla: 48246 ------------------------------------------------- 1. add hugetlb_entry callback to report hugetlb page. 2. try to walk host page table when ept entry is not present. 3. add SCAN_AS_HUGE to report ept page in pmd level as host hugetlb page may be splited into 4k ept page in vm. 4. add SCAN_IGN_HOST for user to ignore access from host. Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: louhongxiang <louhongxiang@huawei.com Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kemeng Shi 提交于
euleros inclusion category: feature feature: etmem bugzilla: 48246 ------------------------------------------------- support ioctl for etmem scan to set scan flag Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: Nlouhongxiang <louhongxiang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 28 6月, 2021 23 次提交
-
-
由 David Ahern 提交于
stable inclusion from stable-5.10.43 commit d17d47da59f726dc4c87caebda3a50333d7e2fd3 bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=326 CVE: NA -------------------------------- commit 7a6b1ab7 upstream. IFF_POINTOPOINT interfaces use NUD_NOARP entries for IPv6. It's possible to fill up the neighbour table with enough entries that it will overflow for valid connections after that. This behaviour is more prevalent after commit 58956317 ("neighbor: Improve garbage collection") is applied, as it prevents removal from entries that are not NUD_FAILED, unless they are more than 5s old. Fixes: 58956317 (neighbor: Improve garbage collection) Reported-by: NKasper Dupont <kasperd@gjkwv.06.feb.2021.kasperd.net> Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: NDavid Ahern <dsahern@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Nqingyouzijin <2645753614@qq.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Roger Pau Monne 提交于
stable inclusion from stable-5.10.43 commit 6b53db8c4c14b4e7256f058d202908b54a7b85b4 bugzilla: 109284 CVE: NA -------------------------------- commit 107866a8 upstream. Do this in order to prevent the task from being freed if the thread returns (which can be triggered by the frontend) before the call to kthread_stop done as part of the backend tear down. Not taking the reference will lead to a use-after-free in that scenario. Such reference was taken before but dropped as part of the rework done in 2ac061ce. Reintroduce the reference taking and add a comment this time explaining why it's needed. This is XSA-374 / CVE-2021-28691. Fixes: 2ac061ce ('xen/netback: cleanup init and deinit code') Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> Cc: stable@vger.kernel.org Reviewed-by: NJan Beulich <jbeulich@suse.com> Reviewed-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Pablo Neira Ayuso 提交于
stable inclusion from stable-5.10.43 commit 316de9a88c83c672c18d35bd76034d84e3769fe9 bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=324 CVE: NA -------------------------------- commit c781471d upstream. Sometimes users forget to turn on nftables extensions from Kconfig that they need. In such case, the error reporting from userspace is misleading: $ sudo nft add rule x y counter Error: Could not process rule: No such file or directory add rule x y counter ^^^^^^^^^^^^^^^^^^^^ Add missing NL_SET_BAD_ATTR() to provide a hint: $ nft add rule x y counter Error: Could not process rule: No such file or directory add rule x y counter ^^^^^^^ Fixes: 83d9dcba ("netfilter: nf_tables: extended netlink error reporting for expressions") Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
fanxingin <fanxingin@qq.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> -
由 Roja Rani Yarubandi 提交于
stable inclusion from stable-5.10.43 commit eddf2d9f76b01201dd778f2d36d75b8050217cf7 bugzilla: 109284 CVE: NA -------------------------------- commit 57648e86 upstream. Mark bus as suspended during system suspend to block the future transfers. Implement geni_i2c_resume_noirq() to resume the bus. Fixes: 37692de5 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") Signed-off-by: NRoja Rani Yarubandi <rojay@codeaurora.org> Reviewed-by: NStephen Boyd <swboyd@chromium.org> Signed-off-by: NWolfram Sang <wsa@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Gao Xiang 提交于
stable inclusion from stable-5.10.43 commit f20eef4d068637dc48ed24887ebc7b1faa860ae5 bugzilla: 109284 CVE: NA -------------------------------- commit 89b15863 upstream. LZ4 final literal copy could be overlapped when doing in-place decompression, so it's unsafe to just use memcpy() on an optimized memcpy approach but memmove() instead. Upstream LZ4 has updated this years ago [1] (and the impact is non-sensible [2] plus only a few bytes remain), this commit just synchronizes LZ4 upstream code to the kernel side as well. It can be observed as EROFS in-place decompression failure on specific files when X86_FEATURE_ERMS is unsupported, memcpy() optimization of commit 59daa706 ("x86, mem: Optimize memcpy by avoiding memory false dependece") will be enabled then. Currently most modern x86-CPUs support ERMS, these CPUs just use "rep movsb" approach so no problem at all. However, it can still be verified with forcely disabling ERMS feature... arch/x86/lib/memcpy_64.S: ALTERNATIVE_2 "jmp memcpy_orig", "", X86_FEATURE_REP_GOOD, \ - "jmp memcpy_erms", X86_FEATURE_ERMS + "jmp memcpy_orig", X86_FEATURE_ERMS We didn't observe any strange on arm64/arm/x86 platform before since most memcpy() would behave in an increasing address order ("copy upwards" [3]) and it's the correct order of in-place decompression but it really needs an update to memmove() for sure considering it's an undefined behavior according to the standard and some unique optimization already exists in the kernel. [1] https://github.com/lz4/lz4/commit/33cb8518ac385835cc17be9a770b27b40cd0e15b [2] https://github.com/lz4/lz4/pull/717#issuecomment-497818921 [3] https://sourceware.org/bugzilla/show_bug.cgi?id=12518 Link: https://lkml.kernel.org/r/20201122030749.2698994-1-hsiangkao@redhat.comSigned-off-by: NGao Xiang <hsiangkao@redhat.com> Reviewed-by: NNick Terrell <terrelln@fb.com> Cc: Yann Collet <yann.collet.73@gmail.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Li Guifu <bluce.liguifu@huawei.com> Cc: Guo Xuenan <guoxuenan@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Vitaly Kuznetsov 提交于
stable inclusion from stable-5.10.43 commit 334c59d58de5faf449d9c9feaa8c50dd8b4046a7 bugzilla: 109284 CVE: NA -------------------------------- commit 3d6b8413 upstream. Crash shutdown handler only disables kvmclock and steal time, other PV features remain active so we risk corrupting memory or getting some side-effects in kdump kernel. Move crash handler to kvm.c and unify with CPU offline. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210414123544.1060604-5-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NKrzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Vitaly Kuznetsov 提交于
stable inclusion from stable-5.10.43 commit 3b0becf8b1ecf642a9edaf4c9628ffc641e490d6 bugzilla: 109284 CVE: NA -------------------------------- commit c02027b5 upstream. Currenly, we disable kvmclock from machine_shutdown() hook and this only happens for boot CPU. We need to disable it for all CPUs to guard against memory corruption e.g. on restore from hibernate. Note, writing '0' to kvmclock MSR doesn't clear memory location, it just prevents hypervisor from updating the location so for the short while after write and while CPU is still alive, the clock remains usable and correct so we don't need to switch to some other clocksource. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210414123544.1060604-4-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAndrea Righi <andrea.righi@canonical.com> Signed-off-by: NKrzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Vitaly Kuznetsov 提交于
stable inclusion from stable-5.10.43 commit 38b858da1c58ad46519a257764e059e663b59ff2 bugzilla: 109284 CVE: NA -------------------------------- commit 8b79feff upstream. Various PV features (Async PF, PV EOI, steal time) work through memory shared with hypervisor and when we restore from hibernation we must properly teardown all these features to make sure hypervisor doesn't write to stale locations after we jump to the previously hibernated kernel (which can try to place anything there). For secondary CPUs the job is already done by kvm_cpu_down_prepare(), register syscore ops to do the same for boot CPU. Krzysztof: This fixes memory corruption visible after second resume from hibernation: BUG: Bad page state in process dbus-daemon pfn:18b01 page:ffffea000062c040 refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 compound_mapcount: -30591 flags: 0xfffffc0078141(locked|error|workingset|writeback|head|mappedtodisk|reclaim) raw: 000fffffc0078141 dead0000000002d0 dead000000000100 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set bad because of flags: 0x78141(locked|error|workingset|writeback|head|mappedtodisk|reclaim) Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210414123544.1060604-3-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAndrea Righi <andrea.righi@canonical.com> [krzysztof: Extend the commit message, adjust for v5.10 context] Signed-off-by: NKrzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Marc Zyngier 提交于
stable inclusion from stable-5.10.43 commit b327c97747595b462a003a11e6728ebd860cd285 bugzilla: 109284 CVE: NA -------------------------------- commit cb853ded upstream. Commit 03fdfb26 ("KVM: arm64: Don't write junk to sysregs on reset") flipped the register number to 0 for all the debug registers in the sysreg table, hereby indicating that these registers live in a separate shadow structure. However, the author of this patch failed to realise that all the accessors are using that particular index instead of the register encoding, resulting in all the registers hitting index 0. Not quite a valid implementation of the architecture... Address the issue by fixing all the accessors to use the CRm field of the encoding, which contains the debug register index. Fixes: 03fdfb26 ("KVM: arm64: Don't write junk to sysregs on reset") Reported-by: NRicardo Koller <ricarkol@google.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Sean Christopherson 提交于
stable inclusion from stable-5.10.43 commit b3ee3f50ab1bf7b60ba4a8346dca05ba3412fead bugzilla: 109284 CVE: NA -------------------------------- commit 0884335a upstream. Drop bits 63:32 on loads/stores to/from DRs and CRs when the vCPU is not in 64-bit mode. The APM states bits 63:32 are dropped for both DRs and CRs: In 64-bit mode, the operand size is fixed at 64 bits without the need for a REX prefix. In non-64-bit mode, the operand size is fixed at 32 bits and the upper 32 bits of the destination are forced to 0. Fixes: 7ff76d58 ("KVM: SVM: enhance MOV CR intercept handler") Fixes: cae3797a ("KVM: SVM: enhance mov DR intercept handler") Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210422022128.3464144-4-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Anand Jain 提交于
stable inclusion from stable-5.10.43 commit fe910d20e2d8e0736bbea9c1efe6a49535e807ea bugzilla: 109284 CVE: NA -------------------------------- commit 5e753a81 upstream. The following test case reproduces an issue of wrongly freeing in-use blocks on the readonly seed device when fstrim is called on the rw sprout device. As shown below. Create a seed device and add a sprout device to it: $ mkfs.btrfs -fq -dsingle -msingle /dev/loop0 $ btrfstune -S 1 /dev/loop0 $ mount /dev/loop0 /btrfs $ btrfs dev add -f /dev/loop1 /btrfs BTRFS info (device loop0): relocating block group 290455552 flags system BTRFS info (device loop0): relocating block group 1048576 flags system BTRFS info (device loop0): disk added /dev/loop1 $ umount /btrfs Mount the sprout device and run fstrim: $ mount /dev/loop1 /btrfs $ fstrim /btrfs $ umount /btrfs Now try to mount the seed device, and it fails: $ mount /dev/loop0 /btrfs mount: /btrfs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error. Block 5292032 is missing on the readonly seed device: $ dmesg -kt | tail <snip> BTRFS error (device loop0): bad tree block start, want 5292032 have 0 BTRFS warning (device loop0): couldn't read-tree root BTRFS error (device loop0): open_ctree failed >From the dump-tree of the seed device (taken before the fstrim). Block 5292032 belonged to the block group starting at 5242880: $ btrfs inspect dump-tree -e /dev/loop0 | grep -A1 BLOCK_GROUP <snip> item 3 key (5242880 BLOCK_GROUP_ITEM 8388608) itemoff 16169 itemsize 24 block group used 114688 chunk_objectid 256 flags METADATA <snip> >From the dump-tree of the sprout device (taken before the fstrim). fstrim used block-group 5242880 to find the related free space to free: $ btrfs inspect dump-tree -e /dev/loop1 | grep -A1 BLOCK_GROUP <snip> item 1 key (5242880 BLOCK_GROUP_ITEM 8388608) itemoff 16226 itemsize 24 block group used 32768 chunk_objectid 256 flags METADATA <snip> BPF kernel tracing the fstrim command finds the missing block 5292032 within the range of the discarded blocks as below: kprobe:btrfs_discard_extent { printf("freeing start %llu end %llu num_bytes %llu:\n", arg1, arg1+arg2, arg2); } freeing start 5259264 end 5406720 num_bytes 147456 <snip> Fix this by avoiding the discard command to the readonly seed device. Reported-by: NChris Murphy <lists@colorremedies.com> CC: stable@vger.kernel.org # 4.4+ Reviewed-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NAnand Jain <anand.jain@oracle.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NSudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Dmitry Baryshkov 提交于
stable inclusion from stable-5.10.43 commit 05e41f6f1c4e8c42edb9715b6629d9ab2af61064 bugzilla: 109284 CVE: NA -------------------------------- commit a670ff57 upstream. Currently DPU driver scales bandwidth and core clock for sc7180 only, while the rest of chips get static bandwidth votes. Make all chipsets scale bandwidth and clock per composition requirements like sc7180 does. Drop old voting path completely. Tested on RB3 (SDM845) and RB5 (SM8250). Signed-off-by: NDmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20210401020533.3956787-2-dmitry.baryshkov@linaro.orgSigned-off-by: NRob Clark <robdclark@chromium.org> Signed-off-by: NAmit Pundir <amit.pundir@linaro.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Mina Almasry 提交于
stable inclusion from stable-5.10.43 commit 2eb4ec9c2c3535b9755c484183cc5c4d90fd37ff bugzilla: 109284 CVE: NA -------------------------------- [ Upstream commit d84cf06e ] The userfaultfd hugetlb tests cause a resv_huge_pages underflow. This happens when hugetlb_mcopy_atomic_pte() is called with !is_continue on an index for which we already have a page in the cache. When this happens, we allocate a second page, double consuming the reservation, and then fail to insert the page into the cache and return -EEXIST. To fix this, we first check if there is a page in the cache which already consumed the reservation, and return -EEXIST immediately if so. There is still a rare condition where we fail to copy the page contents AND race with a call for hugetlb_no_page() for this index and again we will underflow resv_huge_pages. That is fixed in a more complicated patch not targeted for -stable. Test: Hacked the code locally such that resv_huge_pages underflows produce a warning, then: ./tools/testing/selftests/vm/userfaultfd hugetlb_shared 10 2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success ./tools/testing/selftests/vm/userfaultfd hugetlb 10 2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success Both tests succeed and produce no warnings. After the test runs number of free/resv hugepages is correct. [mike.kravetz@oracle.com: changelog fixes] Link: https://lkml.kernel.org/r/20210528004649.85298-1-almasrymina@google.com Fixes: 8fb5debc ("userfaultfd: hugetlbfs: add hugetlb_mcopy_atomic_pte for userfaultfd support") Signed-off-by: NMina Almasry <almasrymina@google.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Filipe Manana 提交于
stable inclusion from stable-5.10.43 commit baa6763123e2b63b8289943c7211ba0e3220432f bugzilla: 109284 CVE: NA -------------------------------- commit 76a6d5cd upstream. There are a few cases where cloning an inline extent requires copying data into a page of the destination inode. For these cases we are allocating the required data and metadata space while holding a leaf locked. This can result in a deadlock when we are low on available space because allocating the space may flush delalloc and two deadlock scenarios can happen: 1) When starting writeback for an inode with a very small dirty range that fits in an inline extent, we deadlock during the writeback when trying to insert the inline extent, at cow_file_range_inline(), if the extent is going to be located in the leaf for which we are already holding a read lock; 2) After successfully starting writeback, for non-inline extent cases, the async reclaim thread will hang waiting for an ordered extent to complete if the ordered extent completion needs to modify the leaf for which the clone task is holding a read lock (for adding or replacing file extent items). So the cloning task will wait forever on the async reclaim thread to make progress, which in turn is waiting for the ordered extent completion which in turn is waiting to acquire a write lock on the same leaf. So fix this by making sure we release the path (and therefore the leaf) every time we need to copy the inline extent's data into a page of the destination inode, as by that time we do not need to have the leaf locked. Fixes: 05a5a762 ("Btrfs: implement full reflink support for inline extents") CC: stable@vger.kernel.org # 5.10+ Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Josef Bacik 提交于
stable inclusion from stable-5.10.43 commit 0df50d47d17401f9f140dfbe752a65e5d72f9932 bugzilla: 109284 CVE: NA -------------------------------- commit dc09ef35 upstream. Error injection stress uncovered a problem where we'd leave a dangling inode ref if we failed during a rename_exchange. This happens because we insert the inode ref for one side of the rename, and then for the other side. If this second inode ref insert fails we'll leave the first one dangling and leave a corrupt file system behind. Fix this by aborting if we did the insert for the first inode ref. CC: stable@vger.kernel.org # 4.9+ Signed-off-by: NJosef Bacik <josef@toxicpanda.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Josef Bacik 提交于
stable inclusion from stable-5.10.43 commit 48568f3944ee7357e8fed394804745bd981e978a bugzilla: 109284 CVE: NA -------------------------------- commit 011b28ac upstream. This function has the following pattern while (1) { ret = whatever(); if (ret) goto out; } ret = 0 out: return ret; However several places in this while loop we simply break; when there's a problem, thus clearing the return value, and in one case we do a return -EIO, and leak the memory for the path. Fix this by re-arranging the loop to deal with ret == 1 coming from btrfs_search_slot, and then simply delete the ret = 0; out: bit so everybody can break if there is an error, which will allow for proper error handling to occur. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: NJosef Bacik <josef@toxicpanda.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Josef Bacik 提交于
stable inclusion from stable-5.10.43 commit 466d83fdbbe345f3cfd5f7b2633f740ecad67853 bugzilla: 109284 CVE: NA -------------------------------- commit 856bd270 upstream. We are unconditionally returning 0 in cleanup_ref_head, despite the fact that btrfs_del_csums could fail. We need to return the error so the transaction gets aborted properly, fix this by returning ret from btrfs_del_csums in cleanup_ref_head. Reviewed-by: NQu Wenruo <wqu@suse.com> CC: stable@vger.kernel.org # 4.19+ Signed-off-by: NJosef Bacik <josef@toxicpanda.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Josef Bacik 提交于
stable inclusion from stable-5.10.43 commit 5a89982fa2bba459b82323655df986945a853bbe bugzilla: 109284 CVE: NA -------------------------------- commit b86652be upstream. Error injection stress would sometimes fail with checksums on disk that did not have a corresponding extent. This occurred because the pattern in btrfs_del_csums was while (1) { ret = btrfs_search_slot(); if (ret < 0) break; } ret = 0; out: btrfs_free_path(path); return ret; If we got an error from btrfs_search_slot we'd clear the error because we were breaking instead of goto out. Instead of using goto out, simply handle the cases where we may leave a random value in ret, and get rid of the ret = 0; out: pattern and simply allow break to have the proper error reporting. With this fix we properly abort the transaction and do not commit thinking we successfully deleted the csum. Reviewed-by: NQu Wenruo <wqu@suse.com> CC: stable@vger.kernel.org # 4.4+ Signed-off-by: NJosef Bacik <josef@toxicpanda.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Josef Bacik 提交于
stable inclusion from stable-5.10.43 commit b547a16b24918edd63042f9d81c0d310212d2e94 bugzilla: 109284 CVE: NA -------------------------------- commit d61bec08 upstream. While doing error injection testing I saw that sometimes we'd get an abort that wouldn't stop the current transaction commit from completing. This abort was coming from finish ordered IO, but at this point in the transaction commit we should have gotten an error and stopped. It turns out the abort came from finish ordered io while trying to write out the free space cache. It occurred to me that any failure inside of finish_ordered_io isn't actually raised to the person doing the writing, so we could have any number of failures in this path and think the ordered extent completed successfully and the inode was fine. Fix this by marking the ordered extent with BTRFS_ORDERED_IOERR, and marking the mapping of the inode with mapping_set_error, so any callers that simply call fdatawait will also get the error. With this we're seeing the IO error on the free space inode when we fail to do the finish_ordered_io. CC: stable@vger.kernel.org # 4.19+ Signed-off-by: NJosef Bacik <josef@toxicpanda.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Naveen N. Rao 提交于
stable inclusion from stable-5.10.43 commit 5e5e63bacbe8f1ef9688e7804275eb88cf0be51a bugzilla: 109284 CVE: NA -------------------------------- commit 82123a3d upstream. When checking if the probed instruction is the suffix of a prefixed instruction, we access the instruction at the previous word. If the probed instruction is the very first word of a module, we can end up trying to access an invalid page. Fix this by skipping the check for all instructions at the beginning of a page. Prefixed instructions cannot cross a 64-byte boundary and as such, we don't expect to encounter a suffix as the very first word in a page for kernel text. Even if there are prefixed instructions crossing a page boundary (from a module, for instance), the instruction will be illegal, so preventing probing on the suffix of such prefix instructions isn't worthwhile. Fixes: b4657f76 ("powerpc/kprobes: Don't allow breakpoints on suffixes") Cc: stable@vger.kernel.org # v5.8+ Reported-by: NChristophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0df9a032a05576a2fa8e97d1b769af2ff0eafbd6.1621416666.git.naveen.n.rao@linux.vnet.ibm.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Thomas Gleixner 提交于
stable inclusion from stable-5.10.43 commit 42f75a4381a4ffb1b7488f90c657ea0b5461d3b7 bugzilla: 109284 CVE: NA -------------------------------- commit 7d65f9e8 upstream. PIC interrupts do not support affinity setting and they can end up on any online CPU. Therefore, it's required to mark the associated vectors as system-wide reserved. Otherwise, the corresponding irq descriptors are copied to the secondary CPUs but the vectors are not marked as assigned or reserved. This works correctly for the IO/APIC case. When the IO/APIC is disabled via config, kernel command line or lack of enumeration then all legacy interrupts are routed through the PIC, but nothing marks them as system-wide reserved vectors. As a consequence, a subsequent allocation on a secondary CPU can result in allocating one of these vectors, which triggers the BUG() in apic_update_vector() because the interrupt descriptor slot is not empty. Imran tried to work around that by marking those interrupts as allocated when a CPU comes online. But that's wrong in case that the IO/APIC is available and one of the legacy interrupts, e.g. IRQ0, has been switched to PIC mode because then marking them as allocated will fail as they are already marked as system vectors. Stay consistent and update the legacy vectors after attempting IO/APIC initialization and mark them as system vectors in case that no IO/APIC is available. Fixes: 69cde000 ("x86/vector: Use matrix allocator for vector assignment") Reported-by: NImran Khan <imran.f.khan@oracle.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210519233928.2157496-1-imran.f.khan@oracle.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Nirmoy Das 提交于
stable inclusion from stable-5.10.43 commit 3a6b69221f96f87c680bbc9fba01db3415b18f27 bugzilla: 109284 CVE: NA -------------------------------- commit 07438603 upstream. Releasing pinned BOs is illegal now. UVD 6 was missing from: commit 2f40801d ("drm/amdgpu: make sure we unpin the UVD BO") Fixes: 2f40801d ("drm/amdgpu: make sure we unpin the UVD BO") Cc: stable@vger.kernel.org Signed-off-by: NNirmoy Das <nirmoy.das@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Luben Tuikov 提交于
stable inclusion from stable-5.10.43 commit 58da0b509e4b8f4a3a4b1b2e23871d108f81338a bugzilla: 109284 CVE: NA -------------------------------- commit dce3d8e1 upstream. On QUERY2 IOCTL don't query counts of correctable and uncorrectable errors, since when RAS is enabled and supported on Vega20 server boards, this takes insurmountably long time, in O(n^3), which slows the system down to the point of it being unusable when we have GUI up. Fixes: ae363a21 ("drm/amdgpu: Add a new flag to AMDGPU_CTX_OP_QUERY_STATE2") Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: NLuben Tuikov <luben.tuikov@amd.com> Reviewed-by: NAlexander Deucher <Alexander.Deucher@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-