- 29 1月, 2022 7 次提交
-
-
由 Alexander Mikhalitsyn 提交于
stable inclusion from linux-4.19.220 commit 4e91adc73764fd0cde00be381461e52fa641fadd -------------------------------- commit 85b6d246 upstream. Currently, the exit_shm() function not designed to work properly when task->sysvshm.shm_clist holds shm objects from different IPC namespaces. This is a real pain when sysctl kernel.shm_rmid_forced = 1, because it leads to use-after-free (reproducer exists). This is an attempt to fix the problem by extending exit_shm mechanism to handle shm's destroy from several IPC ns'es. To achieve that we do several things: 1. add a namespace (non-refcounted) pointer to the struct shmid_kernel 2. during new shm object creation (newseg()/shmget syscall) we initialize this pointer by current task IPC ns 3. exit_shm() fully reworked such that it traverses over all shp's in task->sysvshm.shm_clist and gets IPC namespace not from current task as it was before but from shp's object itself, then call shm_destroy(shp, ns). Note: We need to be really careful here, because as it was said before (1), our pointer to IPC ns non-refcnt'ed. To be on the safe side we using special helper get_ipc_ns_not_zero() which allows to get IPC ns refcounter only if IPC ns not in the "state of destruction". Q/A Q: Why can we access shp->ns memory using non-refcounted pointer? A: Because shp object lifetime is always shorther than IPC namespace lifetime, so, if we get shp object from the task->sysvshm.shm_clist while holding task_lock(task) nobody can steal our namespace. Q: Does this patch change semantics of unshare/setns/clone syscalls? A: No. It's just fixes non-covered case when process may leave IPC namespace without getting task->sysvshm.shm_clist list cleaned up. Link: https://lkml.kernel.org/r/67bb03e5-f79c-1815-e2bf-949c67047418@colorfullife.com Link: https://lkml.kernel.org/r/20211109151501.4921-1-manfred@colorfullife.com Fixes: ab602f79 ("shm: make exit_shm work proportional to task activity") Co-developed-by: NManfred Spraul <manfred@colorfullife.com> Signed-off-by: NManfred Spraul <manfred@colorfullife.com> Signed-off-by: NAlexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Andrei Vagin <avagin@gmail.com> Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Cc: Vasily Averin <vvs@virtuozzo.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Miklos Szeredi 提交于
stable inclusion from linux-4.19.219 commit 22b814fdce55caf8db1cb550a32a64159e5b83a8 -------------------------------- commit 47344172 upstream. Checking buf->flags should be done before the pipe_buf_release() is called on the pipe buffer, since releasing the buffer might modify the flags. This is exactly what page_cache_pipe_buf_release() does, and which results in the same VM_BUG_ON_PAGE(PageLRU(page)) that the original patch was trying to fix. Reported-by: NJustin Forbes <jmforbes@linuxtx.org> Fixes: 712a9510 ("fuse: fix page stealing") Cc: <stable@vger.kernel.org> # v2.6.35 Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Steven Rostedt (VMware) 提交于
stable inclusion from linux-4.19.219 commit 2692931d92d83afaa636703e95203d53629ca7c1 -------------------------------- commit 6cb20650 upstream. When pid filtering is activated in an instance, all of the events trace files for that instance has the PID_FILTER flag set. This determines whether or not pid filtering needs to be done on the event, otherwise the event is executed as normal. If pid filtering is enabled when an event is created (via a dynamic event or modules), its flag is not updated to reflect the current state, and the events are not filtered properly. Cc: stable@vger.kernel.org Fixes: 3fdaf80f ("tracing: Implement event pid filtering") Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Eric Dumazet 提交于
stable inclusion from linux-4.19.219 commit 3b7c37106bcf08f154404f1f212a3713ebf37cec -------------------------------- [ Upstream commit 19d36c5f ] We deal with IPv6 packets, so we need to use IP6CB(skb)->flags and IP6SKB_REROUTED, instead of IPCB(skb)->flags and IPSKB_REROUTED Found by code inspection, please double check that fixing this bug does not surface other bugs. Fixes: 09ee9dba ("ipv6: Reinject IPv6 packets if IPsec policy matches after SNAT") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Tobias Brunner <tobias@strongswan.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Tested-by: NTobias Brunner <tobias@strongswan.org> Acked-by: NTobias Brunner <tobias@strongswan.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 David Hildenbrand 提交于
stable inclusion from linux-4.19.219 commit 9ef384ed300d1bcfb23d0ab0b487d544444d4b52 -------------------------------- commit c1e63117 upstream. To clear a user buffer we cannot simply use memset, we have to use clear_user(). With a virtio-mem device that registers a vmcore_cb and has some logically unplugged memory inside an added Linux memory block, I can easily trigger a BUG by copying the vmcore via "cp": systemd[1]: Starting Kdump Vmcore Save Service... kdump[420]: Kdump is using the default log level(3). kdump[453]: saving to /sysroot/var/crash/127.0.0.1-2021-11-11-14:59:22/ kdump[458]: saving vmcore-dmesg.txt to /sysroot/var/crash/127.0.0.1-2021-11-11-14:59:22/ kdump[465]: saving vmcore-dmesg.txt complete kdump[467]: saving vmcore BUG: unable to handle page fault for address: 00007f2374e01000 #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation PGD 7a523067 P4D 7a523067 PUD 7a528067 PMD 7a525067 PTE 800000007048f867 Oops: 0003 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 468 Comm: cp Not tainted 5.15.0+ #6 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-27-g64f37cc530f1-prebuilt.qemu.org 04/01/2014 RIP: 0010:read_from_oldmem.part.0.cold+0x1d/0x86 Code: ff ff ff e8 05 ff fe ff e9 b9 e9 7f ff 48 89 de 48 c7 c7 38 3b 60 82 e8 f1 fe fe ff 83 fd 08 72 3c 49 8d 7d 08 4c 89 e9 89 e8 <49> c7 45 00 00 00 00 00 49 c7 44 05 f8 00 00 00 00 48 83 e7 f81 RSP: 0018:ffffc9000073be08 EFLAGS: 00010212 RAX: 0000000000001000 RBX: 00000000002fd000 RCX: 00007f2374e01000 RDX: 0000000000000001 RSI: 00000000ffffdfff RDI: 00007f2374e01008 RBP: 0000000000001000 R08: 0000000000000000 R09: ffffc9000073bc50 R10: ffffc9000073bc48 R11: ffffffff829461a8 R12: 000000000000f000 R13: 00007f2374e01000 R14: 0000000000000000 R15: ffff88807bd421e8 FS: 00007f2374e12140(0000) GS:ffff88807f000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2374e01000 CR3: 000000007a4aa000 CR4: 0000000000350eb0 Call Trace: read_vmcore+0x236/0x2c0 proc_reg_read+0x55/0xa0 vfs_read+0x95/0x190 ksys_read+0x4f/0xc0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Some x86-64 CPUs have a CPU feature called "Supervisor Mode Access Prevention (SMAP)", which is used to detect wrong access from the kernel to user buffers like this: SMAP triggers a permissions violation on wrong access. In the x86-64 variant of clear_user(), SMAP is properly handled via clac()+stac(). To fix, properly use clear_user() when we're dealing with a user buffer. Link: https://lkml.kernel.org/r/20211112092750.6921-1-david@redhat.com Fixes: 997c136f ("fs/proc/vmcore.c: add hook to read_from_oldmem() to check for non-ram pages") Signed-off-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NBaoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Philipp Rudo <prudo@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Steven Rostedt (VMware) 提交于
stable inclusion from linux-4.19.219 commit bbaf4fe5ec34e7931af9a03a60b4071c2bc01ae0 -------------------------------- commit a55f224f upstream. If a event is filtered by pid and a trigger that requires processing of the event to happen is a attached to the event, the discard portion does not take the pid filtering into account, and the event will then be recorded when it should not have been. Cc: stable@vger.kernel.org Fixes: 3fdaf80f ("tracing: Implement event pid filtering") Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Miklos Szeredi 提交于
stable inclusion from linux-4.19.219 commit 65f1f3eb09a3907c5f5edaff6c473b99e49e6020 -------------------------------- commit 712a9510 upstream. It is possible to trigger a crash by splicing anon pipe bufs to the fuse device. The reason for this is that anon_pipe_buf_release() will reuse buf->page if the refcount is 1, but that page might have already been stolen and its flags modified (e.g. PG_lru added). This happens in the unlikely case of fuse_dev_splice_write() getting around to calling pipe_buf_release() after a page has been stolen, added to the page cache and removed from the page cache. Fix by calling pipe_buf_release() right after the page was inserted into the page cache. In this case the page has an elevated refcount so any release function will know that the page isn't reusable. Reported-by: NFrank Dinoff <fdinoff@google.com> Link: https://lore.kernel.org/r/CAAmZXrsGg2xsP1CK+cbuEMumtrqdvD-NKnWzhNcvn71RV3c1yw@mail.gmail.com/ Fixes: dd3bb14f ("fuse: support splice() writing to fuse device") Cc: <stable@vger.kernel.org> # v2.6.35 Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
- 27 1月, 2022 1 次提交
-
-
由 Laibin Qiu 提交于
phytium inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4RK58 CVE: NA -------------------------------- The system would hang up when the Phytium S2500 communicates with some BMCs after several rounds of transactions, unless we reset the controller timeout counter manually by calling firmware through SMC. Signed-off-by: NWang Yinfeng <wangyinfeng@phytium.com.cn> Signed-off-by: Chen Baozi <chenbaozi@phytium.com.cn> #openEuler_contributor Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
- 26 1月, 2022 3 次提交
-
-
由 liubo 提交于
euleros inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4QVXW CVE: NA ------------------------------------------------- etmem, the memory vertical expansion technology, The existing memory expansion tool etmem swaps out all pages that can be swapped out for the process by default, unless the page is marked with lock flag. The function of swapping out specified pages is added. The process adds VM_SWAPFLAG flags for pages to be swapped out. The etmem adds filters to the scanning module and swaps out only these pages. Signed-off-by: Nliubo <liubo254@huawei.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 liubo 提交于
euleros inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4QVXW CVE: NA ------------------------------------------------- etmem, the memory vertical expansion technology, In the current etmem process, memory page swapping is implemented by invoking shrink_page_list. When this interface is invoked for the first time, pages are added to the swap cache and written to disks.The swap cache page is reclaimed only when this interface is invoked for the second time and no process accesses the page.However, in the etmem process, the user mode scans pages that have been accessed, and the migration is not delivered to pages that are not accessed by processes. Therefore, the swap cache may always be occupied. To solve the preceding problem, add the logic for actively reclaiming the swap cache.When the swap cache occupies a large amount of memory, the system proactively scans the LRU linked list and reclaims the swap cache to save memory within the specified range. Signed-off-by: Nliubo <liubo254@huawei.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 liubo 提交于
euleros inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4QVXW CVE: NA ------------------------------------------------- etmem, the memory vertical expansion technology, uses DRAM and high-performance storage new media to form multi-level memory storage. By grading the stored data, etmem migrates the classified cold storage data from the storage medium to the high-performance storage medium, so as to achieve the purpose of memory capacity expansion and memory cost reduction. When the memory expansion function etmem is running, the native swap function of the kernel needs to be disabled in certain scenarios to avoid the impact of kernel swap. This feature provides the preceding functions. The /sys/kernel/mm/swap/ directory provides the kernel_swap_enable sys interface to enable or disable the native swap function of the kernel. The default value of /sys/kernel/mm/swap/kernel_swap_enable is true, that is, kernel swap is enabled by default. Turn on kernel swap: echo true > /sys/kernel/mm/swap/kernel_swap_enable Turn off kernel swap: echo false > /sys/kernel/mm/swap/kernel_swap_enable Signed-off-by: Nliubo <liubo254@huawei.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
- 25 1月, 2022 1 次提交
-
-
由 Nikolay Aleksandrov 提交于
mainline inclusion from mainline-v5.9-rc1 commit fd65e5a9 category: bugfix bugzilla: 186114 CVE: NA -------------------------------- We need to clear all of the bridge private skb variables as they can be stale due to the packet being recirculated through the stack and then transmitted through the bridge device. Similar memset is already done on bridge's input. We've seen cases where proxyarp_replied was 1 on routed multicast packets transmitted through the bridge to ports with neigh suppress which were getting dropped. Same thing can in theory happen with the port isolation bit as well. Fixes: 821f1b21 ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood") Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NHuang Guobin <huangguobin4@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
- 20 1月, 2022 1 次提交
-
-
由 Cui GaoSheng 提交于
hulk inclusion category: bugfix bugzilla: 186105, https://gitee.com/openeuler/kernel/issues/I4RGWS?from=project-issue CVE: NA ----------------------------------------------------------------- When we add "audit=1" to the cmdline, if we keep the audit_hold_queue non-empty, flush the hold queue will fall into an infinite loop. So we need to fix it by stoping flush the hold queue when netlink abnormal. Fixes: 3413ddc9 ("audit: improve robustness of the audit queue handling") Signed-off-by: NCui GaoSheng <cuigaosheng1@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
- 19 1月, 2022 2 次提交
-
-
由 Yu Kuai 提交于
hulk inclusion category: feature bugzilla: 186072, https://gitee.com/openeuler/kernel/issues/I4RH0V CVE: NA ----------------------------------------------- blkio subsytem is not under default hierarchy in cgroup v1 by default, which means configurations will only be effective on current cgroup for io throttle. This patch introduces a new feature that enable default hierarchy for io throttle, which means configurations will be effective on child cgroups. Such feature is disabled by default, and can be enabled by adding "blkcg_global_limit=1" or "blkcg_global_limit=Y" or "blkcg_global_limit=y" in boot cmd. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Darrick J. Wong 提交于
mainline inclusion from mainline-v5.16-rc5 commit 983d8e60 category: bugfix bugzilla: 186083 CVE: CVE-2021-4155 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=983d8e60f50806f90534cc5373d0ce867e5aaf79 -------------------------------- The old ALLOCSP/FREESP ioctls in XFS can be used to preallocate space at the end of files, just like fallocate and RESVSP. Make the behavior consistent with the other ioctls. Reported-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
- 18 1月, 2022 1 次提交
-
-
由 Willem de Bruijn 提交于
mainline inclusion from mainline-v5.14 commit 8a0ed250 category: bugfix bugzilla: NA CVE: CVE-2021-39633 ------------------------------------------------- The GRE tunnel device can pull existing outer headers in ipge_xmit. This is a rare path, apparently unique to this device. The below commit ensured that pulling does not move skb->data beyond csum_start. But it has a false positive if ip_summed is not CHECKSUM_PARTIAL and thus csum_start is irrelevant. Refine to exclude this. At the same time simplify and strengthen the test. Simplify, by moving the check next to the offending pull, making it more self documenting and removing an unnecessary branch from other code paths. Strengthen, by also ensuring that the transport header is correct and therefore the inner headers will be after skb_reset_inner_headers. The transport header is set to csum_start in skb_partial_csum_set. Link: https://lore.kernel.org/netdev/YS+h%2FtqCJJiQei+W@shredder/ Fixes: 1d011c48 ("ip_gre: add validation for csum_start") Reported-by: NIdo Schimmel <idosch@idosch.org> Suggested-by: NAlexander Duyck <alexander.duyck@gmail.com> Signed-off-by: NWillem de Bruijn <willemb@google.com> Reviewed-by: NAlexander Duyck <alexanderduyck@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NHuang Guobin <huangguobin4@huawei.com> Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
- 17 1月, 2022 20 次提交
-
-
由 Zhenguo Yao 提交于
mainline inclusion from mainline-v5.16-rc5 commit 4178158e category: bugfix bugzilla: 186043 CVE: NA -------------------------------- Preallocation of gigantic pages can't work bacause of commit b5389086 ("hugetlbfs: extend the definition of hugepages parameter to support node allocation"). When nid is NUMA_NO_NODE(-1), alloc_bootmem_huge_page will always return without doing allocation. Fix this by adding more check. Link: https://lkml.kernel.org/r/20211129133803.15653-1-yaozhenguo1@gmail.com Fixes: b5389086 ("hugetlbfs: extend the definition of hugepages parameter to support node allocation") Signed-off-by: NZhenguo Yao <yaozhenguo1@gmail.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Tested-by: NMaxim Levitsky <mlevitsk@redhat.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Reviewed-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Zhenguo Yao 提交于
mainline inclusion from mainline-v5.16-rc1 commit b5389086 category: feature bugzilla: 186043 CVE: NA -------------------------------- We can specify the number of hugepages to allocate at boot. But the hugepages is balanced in all nodes at present. In some scenarios, we only need hugepages in one node. For example: DPDK needs hugepages which are in the same node as NIC. If DPDK needs four hugepages of 1G size in node1 and system has 16 numa nodes we must reserve 64 hugepages on the kernel cmdline. But only four hugepages are used. The others should be free after boot. If the system memory is low(for example: 64G), it will be an impossible task. So extend the hugepages parameter to support specifying hugepages on a specific node. For example add following parameter: hugepagesz=1G hugepages=0:1,1:3 It will allocate 1 hugepage in node0 and 3 hugepages in node1. Link: https://lkml.kernel.org/r/20211005054729.86457-1-yaozhenguo1@gmail.comSigned-off-by: NZhenguo Yao <yaozhenguo1@gmail.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Zhenguo Yao <yaozhenguo1@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mike Rapoport <rppt@kernel.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Conflicts: Documentation/admin-guide/kernel-parameters.txt Documentation/admin-guide/mm/hugetlbpage.rst arch/powerpc/mm/hugetlbpage.c include/linux/hugetlb.h mm/hugetlb.c Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Guo Mengqi 提交于
ascend inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4ODJ6 CVE: NA --------------------------- Remove the unnecessary current->mm NULL check in sp_unshare_uva, and allow process to unshare kernel mapped addresses in do_exit(). Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Guo Mengqi 提交于
ascend inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4ODMN CVE: NA ------------------------------------------------- Fix following situation: when the last process in a group exits, and a second process tries to add to this group. The second process may get a invalid spg. However the group's use_count is increased by 1, which caused the first process failed to free the group when it exits. And then second process called sp_group_drop --> free_sp_group and cause a double request of rwsem. Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Yanling Song 提交于
Ramaxel inclusion category: features bugzilla: https://gitee.com/openeuler/kernel/issues/I4ON8F CVE: NA Changes: 1. Split scmd_tmout_nonpt into two parameters: scmd_tmout_vd/scmd_tmout_rawdisk 2. Return -ETIME instead of -EINVAL when command is timeout. 3. Add one module parameters: max_io_force. Signed-off-by: NYanling Song <songyl@ramaxel.com> Reviewed-by: NJiang Yu <yujiang@ramaxel.com> Reviewed-by: NZhang Lei <zhanglei48@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Jason Wang 提交于
mainline inclusion from mainline-5.16 commit 6ae6ff6f category: bugfix bugzilla: NA CVE: NA ------------------------------------------------- If an untrusted device neogitates BLK_F_MQ but advertises a zero num_queues, the driver may end up trying to allocating zero size buffers where ZERO_SIZE_PTR is returned which may pass the checking against the NULL. This will lead unexpected results. Fixing this by failing the probe in this case. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20211019070152.8236-2-jasowang@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NStefano Garzarella <sgarzare@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NWenchao Hao <haowenchao@huawei.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Xie Yongji 提交于
mainline inclusion from mainline-5.16 commit 57a13a5b category: bugfix bugzilla: NA CVE: NA ------------------------------------------------- The block layer can't support a block size larger than page size yet. And a block size that's too small or not a power of two won't work either. If a misconfigured device presents an invalid block size in configuration space, it will result in the kernel crash something like below: [ 506.154324] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 506.160416] RIP: 0010:create_empty_buffers+0x24/0x100 [ 506.174302] Call Trace: [ 506.174651] create_page_buffers+0x4d/0x60 [ 506.175207] block_read_full_page+0x50/0x380 [ 506.175798] ? __mod_lruvec_page_state+0x60/0xa0 [ 506.176412] ? __add_to_page_cache_locked+0x1b2/0x390 [ 506.177085] ? blkdev_direct_IO+0x4a0/0x4a0 [ 506.177644] ? scan_shadow_nodes+0x30/0x30 [ 506.178206] ? lru_cache_add+0x42/0x60 [ 506.178716] do_read_cache_page+0x695/0x740 [ 506.179278] ? read_part_sector+0xe0/0xe0 [ 506.179821] read_part_sector+0x36/0xe0 [ 506.180337] adfspart_check_ICS+0x32/0x320 [ 506.180890] ? snprintf+0x45/0x70 [ 506.181350] ? read_part_sector+0xe0/0xe0 [ 506.181906] bdev_disk_changed+0x229/0x5c0 [ 506.182483] blkdev_get_whole+0x6d/0x90 [ 506.183013] blkdev_get_by_dev+0x122/0x2d0 [ 506.183562] device_add_disk+0x39e/0x3c0 [ 506.184472] virtblk_probe+0x3f8/0x79b [virtio_blk] [ 506.185461] virtio_dev_probe+0x15e/0x1d0 [virtio] So let's use a block layer helper to validate the block size. Conflict: origin patch used blk_cleanup_disk() which is introduced in f525464a (block: add blk_alloc_disk and blk_cleanup_disk APIs) to clean resource, this patch just call blk_cleanup_queue() to perform the same operations. Signed-off-by: NXie Yongji <xieyongji@bytedance.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20211026144015.188-5-xieyongji@bytedance.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NWenchao Hao <haowenchao@huawei.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Xie Yongji 提交于
mainline inclusion from mainline-5.16 commit 570b1cac category: bugfix bugzilla: NA CVE: NA ------------------------------------------------- There are some duplicated codes to validate the block size in block drivers. This limitation actually comes from block layer, so this patch tries to add a new block layer helper for that. Signed-off-by: NXie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20211026144015.188-2-xieyongji@bytedance.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NWenchao Hao <haowenchao@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Michael S. Tsirkin 提交于
mainline inclusion from mainline-5.15 commit ff631988 category: bugfix bugzilla: NA CVE: NA ------------------------------------------------- It turns out that access to config space before completing the feature negotiation is broken for big endian guests at least with QEMU hosts up to 6.1 inclusive. This affects any device that accesses config space in the validate callback: at the moment that is virtio-net with VIRTIO_NET_F_MTU but since 82e89ea0 ("virtio-blk: Add validation for block size in config space") that also started affecting virtio-blk with VIRTIO_BLK_F_BLK_SIZE. Further, unlike VIRTIO_NET_F_MTU which is off by default on QEMU, VIRTIO_BLK_F_BLK_SIZE is on by default, which resulted in lots of people not being able to boot VMs on BE. The spec is very clear that what we are doing is legal so QEMU needs to be fixed, but given it's been broken for so many years and no one noticed, we need to give QEMU a bit more time before applying this. Further, this patch is incomplete (does not check blk size is a power of two) and it duplicates the logic from nbd. Revert for now, and we'll reapply a cleaner logic in the next release. Cc: stable@vger.kernel.org Fixes: 82e89ea0 ("virtio-blk: Add validation for block size in config space") Cc: Xie Yongji <xieyongji@bytedance.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NWenchao Hao <haowenchao@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Matej Genci 提交于
mainline inclusion from mainline-5.10 commit beef6fd0 category: bugfix bugzilla: NA CVE: NA ------------------------------------------------- VirtIO 1.0 spec says: The removed and rescan events ... when sent for LUN 0, they MAY apply to the entire target so the driver can ask the initiator to rescan the target to detect this. This change introduces the behaviour described above by scanning the entire SCSI target when LUN is set to 0. This is both a functional and a performance fix. It aligns the driver with the spec and allows control planes to hotplug targets with large numbers of LUNs without having to request a RESCAN for each one of them. Link: https://lore.kernel.org/r/CY4PR02MB33354370E0A81E75DD9DFE74FB520@CY4PR02MB3335.namprd02.prod.outlook.comSuggested-by: NFelipe Franciosi <felipe@nutanix.com> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NMatej Genci <matej.genci@nutanix.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NWenchao Hao <haowenchao@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Xingang Wang 提交于
ascend inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I49RB2 CVE: NA ------------------------------------------------- This reverts commit 0cc88dd8. The commit "svm: Add support to get svm mpam configuration" add and export interface in svm module, this makes the mpam depend on the svm module, just revert this to avoid coupling. Signed-off-by: NXingang Wang <wangxingang5@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Xingang Wang 提交于
ascend inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I49RB2 CVE: NA ------------------------------------------------- This reverts commit 464f6990. The commit "svm: Add support to set svm mpam configuration" add and export interface in svm module, this makes the mpam depend on the svm module, just revert this to avoid coupling. Signed-off-by: NXingang Wang <wangxingang5@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Xingang Wang 提交于
ascend inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I49RB2 CVE: NA ------------------------------------------------- This reverts commit c50ad40d. The commit "svm: Add svm_set_user_mpam_en to enable/disable mpam for smmu" add and export interface in svm module, this makes the mpam depend on the svm module, just revert this to avoid coupling. Signed-off-by: NXingang Wang <wangxingang5@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-v5.16 commit e5745764 category: bugfix bugzilla: NA CVE: CVE-2021-4197 ------------------------------------------------------------------------ cgroup process migration permission checks are performed at write time as whether a given operation is allowed or not is dependent on the content of the write - the PID. This currently uses current's cgroup namespace which is a potential security weakness as it may allow scenarios where a less privileged process tricks a more privileged one into writing into a fd that it created. This patch makes cgroup remember the cgroup namespace at the time of open and uses it for migration permission checks instad of current's. Note that this only applies to cgroup2 as cgroup1 doesn't have namespace support. This also fixes a use-after-free bug on cgroupns reported in https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com Note that backporting this fix also requires the preceding patch. Reported-by: N"Eric W. Biederman" <ebiederm@xmission.com> Suggested-by: NLinus Torvalds <torvalds@linuxfoundation.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: NMichal Koutný <mkoutny@suse.com> Reported-by: syzbot+50f5cf33a284ce738b62@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com Fixes: 5136f636 ("cgroup: implement "nsdelegate" mount option") Signed-off-by: NTejun Heo <tj@kernel.org> Conflicts: kernel/cgroup/cgroup-internal.h kernel/cgroup/cgroup.c Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-v5.16 commit 0d2b5955 category: bugfix bugzilla: NA CVE: CVE-2021-4197 ------------------------------------------------------------------------- of->priv is currently used by each interface file implementation to store private information. This patch collects the current two private data usages into struct cgroup_file_ctx which is allocated and freed by the common path. This allows generic private data which applies to multiple files, which will be used to in the following patch. Note that cgroup_procs iterator is now embedded as procs.iter in the new cgroup_file_ctx so that it doesn't need to be allocated and freed separately. v2: union dropped from cgroup_file_ctx and the procs iterator is embedded in cgroup_file_ctx as suggested by Linus. v3: Michal pointed out that cgroup1's procs pidlist uses of->priv too. Converted. Didn't change to embedded allocation as cgroup1 pidlists get stored for caching. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: NMichal Koutný <mkoutny@suse.com> Conflicts: kernel/cgroup/cgroup-internal.h kernel/cgroup/cgroup.c Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-v5.16 commit 1756d799 category: bugfix bugzilla: NA CVE: CVE-2021-4197 --------------------------------------------------- cgroup process migration permission checks are performed at write time as whether a given operation is allowed or not is dependent on the content of the write - the PID. This currently uses current's credentials which is a potential security weakness as it may allow scenarios where a less privileged process tricks a more privileged one into writing into a fd that it created. This patch makes both cgroup2 and cgroup1 process migration interfaces to use the credentials saved at the time of open (file->f_cred) instead of current's. Reported-by: N"Eric W. Biederman" <ebiederm@xmission.com> Suggested-by: NLinus Torvalds <torvalds@linuxfoundation.org> Fixes: 187fe840 ("cgroup: require write perm on common ancestor when moving processes on the default hierarchy") Reviewed-by: NMichal Koutný <mkoutny@suse.com> Signed-off-by: NTejun Heo <tj@kernel.org> Conflicts: kernel/cgroup/cgroup.c Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Lin Ma 提交于
mainline inclusion from mainline-v5.16-rc1 commit aedddb4e category: bugfix bugzilla: NA CVE: CVE-2021-4202 -------------------------------- The CAP_NET_ADMIN checks are needed to prevent attackers faking a device under NCIUARTSETDRIVER and exploit privileged commands. This patch add GENL_ADMIN_PERM flags in genl_ops to fulfill the check. Except for commands like NFC_CMD_GET_DEVICE, NFC_CMD_GET_TARGET, NFC_CMD_LLC_GET_PARAMS, and NFC_CMD_GET_SE, which are mainly information- read operations. Signed-off-by: NLin Ma <linma@zju.edu.cn> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Conflicts: net/nfc/netlink.c Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Lin Ma 提交于
stable inclusion from linux-4.19.219 commit 2350cffd71e74bf81dedc989fdec12aebe89a4a5 CVE: CVE-2021-4202 -------------------------------- commit 48b71a9e upstream. There are two sites that calls queue_work() after the destroy_workqueue() and lead to possible UAF. The first site is nci_send_cmd(), which can happen after the nci_close_device as below nfcmrvl_nci_unregister_dev | nfc_genl_dev_up nci_close_device | flush_workqueue | del_timer_sync | nci_unregister_device | nfc_get_device destroy_workqueue | nfc_dev_up nfc_unregister_device | nci_dev_up device_del | nci_open_device | __nci_request | nci_send_cmd | queue_work !!! Another site is nci_cmd_timer, awaked by the nci_cmd_work from the nci_send_cmd. ... | ... nci_unregister_device | queue_work destroy_workqueue | nfc_unregister_device | ... device_del | nci_cmd_work | mod_timer | ... | nci_cmd_timer | queue_work !!! For the above two UAF, the root cause is that the nfc_dev_up can race between the nci_unregister_device routine. Therefore, this patch introduce NCI_UNREG flag to easily eliminate the possible race. In addition, the mutex_lock in nci_close_device can act as a barrier. Signed-off-by: NLin Ma <linma@zju.edu.cn> Fixes: 6a2968aa ("NFC: basic NCI protocol implementation") Reviewed-by: NJakub Kicinski <kuba@kernel.org> Reviewed-by: NKrzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20211116152732.19238-1-linma@zju.edu.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Lin Ma 提交于
stable inclusion from linux-4.19.218 commit c45cea83e13699bdfd47842e04d09dd43af4c371 CVE: CVE-2021-4202 -------------------------------- [ Upstream commit 3e3b5dfc ] There is a potential UAF between the unregistration routine and the NFC netlink operations. The race that cause that UAF can be shown as below: (FREE) | (USE) nfcmrvl_nci_unregister_dev | nfc_genl_dev_up nci_close_device | nci_unregister_device | nfc_get_device nfc_unregister_device | nfc_dev_up rfkill_destory | device_del | rfkill_blocked ... | ... The root cause for this race is concluded below: 1. The rfkill_blocked (USE) in nfc_dev_up is supposed to be placed after the device_is_registered check. 2. Since the netlink operations are possible just after the device_add in nfc_register_device, the nfc_dev_up() can happen anywhere during the rfkill creation process, which leads to data race. This patch reorder these actions to permit 1. Once device_del is finished, the nfc_dev_up cannot dereference the rfkill object. 2. The rfkill_register need to be placed after the device_add of nfc_dev because the parent device need to be created first. So this patch keeps the order but inject device_lock to prevent the data race. Signed-off-by: NLin Ma <linma@zju.edu.cn> Fixes: be055b2f ("NFC: RFKILL support") Reviewed-by: NJakub Kicinski <kuba@kernel.org> Reviewed-by: NKrzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20211116152652.19217-1-linma@zju.edu.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Lin Ma 提交于
stable inclusion from linux-4.19.218 commit 62be2b1e7914b7340281f09412a7bbb62e6c8b67 CVE: CVE-2021-4202 -------------------------------- [ Upstream commit 86cdf8e3 ] There is a possible data race as shown below: thread-A in nci_request() | thread-B in nci_close_device() | mutex_lock(&ndev->req_lock); test_bit(NCI_UP, &ndev->flags); | ... | test_and_clear_bit(NCI_UP, &ndev->flags) mutex_lock(&ndev->req_lock); | | This race will allow __nci_request() to be awaked while the device is getting removed. Similar to commit e2cb6b89 ("bluetooth: eliminate the potential race condition when removing the HCI controller"). this patch alters the function sequence in nci_request() to prevent the data races between the nci_close_device(). Signed-off-by: NLin Ma <linma@zju.edu.cn> Fixes: 6a2968aa ("NFC: basic NCI protocol implementation") Link: https://lore.kernel.org/r/20211115145600.8320-1-linma@zju.edu.cnSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
- 13 1月, 2022 3 次提交
-
-
由 Ye Bin 提交于
mainline inclusion from mainline-v5.17 commit ce85548ab4295234b4f8e63a0eea0c157d2f6b25 category: bugfix bugzilla: 185930 CVE: NA ----------------------------------------------- We got issue as follows when run syzkaller: [ 167.936972] EXT4-fs error (device loop0): __ext4_remount:6314: comm rep: Abort forced by user [ 167.938306] EXT4-fs (loop0): Remounting filesystem read-only [ 167.981637] Assertion failure in ext4_getblk() at fs/ext4/inode.c:847: '(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) || handle != NULL || create == 0' [ 167.983601] ------------[ cut here ]------------ [ 167.984245] kernel BUG at fs/ext4/inode.c:847! [ 167.984882] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI [ 167.985624] CPU: 7 PID: 2290 Comm: rep Tainted: G B 5.16.0-rc5-next-20211217+ #123 [ 167.986823] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 [ 167.988590] RIP: 0010:ext4_getblk+0x17e/0x504 [ 167.989189] Code: c6 01 74 28 49 c7 c0 a0 a3 5c 9b b9 4f 03 00 00 48 c7 c2 80 9c 5c 9b 48 c7 c6 40 b6 5c 9b 48 c7 c7 20 a4 5c 9b e8 77 e3 fd ff <0f> 0b 8b 04 244 [ 167.991679] RSP: 0018:ffff8881736f7398 EFLAGS: 00010282 [ 167.992385] RAX: 0000000000000094 RBX: 1ffff1102e6dee75 RCX: 0000000000000000 [ 167.993337] RDX: 0000000000000001 RSI: ffffffff9b6e29e0 RDI: ffffed102e6dee66 [ 167.994292] RBP: ffff88816a076210 R08: 0000000000000094 R09: ffffed107363fa09 [ 167.995252] R10: ffff88839b1fd047 R11: ffffed107363fa08 R12: ffff88816a0761e8 [ 167.996205] R13: 0000000000000000 R14: 0000000000000021 R15: 0000000000000001 [ 167.997158] FS: 00007f6a1428c740(0000) GS:ffff88839b000000(0000) knlGS:0000000000000000 [ 167.998238] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 167.999025] CR2: 00007f6a140716c8 CR3: 0000000133216000 CR4: 00000000000006e0 [ 167.999987] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 168.000944] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 168.001899] Call Trace: [ 168.002235] <TASK> [ 168.007167] ext4_bread+0xd/0x53 [ 168.007612] ext4_quota_write+0x20c/0x5c0 [ 168.010457] write_blk+0x100/0x220 [ 168.010944] remove_free_dqentry+0x1c6/0x440 [ 168.011525] free_dqentry.isra.0+0x565/0x830 [ 168.012133] remove_tree+0x318/0x6d0 [ 168.014744] remove_tree+0x1eb/0x6d0 [ 168.017346] remove_tree+0x1eb/0x6d0 [ 168.019969] remove_tree+0x1eb/0x6d0 [ 168.022128] qtree_release_dquot+0x291/0x340 [ 168.023297] v2_release_dquot+0xce/0x120 [ 168.023847] dquot_release+0x197/0x3e0 [ 168.024358] ext4_release_dquot+0x22a/0x2d0 [ 168.024932] dqput.part.0+0x1c9/0x900 [ 168.025430] __dquot_drop+0x120/0x190 [ 168.025942] ext4_clear_inode+0x86/0x220 [ 168.026472] ext4_evict_inode+0x9e8/0xa22 [ 168.028200] evict+0x29e/0x4f0 [ 168.028625] dispose_list+0x102/0x1f0 [ 168.029148] evict_inodes+0x2c1/0x3e0 [ 168.030188] generic_shutdown_super+0xa4/0x3b0 [ 168.030817] kill_block_super+0x95/0xd0 [ 168.031360] deactivate_locked_super+0x85/0xd0 [ 168.031977] cleanup_mnt+0x2bc/0x480 [ 168.033062] task_work_run+0xd1/0x170 [ 168.033565] do_exit+0xa4f/0x2b50 [ 168.037155] do_group_exit+0xef/0x2d0 [ 168.037666] __x64_sys_exit_group+0x3a/0x50 [ 168.038237] do_syscall_64+0x3b/0x90 [ 168.038751] entry_SYSCALL_64_after_hwframe+0x44/0xae In order to reproduce this problem, the following conditions need to be met: 1. Ext4 filesystem with no journal; 2. Filesystem image with incorrect quota data; 3. Abort filesystem forced by user; 4. umount filesystem; As in ext4_quota_write: ... if (EXT4_SB(sb)->s_journal && !handle) { ext4_msg(sb, KERN_WARNING, "Quota write (off=%llu, len=%llu)" " cancelled because transaction is not started", (unsigned long long)off, (unsigned long long)len); return -EIO; } ... We only check handle if NULL when filesystem has journal. There is need check handle if NULL even when filesystem has no journal. Signed-off-by: NYe Bin <yebin10@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211223015506.297766-1-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: NYe Bin <yebin10@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Thomas Zeitlhofer 提交于
stable inclusion from linux-v4.19.219 commit 68945e943519df1532e598fafab16ac54488933f --------------------------------------------------- [ Upstream commit cefcf24b ] Commit 39fbef4b ("PM: hibernate: Get block device exclusively in swsusp_check()") changed the opening mode of the block device to (FMODE_READ | FMODE_EXCL). In the corresponding calls to swsusp_close(), the mode is still just FMODE_READ which triggers the warning in blkdev_flush_mapping() on resume from hibernate. So, use the mode (FMODE_READ | FMODE_EXCL) also when closing the device. Fixes: 39fbef4b ("PM: hibernate: Get block device exclusively in swsusp_check()") Signed-off-by: NThomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYe Bin <yebin10@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Yang Yingliang 提交于
hulk inclusion category: bugfix bugzilla: 173968, https://gitee.com/openeuler/kernel/issues/I3J87Y CVE: NA --------------------------- This reverts commit b2e484e9. When CONFIG_LOCKDEP and CONFIG_DEBUG_LOCKDEP are enabled, it detects the following error: [ 10.145007] BUG: sleeping function called from invalid context at mm/slab.h:418 [ 10.145394] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0 [ 10.145765] Preemption disabled at: [ 10.145978] [<ffff000008f8e7b4>] hardlockup_detector_perf_init+0x20/0x100 [ 10.146770] CPU: 6 PID: 1 Comm: swapper/0 Not tainted 4.19.90+ #3 [ 10.148242] Hardware name: linux,dummy-virt (DT) [ 10.148572] Call trace: [ 10.148667] dump_backtrace+0x0/0x190 [ 10.148765] show_stack+0x24/0x30 [ 10.148875] dump_stack+0xa4/0xf8 [ 10.148964] ___might_sleep+0x150/0x180 [ 10.149065] __might_sleep+0x58/0x90 [ 10.149199] kmem_cache_alloc_trace+0x244/0x2b0 [ 10.149308] perf_event_alloc+0x74/0x680 [ 10.149402] perf_event_create_kernel_counter+0x2c/0x190 [ 10.149516] arch_probe_cpu_freq+0x84/0x1ac [ 10.149611] hw_nmi_get_sample_period+0xb8/0x180 [ 10.149713] hardlockup_detector_event_create+0x28/0xfc [ 10.149827] hardlockup_detector_perf_init+0x24/0x100 [ 10.149943] watchdog_nmi_probe+0x14/0x1c [ 10.150037] lockup_detector_init+0x58/0x98 [ 10.150173] kernel_init_freeable+0x10c/0x1c4 [ 10.150298] kernel_init+0x18/0x110 [ 10.150422] ret_from_fork+0x10/0x18 In 'b2e484e9 ("watchdog: Fix check_preemption_disabled() error")', we tried to fix check_preemption_disabled() error by disabling preemption in hardlockup_detector_perf_init(), but missed that function perf_event_create_kernel_counter() may sleep. The preemption is always disabled, the problem that wanted be fixed is not existed, so just revert this commit. Fixes: b2e484e9 ("watchdog: Fix check_preemption_disabled() error") Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
- 06 1月, 2022 1 次提交
-
-
由 Xingang Wang 提交于
ascend inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I49RB2 CVE: NA --------------------------------------------------- [ 0.596145] BUG: KASAN: global-out-of-bounds in __of_match_node.part.0+0xe0/0x110 [ 0.596731] Read of size 1 at addr ffff2000099a8288 by task swapper/0/1 [ 0.597247] [ 0.597372] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.90+ #34 [ 0.597858] Hardware name: linux,dummy-virt (DT) [ 0.598243] Call trace: [ 0.598443] dump_backtrace+0x0/0x360 [ 0.598734] show_stack+0x24/0x30 [ 0.599004] dump_stack+0xdc/0x128 [ 0.599323] print_address_description+0x184/0x278 [ 0.599771] kasan_report+0x204/0x330 [ 0.600117] __asan_report_load1_noabort+0x30/0x40 [ 0.600566] __of_match_node.part.0+0xe0/0x110 [ 0.600980] of_match_node+0x6c/0xa8 [ 0.601316] of_match_device+0x48/0x70 [ 0.601669] platform_match+0xa4/0x260 [ 0.602037] __driver_attach+0x68/0x128 [ 0.602397] bus_for_each_dev+0x118/0x198 [ 0.602773] driver_attach+0x48/0x60 [ 0.603112] bus_add_driver+0x330/0x658 [ 0.603472] driver_register+0x148/0x398 [ 0.603839] __platform_driver_register+0xd4/0x108 [ 0.604288] arm_mpam_driver_init+0x64/0x78 [ 0.604680] do_one_initcall+0xbc/0x488 [ 0.605039] kernel_init_freeable+0x604/0x6f8 [ 0.605447] kernel_init+0x18/0x130 [ 0.605775] ret_from_fork+0x10/0x18 [ 0.606130] [ 0.606274] The buggy address belongs to the variable: [ 0.606754] arm_mpam_of_device_ids+0xc8/0x380 [ 0.607168] [ 0.607314] Memory state around the buggy address: [ 0.607762] ffff2000099a8180: 00 00 00 fa fa fa fa fa 00 00 00 00 00 00 00 00 [ 0.608429] ffff2000099a8200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.609095] >ffff2000099a8280: 00 fa fa fa fa fa fa fa 05 fa fa fa fa fa fa fa [ 0.609760] ^ [ 0.610101] ffff2000099a8300: 00 00 07 fa fa fa fa fa 00 04 fa fa fa fa fa fa [ 0.610771] ffff2000099a8380: 00 00 00 06 fa fa fa fa 00 01 fa fa fa fa fa fa The arm_mpam_of_device_ids array has no end item, so the array access might be out of bounds. When enable the KASAN config, the out of bounds call trace occured. The add empty end item for arm_mpam_of_device_ids array to fix this issue. Fixes: b45bdb5a ("arm64/mpam: add device tree support for mpam initialization") Signed-off-by: NXingang Wang <wangxingang5@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-