- 09 3月, 2022 18 次提交
-
-
由 Miklos Szeredi 提交于
mainline inclusion from mainline-v5.11-rc1 commit 10c52c84 category: perf bugzilla: https://gitee.com/openeuler/kernel/issues/I4SIR8 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10c52c84e3f4872689a64ac7666b34d67e630691 -------------------------------- Kernel has: ATTR_KILL_PRIV -> clear "security.capability" ATTR_KILL_SUID -> clear S_ISUID ATTR_KILL_SGID -> clear S_ISGID if executable Fuse has: FUSE_WRITE_KILL_PRIV -> clear S_ISUID and S_ISGID if executable So FUSE_WRITE_KILL_PRIV implies the complement of ATTR_KILL_PRIV, which is somewhat confusing. Also PRIV implies all privileges, including "security.capability". Change the name to FUSE_WRITE_KILL_SUIDGID and make FUSE_WRITE_KILL_PRIV an alias to perserve API compatibility Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Vivek Goyal 提交于
mainline inclusion from mainline-v5.11-rc1 commit 63f9909f category: perf bugzilla: https://gitee.com/openeuler/kernel/issues/I4SIR8 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=63f9909ff602082597849f684655e93336c50b11 -------------------------------- We already have FUSE_HANDLE_KILLPRIV flag that says that file server will remove suid/sgid/caps on truncate/chown/write. But that's little different from what Linux VFS implements. To be consistent with Linux VFS behavior what we want is. - caps are always cleared on chown/write/truncate - suid is always cleared on chown, while for truncate/write it is cleared only if caller does not have CAP_FSETID. - sgid is always cleared on chown, while for truncate/write it is cleared only if caller does not have CAP_FSETID as well as file has group execute permission. As previous flag did not provide above semantics. Implement a V2 of the protocol with above said constraints. Server does not know if caller has CAP_FSETID or not. So for the case of write()/truncate(), client will send information in special flag to indicate whether to kill priviliges or not. These changes are in subsequent patches. FUSE_HANDLE_KILLPRIV_V2 relies on WRITE being sent to server to clear suid/sgid/security.capability. But with ->writeback_cache, WRITES are cached in guest. So it is not recommended to use FUSE_HANDLE_KILLPRIV_V2 and writeback_cache together. Though it probably might be good enough for lot of use cases. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Brian Foster 提交于
mainline inclusion from mainline-v5.13-rc4 commit e53d3aa0 category: bugfix bugzilla: 185862 https://gitee.com/openeuler/kernel/issues/I4KIAO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e53d3aa0b605c49d780e1b2fd0b49dba4154f32b -------------------------------- This code goes back to a time when transaction commits wrote directly to iclogs. The associated log items were pinned, written to the log, and then "uncommitted" if some part of the log write had failed. This uncommit sequence called an ->iop_unpin_remove() handler that was eventually folded into ->iop_unpin() via the remove parameter. The log subsystem has since changed significantly in that transactions commit to the CIL instead of direct to iclogs, though log items must still be aborted in the event of an eventual log I/O error. However, the context for a log item abort is now asynchronous from transaction commit, which means the committing transaction has been freed by this point in time and the transaction uncommit sequence of events is no longer relevant. Further, since stale buffers remain locked at transaction commit through unpin, we can be certain that the buffer is not associated with any transaction when the unpin callback executes. Remove this unused hunk of code and replace it with an assertion that the buffer is disassociated from transaction context. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com> Reviewed-by: NLihong Kou <koulihong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Brian Foster 提交于
mainline inclusion from mainline-v5.13-rc4 commit 84d8949e category: bugfix bugzilla: 185862 https://gitee.com/openeuler/kernel/issues/I4KIAO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=84d8949e770745b16a7e8a68dcb1d0f3687bdee9 -------------------------------- The special processing used to simulate a buffer I/O failure on fs shutdown has a difficult to reproduce race that can result in a use after free of the associated buffer. Consider a buffer that has been committed to the on-disk log and thus is AIL resident. The buffer lands on the writeback delwri queue, but is subsequently locked, committed and pinned by another transaction before submitted for I/O. At this point, the buffer is stuck on the delwri queue as it cannot be submitted for I/O until it is unpinned. A log checkpoint I/O failure occurs sometime later, which aborts the bli. The unpin handler is called with the aborted log item, drops the bli reference count, the pin count, and falls into the I/O failure simulation path. The potential problem here is that once the pin count falls to zero in ->iop_unpin(), xfsaild is free to retry delwri submission of the buffer at any time, before the unpin handler even completes. If delwri queue submission wins the race to the buffer lock, it observes the shutdown state and simulates the I/O failure itself. This releases both the bli and delwri queue holds and frees the buffer while xfs_buf_item_unpin() sits on xfs_buf_lock() waiting to run through the same failure sequence. This problem is rare and requires many iterations of fstest generic/019 (which simulates disk I/O failures) to reproduce. To avoid this problem, grab a hold on the buffer before the log item is unpinned if the associated item has been aborted and will require a simulated I/O failure. The hold is already required for the simulated I/O failure, so the ordering simply guarantees the unpin handler access to the buffer before it is unpinned and thus processed by the AIL. This particular ordering is required so long as the AIL does not acquire a reference on the bli, which is the long term solution to this problem. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com> Reviewed-by: NLihong Kou <koulihong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Darrick J. Wong 提交于
mainline inclusion from mainline-v5.11-rc4 commit 6da1b4b1 category: bugfix bugzilla: 185867 https://gitee.com/openeuler/kernel/issues/I4KIAO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6da1b4b1ab36d80a3994fd4811c8381de10af604 -------------------------------- When overlayfs is running on top of xfs and the user unlinks a file in the overlay, overlayfs will create a whiteout inode and ask xfs to "rename" the whiteout file atop the one being unlinked. If the file being unlinked loses its one nlink, we then have to put the inode on the unlinked list. This requires us to grab the AGI buffer of the whiteout inode to take it off the unlinked list (which is where whiteouts are created) and to grab the AGI buffer of the file being deleted. If the whiteout was created in a higher numbered AG than the file being deleted, we'll lock the AGIs in the wrong order and deadlock. Therefore, grab all the AGI locks we think we'll need ahead of time, and in order of increasing AG number per the locking rules. Reported-by: Nwenli xie <wlxie7296@gmail.com> Fixes: 93597ae8 ("xfs: Fix deadlock between AGI and AGF when target_ip exists in xfs_rename()") Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com> Reviewed-by: NLihong Kou <koulihong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yang Yingliang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4VSGH CVE: NA -------------------------------- This reverts commit c6d2a109. I got the following messages when booting kernel: EFI stub: Booting Linux Kernel... EFI stub: EFI_RNG_PROTOCOL unavailable, KASLR will be disabled EFI stub: Using DTB from configuration table EFI stub: Exiting boot services and installing virtual address map... ... [ 0.000000] CPU features: kernel page table isolation forced ON by KASLR [ 0.000000] CPU features: detected: Kernel page table isolation (KPTI) [ 3.393380] KASLR disabled due to lack of seed KPTI is forced on by KASLR, but in fact KASLR is not enabled, it's because kaslr_offset() returns non-zero in kaslr_requires_kpti(). To avoid this problem, when efi kaslr is disabled, make image MIN_KIMG_ALIGN align which is used to get KASLR offset in primary_entry(), so kaslr_offset() will returns 0. Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kai Ye 提交于
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4W3OQ ---------------------------------------------------------------------- Due to that extra page addr is used as a qp error flag when the device resetting. So it not should to clear this qp flag in userspace. Signed-off-by: NKai Ye <yekai13@huawei.com> Signed-off-by: NYang Shen <shenyang39@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Weili Qian 提交于
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4W3OQ ---------------------------------------------------------------------- If the device master ooo is blocked, there is no need to empty the queue. Only the PF can obtain the status of the device. If the VF runs on the host, the device status can be obtained by PF. Signed-off-by: NWeili Qian <qianweili@huawei.com> Signed-off-by: NYang Shen <shenyang39@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Weili Qian 提交于
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4W3OQ ---------------------------------------------------------------------- Currently, the memory of the queue's sqe is freed when the driver is removed, not the put queue. Therefore, it is only necessary to write back the data in the hardware cache to memory before removing the driver. Signed-off-by: NWeili Qian <qianweili@huawei.com> Signed-off-by: NYang Shen <shenyang39@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Weili Qian 提交于
mainline inclusion from mainline-v5.17-rc1 commit 696645d2 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4W3OQ CVE: NA -------------------------------- If the hardware reports the 'CQ' overflow or 'CQE' error by the abnormal interrupt, disable the queue and stop tasks send to hardware. Signed-off-by: NWeili Qian <qianweili@huawei.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NYang Shen <shenyang39@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Weili Qian 提交于
mainline inclusion from mainline-v5.17-rc1 commit 95f0b6d5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4W3OQ CVE: NA -------------------------------- If the hardware reports the event queue overflow by the abnormal interrupt, the driver needs to reset the function and re-enable the event queue interrupt and abnormal interrupt. Signed-off-by: NWeili Qian <qianweili@huawei.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NYang Shen <shenyang39@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Weili Qian 提交于
mainline inclusion from mainline-v5.17-rc1 commit a0a9486b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4W3OQ CVE: NA -------------------------------- The abnormal interrupt method needs to be changed, and the changed method needs to be locked in order to maintain atomicity. Therefore, replace request_irq() with request_threaded_irq(). Signed-off-by: NWeili Qian <qianweili@huawei.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NYang Shen <shenyang39@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Weili Qian 提交于
mainline inclusion from mainline-v5.17-rc1 commit 145dcedd category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4W3OQ CVE: NA -------------------------------- After processing an interrupt event and the interrupt function is enabled by writing the QM_DOORBELL_CMD_AEQ register, the hardware may generate new interrupt events due to processing other user's task when the subsequent interrupt events have not been processed. The new interrupt event will disrupt the current normal processing flow and cause other problems. Therefore, the operation of writing the QM_DOORBELL_CMD_AEQ doorbell register needs to be placed after all interrupt events processing are completed. Signed-off-by: NWeili Qian <qianweili@huawei.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NYang Shen <shenyang39@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Weili Qian 提交于
mainline inclusion from mainline-v5.17-rc1 commit 9ee401ea category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4W3OQ CVE: NA -------------------------------- This patch does not change any code, just code movement. Preparing for next patch. Signed-off-by: NWeili Qian <qianweili@huawei.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NYang Shen <shenyang39@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Weili Qian 提交于
mainline inclusion from mainline-v5.17-rc1 commit f123e66d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4W3OQ CVE: NA -------------------------------- The internal memory of the device needs to be reset only when the device is globally initialized. Other scenarios, such as function reset, do not need to perform reset. Signed-off-by: NWeili Qian <qianweili@huawei.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NYang Shen <shenyang39@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yang Shen 提交于
mainline inclusion from mainline-v5.16-rc1 commit fc6c01f0 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4W3EG CVE: NA -------------------------------- When remove the driver and executing the task occur at the same time, the following deadlock will be triggered: Chain exists of: sva_lock --> uacce_mutex --> &qm->qps_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&qm->qps_lock); lock(uacce_mutex); lock(&qm->qps_lock); lock(sva_lock); And the lock 'qps_lock' is used to protect qp. Therefore, it's reasonable cycle is to continue until the qp memory is released. So move the release lock infront of 'uacce_remove'. Signed-off-by: NYang Shen <shenyang39@huawei.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NYang Shen <shenyang39@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kai Ye 提交于
mainline inclusion from mainline-crypto-master commit e764d81d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WC3O CVE: NA -------------------------------- Modify the print of information that might lead to user misunderstanding. Currently only XTS mode need the fallback tfm when using 192bit key. Others algs not need soft fallback tfm. So others algs can return directly. Signed-off-by: NKai Ye <yekai13@huawei.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NYang Shen <shenyang39@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kai Ye 提交于
mainline inclusion from mainline-crypto-master commit 0a2a464f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4W4WU CVE: NA -------------------------------- Due to the subreq pointer misuse the private context memory. The aead soft crypto occasionally casues the OS panic as setting the 64K page. Here is fix it. Fixes: 6c46a329 ("crypto: hisilicon/sec - add fallback tfm...") Signed-off-by: NKai Ye <yekai13@huawei.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NYang Shen <shenyang39@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 08 3月, 2022 22 次提交
-
-
由 Laibin Qiu 提交于
hulk inclusion category: bugfix bugzilla: 185779, https://gitee.com/openeuler/kernel/issues/I4WFIY CVE: NA ------------------------------------------------- 1.In current process, all bio will set the BIO_THROTTLED flag after __blk_throtl_bio(). 2.If bio needs to be throttled, it will start the timer and stop submit bio directly. Bio will submit in blk_throtl_dispatch_work_fn() when the timer expires.But in the current process, if bio is throttled. The BIO_THROTTLED will be set to bio after timer start. If the bio has been completed, it may cause use-after-free blow. BUG: KASAN: use-after-free in blk_throtl_bio+0x12f0/0x2c70 Read of size 2 at addr ffff88801b8902d4 by task fio/26380 dump_stack+0x9b/0xce print_address_description.constprop.6+0x3e/0x60 kasan_report.cold.9+0x22/0x3a blk_throtl_bio+0x12f0/0x2c70 submit_bio_checks+0x701/0x1550 submit_bio_noacct+0x83/0xc80 submit_bio+0xa7/0x330 mpage_readahead+0x380/0x500 read_pages+0x1c1/0xbf0 page_cache_ra_unbounded+0x471/0x6f0 do_page_cache_ra+0xda/0x110 ondemand_readahead+0x442/0xae0 page_cache_async_ra+0x210/0x300 generic_file_buffered_read+0x4d9/0x2130 generic_file_read_iter+0x315/0x490 blkdev_read_iter+0x113/0x1b0 aio_read+0x2ad/0x450 io_submit_one+0xc8e/0x1d60 __se_sys_io_submit+0x125/0x350 do_syscall_64+0x2d/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Allocated by task 26380: kasan_save_stack+0x19/0x40 __kasan_kmalloc.constprop.2+0xc1/0xd0 kmem_cache_alloc+0x146/0x440 mempool_alloc+0x125/0x2f0 bio_alloc_bioset+0x353/0x590 mpage_alloc+0x3b/0x240 do_mpage_readpage+0xddf/0x1ef0 mpage_readahead+0x264/0x500 read_pages+0x1c1/0xbf0 page_cache_ra_unbounded+0x471/0x6f0 do_page_cache_ra+0xda/0x110 ondemand_readahead+0x442/0xae0 page_cache_async_ra+0x210/0x300 generic_file_buffered_read+0x4d9/0x2130 generic_file_read_iter+0x315/0x490 blkdev_read_iter+0x113/0x1b0 aio_read+0x2ad/0x450 io_submit_one+0xc8e/0x1d60 __se_sys_io_submit+0x125/0x350 do_syscall_64+0x2d/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 0: kasan_save_stack+0x19/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x1b/0x30 __kasan_slab_free+0x111/0x160 kmem_cache_free+0x94/0x460 mempool_free+0xd6/0x320 bio_free+0xe0/0x130 bio_put+0xab/0xe0 bio_endio+0x3a6/0x5d0 blk_update_request+0x590/0x1370 scsi_end_request+0x7d/0x400 scsi_io_completion+0x1aa/0xe50 scsi_softirq_done+0x11b/0x240 blk_mq_complete_request+0xd4/0x120 scsi_mq_done+0xf0/0x200 virtscsi_vq_done+0xbc/0x150 vring_interrupt+0x179/0x390 __handle_irq_event_percpu+0xf7/0x490 handle_irq_event_percpu+0x7b/0x160 handle_irq_event+0xcc/0x170 handle_edge_irq+0x215/0xb20 common_interrupt+0x60/0x120 asm_common_interrupt+0x1e/0x40 Fix this by move BIO_THROTTLED set into the queue_lock. Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Daniel Borkmann 提交于
mainline inclusion from mainline-v5.17-rc1 commit 37c8d480 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WT90 CVE: CVE-2021-4204 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=37c8d4807d1b8b521b30310dce97f6695dc2c2c6 -------------------------------- Add two tests, one which asserts that ring buffer memory can be passed to other helpers for populating its entry area, and another one where verifier rejects different type of memory passed to bpf_ringbuf_submit(). Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Hao Luo 提交于
mainline inclusion from mainline-v5.17-rc1 commit 44bab87d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WT90 CVE: CVE-2021-4204 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=44bab87d8ca6f0544a9f8fc97bdf33aa5b3c899e -------------------------------- The second parameter of bpf_d_path() can only accept writable memories. Rdonly_mem obtained from bpf_per_cpu_ptr() can not be passed into bpf_d_path for modification. This patch adds a selftest to verify this behavior. Signed-off-by: NHao Luo <haoluo@google.com> Signed-off-by: NAndrii Nakryiko <andrii@kernel.org> Acked-by: NYonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220106205525.2116218-1-haoluo@google.comSigned-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Daniel Borkmann 提交于
mainline inclusion from mainline-v5.17-rc1 commit 722e4db3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WT90 CVE: CVE-2021-4204 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=722e4db3ae0d52b2e3801280afbe19cf2d188e91 -------------------------------- Assert that the verifier is rejecting invalid offsets on the ringbuf entries: # ./test_verifier | grep ring #947/u ringbuf: invalid reservation offset 1 OK #947/p ringbuf: invalid reservation offset 1 OK #948/u ringbuf: invalid reservation offset 2 OK #948/p ringbuf: invalid reservation offset 2 OK Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Gilad Reti 提交于
mainline inclusion from mainline-v5.11-rc5 commit 4237e9f4 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WT90 CVE: CVE-2021-4204 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4237e9f4a96228ccc8a7abe5e4b30834323cd353 -------------------------------- Add a test to check that the verifier is able to recognize spilling of PTR_TO_MEM registers, by reserving a ringbuf buffer, forcing the spill of a pointer holding the buffer address to the stack, filling it back in from the stack and writing to the memory area pointed by it. The patch was partially contributed by CyberArk Software, Inc. Signed-off-by: NGilad Reti <gilad.reti@gmail.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NYonghong Song <yhs@fb.com> Acked-by: NKP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/bpf/20210113053810.13518-2-gilad.reti@gmail.comSigned-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Daniel Borkmann 提交于
mainline inclusion from mainline-v5.17-rc1 commit a672b2e3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WT90 CVE: CVE-2021-4204 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a672b2e36a648afb04ad3bda93b6bda947a479a5 -------------------------------- The bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM in their bpf_func_proto definition as their first argument, and thus both expect the result from a prior bpf_ringbuf_reserve() call which has a return type of RET_PTR_TO_ALLOC_MEM_OR_NULL. While the non-NULL memory from bpf_ringbuf_reserve() can be passed to other helpers, the two sinks (bpf_ringbuf_submit(), bpf_ringbuf_discard()) right now only enforce a register type of PTR_TO_MEM. This can lead to potential type confusion since it would allow other PTR_TO_MEM memory to be passed into the two sinks which did not come from bpf_ringbuf_reserve(). Add a new MEM_ALLOC composable type attribute for PTR_TO_MEM, and enforce that: - bpf_ringbuf_reserve() returns NULL or PTR_TO_MEM | MEM_ALLOC - bpf_ringbuf_submit() and bpf_ringbuf_discard() only take PTR_TO_MEM | MEM_ALLOC but not plain PTR_TO_MEM arguments via ARG_PTR_TO_ALLOC_MEM - however, other helpers might treat PTR_TO_MEM | MEM_ALLOC as plain PTR_TO_MEM to populate the memory area when they use ARG_PTR_TO_{UNINIT_,}MEM in their func proto description Fixes: 457f4436 ("bpf: Implement BPF ring buffer and verifier support for it") Reported-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Daniel Borkmann 提交于
mainline inclusion from mainline-v5.17-rc1 commit 64620e0a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WT90 CVE: CVE-2021-4204 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=64620e0a1e712a778095bd35cbb277dc2259281f -------------------------------- Both bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM in their bpf_func_proto definition as their first argument. They both expect the result from a prior bpf_ringbuf_reserve() call which has a return type of RET_PTR_TO_ALLOC_MEM_OR_NULL. Meaning, after a NULL check in the code, the verifier will promote the register type in the non-NULL branch to a PTR_TO_MEM and in the NULL branch to a known zero scalar. Generally, pointer arithmetic on PTR_TO_MEM is allowed, so the latter could have an offset. The ARG_PTR_TO_ALLOC_MEM expects a PTR_TO_MEM register type. However, the non- zero result from bpf_ringbuf_reserve() must be fed into either bpf_ringbuf_submit() or bpf_ringbuf_discard() but with the original offset given it will then read out the struct bpf_ringbuf_hdr mapping. The verifier missed to enforce a zero offset, so that out of bounds access can be triggered which could be used to escalate privileges if unprivileged BPF was enabled (disabled by default in kernel). Fixes: 457f4436 ("bpf: Implement BPF ring buffer and verifier support for it") Reported-by: <tr3e.wang@gmail.com> (SecCoder Security Lab) Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Daniel Borkmann 提交于
mainline inclusion from mainline-v5.17-rc1 commit 6788ab23 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WT90 CVE: CVE-2021-4204 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6788ab23508bddb0a9d88e104284922cb2c22b77 -------------------------------- Right now the assertion on check_ptr_off_reg() is only enforced for register types PTR_TO_CTX (and open coded also for PTR_TO_BTF_ID), however, this is insufficient since many other PTR_TO_* register types such as PTR_TO_FUNC do not handle/expect register offsets when passed to helper functions. Given this can slip-through easily when adding new types, make this an explicit allow-list and reject all other current and future types by default if this is encountered. Also, extend check_ptr_off_reg() to handle PTR_TO_BTF_ID as well instead of duplicating it. For PTR_TO_BTF_ID, reg->off is used for BTF to match expected BTF ids if struct offset is used. This part still needs to be allowed, but the dynamic off from the tnum must be rejected. Fixes: 69c087ba ("bpf: Add bpf_for_each_map_elem() helper") Fixes: eaa6bcb7 ("bpf: Introduce bpf_per_cpu_ptr()") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Daniel Borkmann 提交于
mainline inclusion from mainline-v5.17-rc1 commit d400a6cf category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WT90 CVE: CVE-2021-4204 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d400a6cf1c8a57cdf10f35220ead3284320d85ff -------------------------------- Similar as with other pointer types where we use ldimm64, clear the register content to zero first, and then populate the PTR_TO_FUNC type and subprogno number. Currently this is not done, and leads to reuse of stale register tracking data. Given for special ldimm64 cases we always clear the register offset, make it common for all cases, so it won't be forgotten in future. Fixes: 69c087ba ("bpf: Add bpf_for_each_map_elem() helper") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Daniel Borkmann 提交于
mainline inclusion from mainline-v5.17-rc1 commit be80a1d3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WT90 CVE: CVE-2021-4204 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=be80a1d3f9dbe5aee79a325964f7037fe2d92f30 -------------------------------- Generalize the check_ctx_reg() helper function into a more generic named one so that it can be reused for other register types as well to check whether their offset is non-zero. No functional change. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Conflicts: include/linux/bpf_verifier.h kernel/bpf/btf.c Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Hao Luo 提交于
mainline inclusion from mainline-v5.17-rc1 commit 9497c458 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WRPV CVE: CVE-2022-0500 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9497c458c10b049438ef6e6ddda898edbc3ec6a8 -------------------------------- This test verifies that a ksym of non-struct can not be directly updated. Signed-off-by: NHao Luo <haoluo@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-10-haoluo@google.com Conflicts: tools/testing/selftests/bpf/prog_tests/ksyms_btf.c Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Hao Luo 提交于
mainline inclusion from mainline-v5.17-rc1 commit 216e3cd2 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WRPV CVE: CVE-2022-0500 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=216e3cd2f28dbbf1fe86848e0e29e6693b9f0a20 -------------------------------- Some helper functions may modify its arguments, for example, bpf_d_path, bpf_get_stack etc. Previously, their argument types were marked as ARG_PTR_TO_MEM, which is compatible with read-only mem types, such as PTR_TO_RDONLY_BUF. Therefore it's legitimate, but technically incorrect, to modify a read-only memory by passing it into one of such helper functions. This patch tags the bpf_args compatible with immutable memory with MEM_RDONLY flag. The arguments that don't have this flag will be only compatible with mutable memory types, preventing the helper from modifying a read-only memory. The bpf_args that have MEM_RDONLY are compatible with both mutable memory and immutable memory. Signed-off-by: NHao Luo <haoluo@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-9-haoluo@google.com Conflicts: kernel/bpf/btf.c kernel/bpf/helpers.c kernel/bpf/syscall.c kernel/trace/bpf_trace.c net/core/filter.c Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Hao Luo 提交于
mainline inclusion from mainline-v5.17-rc1 commit 34d3a78c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WRPV CVE: CVE-2022-0500 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=34d3a78c681e8e7844b43d1a2f4671a04249c821 -------------------------------- Tag the return type of {per, this}_cpu_ptr with RDONLY_MEM. The returned value of this pair of helpers is kernel object, which can not be updated by bpf programs. Previously these two helpers return PTR_OT_MEM for kernel objects of scalar type, which allows one to directly modify the memory. Now with RDONLY_MEM tagging, the verifier will reject programs that write into RDONLY_MEM. Fixes: 63d9b80d ("bpf: Introducte bpf_this_cpu_ptr()") Fixes: eaa6bcb7 ("bpf: Introduce bpf_per_cpu_ptr()") Fixes: 4976b718 ("bpf: Introduce pseudo_btf_id") Signed-off-by: NHao Luo <haoluo@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-8-haoluo@google.comSigned-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Hao Luo 提交于
mainline inclusion from mainline-v5.17-rc1 commit cf9f2f8d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WRPV CVE: CVE-2022-0500 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cf9f2f8d62eca810afbd1ee6cc0800202b000e57 -------------------------------- Remove PTR_TO_MEM_OR_NULL and replace it with PTR_TO_MEM combined with flag PTR_MAYBE_NULL. Signed-off-by: NHao Luo <haoluo@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-7-haoluo@google.com Conflicts: kernel/bpf/btf.c kernel/bpf/verifier.c Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Hao Luo 提交于
mainline inclusion from mainline-v5.17-rc1 commit 20b2aff4 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WRPV CVE: CVE-2022-0500 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=20b2aff4bc15bda809f994761d5719827d66c0b4 -------------------------------- This patch introduce a flag MEM_RDONLY to tag a reg value pointing to read-only memory. It makes the following changes: 1. PTR_TO_RDWR_BUF -> PTR_TO_BUF 2. PTR_TO_RDONLY_BUF -> PTR_TO_BUF | MEM_RDONLY Signed-off-by: NHao Luo <haoluo@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-6-haoluo@google.com Conflicts: kernel/bpf/verifier.c Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Hao Luo 提交于
mainline inclusion from mainline-v5.17-rc1 commit c25b2ae1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WRPV CVE: CVE-2022-0500 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c25b2ae136039ffa820c26138ed4a5e5f3ab3841 -------------------------------- We have introduced a new type to make bpf_reg composable, by allocating bits in the type to represent flags. One of the flags is PTR_MAYBE_NULL which indicates a pointer may be NULL. This patch switches the qualified reg_types to use this flag. The reg_types changed in this patch include: 1. PTR_TO_MAP_VALUE_OR_NULL 2. PTR_TO_SOCKET_OR_NULL 3. PTR_TO_SOCK_COMMON_OR_NULL 4. PTR_TO_TCP_SOCK_OR_NULL 5. PTR_TO_BTF_ID_OR_NULL 6. PTR_TO_MEM_OR_NULL 7. PTR_TO_RDONLY_BUF_OR_NULL 8. PTR_TO_RDWR_BUF_OR_NULL Signed-off-by: NHao Luo <haoluo@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211217003152.48334-5-haoluo@google.com Conflicts: include/linux/bpf.h include/linux/bpf_verifier.h kernel/bpf/verifier.c Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Hao Luo 提交于
mainline inclusion from mainline-v5.17-rc1 commit 3c480732 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WRPV CVE: CVE-2022-0500 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c4807322660d4290ac9062c034aed6b87243861 -------------------------------- We have introduced a new type to make bpf_ret composable, by reserving high bits to represent flags. One of the flag is PTR_MAYBE_NULL, which indicates a pointer may be NULL. When applying this flag to ret_types, it means the returned value could be a NULL pointer. This patch switches the qualified arg_types to use this flag. The ret_types changed in this patch include: 1. RET_PTR_TO_MAP_VALUE_OR_NULL 2. RET_PTR_TO_SOCKET_OR_NULL 3. RET_PTR_TO_TCP_SOCK_OR_NULL 4. RET_PTR_TO_SOCK_COMMON_OR_NULL 5. RET_PTR_TO_ALLOC_MEM_OR_NULL 6. RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL 7. RET_PTR_TO_BTF_ID_OR_NULL This patch doesn't eliminate the use of these names, instead it makes them aliases to 'RET_PTR_TO_XXX | PTR_MAYBE_NULL'. Signed-off-by: NHao Luo <haoluo@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-4-haoluo@google.com Conflicts: kernel/bpf/verifier.c Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Hao Luo 提交于
mainline inclusion from mainline-v5.17-rc1 commit 48946bd6 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WRPV CVE: CVE-2022-0500 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=48946bd6a5d695c50b34546864b79c1f910a33c1 -------------------------------- We have introduced a new type to make bpf_arg composable, by reserving high bits of bpf_arg to represent flags of a type. One of the flags is PTR_MAYBE_NULL which indicates a pointer may be NULL. When applying this flag to an arg_type, it means the arg can take NULL pointer. This patch switches the qualified arg_types to use this flag. The arg_types changed in this patch include: 1. ARG_PTR_TO_MAP_VALUE_OR_NULL 2. ARG_PTR_TO_MEM_OR_NULL 3. ARG_PTR_TO_CTX_OR_NULL 4. ARG_PTR_TO_SOCKET_OR_NULL 5. ARG_PTR_TO_ALLOC_MEM_OR_NULL 6. ARG_PTR_TO_STACK_OR_NULL This patch does not eliminate the use of these arg_types, instead it makes them an alias to the 'ARG_XXX | PTR_MAYBE_NULL'. Signed-off-by: NHao Luo <haoluo@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-3-haoluo@google.com Conflicts: include/linux/bpf.h kernel/bpf/verifier.c Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Hao Luo 提交于
mainline inclusion from mainline-v5.17-rc1 commit d639b9d1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WRPV CVE: CVE-2022-0500 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d639b9d13a39cf15639cbe6e8b2c43eb60148a73 -------------------------------- There are some common properties shared between bpf reg, ret and arg values. For instance, a value may be a NULL pointer, or a pointer to a read-only memory. Previously, to express these properties, enumeration was used. For example, in order to test whether a reg value can be NULL, reg_type_may_be_null() simply enumerates all types that are possibly NULL. The problem of this approach is that it's not scalable and causes a lot of duplication. These properties can be combined, for example, a type could be either MAYBE_NULL or RDONLY, or both. This patch series rewrites the layout of reg_type, arg_type and ret_type, so that common properties can be extracted and represented as composable flag. For example, one can write ARG_PTR_TO_MEM | PTR_MAYBE_NULL which is equivalent to the previous ARG_PTR_TO_MEM_OR_NULL The type ARG_PTR_TO_MEM are called "base type" in this patch. Base types can be extended with flags. A flag occupies the higher bits while base types sits in the lower bits. This patch in particular sets up a set of macro for this purpose. The following patches will rewrite arg_types, ret_types and reg_types respectively. Signed-off-by: NHao Luo <haoluo@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-2-haoluo@google.com Conflicts: include/linux/bpf.h include/linux/bpf_verifier.h Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Daniel Borkmann 提交于
stable inclusion from stable-v5.10.92 commit 35ab8c9085b0af847df7fac9571ccd26d9f0f513 bugzilla: https://gitee.com/openeuler/kernel/issues/I4WRPV Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=35ab8c9085b0af847df7fac9571ccd26d9f0f513 -------------------------------- [ no upstream commit given implicitly fixed through the larger refactoring in c25b2ae1 ] While auditing some other code, I noticed missing checks inside the pointer arithmetic simulation, more specifically, adjust_ptr_min_max_vals(). Several *_OR_NULL types are not rejected whereas they are _required_ to be rejected given the expectation is that they get promoted into a 'real' pointer type for the success case, that is, after an explicit != NULL check. One case which stands out and is accessible from unprivileged (iff enabled given disabled by default) is BPF ring buffer. From crafting a PoC, the NULL check can be bypassed through an offset, and its id marking will then lead to promotion of mem_or_null to a mem type. bpf_ringbuf_reserve() helper can trigger this case through passing of reserved flags, for example. func#0 @0 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (7a) *(u64 *)(r10 -8) = 0 1: R1=ctx(id=0,off=0,imm=0) R10=fp0 fp-8_w=mmmmmmmm 1: (18) r1 = 0x0 3: R1_w=map_ptr(id=0,off=0,ks=0,vs=0,imm=0) R10=fp0 fp-8_w=mmmmmmmm 3: (b7) r2 = 8 4: R1_w=map_ptr(id=0,off=0,ks=0,vs=0,imm=0) R2_w=invP8 R10=fp0 fp-8_w=mmmmmmmm 4: (b7) r3 = 0 5: R1_w=map_ptr(id=0,off=0,ks=0,vs=0,imm=0) R2_w=invP8 R3_w=invP0 R10=fp0 fp-8_w=mmmmmmmm 5: (85) call bpf_ringbuf_reserve#131 6: R0_w=mem_or_null(id=2,ref_obj_id=2,off=0,imm=0) R10=fp0 fp-8_w=mmmmmmmm refs=2 6: (bf) r6 = r0 7: R0_w=mem_or_null(id=2,ref_obj_id=2,off=0,imm=0) R6_w=mem_or_null(id=2,ref_obj_id=2,off=0,imm=0) R10=fp0 fp-8_w=mmmmmmmm refs=2 7: (07) r0 += 1 8: R0_w=mem_or_null(id=2,ref_obj_id=2,off=1,imm=0) R6_w=mem_or_null(id=2,ref_obj_id=2,off=0,imm=0) R10=fp0 fp-8_w=mmmmmmmm refs=2 8: (15) if r0 == 0x0 goto pc+4 R0_w=mem(id=0,ref_obj_id=0,off=0,imm=0) R6_w=mem(id=0,ref_obj_id=2,off=0,imm=0) R10=fp0 fp-8_w=mmmmmmmm refs=2 9: R0_w=mem(id=0,ref_obj_id=0,off=0,imm=0) R6_w=mem(id=0,ref_obj_id=2,off=0,imm=0) R10=fp0 fp-8_w=mmmmmmmm refs=2 9: (62) *(u32 *)(r6 +0) = 0 R0_w=mem(id=0,ref_obj_id=0,off=0,imm=0) R6_w=mem(id=0,ref_obj_id=2,off=0,imm=0) R10=fp0 fp-8_w=mmmmmmmm refs=2 10: R0_w=mem(id=0,ref_obj_id=0,off=0,imm=0) R6_w=mem(id=0,ref_obj_id=2,off=0,imm=0) R10=fp0 fp-8_w=mmmmmmmm refs=2 10: (bf) r1 = r6 11: R0_w=mem(id=0,ref_obj_id=0,off=0,imm=0) R1_w=mem(id=0,ref_obj_id=2,off=0,imm=0) R6_w=mem(id=0,ref_obj_id=2,off=0,imm=0) R10=fp0 fp-8_w=mmmmmmmm refs=2 11: (b7) r2 = 0 12: R0_w=mem(id=0,ref_obj_id=0,off=0,imm=0) R1_w=mem(id=0,ref_obj_id=2,off=0,imm=0) R2_w=invP0 R6_w=mem(id=0,ref_obj_id=2,off=0,imm=0) R10=fp0 fp-8_w=mmmmmmmm refs=2 12: (85) call bpf_ringbuf_submit#132 13: R6=invP(id=0) R10=fp0 fp-8=mmmmmmmm 13: (b7) r0 = 0 14: R0_w=invP0 R6=invP(id=0) R10=fp0 fp-8=mmmmmmmm 14: (95) exit from 8 to 13: safe processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 0 OK All three commits, that is b121b341 ("bpf: Add PTR_TO_BTF_ID_OR_NULL support"), 457f4436 ("bpf: Implement BPF ring buffer and verifier support for it"), and the afbf21dc ("bpf: Support readonly/readwrite buffers in verifier") suffer the same cause and their *_OR_NULL type pendants must be rejected in adjust_ptr_min_max_vals(). Make the test more robust by reusing reg_type_may_be_null() helper such that we catch all *_OR_NULL types we have today and in future. Note that pointer arithmetic on PTR_TO_BTF_ID, PTR_TO_RDONLY_BUF, and PTR_TO_RDWR_BUF is generally allowed. Fixes: b121b341 ("bpf: Add PTR_TO_BTF_ID_OR_NULL support") Fixes: 457f4436 ("bpf: Implement BPF ring buffer and verifier support for it") Fixes: afbf21dc ("bpf: Support readonly/readwrite buffers in verifier") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NPu Lehui <pulehui@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: performance bugzilla: https://gitee.com/openeuler/kernel/issues/I4S8DW --------------------------- If pending_queues is increased once, it will only be decreased when nr_active is zero, and that will lead to the under-utilization of host tags because pending_queues is non-zero and the available tags for the queue will be max(host tags / active_queues, 4) instead of the needed tags of the queue. Fix it by adding an expiration time for the increasement of pending_queues, and decrease it when it expires, so pending_queues will be decreased to zero if there is no tag allocation failure, and the available tags for the queue will be the whole host tags. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: performance bugzilla: https://gitee.com/openeuler/kernel/issues/I4S8DW --------------------------- Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-