提交 · 11e2bfddce4a40a3c996f1eef7e563c03b16839c · openeuler / Kernel

26 3月, 2022 1 次提交

arm64/mpam: fix __mpam_device_create() section mismatch error · 11e2bfdd

由 Xingang Wang 提交于 3月 26, 2022

ascend inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I49RB2
CVE: NA

---------------------------------------------------

Fix modpost Section mismatch error in __mpam_device_create() and others.
These warnings will occur in high version gcc, for example 10.1.0.

  [...]
  WARNING: vmlinux.o(.text+0x2ed88): Section mismatch in reference from the
  function __mpam_device_create() to the function .init.text:mpam_device_alloc()
  The function __mpam_device_create() references
  the function __init mpam_device_alloc().
  This is often because __mpam_device_create lacks a __init
  annotation or the annotation of mpam_device_alloc is wrong.

  WARNING: vmlinux.o(.text.unlikely+0xa5c): Section mismatch in reference from
  the function mpam_resctrl_init() to the function .init.text:mpam_init_padding()
  The function mpam_resctrl_init() references
  the function __init mpam_init_padding().
  This is often because mpam_resctrl_init lacks a __init
  annotation or the annotation of mpam_init_padding is wrong.

  WARNING: vmlinux.o(.text.unlikely+0x5a9c): Section mismatch in reference from
  the function resctrl_group_init() to the function .init.text:resctrl_group_setup_root()
  The function resctrl_group_init() references
  the function __init resctrl_group_setup_root().
  This is often because resctrl_group_init lacks a __init
  annotation or the annotation of resctrl_group_setup_root is wrong.
  [...]

Fixes: c5e27c39 ("arm64/mpam: remove __init macro to support driver probe")
Signed-off-by: NXingang Wang <wangxingang5@huawei.com>
Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com>
Reviewed-by: NCheng Jian <cj.chengjian@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

11e2bfdd

24 3月, 2022 2 次提交

block-map: add __GFP_ZERO flag for alloc_page in function bio_copy_kern · b253ac1c

由 Haimin Zhang 提交于 3月 24, 2022

mainline inclusion
from mainline-v5.17-rc5
commit cc8f7fe1
category: bugfix
bugzilla: 186474, https://gitee.com/openeuler/kernel/issues/I4Z2LA
CVE: CVE-2022-0494

--------------------------------

Add __GFP_ZERO flag for alloc_page in function bio_copy_kern to initialize
the buffer of a bio.
Signed-off-by: NHaimin Zhang <tcs.kernel@gmail.com>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220216084038.15635-1-tcs.kernel@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

Conflict: commit ce288e05 ("block: remove BLK_BOUNCE_ISA support")
is not backported.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

b253ac1c

hugetlb: Add huge page alloced limit · c7c20ad0

由 Kefeng Wang 提交于 3月 24, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4YTLN
CVE: NA

--------------------------------

The user wants to reserve a certain amount of memory for normal
non-huge page, that is, the hugetlb can't allowed to use all the
memory.

Add a new kernel parameters "hugepage_prohibit_sz=" to set size
for normal non-huge page reserved, and when alloc huge page,
let's fail if the new allocating exceeds the limit.
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NChen Wandun <chenwandun@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

c7c20ad0

23 3月, 2022 2 次提交

swiotlb: rework "fix info leak with DMA_FROM_DEVICE" · 3f80e186

由 Halil Pasic 提交于 3月 23, 2022

mainline inclusion
from mainline-v5.17-rc8
commit aa6f8dcb
category: bugfix
bugzilla: 186478, https://gitee.com/openeuler/kernel/issues/I4Z86P
CVE: CVE-2022-0854

--------------------------------

Unfortunately, we ended up merging an old version of the patch "fix info
leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
(the swiotlb maintainer), he asked me to create an incremental fix
(after I have pointed this out the mix up, and asked him for guidance).
So here we go.

The main differences between what we got and what was agreed are:
* swiotlb_sync_single_for_device is also required to do an extra bounce
* We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters
* The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE
  must take precedence over DMA_ATTR_SKIP_CPU_SYNC

Thus this patch removes DMA_ATTR_OVERWRITE, and makes
swiotlb_sync_single_for_device() bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer.

Let me note, that if the size used with dma_sync_* API is less than the
size used with dma_[un]map_*, under certain circumstances we may still
end up with swiotlb not being transparent. In that sense, this is no
perfect fix either.

To get this bullet proof, we would have to bounce the entire
mapping/bounce buffer. For that we would have to figure out the starting
address, and the size of the mapping in
swiotlb_sync_single_for_device(). While this does seem possible, there
seems to be no firm consensus on how things are supposed to work.
Signed-off-by: NHalil Pasic <pasic@linux.ibm.com>
Fixes: ddbd89de ("swiotlb: fix info leak with DMA_FROM_DEVICE")
Cc: stable@vger.kernel.org
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Conflicts:
	Documentation/core-api/dma-attributes.rst
	include/linux/dma-mapping.h
	kernel/dma/swiotlb.c
Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

3f80e186

swiotlb: fix info leak with DMA_FROM_DEVICE · 04c20fc8

由 Halil Pasic 提交于 3月 23, 2022

mainline inclusion
from mainline-v5.17-rc6
commit ddbd89de
category: bugfix
bugzilla: 186478, https://gitee.com/openeuler/kernel/issues/I4Z86P
CVE: CVE-2022-0854

--------------------------------

The problem I'm addressing was discovered by the LTP test covering
cve-2018-1000204.

A short description of what happens follows:
1) The test case issues a command code 00 (TEST UNIT READY) via the SG_IO
   interface with: dxfer_len == 524288, dxdfer_dir == SG_DXFER_FROM_DEV
   and a corresponding dxferp. The peculiar thing about this is that TUR
   is not reading from the device.
2) In sg_start_req() the invocation of blk_rq_map_user() effectively
   bounces the user-space buffer. As if the device was to transfer into
   it. Since commit a45b599a ("scsi: sg: allocate with __GFP_ZERO in
   sg_build_indirect()") we make sure this first bounce buffer is
   allocated with GFP_ZERO.
3) For the rest of the story we keep ignoring that we have a TUR, so the
   device won't touch the buffer we prepare as if the we had a
   DMA_FROM_DEVICE type of situation. My setup uses a virtio-scsi device
   and the  buffer allocated by SG is mapped by the function
   virtqueue_add_split() which uses DMA_FROM_DEVICE for the "in" sgs (here
   scatter-gather and not scsi generics). This mapping involves bouncing
   via the swiotlb (we need swiotlb to do virtio in protected guest like
   s390 Secure Execution, or AMD SEV).
4) When the SCSI TUR is done, we first copy back the content of the second
   (that is swiotlb) bounce buffer (which most likely contains some
   previous IO data), to the first bounce buffer, which contains all
   zeros.  Then we copy back the content of the first bounce buffer to
   the user-space buffer.
5) The test case detects that the buffer, which it zero-initialized,
  ain't all zeros and fails.

One can argue that this is an swiotlb problem, because without swiotlb
we leak all zeros, and the swiotlb should be transparent in a sense that
it does not affect the outcome (if all other participants are well
behaved).

Copying the content of the original buffer into the swiotlb buffer is
the only way I can think of to make swiotlb transparent in such
scenarios. So let's do just that if in doubt, but allow the driver
to tell us that the whole mapped buffer is going to be overwritten,
in which case we can preserve the old behavior and avoid the performance
impact of the extra bounce.
Signed-off-by: NHalil Pasic <pasic@linux.ibm.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Conflicts:
	Documentation/core-api/dma-attributes.rst
	include/linux/dma-mapping.h
	kernel/dma/swiotlb.c
Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

04c20fc8

22 3月, 2022 7 次提交

esp: Fix possible buffer overflow in ESP transformation · 019f9120

由 Steffen Klassert 提交于 3月 22, 2022

mainline inclusion
from mainline
commit ebe48d36
category: bugfix
bugzilla: 186409, https://gitee.com/openeuler/kernel/issues/I4Z0V2
CVE: CVE-2022-0886

--------------------------------

The maximum message size that can be send is bigger than
the  maximum site that skb_page_frag_refill can allocate.
So it is possible to write beyond the allocated buffer.

Fix this by doing a fallback to COW in that case.

v2:

Avoid get get_order() costs as suggested by Linus Torvalds.

Fixes: cac2661c ("esp4: Avoid skb_cow_data whenever possible")
Fixes: 03e2a30f ("esp6: Avoid skb_cow_data whenever possible")
Reported-by: Nvalis <sec@valis.email>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NXu Jia <xujia39@huawei.com>
Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

019f9120

sock: remove one redundant SKB_FRAG_PAGE_ORDER macro · b630f785

由 Yunsheng Lin 提交于 3月 22, 2022

mainline inclusion
from mainline-v5.15-rc1
commit 723783d0
category: bugfix
bugzilla: 186409, https://gitee.com/openeuler/kernel/issues/I4Z0V2
CVE: CVE-2022-0886

--------------------------------

Both SKB_FRAG_PAGE_ORDER are defined to the same value in
net/core/sock.c and drivers/vhost/net.c.

Move the SKB_FRAG_PAGE_ORDER definition to net/core/sock.h,
as both net/core/sock.c and drivers/vhost/net.c include it,
and it seems a reasonable file to put the macro.
Signed-off-by: NYunsheng Lin <linyunsheng@huawei.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NXu Jia <xujia39@huawei.com>
conflict:
	drivers/vhost/net.c
Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

b630f785

io_uring: fix UAF in get_files_struct() · 0213acd0

由 Luo Meng 提交于 3月 22, 2022

hulk inclusion
category: bugfix
bugzilla: 186337, https://gitee.com/openeuler/kernel/issues/I4XA09
CVE: NA

--------------------------------

If two tasks are running concurrently as follows:
     task1                                        |       task2
io_uring_enter                                    |  io_wqe_worker
  io_submit_sqes                                  |
    io_submit_sqe                                 |
      io_queue_sqe                                |
        io_req_defer                              |
          io_req_defer_prep                       |
            io_prep_work_files                    |
              io_grab_files                       |
                req->work.files = current->files  |
          io_queue_async_work                     |
            __io_queue_async_work                 |
              io_wq_enqueue                       |
                io_wqe_insert_work                |
                                                  |  io_worker_handle_work
                                                  |    io_impersonate_work
                                                  |      current->files = work->files

And then, one of the concurrency UAF can be shown as below:
          free                                          use (task3 ls -l /proc/io_wqe_worker id/fd )
do_exit // tsk = current = work->files            |
  exit_files				          |
    put_files_struct			          |
      tsk->files // tsk->files = work->files      |
	                                          |  iterate_dir
					          |    proc_readfd_common
                                                  |      p = get_proc_task(file_inode(file))
                                                  |       get_files_struct
                                                  |         files = task->files
                                                  |         atomic_inc(&files->count)

The root cause of UAF bugs is when get req->work.files doesn't add refcount.
The mainline commit 0f212204(io_uring: don't rely on weak ->files references)
fixes this problem, based on this commit to resolved the problme.
Signed-off-by: NLuo Meng <luomeng12@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

0213acd0

xfs: fix an undefined behaviour in _da3_path_shift · 76f51e9e

由 Qian Cai 提交于 3月 22, 2022

mainline inclusion
from mainline-v5.6-rc4
commit 4982bff1
category: bugfix
bugzilla: 186464, https://gitee.com/openeuler/kernel/issues/I4YYIZ

--------------------------------

In xfs_da3_path_shift() "blk" can be assigned to state->path.blk[-1] if
state->path.active is 1 (which is a valid state) when it tries to add an
entry to a single dir leaf block and then to shift forward to see if
there's a sibling block that would be a better place to put the new
entry. This causes a UBSAN warning given negative array indices are
undefined behavior in C. In practice the warning is entirely harmless
given that "blk" is never dereferenced in this case, but it is still
better to fix up the warning and slightly improve the code.

 UBSAN: Undefined behaviour in fs/xfs/libxfs/xfs_da_btree.c:1989:14
 index -1 is out of range for type 'xfs_da_state_blk_t [5]'
 Call trace:
  dump_backtrace+0x0/0x2c8
  show_stack+0x20/0x2c
  dump_stack+0xe8/0x150
  __ubsan_handle_out_of_bounds+0xe4/0xfc
  xfs_da3_path_shift+0x860/0x86c [xfs]
  xfs_da3_node_lookup_int+0x7c8/0x934 [xfs]
  xfs_dir2_node_addname+0x2c8/0xcd0 [xfs]
  xfs_dir_createname+0x348/0x38c [xfs]
  xfs_create+0x6b0/0x8b4 [xfs]
  xfs_generic_create+0x12c/0x1f8 [xfs]
  xfs_vn_mknod+0x3c/0x4c [xfs]
  xfs_vn_create+0x34/0x44 [xfs]
  do_last+0xd4c/0x10c8
  path_openat+0xbc/0x2f4
  do_filp_open+0x74/0xf4
  do_sys_openat2+0x98/0x180
  __arm64_sys_openat+0xf8/0x170
  do_el0_svc+0x170/0x240
  el0_sync_handler+0x150/0x250
  el0_sync+0x164/0x180
Suggested-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NQian Cai <cai@lca.pw>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>

Conflicts:
	fs/xfs/libxfs/xfs_da_btree.c
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

76f51e9e

xfs: Fix possible null-pointer dereferences in xchk_da_btree_block_check_sibling() · a4e35984

由 Jia-Ju Bai 提交于 3月 22, 2022

mainline inclusion
from mainline-v5.3-rc2
commit afa1d96d
category: bugfix
bugzilla: 186464, https://gitee.com/openeuler/kernel/issues/I4YYIZ

--------------------------------

In xchk_da_btree_block_check_sibling(), there is an if statement on
line 274 to check whether ds->state->altpath.blk[level].bp is NULL:
    if (ds->state->altpath.blk[level].bp)

When ds->state->altpath.blk[level].bp is NULL, it is used on line 281:
    xfs_trans_brelse(..., ds->state->altpath.blk[level].bp);
        struct xfs_buf_log_item *bip = bp->b_log_item;
        ASSERT(bp->b_transp == tp);

Thus, possible null-pointer dereferences may occur.

To fix these bugs, ds->state->altpath.blk[level].bp is checked before
being used.

These bugs are found by a static analysis tool STCheck written by us.
Signed-off-by: NJia-Ju Bai <baijiaju1990@gmail.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

a4e35984

xfs: fix use after free in buf log item unlock assert · afea1700

由 Brian Foster 提交于 3月 22, 2022

mainline inclusion
from mainline-v5.1-rc5
commit 4d09807f
category: bugfix
bugzilla: 186464, https://gitee.com/openeuler/kernel/issues/I4YYIZ

--------------------------------

The xfs_buf_log_item ->iop_unlock() callback asserts that the buffer
is unlocked when either non-stale or aborted. This assert occurs
after the bli refcount has been dropped and the log item potentially
freed. The aborted check is thus a potential use after free. This
problem has been reproduced with KASAN enabled via generic/475.

Fix up xfs_buf_item_unlock() to query aborted state before the bli
reference is dropped to prevent a potential use after free.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

afea1700

ACPI/IORT: Do not blindly trust DMA masks from firmware · 1b70444c

由 Moritz Fischer 提交于 3月 22, 2022

mainline inclusion
from mainline-v5.11-rc6
commit a1df829e
category: bugfix
bugzilla: https://gitee.com/openeuler/qemu/issues/I4WE3Y
CVE: NA

--------------------------------

Address issue observed on real world system with suboptimal IORT table
where DMA masks of PCI devices would get set to 0 as result.

iort_dma_setup() would query the root complex'/named component IORT
entry for a DMA mask, and use that over the one the device has been
configured with earlier.

Ideally we want to use the minimum mask of what the IORT contains for
the root complex and what the device was configured with.

Fixes: 5ac65e8c ("ACPI/IORT: Support address size limit for root complexes")
Signed-off-by: NMoritz Fischer <mdf@kernel.org>
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Link: https://lore.kernel.org/r/20210122012419.95010-1-mdf@kernel.orgSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

 Conflicts:
	drivers/acpi/arm64/iort.c
Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

1b70444c

21 3月, 2022 5 次提交

kabi: fix kabi broken in struct fuse_in · 10856783

由 Zhang Wensheng 提交于 3月 21, 2022

mainline inclusion
from mainline-v5.17-rc8
commit 0c4bcfde
category: bugfix
bugzilla: 186448, https://gitee.com/openeuler/kernel/issues/I4YORE
CVE: CVE-2022-1011

--------------------------------

Because create a new user_pages in fuse_in, to fix kabi change.
Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com>
Reviewed-by: NTao Hou <houtao1@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

10856783

fuse: fix pipe buffer lifetime for direct_io · ee8a1d43

由 Miklos Szeredi 提交于 3月 21, 2022

mainline inclusion
from mainline-v5.17-rc8
commit 0c4bcfde
category: bugfix
bugzilla: 186448, https://gitee.com/openeuler/kernel/issues/I4YORE
CVE: CVE-2022-1011

--------------------------------

In FOPEN_DIRECT_IO mode, fuse_file_write_iter() calls
fuse_direct_write_iter(), which normally calls fuse_direct_io(), which then
imports the write buffer with fuse_get_user_pages(), which uses
iov_iter_get_pages() to grab references to userspace pages instead of
actually copying memory.

On the filesystem device side, these pages can then either be read to
userspace (via fuse_dev_read()), or splice()d over into a pipe using
fuse_dev_splice_read() as pipe buffers with &nosteal_pipe_buf_ops.

This is wrong because after fuse_dev_do_read() unlocks the FUSE request,
the userspace filesystem can mark the request as completed, causing write()
to return. At that point, the userspace filesystem should no longer have
access to the pipe buffer.

Fix by copying pages coming from the user address space to new pipe
buffers.
Reported-by: NJann Horn <jannh@google.com>
Fixes: c3021629 ("fuse: support splice() reading from fuse device")
Cc: <stable@vger.kernel.org>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

Conflicts:
	fs/fuse/file.c
	fs/fuse/fuse_i.h
Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com>
Reviewed-by: NTao Hou <houtao1@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

ee8a1d43

blk-throtl: fix race in io dispatching · 06ff79d5

由 Yu Kuai 提交于 3月 21, 2022

hulk inclusion
category: bugfix
bugzilla: 186449, https://gitee.com/openeuler/kernel/issues/I4YSPC
CVE: NA

--------------------------------

If io is throttled, such io will be issued by blk_throtl_dispatch_work_fn()
or blk_throtl_drain(), and the io is fetched by throtl_pop_queued().
throtl_pop_queued() should be protected by 'queue_lock', as what
blk_throtl_dispatch_work_fn() does. However, it's not protected in
blk_throtl_drain(), which may lead to concurrent bio_list_pop(), and may
end up crashing the kernel.

Fix the problem by protecting throtl_pop_queued() through 'queue_lock'
in blk_throtl_drain().
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

06ff79d5

ext4: Fix symlink file size not match to file content · f91c9577

由 Ye Bin 提交于 3月 21, 2022

hulk inclusion
category: bugfix
bugzilla: 186450, https://gitee.com/openeuler/kernel/issues/I4YSJ7
CVE: NA

-----------------------------------------------

We got issue as follows:
[root@yebin home]# fsck.ext4  -fn  ram0yb
e2fsck 1.45.6 (20-Mar-2020)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Symlink /p3/d14/d1a/l3d (inode #3494) is invalid.
Clear? no
Entry 'l3d' in /p3/d14/d1a (3383) has an incorrect filetype (was 7, should be 0).
Fix? no

As symlink file size not match to file content. If symlink data block writback
failed, will call ext4_finish_bio to end io. In this path don't mark buffer
error. When umount do checkpoint can't detect buffer error, then will cleanup
jounral. Actually, correct data maybe in journal area.
To solve this issue, mark buffer error when detect bio error in ext4_finish_bio.
Signed-off-by: NYe Bin <yebin10@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

f91c9577

livepatch/core: Check klp_func before 'klp_init_object_loaded' · b89cbe68

由 Zheng Yejian 提交于 3月 18, 2022

hulk inclusion
category: feature
bugzilla: 186346, https://gitee.com/openeuler/kernel/issues/I4WBFN
CVE: NA

--------------------------------

Refer to following procedure:
  klp_init_object
    klp_init_object_loaded
      klp_find_object_symbol <-- 1. oops happened when old_name is NULL!!!
    klp_init_func  <-- 2. currently old_name is first time check here

This problem was introduced in commit 453d3845 ("livepatch/arm64:
fix func size less than limit") which exchange order of 'klp_init_func'
and 'klp_init_object_loaded' then cause old_name being used before check.

We move these checks before 'klp_init_object_loaded' and add several
logs to tell why check failed.

Fixes: 453d3845 ("livepatch/arm64: fix func size less than limit")
Signed-off-by: NZheng Yejian <zhengyejian1@huawei.com>
Reviewed-by: NCheng Jian <cj.chengjian@huawei.com>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>

b89cbe68

18 3月, 2022 2 次提交

irqchip/gic-phytium-2500: Fix issue that interrupts are concentrated in one cpu · c61648a1

由 Mao HongBo 提交于 3月 18, 2022

Phytium inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I41AUQ
CVE: NA

-------------------------------------------------

Fix the issue that interrupts are concentrated in one cpu
for Phytium S2500 server.
Signed-off-by: NMao HongBo <maohongbo@phytium.com.cn>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

c61648a1

blk-mq: add exception handling when srcu->sda alloc failed · ab774358

由 Laibin Qiu 提交于 3月 18, 2022

hulk inclusion
category: bugfix
bugzilla: 186352, https://gitee.com/openeuler/kernel/issues/I4YADX
DTS: DTS2022031707143
CVE: NA

--------------------------------

In case of BLK_MQ_F_BLOCKING, per-hctx srcu is used to protect dispatch
critical area. But the current process is not aware when memory of srcu
allocation failed in blk_mq_alloc_hctx, which will leads to illegal
address BUG. Add return value validation to avoid this problem.
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>

ab774358

17 3月, 2022 3 次提交

audit: improve audit queue handling when "audit=1" on cmdline · 1518b80b

由 Paul Moore 提交于 3月 17, 2022

mainline inclusion
from mainline-v5.17-rc3
commit f26d0433
category: bugfix
bugzilla: 186384 https://gitee.com/openeuler/kernel/issues/I4X1AI?from=project-issue
CVE: NA

--------------------------------

When an admin enables audit at early boot via the "audit=1" kernel
command line the audit queue behavior is slightly different; the
audit subsystem goes to greater lengths to avoid dropping records,
which unfortunately can result in problems when the audit daemon is
forcibly stopped for an extended period of time.

This patch makes a number of changes designed to improve the audit
queuing behavior so that leaving the audit daemon in a stopped state
for an extended period does not cause a significant impact to the
system.

- kauditd_send_queue() is now limited to looping through the
  passed queue only once per call.  This not only prevents the
  function from looping indefinitely when records are returned
  to the current queue, it also allows any recovery handling in
  kauditd_thread() to take place when kauditd_send_queue()
  returns.

- Transient netlink send errors seen as -EAGAIN now cause the
  record to be returned to the retry queue instead of going to
  the hold queue.  The intention of the hold queue is to store,
  perhaps for an extended period of time, the events which led
  up to the audit daemon going offline.  The retry queue remains
  a temporary queue intended to protect against transient issues
  between the kernel and the audit daemon.

- The retry queue is now limited by the audit_backlog_limit
  setting, the same as the other queues.  This allows admins
  to bound the size of all of the audit queues on the system.

- kauditd_rehold_skb() now returns records to the end of the
  hold queue to ensure ordering is preserved in the face of
  recent changes to kauditd_send_queue().

Cc: stable@vger.kernel.org
Fixes: 5b52330b ("audit: fix auditd/kernel connection state tracking")
Fixes: f4b3ee3c ("audit: improve robustness of the audit queue handling")
Reported-by: NGaosheng Cui <cuigaosheng1@huawei.com>
Tested-by: NGaosheng Cui <cuigaosheng1@huawei.com>
Reviewed-by: NRichard Guy Briggs <rgb@redhat.com>
Signed-off-by: NPaul Moore <paul@paul-moore.com>
Signed-off-by: NCui GaoSheng <cuigaosheng1@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

1518b80b

Revert "audit: bugfix for infinite loop when flush the hold queue" · a9140c08

由 Cui GaoSheng 提交于 3月 17, 2022

hulk inclusion
category: bugfix
bugzilla: 186384 https://gitee.com/openeuler/kernel/issues/I4X1AI?from=project-issue
CVE: NA

--------------------------------

This reverts commit 67ab712f.
Signed-off-by: NCui GaoSheng <cuigaosheng1@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

a9140c08

veth: Do not record rx queue hint in veth_xmit · 2632c58b

由 Daniel Borkmann 提交于 3月 17, 2022

stable inclusion
from linux-4.19.226
commit bd6e97e2b6f59a19894c7032a83f03ad38ede28e

--------------------------------

commit 710ad98c upstream.

Laurent reported that they have seen a significant amount of TCP retransmissions
at high throughput from applications residing in network namespaces talking to
the outside world via veths. The drops were seen on the qdisc layer (fq_codel,
as per systemd default) of the phys device such as ena or virtio_net due to all
traffic hitting a _single_ TX queue _despite_ multi-queue device. (Note that the
setup was _not_ using XDP on veths as the issue is generic.)

More specifically, after edbea922 ("veth: Store queue_mapping independently
of XDP prog presence") which made it all the way back to v4.19.184+,
skb_record_rx_queue() would set skb->queue_mapping to 1 (given 1 RX and 1 TX
queue by default for veths) instead of leaving at 0.

This is eventually retained and callbacks like ena_select_queue() will also pick
single queue via netdev_core_pick_tx()'s ndo_select_queue() once all the traffic
is forwarded to that device via upper stack or other means. Similarly, for others
not implementing ndo_select_queue() if XPS is disabled, netdev_pick_tx() might
call into the skb_tx_hash() and check for prior skb_rx_queue_recorded() as well.

In general, it is a _bad_ idea for virtual devices like veth to mess around with
queue selection [by default]. Given dev->real_num_tx_queues is by default 1,
the skb->queue_mapping was left untouched, and so prior to edbea922 the
netdev_core_pick_tx() could do its job upon __dev_queue_xmit() on the phys device.

Unbreak this and restore prior behavior by removing the skb_record_rx_queue()
from veth_xmit() altogether.

If the veth peer has an XDP program attached, then it would return the first RX
queue index in xdp_md->rx_queue_index (unless configured in non-default manner).
However, this is still better than breaking the generic case.

Fixes: edbea922 ("veth: Store queue_mapping independently of XDP prog presence")
Fixes: 638264dc ("veth: Support per queue XDP ring")
Reported-by: NLaurent Bernaille <laurent.bernaille@datadoghq.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Cc: Toshiaki Makita <toshiaki.makita1@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Acked-by: NToshiaki Makita <toshiaki.makita1@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Conflicts:
	drivers/net/veth.c
Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

2632c58b

15 3月, 2022 1 次提交

crypto: pcrypt - Fix user-after-free on module unload · 939559f5

由 Herbert Xu 提交于 3月 15, 2022

stable inclusion
from linux-4.19.102
commit 47ef5cb878817127bd3d54c3578bbbd3f7c2bf2c
CVE: NA

-------------------------------

[ Upstream commit 07bfd9bd ]

On module unload of pcrypt we must unregister the crypto algorithms
first and then tear down the padata structure.  As otherwise the
crypto algorithms are still alive and can be used while the padata
structure is being freed.

Fixes: 5068c7a8 ("crypto: pcrypt - Add pcrypt crypto...")
Cc: <stable@vger.kernel.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NLu Jialin <lujialin4@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

939559f5

14 3月, 2022 17 次提交

lib/iov_iter: initialize "flags" in new pipe_buffer · 46e70fd2

由 Max Kellermann 提交于 3月 14, 2022

mainline inclusion
from mainline-v5.17-rc6
commit 9d2231c5
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4X1GI?from=project-issue
CVE: NA

--------------------------------

The functions copy_page_to_iter_pipe() and push_pipe() can both
allocate a new pipe_buffer, but the "flags" member initializer is
missing.

Fixes: 241699cd ("new iov_iter flavour: pipe-backed")
To: Alexander Viro <viro@zeniv.linux.org.uk>
To: linux-fsdevel@vger.kernel.org
To: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: NMax Kellermann <max.kellermann@ionos.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

46e70fd2

mm: Count reliable shmem used based on NR_SHMEM · fde6ae61

由 Ma Wupeng 提交于 3月 14, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4SK3S
CVE: NA

------------------------------------------

With this patch, reliable memory counter will be updated when NR_SHMEM is
updated. Pervious shmem reliable memory counter is not accurate if swap is
enabled.

NR_SHMEM update in memcg secenario is ignored because this has nothing to
do with the global counter. If shmem pages is migrated or collapsed from
one region to another region, reliable memory counter need to be updated
because these pages's reliable status may not be the same.
Signed-off-by: NMa Wupeng <mawupeng1@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

fde6ae61

mm: fix zoneref mapping problem in memory reliable · 9b6c51cd

由 Ma Wupeng 提交于 3月 14, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4SK3S
CVE: NA

--------------------------------

The mapping between zoneref and zone will be updated if __GFP_THISNODE is
defined and memory reliable fallback is enabled. This will put ZONE_MOVABLE
in a wrong zonerefs slot and lead to the origin zone unselectable.

With this patch, high_zoneidx is updated via gfp_zone() which
___GFP_RELIABILITY is removed from the origin gfp_mask. Perferred zoneref
is recalculated after.

Fixes: 3023a4b3 ("mm: Introduce fallback mechanism for memory reliable")
Signed-off-by: NMa Wupeng <mawupeng1@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

9b6c51cd

mm: disable memory reliable when kdump is in progress · 6fb0d251

由 Ma Wupeng 提交于 3月 14, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4SK3S
CVE: NA

--------------------------------

Kdump only have limited memory and will lead to bugly memory reliable
features if memory reliable if enabled. So disable memory reliable if kdump
is in progress.
Signed-off-by: NMa Wupeng <mawupeng1@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

6fb0d251

mm: introduce "clear_freelist" kernel parameter · 216ce413

由 Yu Liao 提交于 3月 14, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4XB9H
CVE: NA

--------------------------------

CONFIG_CLEAR_FREELIST_PAGE specifies the default value for clear
freelist. Add a kernel parameter to make it possible to
override the default. Keep clear_freelist disabled unless
"clear_freelist" is explicitly specified.
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

216ce413

mm: fix unable to use reliable memory in page cache · 612e8299

由 Chen Wandun 提交于 3月 14, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4SK3S
CVE: NA

----------------------------------------------

It is inaccurate when accumulate percpu variable without lock,
it will result in pagecache_reliable_pages to be negative in
sometime, and will prevent pagecache using reliable memory.

For more accurate statistic, replace percpu variable by percpu_counter.

The additional percpu_counter will be access in alloc_pages, the init
of these two percpu_conter is too late in late_initcall, allocations
that use alloc_pages should have to check if the two counter has been
inited, that will introduce latency, in order to sovle this, init the
two percpu_counter in advance.
Signed-off-by: NChen Wandun <chenwandun@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

612e8299

nfc: st21nfca: Fix potential buffer overflows in EVT_TRANSACTION · faba439f

由 Jordy Zomer 提交于 3月 14, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 4fbcc1a4
category: bugfix
bugzilla: 186393
CVE: CVE-2022-26490

-------------------------------------------------

It appears that there are some buffer overflows in EVT_TRANSACTION.
This happens because the length parameters that are passed to memcpy
come directly from skb->data and are not guarded in any way.
Signed-off-by: NJordy Zomer <jordy@pwning.systems>
Reviewed-by: NKrzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHuang Guobin <huangguobin4@huawei.com>
Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

faba439f

select: Fix indefinitely sleeping task in poll_schedule_timeout() · 3dfc9fb2

由 Jan Kara 提交于 3月 14, 2022

stable inclusion
from linux-4.19.227
commit 6717900f775a6129a7b4d03ba4922218d8bf1caa

--------------------------------

commit 68514dac upstream.

A task can end up indefinitely sleeping in do_select() ->
poll_schedule_timeout() when the following race happens:

  TASK1 (thread1)             TASK2                   TASK1 (thread2)
  do_select()
    setup poll_wqueues table
    with 'fd'
                              write data to 'fd'
                                pollwake()
                                  table->triggered = 1
                                                      closes 'fd' thread1 is
                                                        waiting for
    poll_schedule_timeout()
      - sees table->triggered
      table->triggered = 0
      return -EINTR
    loop back in do_select()

But at this point when TASK1 loops back, the fdget() in the setup of
poll_wqueues fails.  So now so we never find 'fd' is ready for reading
and sleep in poll_schedule_timeout() indefinitely.

Treat an fd that got closed as a fd on which some event happened.  This
makes sure cannot block indefinitely in do_select().

Another option would be to return -EBADF in this case but that has a
potential of subtly breaking applications that excercise this behavior
and it happens to work for them.  So returning fd as active seems like a
safer choice.
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
CC: stable@vger.kernel.org
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

3dfc9fb2

mtd: nand: bbt: Fix corner case in bad block table handling · 969a3664

由 Doyle, Patrick 提交于 3月 14, 2022

stable inclusion
from linux-4.19.226
commit 1550a97e4a5d8bf29071bd6c17355ca173e90f73

--------------------------------

commit fd0d8d85 upstream.

In the unlikely event that both blocks 10 and 11 are marked as bad (on a
32 bit machine), then the process of marking block 10 as bad stomps on
cached entry for block 11.  There are (of course) other examples.
Signed-off-by: NPatrick Doyle <pdoyle@irobot.com>
Reviewed-by: NRichard Weinberger <richard@nod.at>
Signed-off-by: NYoshio Furuyama <ytc-mb-yfuruyama7@kioxia.com>
[<miquel.raynal@bootlin.com>: Fixed the title]
Signed-off-by: NMiquel Raynal <miquel.raynal@bootlin.com>
Cc: Frieder Schrempf <frieder.schrempf@kontron.de>
Link: https://lore.kernel.org/linux-mtd/774a92693f311e7de01e5935e720a179fb1b2468.1616635406.git.ytc-mb-yfuruyama7@kioxia.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

969a3664

netns: add schedule point in ops_exit_list() · dcc76734

由 Eric Dumazet 提交于 3月 14, 2022

stable inclusion
from linux-4.19.226
commit 4048cedfd16a995b2ef4294b9539da17e2fea750

--------------------------------

commit 2836615a upstream.

When under stress, cleanup_net() can have to dismantle
netns in big numbers. ops_exit_list() currently calls
many helpers [1] that have no schedule point, and we can
end up with soft lockups, particularly on hosts
with many cpus.

Even for moderate amount of netns processed by cleanup_net()
this patch avoids latency spikes.

[1] Some of these helpers like fib_sync_up() and fib_sync_down_dev()
are very slow because net/ipv4/fib_semantics.c uses host-wide hash tables,
and ifindex is used as the only input of two hash functions.
    ifindexes tend to be the same for all netns (lo.ifindex==1 per instance)
    This will be fixed in a separate patch.

Fixes: 72ad937a ("net: Add support for batching network namespace cleanups")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

dcc76734

af_unix: annote lockless accesses to unix_tot_inflight & gc_in_progress · c26e8934

由 Eric Dumazet 提交于 3月 14, 2022

stable inclusion
from linux-4.19.226
commit cf3c4b5912cb208194507a21f6209e8ae4e6c260

--------------------------------

commit 9d6d7f1c upstream.

wait_for_unix_gc() reads unix_tot_inflight & gc_in_progress
without synchronization.

Adds READ_ONCE()/WRITE_ONCE() and their associated comments
to better document the intent.

BUG: KCSAN: data-race in unix_inflight / wait_for_unix_gc

write to 0xffffffff86e2b7c0 of 4 bytes by task 9380 on cpu 0:
 unix_inflight+0x1e8/0x260 net/unix/scm.c:63
 unix_attach_fds+0x10c/0x1e0 net/unix/scm.c:121
 unix_scm_to_skb net/unix/af_unix.c:1674 [inline]
 unix_dgram_sendmsg+0x679/0x16b0 net/unix/af_unix.c:1817
 unix_seqpacket_sendmsg+0xcc/0x110 net/unix/af_unix.c:2258
 sock_sendmsg_nosec net/socket.c:704 [inline]
 sock_sendmsg net/socket.c:724 [inline]
 ____sys_sendmsg+0x39a/0x510 net/socket.c:2409
 ___sys_sendmsg net/socket.c:2463 [inline]
 __sys_sendmmsg+0x267/0x4c0 net/socket.c:2549
 __do_sys_sendmmsg net/socket.c:2578 [inline]
 __se_sys_sendmmsg net/socket.c:2575 [inline]
 __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2575
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffffffff86e2b7c0 of 4 bytes by task 9375 on cpu 1:
 wait_for_unix_gc+0x24/0x160 net/unix/garbage.c:196
 unix_dgram_sendmsg+0x8e/0x16b0 net/unix/af_unix.c:1772
 unix_seqpacket_sendmsg+0xcc/0x110 net/unix/af_unix.c:2258
 sock_sendmsg_nosec net/socket.c:704 [inline]
 sock_sendmsg net/socket.c:724 [inline]
 ____sys_sendmsg+0x39a/0x510 net/socket.c:2409
 ___sys_sendmsg net/socket.c:2463 [inline]
 __sys_sendmmsg+0x267/0x4c0 net/socket.c:2549
 __do_sys_sendmmsg net/socket.c:2578 [inline]
 __se_sys_sendmmsg net/socket.c:2575 [inline]
 __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2575
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x00000002 -> 0x00000004

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 9375 Comm: syz-executor.1 Not tainted 5.16.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: 9915672d ("af_unix: limit unix_tot_inflight")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20220114164328.2038499-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

c26e8934

crypto: stm32/crc32 - Fix kernel BUG triggered in probe() · d116bcc0

由 Marek Vasut 提交于 3月 14, 2022

stable inclusion
from linux-4.19.226
commit e670c4b7c1ca134463211b27af69695a3adcb846

--------------------------------

commit 29009604 upstream.

The include/linux/crypto.h struct crypto_alg field cra_driver_name description
states "Unique name of the transformation provider. " ... " this contains the
name of the chip or provider and the name of the transformation algorithm."

In case of the stm32-crc driver, field cra_driver_name is identical for all
registered transformation providers and set to the name of the driver itself,
which is incorrect. This patch fixes it by assigning a unique cra_driver_name
to each registered transformation provider.

The kernel crash is triggered when the driver calls crypto_register_shashes()
which calls crypto_register_shash(), which calls crypto_register_alg(), which
calls __crypto_register_alg(), which returns -EEXIST, which is propagated
back through this call chain. Upon -EEXIST from crypto_register_shash(), the
crypto_register_shashes() starts unregistering the providers back, and calls
crypto_unregister_shash(), which calls crypto_unregister_alg(), and this is
where the BUG() triggers due to incorrect cra_refcnt.

Fixes: b51dbe90 ("crypto: stm32 - Support for STM32 CRC32 crypto module")
Signed-off-by: NMarek Vasut <marex@denx.de>
Cc: <stable@vger.kernel.org> # 4.12+
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Fabien Dessenne <fabien.dessenne@st.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Lionel Debieve <lionel.debieve@st.com>
Cc: Nicolas Toromanoff <nicolas.toromanoff@st.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
To: linux-crypto@vger.kernel.org
Acked-by: NNicolas Toromanoff <nicolas.toromanoff@foss.st.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

d116bcc0

ext4: don't use the orphan list when migrating an inode · 277de38d

由 Theodore Ts'o 提交于 3月 14, 2022

stable inclusion
from linux-4.19.226
commit 33446496d21753b5ceb55be4d6e593b487a61239

--------------------------------

commit 6eeaf88f upstream.

We probably want to remove the indirect block to extents migration
feature after a deprecation window, but until then, let's fix a
potential data loss problem caused by the fact that we put the
tmp_inode on the orphan list.  In the unlikely case where we crash and
do a journal recovery, the data blocks belonging to the inode being
migrated are also represented in the tmp_inode on the orphan list ---
and so its data blocks will get marked unallocated, and available for
reuse.

Instead, stop putting the tmp_inode on the oprhan list.  So in the
case where we crash while migrating the inode, we'll leak an inode,
which is not a disaster.  It will be easily fixed the next time we run
fsck, and it's better than potentially having blocks getting claimed
by two different files, and losing data as a result.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

277de38d

ext4: set csum seed in tmp inode while migrating to extents · 7e65b849

由 Luís Henriques 提交于 3月 14, 2022

stable inclusion
from linux-4.19.226
commit 9103cafdc4531285b6ededd0a3437effd71ff255

--------------------------------

commit e81c9302 upstream.

When migrating to extents, the temporary inode will have it's own checksum
seed.  This means that, when swapping the inodes data, the inode checksums
will be incorrect.

This can be fixed by recalculating the extents checksums again.  Or simply
by copying the seed into the temporary inode.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=213357Reported-by: NJeroen van Wolffelaar <jeroen@wolffelaar.nl>
Signed-off-by: NLuís Henriques <lhenriques@suse.de>
Link: https://lore.kernel.org/r/20211214175058.19511-1-lhenriques@suse.deSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

7e65b849

ext4: make sure quota gets properly shutdown on error · 546ab4a5

由 Jan Kara 提交于 3月 14, 2022

stable inclusion
from linux-4.19.226
commit 841bba6544e10cab41535b76bbdd37555a6ab9df

--------------------------------

commit 15fc69bb upstream.

When we hit an error when enabling quotas and setting inode flags, we do
not properly shutdown quota subsystem despite returning error from
Q_QUOTAON quotactl. This can lead to some odd situations like kernel
using quota file while it is still writeable for userspace. Make sure we
properly cleanup the quota subsystem in case of error.
Signed-off-by: NJan Kara <jack@suse.cz>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20211007155336.12493-2-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

546ab4a5

ext4: make sure to reset inode lockdep class when quota enabling fails · 825982c0

由 Jan Kara 提交于 3月 14, 2022

stable inclusion
from linux-4.19.226
commit ef41f72716c469a670b9d556b65e5ed83a3a5fd7

--------------------------------

commit 4013d47a upstream.

When we succeed in enabling some quota type but fail to enable another
one with quota feature, we correctly disable all enabled quota types.
However we forget to reset i_data_sem lockdep class. When the inode gets
freed and reused, it will inherit this lockdep class (i_data_sem is
initialized only when a slab is created) and thus eventually lockdep
barfs about possible deadlocks.

Reported-and-tested-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com
Signed-off-by: NJan Kara <jack@suse.cz>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20211007155336.12493-3-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

825982c0

cputime, cpuacct: Include guest time in user time in cpuacct.stat · 5b303f3d

由 Andrey Ryabinin 提交于 3月 14, 2022

stable inclusion
from linux-4.19.226
commit 952514c8565cf72a966993b473fae1708c3684f3

--------------------------------

commit 9731698e upstream.

cpuacct.stat in no-root cgroups shows user time without guest time
included int it. This doesn't match with user time shown in root
cpuacct.stat and /proc/<pid>/stat. This also affects cgroup2's cpu.stat
in the same way.

Make account_guest_time() to add user time to cgroup's cpustat to
fix this.

Fixes: ef12fefa ("cpuacct: add per-cgroup utime/stime statistics")
Signed-off-by: NAndrey Ryabinin <arbn@yandex-team.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Acked-by: NTejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20211115164607.23784-1-arbn@yandex-team.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>

5b303f3d

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功