提交 · 5.10.0-153.3.0 · openeuler / Kernel

07 6月, 2023 6 次提交

!964 [sync] PR-937: tcp: restrict net.ipv4.tcp_app_win · b8e1d215

由 openeuler-ci-bot 提交于 6月 07, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/937 
 
PR sync from:  YueHaibing <yuehaibing@huawei.com>
 https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/thread/IQ4SJZTGXXAAT4OQ72ZLUTPEDFFVDQQX/ 
 
 
Link:https://gitee.com/openeuler/kernel/pulls/964 

Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

b8e1d215

!961 [sync] PR-925: tcp: prohibit TCP_REPAIR_OPTIONS if data was already sent · e897e3ca

由 openeuler-ci-bot 提交于 6月 07, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/925 
 
PR sync from: Lu Wei luwei32@huawei.com https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/thread/I2EQIIQZEV4HWCI4CUWWZFLX2LZQGUDT/ 
 
Link:https://gitee.com/openeuler/kernel/pulls/961 

Reviewed-by: Yue Haibing <yuehaibing@huawei.com> 
Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

e897e3ca

!957 [sync] PR-938: config: Disable CONFIG_EULER_FS by default · 1604b194

由 openeuler-ci-bot 提交于 6月 07, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/938 
 
PR sync from:  Wei Li <liwei391@huawei.com>
 https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/thread/WZJYEXFZMOTCHU42AQNLSLRHTTRTZU4I/ 
 
 
Link:https://gitee.com/openeuler/kernel/pulls/957 

Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

1604b194

tcp: restrict net.ipv4.tcp_app_win · f3df2fef

由 YueHaibing 提交于 6月 06, 2023

mainline inclusion
from mainline-v6.3-rc7
commit dc5110c2
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6WB6P

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dc5110c2d959c1707e12df5f792f41d90614adaa

--------------------------------

UBSAN: shift-out-of-bounds in net/ipv4/tcp_input.c:555:23
shift exponent 255 is too large for 32-bit type 'int'
CPU: 1 PID: 7907 Comm: ssh Not tainted 6.3.0-rc4-00161-g62bad54b-dirty #206
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x136/0x150
 __ubsan_handle_shift_out_of_bounds+0x21f/0x5a0
 tcp_init_transfer.cold+0x3a/0xb9
 tcp_finish_connect+0x1d0/0x620
 tcp_rcv_state_process+0xd78/0x4d60
 tcp_v4_do_rcv+0x33d/0x9d0
 __release_sock+0x133/0x3b0
 release_sock+0x58/0x1b0

'maxwin' is int, shifting int for 32 or more bits is undefined behaviour.

Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NKuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
(cherry picked from commit bb54ca93)

f3df2fef

tcp: prohibit TCP_REPAIR_OPTIONS if data was already sent · 4b715882

由 Lu Wei 提交于 6月 06, 2023

stable inclusion
from stable-v5.10.155
commit 02f8dfee7580b65449a67baa65cc2da4e5ffc473
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZG7O

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=02f8dfee7580b65449a67baa65cc2da4e5ffc473

--------------------------------

[ Upstream commit 0c175da7 ]

If setsockopt with option name of TCP_REPAIR_OPTIONS and opt_code
of TCPOPT_SACK_PERM is called to enable sack after data is sent
and dupacks are received , it will trigger a warning in function
tcp_verify_left_out() as follows:

============================================
WARNING: CPU: 8 PID: 0 at net/ipv4/tcp_input.c:2132
tcp_timeout_mark_lost+0x154/0x160
tcp_enter_loss+0x2b/0x290
tcp_retransmit_timer+0x50b/0x640
tcp_write_timer_handler+0x1c8/0x340
tcp_write_timer+0xe5/0x140
call_timer_fn+0x3a/0x1b0
__run_timers.part.0+0x1bf/0x2d0
run_timer_softirq+0x43/0xb0
__do_softirq+0xfd/0x373
__irq_exit_rcu+0xf6/0x140

The warning is caused in the following steps:
1. a socket named socketA is created
2. socketA enters repair mode without build a connection
3. socketA calls connect() and its state is changed to TCP_ESTABLISHED
   directly
4. socketA leaves repair mode
5. socketA calls sendmsg() to send data, packets_out and sack_outs(dup
   ack receives) increase
6. socketA enters repair mode again
7. socketA calls setsockopt with TCPOPT_SACK_PERM to enable sack
8. retransmit timer expires, it calls tcp_timeout_mark_lost(), lost_out
   increases
9. sack_outs + lost_out > packets_out triggers since lost_out and
   sack_outs increase repeatly

In function tcp_timeout_mark_lost(), tp->sacked_out will be cleared if
Step7 not happen and the warning will not be triggered. As suggested by
Denis and Eric, TCP_REPAIR_OPTIONS should be prohibited if data was
already sent.

socket-tcp tests in CRIU has been tested as follows:
$ sudo ./test/zdtm.py run -t zdtm/static/socket-tcp*  --keep-going \
       --ignore-taint

socket-tcp* represent all socket-tcp tests in test/zdtm/static/.

Fixes: b139ba4e ("tcp: Repair connection-time negotiated parameters")
Signed-off-by: NLu Wei <luwei32@huawei.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NLu Wei <luwei32@huawei.com>
(cherry picked from commit cd2fedb1)

4b715882

config: Disable CONFIG_EULER_FS by default · 81c657cc

由 Wei Li 提交于 6月 06, 2023

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I7BAJ0

--------------------------------

EulerFS was introduced as a technical preview feature in the 21.09
innovation version. Considering the current discontinuation of Intel
Optane hardware and immature ecosystem, it has been decided to turn it
off in the 22.03 LTS version. It will continue to evolve as an innovative
feature in future innovation versions.
Signed-off-by: NWei Li <liwei391@huawei.com>
(cherry picked from commit 42e496a1)

81c657cc

06 6月, 2023 9 次提交

!933 [sync] PR-922: jbd2: fix checkpoint inconsistent · 36771615

由 openeuler-ci-bot 提交于 6月 06, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/922 
 
PR sync from:  Zhihao Cheng <chengzhihao1@huawei.com>
 https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/thread/LVVMWDDI7DASB3DYASLNKKPERIERFPSU/ 
Zhang Yi (2):
  jbd2: recheck chechpointing non-dirty buffer
  jbd2: remove t_checkpoint_io_list


-- 
2.31.1
 
 
Link:https://gitee.com/openeuler/kernel/pulls/933 

Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

36771615

jbd2: remove t_checkpoint_io_list · 37c8e4d7

由 Zhang Yi 提交于 6月 06, 2023

maillist inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I70WHL

Reference: https://lore.kernel.org/linux-ext4/20230531115100.2779605-1-yi.zhang@huaweicloud.com/T/#t

---------------------------------------------------------------

Since t_checkpoint_io_list was stop using in jbd2_log_do_checkpoint()
now, it's time to remove the whole t_checkpoint_io_list logic.
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Conflits:
	include/linux/jbd2.h
	[ Don't remove t_checkpoint_io_list for KABI broken. ]
Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
(cherry picked from commit ae9c0722)

37c8e4d7

jbd2: recheck chechpointing non-dirty buffer · 88579155

由 Zhang Yi 提交于 6月 06, 2023

maillist inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I70WHL

Reference: https://lore.kernel.org/linux-ext4/20230531115100.2779605-1-yi.zhang@huaweicloud.com/T/#t

---------------------------------------------------------------

There is a long-standing metadata corruption issue that happens from
time to time, but it's very difficult to reproduce and analyse, benefit
from the JBD2_CYCLE_RECORD option, we found out that the problem is the
checkpointing process miss to write out some buffers which are raced by
another do_get_write_access(). Looks below for detail.

jbd2_log_do_checkpoint() //transaction X
 //buffer A is dirty and not belones to any transaction
 __buffer_relink_io() //move it to the IO list
 __flush_batch()
  write_dirty_buffer()
                             do_get_write_access()
                             clear_buffer_dirty
                             __jbd2_journal_file_buffer()
                             //add buffer A to a new transaction Y
   lock_buffer(bh)
   //doesn't write out
 __jbd2_journal_remove_checkpoint()
 //finish checkpoint except buffer A
 //filesystem corrupt if the new transaction Y isn't fully write out.

Due to the t_checkpoint_list walking loop in jbd2_log_do_checkpoint()
have already handles waiting for buffers under IO and re-added new
transaction to complete commit, and it also removing cleaned buffers,
this makes sure the list will eventually get empty. So it's fine to
leave buffers on the t_checkpoint_list while flushing out and completely
stop using the t_checkpoint_io_list.

Cc: stable@vger.kernel.org
Suggested-by: NJan Kara <jack@suse.cz>
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
Tested-by: NZhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
(cherry picked from commit caa8415e)

88579155

!923 [sync] PR-918: Misc fixes for Kunpeng accelerator drivers! · 1f5570d3

由 openeuler-ci-bot 提交于 6月 06, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/918 
 
* crypto: hisilicon/qm - remove unnecessary aer.h include
* crypto: hisilicon/qm - prevent soft lockup in qm_poll_req_cb()'s loop
* crypto: hisilicon/hpre - ensure private key less than n
* crypto: hisilicon/qm - stop function and write data to memory

issue: https://gitee.com/openeuler/kernel/issues/I7AUVE 
 
Link:https://gitee.com/openeuler/kernel/pulls/923 

Reviewed-by: Yang Shen <shenyang39@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

1f5570d3

!914 [sync] PR-906: ipv6: Add lwtunnel encap size of all siblings in nexthop calculation · 7ab3b6b8

由 openeuler-ci-bot 提交于 6月 06, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/906 
 
PR sync from: Lu Wei luwei32@huawei.com https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/thread/P2VNBKLIL5YGSGNPIHF44HRLVST76J4E/ 
 
Link:https://gitee.com/openeuler/kernel/pulls/914 

Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
Reviewed-by: Yue Haibing <yuehaibing@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

7ab3b6b8

crypto: hisilicon/qm - remove unnecessary aer.h include · f366751c

由 Bjorn Helgaas 提交于 6月 05, 2023

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7AUVE
CVE: NA

----------------------------------------------------------------------

<linux/aer.h> is unused, so remove it.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Cc: Weili Qian <qianweili@huawei.com>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
Acked-by: NLongfang Liu <liulongfang@huawei.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NJiangShui Yang <yangjiangshui@h-partners.com>
(cherry picked from commit c23d3855)

f366751c

crypto: hisilicon/qm - stop function and write data to memory · b0620a52

由 Weili Qian 提交于 6月 05, 2023

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7AUVE
CVE: NA

----------------------------------------------------------------------

Before the system is shutdown, the accelerator driver
needs to stop the device and write data to the memory.
This prevents the accelerator from accessing addresses
and writing data to the memory after the memory is reclaimed
by the system, causing device exceptions and generating NFE errors.
Signed-off-by: NWeili Qian <qianweili@huawei.com>
Signed-off-by: NJiangShui Yang <yangjiangshui@h-partners.com>
(cherry picked from commit 23bdb7d8)

b0620a52

crypto: hisilicon/hpre - ensure private key less than n · 388ade40

由 Weili Qian 提交于 6月 05, 2023

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7AUVE
CVE: NA

----------------------------------------------------------------------

The private key of the curve key size generated by stdrng, which may not be
less than n. So the private key with the curve key size minus 1 is
generated to ensure that the private key is less than n.
Signed-off-by: NWeili Qian <qianweili@huawei.com>
Signed-off-by: NJiangShui Yang <yangjiangshui@h-partners.com>
(cherry picked from commit 91c618f0)

388ade40

crypto: hisilicon/qm - prevent soft lockup in qm_poll_req_cb()'s loop · 4591e909

由 Weili Qian 提交于 6月 05, 2023

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7AUVE
CVE: NA

----------------------------------------------------------------------

The function qm_poll_req_cb() may take a while due to complex req_cb,
so soft lockup may occur in kernel with preemption disabled.
Add a cond_resched() to prevent that.
Signed-off-by: NWeili Qian <qianweili@huawei.com>
Signed-off-by: NJiangShui Yang <yangjiangshui@h-partners.com>
(cherry picked from commit d07dbb66)

4591e909

05 6月, 2023 11 次提交

!921 [sync] PR-919: Revert "ext4: dio take shared inode lock when overwriting preallocated blocks" · c944f16a

由 openeuler-ci-bot 提交于 6月 05, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/919 
 
PR sync from:  Baokun Li <libaokun1@huawei.com>
 https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/thread/TMNW5KJPDRL5VBVVST3Y343U5DVD4WWG/ 
 
 
Link:https://gitee.com/openeuler/kernel/pulls/921 

Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> 
Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

c944f16a

Revert "ext4: dio take shared inode lock when overwriting preallocated blocks" · dab7ae63

由 Baokun Li 提交于 6月 05, 2023

hulk inclusion
category: perf
bugzilla: 188836, https://gitee.com/openeuler/kernel/issues/I7AYNZ

--------------------------------

This reverts commit 5193a88e.
This commit may cause performance degradation, so it is being reverted
temporarily.
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
(cherry picked from commit 6ff958e1)

dab7ae63

!898 [sync] PR-894: Fixed two accelerator bugfixes · a661c929

由 openeuler-ci-bot 提交于 6月 05, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/894 
 
1、The accelerator queue parameter configuration is incorrect.
2、uacce: use filep->f_mapping to replace inode->i_mapping

issue: https://gitee.com/openeuler/kernel/issues/I79JRM
 
 
Link:https://gitee.com/openeuler/kernel/pulls/898 

Reviewed-by: Yang Shen <shenyang39@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

a661c929

!908 [sync] PR-874: nic: hns3: fix pointer cast for wol and fix getting GE... · 96304d93

由 openeuler-ci-bot 提交于 6月 05, 2023

!908 [sync] PR-874: nic: hns3: fix pointer cast for wol and fix getting GE port lanes error and set cpu affinity

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/874 
 
This pull Requests fix pointer cast for wol and fix getting GE port lanes error and set cpu affinity

issue:
https://gitee.com/openeuler/kernel/issues/I7A712 
 
Link:https://gitee.com/openeuler/kernel/pulls/908 

Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

96304d93

ipv6: Add lwtunnel encap size of all siblings in nexthop calculation · ceef32aa

由 Lu Wei 提交于 6月 05, 2023

stable inclusion
from stable-v5.10.173
commit da26369377f0b671c14692e2d65ceb38131053e1
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6GT9T

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=da26369377f0b671c14692e2d65ceb38131053e1

--------------------------------

[ Upstream commit 4cc59f38 ]

In function rt6_nlmsg_size(), the length of nexthop is calculated
by multipling the nexthop length of fib6_info and the number of
siblings. However if the fib6_info has no lwtunnel but the siblings
have lwtunnels, the nexthop length is less than it should be, and
it will trigger a warning in inet6_rt_notify() as follows:

WARNING: CPU: 0 PID: 6082 at net/ipv6/route.c:6180 inet6_rt_notify+0x120/0x130
......
Call Trace:
 <TASK>
 fib6_add_rt2node+0x685/0xa30
 fib6_add+0x96/0x1b0
 ip6_route_add+0x50/0xd0
 inet6_rtm_newroute+0x97/0xa0
 rtnetlink_rcv_msg+0x156/0x3d0
 netlink_rcv_skb+0x5a/0x110
 netlink_unicast+0x246/0x350
 netlink_sendmsg+0x250/0x4c0
 sock_sendmsg+0x66/0x70
 ___sys_sendmsg+0x7c/0xd0
 __sys_sendmsg+0x5d/0xb0
 do_syscall_64+0x3f/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

This bug can be reproduced by script:

ip -6 addr add 2002::2/64 dev ens2
ip -6 route add 100::/64 via 2002::1 dev ens2 metric 100

for i in 10 20 30 40 50 60 70;
do
	ip link add link ens2 name ipv_$i type ipvlan
	ip -6 addr add 2002::$i/64 dev ipv_$i
	ifconfig ipv_$i up
done

for i in 10 20 30 40 50 60;
do
	ip -6 route append 100::/64 encap ip6 dst 2002::$i via 2002::1
dev ipv_$i metric 100
done

ip -6 route append 100::/64 via 2002::1 dev ipv_70 metric 100

This patch fixes it by adding nexthop_len of every siblings using
rt6_nh_nlmsg_size().

Fixes: beb1afac ("net: ipv6: Add support to dump multipath routes via RTA_MULTIPATH attribute")
Signed-off-by: NLu Wei <luwei32@huawei.com>
Reviewed-by: NDavid Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230222083629.335683-2-luwei32@huawei.comSigned-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NLu Wei <luwei32@huawei.com>
(cherry picked from commit d48ab856)

ceef32aa

!909 [sync] PR-907: tcp/dccp: Add another way to allocate local ports in connect() · 3267f97c

由 openeuler-ci-bot 提交于 6月 05, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/907 
 
PR sync from: Liu Jian liujian56@huawei.com https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/thread/BSVWS6P3YEQUKJMVM2L3BMIPHWVKODOD/ 
 
Link:https://gitee.com/openeuler/kernel/pulls/909 

Reviewed-by: Yue Haibing <yuehaibing@huawei.com> 
Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

3267f97c

!893 mitigatin cacheline false sharing · 7feb1eea

由 openeuler-ci-bot 提交于 6月 05, 2023

Merge Pull Request from: @zhangjialin11 
 
In the test of execl, shell1 and shell8 of UnixBench, L3 false sharing occurs between rwsem_try_write_lock_unqueued() and filemap_map_pages().

The offset between address_space.host and address_space.i_mmap_rwsem is 48. It may occur L3 false sharing. Their offsets in struct ext4_inode_info is 696 and 744, so when the address of ext4_inode_info after L3 aligned, it may occur L3 false sharing in the following condition:

[0x00 ~ 0x10] false sharing
[0x18 ~ 0x40] no false sharing
[0x48 ~ 0x80] false sharing

Change the offset of 'vfs_inode' from 320 to 360 in ext4_inode_info and make the address of ext4_inode_info L3 aligned, so the offset of host and i_mmap_rwsem in ext4_inode_info is changed to 736 and 784, it can make them in different L3 to avoid false sharing.

./Run -c 96 -i 3 execl

Before this patch:
System Benchmarks Partial Index              BASELINE       RESULT    INDEX
Execl Throughput                                 43.0      24238.0   5636.8
                                                                   ========
System Benchmarks Index Score (Partial Only)                         5636.8

After this patch:
System Benchmarks Partial Index              BASELINE       RESULT    INDEX
Execl Throughput                                 43.0      29363.7   6828.8
                                                                   ========
System Benchmarks Index Score (Partial Only)                         6828.8 
 
Link:https://gitee.com/openeuler/kernel/pulls/893 

Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> 
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>

7feb1eea

tcp/dccp: Add another way to allocate local ports in connect() · 4820557e

由 Lu Wei 提交于 6月 05, 2023

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I7AO8G
CVE: NA

--------------------------------

Commit 07f4c900 ("tcp/dccp: try to not exhaust ip_local_port_range
in connect()") allocates even ports for connect() first while leaving
odd ports for bind() and this works well in busy servers.

But this strategy causes severe performance degradation in busy clients.
when a client has used more than half of the local ports setted in
proc/sys/net/ipv4/ip_local_port_range, if this client trys to connect
to a server again, the connect time increases rapidly since it will
traverse all the even ports though they are exhausted.

So this path provides another strategy by introducing a system option:
local_port_allocation. If it is a busy client, users should set it to 1
to use sequential allocation while it should be set to 0 in other
situations. Its default value is 0.
Signed-off-by: NLu Wei <luwei32@huawei.com>
Signed-off-by: NLiu Jian <liujian56@huawei.com>
(cherry picked from commit 726c5265)

4820557e

net: hns3: fix set cpu affinity when state down · 8174d796

由 Jiantao Xiao 提交于 6月 01, 2023

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7A712
CVE: NA

----------------------------------------------------------------------

The CPU affinity can be configured when the network
port is down. The patch fixes the problem.
Signed-off-by: NJiantao Xiao <xiaojiantao1@h-partners.com>
(cherry picked from commit b2b62276)

8174d796

net: hns3: add support for getting GE port lanes · 20a1e3d0

由 Hao Chen 提交于 12月 15, 2022

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7A712
CVE: NA

----------------------------------------------------------------------

The number of lanes on the electrical port is 0, which does not
meet the expectation. The patch add support for getting GE port
lanes.
Signed-off-by: NHao Chen <chenhao418@huawei.com>
(cherry picked from commit 49acadc5)

20a1e3d0

net: hns3: fix pointer cast to different type for wol · 7635f442

由 Hao Lan 提交于 12月 13, 2022

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7A712
CVE: NA

----------------------------------------------------------------------

The pointer to decs.data is __le32 (*)[6], is incompatible to
"struct hclge_wol_cfg_cmd *". So fix the pointer cast to
correct type for decs.data.
Signed-off-by: NHao Lan <lanhao@huawei.com>
(cherry picked from commit f5432d46)

7635f442

03 6月, 2023 14 次提交

!903 backport block bugfix · 4ac8d141

由 openeuler-ci-bot 提交于 6月 03, 2023

Merge Pull Request from: @zhangjialin11 
 
This patch series fix block layer bug.
3 patchs fix iocost bug. Other patchs fix raid10 and badblocks bug.
 
 
Link:https://gitee.com/openeuler/kernel/pulls/903 

Reviewed-by: Zheng Zengkai <zhengzengkai@huawei.com> 
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>

4ac8d141

md/raid10: fix incorrect done of recovery · b0ac58c9

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188535, https://gitee.com/openeuler/kernel/issues/I6O61Q
CVE: NA

--------------------------------

Recovery will go to giveup and let chunks_skipped++ in raid10_sync_request
if there are some bad_blocks, and it will return max_sector when
chunks_skipped >= geo.raid_disks. Now, recovery fail and data is
inconsistent but user think recovery is done, it is wrong.

Fix it by set mirror's recovery_disabled and spare device shouln't be
added to here.
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>

b0ac58c9

md/raid10: fix null-ptr-deref in raid10_sync_request · 2de30b8f

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188378, https://gitee.com/openeuler/kernel/issues/I6GGV7
CVE: NA

--------------------------------

init_resync() init mempool and set conf->have_replacemnt at the begaining
of sync, close_sync() free the mempool when sync is completed.

After commit 7e83ccbe ("md/raid10: Allow skipping recovery when clean
arrays are assembled"), recovery might skipped and init_resync() is called
but close_sync() is not. null-ptr-deref occurs as below:
  1) creat a array, wait for resync to complete, mddev->recovery_cp is set
     to MaxSector.
  2) recovery is woken and it is skipped. conf->have_replacement is set to
     0 in init_resync(). close_sync() not called.
  3) some io errors and rdev A is set to WantReplacement.
  4) a new device is added and set to A's replacement.
  5) recovery is woken, A have replacement, but conf->have_replacemnt is
     0. r10bio->dev[i].repl_bio will not be alloced and null-ptr-deref
     occurs.

Fix it by not init_resync() if recovery skipped.

Fixes: 7e83ccbe md/raid10: Allow skipping recovery when clean arrays are assembled")
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>

2de30b8f

block/badblocks: fix badblocks loss when badblocks combine · e35a7762

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6ZG5B
CVE: NA

--------------------------------

badblocks will loss if we set it as below:

  # echo 1 1 > bad_blocks
  # echo 3 1 > bad_blocks
  # echo 1 5 > bad_blocks
  # cat bad_blocks
    1 3

we will combine badblocks if there is an intersection between p[lo] and
p[hi] in badblocks_set(). The end of new badblocks is p[hi]'s end now. but
p[lo] may cross p[hi] and new end should be the larger of p[lo] and p[hi].
  lo: |------------------------|
  hi:		|--------|

Fixes: 9e0e252a ("badblocks: Add core badblock management code")
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>

e35a7762

block/badblocks: fix the bug of reverse order · f9a3eea0

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6ZG5B
CVE: NA

--------------------------------

Order of badblocks will be reversed if we set a large area at once. 'hi'
remains unchanged while adding continuous badblocks is wrong, the next
setting is greater than 'hi', it should be added to the next position.
Let 'hi' +1 each cycle.

  # echo 0 2048 > bad_blocks
  # cat bad_blocks
    1536 512
    1024 512
    512 512
    0 512

Fixes: 9e0e252a ("badblocks: Add core badblock management code")
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>

f9a3eea0

md: fix unexpected changes of return value in rdev_set_badblocks · bebf3d97

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6XBZQ
CVE: NA

--------------------------------

If we set any badblocks fail, we will remove this rdev(set it to Faulty
or set recovery_disabled). Previous patch "md/raid10: fix io hung in
md_wait_for_blocked_rdev()" check badblocks->changed instead of return
value in rdev_set_badblocks(), but return value of this func also changed
accordingly, which is not what we expected.

Keep the return value consistent with before.
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>

bebf3d97

md/raid10: fix io hung in md_wait_for_blocked_rdev() · c23e1cd1

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6XBZQ
CVE: NA

--------------------------------

If badblocks are merged but bb->count exceedded, badblocks_set() will
return 1 and merged badblocks will become un-ack. rdev_set_badblocks()
will not set sb_flags and wakeup mddev->thread, io wait in
md_wait_for_blocked_rdev() will hung because BlockedBadBlocks may not be
cleared.

Fix it by checking badblocks->changed instead of return value. This flag
is set when badblocks changes.
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>

c23e1cd1

block: Only set bb->changed when badblocks changes · 78cba163

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6XBZQ
CVE: NA

--------------------------------

bb->changed and unacked_exist is set and badblocks_update_acked() is
involked even if no badblocks changes in badblocks_set(). Only update
them when badblocks changes.

Fixes: 9e0e252a ("badblocks: Add core badblock management code")
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>

78cba163

md/raid10: fix incorrect counting of rdev->nr_pending · 7b3b8187

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188605, https://gitee.com/openeuler/kernel/issues/I6ZJ3T
CVE: NA

--------------------------------

We get rdev from mirrors.replacement twice in raid10_write_request().
If replacement changes between two reads, it will increase A->nr_pending
and decrease B->nr_pending.

  T1 (write)	   T2 (remove)	    T3 (add)
                   raid10_remove_disk

  raid10_write_request
   rrdev = conf->mirrors[d].replacement; ->rdev A
   A nr_pending++

                    p->rdev = p->replacement; ->rdev A
                    p->replacement = NULL;

				    //A it set to WantReplacement
                                    raid10_add_disk
				     p->replacement = rdev; ->rdev B

   if blocked_rdev
    rdev = conf->mirrors[d].replacement; ->rdev B
    B nr_pending--

We will record rdev in r10bio, and get rdev from r10bio to fix it.

Fixes: 475b0321 ("md/raid10: writes should get directed to replacement as well as original.")
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>

7b3b8187

md/raid10: remove WANR_ON_ONCE in raid10_end_write_request · a3ebeed7

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188605, https://gitee.com/openeuler/kernel/issues/I6GOYF
CVE: NA

--------------------------------

It might read mirror.redev first and then mirror->replacement because of
memory reordering in raid10_end_write_request(), WARN_ON occurs if we
remove disk at the same time.

  T1 remove			T2 io end
  raid10_remove_disk		raid10_end_write_request
   p->rdev = NULL
				 read rdev -> NULL
   smp_mb
   p->replacement = NULL
				 read replacement -> NULL

It is meaningless to compare rdev with mirror->rdev after we get it from
r10_bio in raid10_end_write_request(). Remove this WANR_ON_ONCE.

Fixes: 2ecf5e6ecbfd ("md/raid10: fix uaf if replacement replaces rdev")
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>

a3ebeed7

md/raid10: fix uaf if replacement replaces rdev · af959500

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188377, https://gitee.com/openeuler/kernel/issues/I6GOYF
CVE: NA

--------------------------------

After commit 4ca40c2c ("md/raid10: Allow replacement device to be
replace old drive.") mirrors->replacement can replace rdev during
replacement's io pending, and repl_bio will write rdev (see
raid10_write_one_disk()). We will get wrong device by r10conf in
raid10_end_write_request(). In which case, r10_bio->devs[slot].repl_bio
will be put but not set to IO_MADE_GOOD, and it will be put again later in
raid_end_bio_io(), uaf occurs.

Fix it by using r10_bio to record rdev. Put the operations of io fail and
no replacement together, so no need to change repl.

  ==================================================================
  BUG: KASAN: use-after-free in bio_flagged include/linux/bio.h:238 [inline]
  BUG: KASAN: use-after-free in bio_put+0x78/0x80 block/bio.c:650
  Read of size 2 at addr ffff888116524dd4 by task md0_raid10/2618

  CPU: 0 PID: 2618 Comm: md0_raid10 Not tainted 5.10.0+ #3
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
  sd 0:0:0:0: rejecting I/O to offline device
  Call Trace:
   __dump_stack lib/dump_stack.c:77 [inline]
   dump_stack+0x107/0x167 lib/dump_stack.c:118
   print_address_description.constprop.0+0x1c/0x270 mm/kasan/report.c:390
   __kasan_report mm/kasan/report.c:550 [inline]
   kasan_report.cold+0x22/0x3a mm/kasan/report.c:567
   bio_flagged include/linux/bio.h:238 [inline]
   bio_put+0x78/0x80 block/bio.c:650
   put_all_bios drivers/md/raid10.c:248 [inline]
   free_r10bio drivers/md/raid10.c:257 [inline]
   raid_end_bio_io+0x3b5/0x590 drivers/md/raid10.c:309
   handle_write_completed drivers/md/raid10.c:2699 [inline]
   raid10d+0x2f85/0x5af0 drivers/md/raid10.c:2759
   md_thread+0x444/0x4b0 drivers/md/md.c:7932
   kthread+0x38c/0x470 kernel/kthread.c:313
   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:299

  Allocated by task 1400:
   kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
   kasan_set_track mm/kasan/common.c:56 [inline]
   set_alloc_info mm/kasan/common.c:498 [inline]
   __kasan_kmalloc.constprop.0+0xb5/0xe0 mm/kasan/common.c:530
   slab_post_alloc_hook mm/slab.h:512 [inline]
   slab_alloc_node mm/slub.c:2923 [inline]
   slab_alloc mm/slub.c:2931 [inline]
   kmem_cache_alloc+0x144/0x360 mm/slub.c:2936
   mempool_alloc+0x146/0x360 mm/mempool.c:391
   bio_alloc_bioset+0x375/0x610 block/bio.c:486
   bio_clone_fast+0x20/0x50 block/bio.c:711
   raid10_write_one_disk+0x166/0xd30 drivers/md/raid10.c:1240
   raid10_write_request+0x1600/0x2c90 drivers/md/raid10.c:1484
   __make_request drivers/md/raid10.c:1508 [inline]
   raid10_make_request+0x376/0x620 drivers/md/raid10.c:1537
   md_handle_request+0x699/0x970 drivers/md/md.c:451
   md_submit_bio+0x204/0x400 drivers/md/md.c:489
   __submit_bio block/blk-core.c:959 [inline]
   __submit_bio_noacct block/blk-core.c:1007 [inline]
   submit_bio_noacct+0x2e3/0xcf0 block/blk-core.c:1086
   submit_bio+0x1a0/0x3a0 block/blk-core.c:1146
   submit_bh_wbc+0x685/0x8e0 fs/buffer.c:3053
   ext4_commit_super+0x37e/0x6c0 fs/ext4/super.c:5696
   flush_stashed_error_work+0x28b/0x400 fs/ext4/super.c:791
   process_one_work+0x9a6/0x1590 kernel/workqueue.c:2280
   worker_thread+0x61d/0x1310 kernel/workqueue.c:2426
   kthread+0x38c/0x470 kernel/kthread.c:313
   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:299

  Freed by task 2618:
   kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
   kasan_set_track+0x1c/0x30 mm/kasan/common.c:56
   kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:361
   __kasan_slab_free+0x151/0x180 mm/kasan/common.c:482
   slab_free_hook mm/slub.c:1569 [inline]
   slab_free_freelist_hook+0xa9/0x180 mm/slub.c:1608
   slab_free mm/slub.c:3179 [inline]
   kmem_cache_free+0xcd/0x3d0 mm/slub.c:3196
   mempool_free+0xe3/0x3b0 mm/mempool.c:500
   bio_free+0xe2/0x140 block/bio.c:266
   bio_put+0x58/0x80 block/bio.c:651
   raid10_end_write_request+0x885/0xb60 drivers/md/raid10.c:516
   bio_endio+0x376/0x6a0 block/bio.c:1465
   req_bio_endio block/blk-core.c:289 [inline]
   blk_update_request+0x5f5/0xf40 block/blk-core.c:1525
   blk_mq_end_request+0x4c/0x510 block/blk-mq.c:654
   blk_flush_complete_seq+0x835/0xd80 block/blk-flush.c:204
   flush_end_io+0x7b7/0xb90 block/blk-flush.c:261
   __blk_mq_end_request+0x282/0x4c0 block/blk-mq.c:645
   scsi_end_request+0x3a8/0x850 drivers/scsi/scsi_lib.c:607
   scsi_io_completion+0x3f5/0x1320 drivers/scsi/scsi_lib.c:970
   scsi_softirq_done+0x11b/0x490 drivers/scsi/scsi_lib.c:1448
   blk_mq_complete_request block/blk-mq.c:788 [inline]
   blk_mq_complete_request+0x84/0xb0 block/blk-mq.c:785
   scsi_mq_done+0x155/0x360 drivers/scsi/scsi_lib.c:1603
   virtscsi_vq_done drivers/scsi/virtio_scsi.c:184 [inline]
   virtscsi_req_done+0x14c/0x220 drivers/scsi/virtio_scsi.c:199
   vring_interrupt drivers/virtio/virtio_ring.c:2061 [inline]
   vring_interrupt+0x27a/0x300 drivers/virtio/virtio_ring.c:2047
   __handle_irq_event_percpu+0x2f8/0x830 kernel/irq/handle.c:156
   handle_irq_event_percpu kernel/irq/handle.c:196 [inline]
   handle_irq_event+0x105/0x280 kernel/irq/handle.c:213
   handle_edge_irq+0x258/0xd20 kernel/irq/chip.c:828
   asm_call_irq_on_stack+0xf/0x20
   __run_irq_on_irqstack arch/x86/include/asm/irq_stack.h:48 [inline]
   run_irq_on_irqstack_cond arch/x86/include/asm/irq_stack.h:101 [inline]
   handle_irq arch/x86/kernel/irq.c:230 [inline]
   __common_interrupt arch/x86/kernel/irq.c:249 [inline]
   common_interrupt+0xe2/0x190 arch/x86/kernel/irq.c:239
   asm_common_interrupt+0x1e/0x40 arch/x86/include/asm/idtentry.h:626

Fixes: 4ca40c2c ("md/raid10: Allow replacement device to be replace old drive.")
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>

af959500

md/raid10: fix null-ptr-deref of mreplace in raid10_sync_request · 7718714e

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188527, https://gitee.com/openeuler/kernel/issues/I6O3HO
CVE: NA

--------------------------------

need_replace will be set to 1 if no-Faulty mreplace exists, and mreplace
will be deref later. However, the latter check of mreplace might set
mreplace to NULL, null-ptr-deref occurs if need_replace is 1 at this time.

Fix it by merging two checks into one.

Fixes: ee37d731 ("md/raid10: Fix raid10 replace hang when new added disk faulty")
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>

7718714e

md/raid10: fix io loss while replacement replace rdev · e8025850

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188787, https://gitee.com/openeuler/kernel/issues/I78YIW
CVE: NA

--------------------------------

When we remove a disk which has replacement, first set rdev to NULL
and then set replacement to rdev, finally set replacement to NULL (see
raid10_remove_disk()). If io is submitted during the same time, it might
read both rdev and replacement as NULL, and io will not be submitted.

  rdev -> NULL
                        read rdev
  replacement -> NULL
                        read replacement

Fix it by reading replacement first and rdev later, meanwhile, use smp_mb()
to prevent memory reordering.

Fixes: 475b0321 ("md/raid10: writes should get directed to replacement as well as original.")
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>

e8025850

md/raid10: prioritize adding disk to 'removed' mirror · 2e2e7ab6

由 Li Nan 提交于 6月 03, 2023

hulk inclusion
category: bugfix
bugzilla: 188804, https://gitee.com/openeuler/kernel/issues/I78YIS
CVE: NA

--------------------------------

When add a new disk to raid10, it will traverse conf->mirror from start
and find one of the following mirror:
  1. mirror->rdev is set to WantReplacement and it have no replacement,
     set new disk to mirror->replacement.
  2. no rdev, set new disk to mirror->rdev.

There is a array as below (sda is set to WantReplacement):

    Number   Major   Minor   RaidDevice State
       0       8        0        0      active sync set-A   /dev/sda
       -       0        0        1      removed
       2       8       32        2      active sync set-A   /dev/sdc
       3       8       48        3      active sync set-B   /dev/sdd

Use 'mdadm --add' to add a new disk to this array, the new disk will
become sda's replacement instead of add to removed position, which is
confusing for users. Meanwhile, after new disk recovery success, sda
will be set to Faulty.

Prioritize adding disk to 'removed' mirror is a better choice. In the
above scenario, the behavior is the same as before, except sda will not
be deleted. Before other disks are added, continued use sda is more
reliable.
Signed-off-by: NLi Nan <linan122@huawei.com>
Reviewed-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>

2e2e7ab6

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功