- 14 7月, 2023 4 次提交
-
-
由 Zhihao Cheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7CBCS CVE: NA -------------------------------- Add debug message to notify user that ext4_writepages is stuck in loop caused by ENOSPC. Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> (cherry picked from commit 4ae7e703)
-
由 Zhihao Cheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7CBCS CVE: NA -------------------------------- This reverts commit 07a8109d. When ext4 runs out of space, there could be a potential data lost in ext4_writepages: If there are many preallocated blocks for some files, e4b bitmap is different from block bitmap, and there are more free blocks accounted by block bitmap. ext4_writepages P2 ext4_mb_new_blocks ext4_map_blocks ext4_mb_regular_allocator // No free bits in e4b bitmap ext4_mb_discard_preallocations_should_retry ext4_mb_discard_preallocations ext4_mb_discard_group_preallocations ext4_mb_release_inode_pa // updates e4b bitmap by pa->pa_free mb_free_blocks ext4_mb_new_blocks ext4_mb_regular_allocator // Got e4b bitmap's free bits ext4_mb_regular_allocator // After 3 times retrying, ret ENOSPC ext4_writepages mpage_map_and_submit_extent mpage_map_one_extent // ret ENOSPC if (err == -ENOSPC && EXT4_SB(sb)->s_mb_free_pending) // s_mb_free_pending is 0 *give_up_on_write = true // Abandon writeback, data lost! Fixes: 07a8109d ("ext4: Stop trying writing pages if no free ...") Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> (cherry picked from commit 5f142164)
-
https://gitee.com/openeuler/kernel由 openeuler-sync-bot 提交于
Merge branch 'openEuler-22.03-LTS-SP1' of https://gitee.com/openeuler/kernel into openEuler-22.03-LTS-SP1
-
由 openeuler-ci-bot 提交于
!759 【kernel-openEuler-22.03-LTS-SP1】kernel:fix a type error with 5.10 kernel on openEuler 22.03 LTS SP1 system Merge Pull Request from: @zhujun3 This PR is to adapt the 5.10 kernel to BC-Linux for Euler V22.10 U1 OS, the step one is compile kernel Kernel Issue: (https://gitee.com/openeuler/kernel/issues/I7E2XC?from=project-issue) Link:https://gitee.com/openeuler/kernel/pulls/759 Reviewed-by: sanglipeng <sanglipeng1@jd.com> Reviewed-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
-
- 13 7月, 2023 6 次提交
-
-
https://gitee.com/openeuler/kernel由 openeuler-sync-bot 提交于
Merge branch 'openEuler-22.03-LTS-SP1' of https://gitee.com/openeuler/kernel into openEuler-22.03-LTS-SP1
-
由 Mårten Lindahl 提交于
mainline inclusion from mainline-v6.4-rc1 commit 3a36d20e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7JO0G CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3a36d20e012903f45714df2731261fdefac900cb -------------------------------- If renaming a file in an encrypted directory, function fscrypt_setup_filename allocates memory for a file name. This name is never used, and before returning to the caller the memory for it is not freed. When running kmemleak on it we see that it is registered as a leak. The report below is triggered by a simple program 'rename' that renames a file in an encrypted directory: unreferenced object 0xffff888101502840 (size 32): comm "rename", pid 9404, jiffies 4302582475 (age 435.735s) backtrace: __kmem_cache_alloc_node __kmalloc fscrypt_setup_filename do_rename ubifs_rename vfs_rename do_renameat2 To fix this we can remove the call to fscrypt_setup_filename as it's not needed. Fixes: 278d9a24 ("ubifs: Rename whiteout atomically") Reported-by: NZhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: NMårten Lindahl <marten.lindahl@axis.com> Reviewed-by: NZhihao Cheng <chengzhihao1@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: NRichard Weinberger <richard@nod.at> Signed-off-by: NZhaoLong Wang <wangzhaolong1@huawei.com> (cherry picked from commit 6bc63230)
-
由 Mårten Lindahl 提交于
mainline inclusion from mainline-vv6.4-rc1 commit 1fb815b3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7JO0G CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1fb815b38bb31d6af9bd0540b8652a0d6fe6cfd3 -------------------------------- When opening a ubifs tmpfile on an encrypted directory, function fscrypt_setup_filename allocates memory for the name that is to be stored in the directory entry, but after the name has been copied to the directory entry inode, the memory is not freed. When running kmemleak on it we see that it is registered as a leak. The report below is triggered by a simple program 'tmpfile' just opening a tmpfile: unreferenced object 0xffff88810178f380 (size 32): comm "tmpfile", pid 509, jiffies 4294934744 (age 1524.742s) backtrace: __kmem_cache_alloc_node __kmalloc fscrypt_setup_filename ubifs_tmpfile vfs_tmpfile path_openat Free this memory after it has been copied to the inode. Signed-off-by: NMårten Lindahl <marten.lindahl@axis.com> Reviewed-by: NZhihao Cheng <chengzhihao1@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: NRichard Weinberger <richard@nod.at> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZhaoLong Wang <wangzhaolong1@huawei.com> (cherry picked from commit 3c594ca7)
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @openeuler-sync-bot Origin pull request: https://gitee.com/openeuler/kernel/pulls/1312 PR sync from: Baokun Li <libaokun1@huawei.com> https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/7ATD3RNUBURBEYA34VGOOZB53J377OZQ/ Baokun Li (5): quota: factor out dquot_write_dquot() quota: rename dquot_active() to inode_quota_active() quota: add new helper dquot_active() quota: fix dqput() to follow the guarantees dquot_srcu should provide quota: simplify drop_dquot_ref() -- 2.31.1 Link:https://gitee.com/openeuler/kernel/pulls/1389 Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @openeuler-sync-bot Origin pull request: https://gitee.com/openeuler/kernel/pulls/1376 PR sync from: Zhihao Cheng <chengzhihao1@huawei.com> https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/XNJZFYFNQIMIIQRPICSJB7KUZJDPS27T/ Link:https://gitee.com/openeuler/kernel/pulls/1392 Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @openeuler-sync-bot Origin pull request: https://gitee.com/openeuler/kernel/pulls/1280 A successful call to cgroup_css_set_fork() will always have taken a ref on kargs->cset (regardless of CLONE_INTO_CGROUP), so always do a corresponding put in cgroup_css_set_put_fork(). Without this, a cset and its contained css structures will be leaked for some fork failures. The following script reproduces the leak for a fork failure due to exceeding pids.max in the pids controller. A similar thing can happen if we jump to the bad_fork_cancel_cgroup label in copy_process(). [ -z "$1" ] && echo "Usage $0 pids-root" && exit 1 PID_ROOT=$1 CGROUP=$PID_ROOT/foo [ -e $CGROUP ] && rmdir -f $CGROUP mkdir $CGROUP echo 5 > $CGROUP/pids.max echo $$ > $CGROUP/cgroup.procs fork_bomb() { set -e for i in $(seq 10); do /bin/sleep 3600 & done } (fork_bomb) & wait echo $$ > $PID_ROOT/cgroup.procs kill $(cat $CGROUP/cgroup.procs) rmdir $CGROUP Link:https://gitee.com/openeuler/kernel/pulls/1308 Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
- 12 7月, 2023 10 次提交
-
-
由 Zhihao Cheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I70WHL CVE: NA -------------------------------- Following process will corrupt ext4 image: Step 1: jbd2_journal_commit_transaction __jbd2_journal_insert_checkpoint(jh, commit_transaction) // Put jh into trans1->t_checkpoint_list journal->j_checkpoint_transactions = commit_transaction // Put trans1 into journal->j_checkpoint_transactions Step 2: do_get_write_access test_clear_buffer_dirty(bh) // clear buffer dirty,set jbd dirty __jbd2_journal_file_buffer(jh, transaction) // jh belongs to trans2 Step 3: drop_cache journal_shrink_one_cp_list jbd2_journal_try_remove_checkpoint if (!trylock_buffer(bh)) // lock bh, true if (buffer_dirty(bh)) // buffer is not dirty __jbd2_journal_remove_checkpoint(jh) // remove jh from trans1->t_checkpoint_list Step 4: jbd2_log_do_checkpoint trans1 = journal->j_checkpoint_transactions // jh is not in trans1->t_checkpoint_list jbd2_cleanup_journal_tail(journal) // trans1 is done Step 5: Power cut, trans2 is not committed, jh is lost in next mounting. Fix it by checking 'jh->b_transaction' before remove it from checkpoint. Fixes: 80079353 ("jbd2: fix a race when checking checkpoint ...") Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> (cherry picked from commit 7723e91d)
-
由 Baokun Li 提交于
maillist inclusion category: bugfix bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR Reference: https://www.spinics.net/lists/kernel/msg4844759.html ---------------------------------------- As Honza said, remove_inode_dquot_ref() currently does not release the last dquot reference but instead adds the dquot to tofree_head list. This is because dqput() can sleep while dropping of the last dquot reference (writing back the dquot and calling ->release_dquot()) and that must not happen under dq_list_lock. Now that dqput() queues the final dquot cleanup into a workqueue, remove_inode_dquot_ref() can call dqput() unconditionally and we can significantly simplify it. Here we open code the simplified code of remove_inode_dquot_ref() into remove_dquot_ref() and remove the function put_dquot_list() which is no longer used. Signed-off-by: NBaokun Li <libaokun1@huawei.com> (cherry picked from commit a13fcef3)
-
由 Baokun Li 提交于
maillist inclusion category: bugfix bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR Reference: https://www.spinics.net/lists/kernel/msg4844759.html ---------------------------------------- The dquot_mark_dquot_dirty() using dquot references from the inode should be protected by dquot_srcu. quota_off code takes care to call synchronize_srcu(&dquot_srcu) to not drop dquot references while they are used by other users. But dquot_transfer() breaks this assumption. We call dquot_transfer() to drop the last reference of dquot and add it to free_dquots, but there may still be other users using the dquot at this time, as shown in the function graph below: cpu1 cpu2 _________________|_________________ wb_do_writeback CHOWN(1) ... ext4_da_update_reserve_space dquot_claim_block ... dquot_mark_dquot_dirty // try to dirty old quota test_bit(DQ_ACTIVE_B, &dquot->dq_flags) // still ACTIVE if (test_bit(DQ_MOD_B, &dquot->dq_flags)) // test no dirty, wait dq_list_lock ... dquot_transfer __dquot_transfer dqput_all(transfer_from) // rls old dquot dqput // last dqput dquot_release clear_bit(DQ_ACTIVE_B, &dquot->dq_flags) atomic_dec(&dquot->dq_count) put_dquot_last(dquot) list_add_tail(&dquot->dq_free, &free_dquots) // add the dquot to free_dquots if (!test_and_set_bit(DQ_MOD_B, &dquot->dq_flags)) add dqi_dirty_list // add released dquot to dirty_list This can cause various issues, such as dquot being destroyed by dqcache_shrink_scan() after being added to free_dquots, which can trigger a UAF in dquot_mark_dquot_dirty(); or after dquot is added to free_dquots and then to dirty_list, it is added to free_dquots again after dquot_writeback_dquots() is executed, which causes the free_dquots list to be corrupted and triggers a UAF when dqcache_shrink_scan() is called for freeing dquot twice. As Honza said, we need to fix dquot_transfer() to follow the guarantees dquot_srcu should provide. But calling synchronize_srcu() directly from dquot_transfer() is too expensive (and mostly unnecessary). So we add dquot whose last reference should be dropped to the new global dquot list releasing_dquots, and then queue work item which would call synchronize_srcu() and after that perform the final cleanup of all the dquots on releasing_dquots. Fixes: 4580b30e ("quota: Do not dirty bad dquots") Suggested-by: NJan Kara <jack@suse.cz> Signed-off-by: NBaokun Li <libaokun1@huawei.com> (cherry picked from commit d82ddaab)
-
由 Baokun Li 提交于
maillist inclusion category: bugfix bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR Reference: https://www.spinics.net/lists/kernel/msg4844759.html ---------------------------------------- Add new helper function dquot_active() to make the code more concise. Signed-off-by: NBaokun Li <libaokun1@huawei.com> (cherry picked from commit 3fb7aa3a)
-
由 Baokun Li 提交于
maillist inclusion category: bugfix bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR Reference: https://www.spinics.net/lists/kernel/msg4844759.html ---------------------------------------- Now we have a helper function dquot_dirty() to determine if dquot has DQ_MOD_B bit. dquot_active() can easily be misunderstood as a helper function to determine if dquot has DQ_ACTIVE_B bit. So we avoid this by renaming it to inode_quota_active() and later on we will add the helper function dquot_active() to determine if dquot has DQ_ACTIVE_B bit. Signed-off-by: NBaokun Li <libaokun1@huawei.com> (cherry picked from commit 329a1eb4)
-
由 Baokun Li 提交于
maillist inclusion category: bugfix bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR Reference: https://www.spinics.net/lists/kernel/msg4844759.html ---------------------------------------- Refactor out dquot_write_dquot() to reduce duplicate code. Signed-off-by: NBaokun Li <libaokun1@huawei.com> (cherry picked from commit 0a3781ae)
-
https://gitee.com/openeuler/kernel由 openeuler-sync-bot 提交于
Merge branch 'openEuler-22.03-LTS-SP1' of https://gitee.com/openeuler/kernel into openEuler-22.03-LTS-SP1
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @openeuler-sync-bot Origin pull request: https://gitee.com/openeuler/kernel/pulls/1325 PR sync from: Zhihao Cheng <chengzhihao1@huawei.com> https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/QARA5X5OQUKRFUIORG2YVB6YE3V5CGQB/ Zhang Yi (4): jbd2: remove journal_clean_one_cp_list() jbd2: fix a race when checking checkpoint buffer busy jbd2: remove __journal_try_to_free_buffer() jbd2: fix checkpoint cleanup performance regression Zhihao Cheng (1): jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint -- 2.31.1 Link:https://gitee.com/openeuler/kernel/pulls/1329 Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @openeuler-sync-bot Origin pull request: https://gitee.com/openeuler/kernel/pulls/1314 PR sync from: Zhihao Cheng <chengzhihao1@huawei.com> https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/ALOJ633HB2KNGCGZVSSVUI34JMM2MTRP/ Link:https://gitee.com/openeuler/kernel/pulls/1332 Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 Coly Li 提交于
mainline inclusion from mainline-v6.3-rc4 commit 9bbf5fee category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7JLUM CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.4&id=9bbf5feecc7eab2c370496c1c161bbfe62084028 ---------------------------------------- This is an already known issue that dm-thin volume cannot be used as swap, otherwise a deadlock may happen when dm-thin internal memory demand triggers swap I/O on the dm-thin volume itself. But thanks to commit a666e5c0 ("dm: fix deadlock when swapping to encrypted device"), the limit_swap_bios target flag can also be used for dm-thin to avoid the recursive I/O when it is used as swap. Fix is to simply set ti->limit_swap_bios to true in both pool_ctr() and thin_ctr(). In my test, I create a dm-thin volume /dev/vg/swap and use it as swap device. Then I run fio on another dm-thin volume /dev/vg/main and use large --blocksize to trigger swap I/O onto /dev/vg/swap. The following fio command line is used in my test, fio --name recursive-swap-io --lockmem 1 --iodepth 128 \ --ioengine libaio --filename /dev/vg/main --rw randrw \ --blocksize 1M --numjobs 32 --time_based --runtime=12h Without this fix, the whole system can be locked up within 15 seconds. With this fix, there is no any deadlock or hung task observed after 2 hours of running fio. Furthermore, if blocksize is changed from 1M to 128M, after around 30 seconds fio has no visible I/O, and the out-of-memory killer message shows up in kernel message. After around 20 minutes all fio processes are killed and the whole system is back to being alive. This is exactly what is expected when recursive I/O happens on dm-thin volume when it is used as swap. Depends-on: a666e5c0 ("dm: fix deadlock when swapping to encrypted device") Cc: stable@vger.kernel.org Signed-off-by: NColy Li <colyli@suse.de> Acked-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@kernel.org> Conflict: drivers/md/dm-thin.c Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> (cherry picked from commit 6283fa7e)
-
- 11 7月, 2023 6 次提交
-
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @openeuler-sync-bot Origin pull request: https://gitee.com/openeuler/kernel/pulls/1286 PR sync from: Baokun Li <libaokun1@huawei.com> https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/X3ZSP2AARUKCTNGQH7V2EC4D2KQ67AMO/ Link:https://gitee.com/openeuler/kernel/pulls/1340 Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @openeuler-sync-bot Origin pull request: https://gitee.com/openeuler/kernel/pulls/1324 PR sync from: Zhong Jinghua <zhongjinghua@huawei.com> https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/2P2KGVU22TWAYJ5N3JDYWA7EXWJOL2OS/ Link:https://gitee.com/openeuler/kernel/pulls/1367 Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @openeuler-sync-bot Origin pull request: https://gitee.com/openeuler/kernel/pulls/1287 PR sync from: Zhengchao Shao <shaozhengchao@huawei.com> https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/AM4RDLF2OSU74VL45PDNQCRW7E3VXA63/ Link:https://gitee.com/openeuler/kernel/pulls/1363 Reviewed-by: Yue Haibing <yuehaibing@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 Jens Axboe 提交于
stable inclusion from stable-v5.10.185 commit 4716c73b188566865bdd79c3a6709696a224ac04 category: bugfix bugzilla: 188954, https://gitee.com/src-openeuler/kernel/issues/I7GVI5?from=project-issue CVE: CVE-2023-3389 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4716c73b188566865bdd79c3a6709696a224ac04 ---------------------------------------- Snipped from commit 9ca9fb24 upstream. While reworking the poll hashing in the v6.0 kernel, we ended up grabbing the ctx->uring_lock in poll update/removal. This also fixed a bug with linked timeouts racing with timeout expiry and poll removal. Bring back just the locking fix for that. Reported-and-tested-by: NQuerijn Voet <querijnqyn@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com> (cherry picked from commit 43a7aef4)
-
由 t.feng 提交于
stable inclusion from stable-v5.10.181 commit f4a371d3f5a7a71dff1ab48b3122c5cf23cc7ad5 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7GVI1 CVE: CVE-2023-3090 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f4a371d3f5a7a71dff1ab48b3122c5cf23cc7ad5 -------------------------------- [ Upstream commit 90cbed52 ] If skb enqueue the qdisc, fq_skb_cb(skb)->time_to_send is changed which is actually skb->cb, and IPCB(skb_in)->opt will be used in __ip_options_echo. It is possible that memcpy is out of bounds and lead to stack overflow. We should clear skb->cb before ip_local_out or ip6_local_out. v2: 1. clean the stack info 2. use IPCB/IP6CB instead of skb->cb crash on stable-5.10(reproduce in kasan kernel). Stack info: [ 2203.651571] BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0x589/0x800 [ 2203.653327] Write of size 4 at addr ffff88811a388f27 by task swapper/3/0 [ 2203.655460] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Not tainted 5.10.0-60.18.0.50.h856.kasan.eulerosv2r11.x86_64 #1 [ 2203.655466] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-20181220_000000-szxrtosci10000 04/01/2014 [ 2203.655475] Call Trace: [ 2203.655481] <IRQ> [ 2203.655501] dump_stack+0x9c/0xd3 [ 2203.655514] print_address_description.constprop.0+0x19/0x170 [ 2203.655530] __kasan_report.cold+0x6c/0x84 [ 2203.655586] kasan_report+0x3a/0x50 [ 2203.655594] check_memory_region+0xfd/0x1f0 [ 2203.655601] memcpy+0x39/0x60 [ 2203.655608] __ip_options_echo+0x589/0x800 [ 2203.655654] __icmp_send+0x59a/0x960 [ 2203.655755] nf_send_unreach+0x129/0x3d0 [nf_reject_ipv4] [ 2203.655763] reject_tg+0x77/0x1bf [ipt_REJECT] [ 2203.655772] ipt_do_table+0x691/0xa40 [ip_tables] [ 2203.655821] nf_hook_slow+0x69/0x100 [ 2203.655828] __ip_local_out+0x21e/0x2b0 [ 2203.655857] ip_local_out+0x28/0x90 [ 2203.655868] ipvlan_process_v4_outbound+0x21e/0x260 [ipvlan] [ 2203.655931] ipvlan_xmit_mode_l3+0x3bd/0x400 [ipvlan] [ 2203.655967] ipvlan_queue_xmit+0xb3/0x190 [ipvlan] [ 2203.655977] ipvlan_start_xmit+0x2e/0xb0 [ipvlan] [ 2203.655984] xmit_one.constprop.0+0xe1/0x280 [ 2203.655992] dev_hard_start_xmit+0x62/0x100 [ 2203.656000] sch_direct_xmit+0x215/0x640 [ 2203.656028] __qdisc_run+0x153/0x1f0 [ 2203.656069] __dev_queue_xmit+0x77f/0x1030 [ 2203.656173] ip_finish_output2+0x59b/0xc20 [ 2203.656244] __ip_finish_output.part.0+0x318/0x3d0 [ 2203.656312] ip_finish_output+0x168/0x190 [ 2203.656320] ip_output+0x12d/0x220 [ 2203.656357] __ip_queue_xmit+0x392/0x880 [ 2203.656380] __tcp_transmit_skb+0x1088/0x11c0 [ 2203.656436] __tcp_retransmit_skb+0x475/0xa30 [ 2203.656505] tcp_retransmit_skb+0x2d/0x190 [ 2203.656512] tcp_retransmit_timer+0x3af/0x9a0 [ 2203.656519] tcp_write_timer_handler+0x3ba/0x510 [ 2203.656529] tcp_write_timer+0x55/0x180 [ 2203.656542] call_timer_fn+0x3f/0x1d0 [ 2203.656555] expire_timers+0x160/0x200 [ 2203.656562] run_timer_softirq+0x1f4/0x480 [ 2203.656606] __do_softirq+0xfd/0x402 [ 2203.656613] asm_call_irq_on_stack+0x12/0x20 [ 2203.656617] </IRQ> [ 2203.656623] do_softirq_own_stack+0x37/0x50 [ 2203.656631] irq_exit_rcu+0x134/0x1a0 [ 2203.656639] sysvec_apic_timer_interrupt+0x36/0x80 [ 2203.656646] asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 2203.656654] RIP: 0010:default_idle+0x13/0x20 [ 2203.656663] Code: 89 f0 5d 41 5c 41 5d 41 5e c3 cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 0f 1f 44 00 00 0f 00 2d 9f 32 57 00 fb f4 <c3> cc cc cc cc 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 be 08 [ 2203.656668] RSP: 0018:ffff88810036fe78 EFLAGS: 00000256 [ 2203.656676] RAX: ffffffffaf2a87f0 RBX: ffff888100360000 RCX: ffffffffaf290191 [ 2203.656681] RDX: 0000000000098b5e RSI: 0000000000000004 RDI: ffff88811a3c4f60 [ 2203.656686] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff88811a3c4f63 [ 2203.656690] R10: ffffed10234789ec R11: 0000000000000001 R12: 0000000000000003 [ 2203.656695] R13: ffff888100360000 R14: 0000000000000000 R15: 0000000000000000 [ 2203.656729] default_idle_call+0x5a/0x150 [ 2203.656735] cpuidle_idle_call+0x1c6/0x220 [ 2203.656780] do_idle+0xab/0x100 [ 2203.656786] cpu_startup_entry+0x19/0x20 [ 2203.656793] secondary_startup_64_no_verify+0xc2/0xcb [ 2203.657409] The buggy address belongs to the page: [ 2203.658648] page:0000000027a9842f refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11a388 [ 2203.658665] flags: 0x17ffffc0001000(reserved|node=0|zone=2|lastcpupid=0x1fffff) [ 2203.658675] raw: 0017ffffc0001000 ffffea000468e208 ffffea000468e208 0000000000000000 [ 2203.658682] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 [ 2203.658686] page dumped because: kasan: bad access detected To reproduce(ipvlan with IPVLAN_MODE_L3): Env setting: ======================================================= modprobe ipvlan ipvlan_default_mode=1 sysctl net.ipv4.conf.eth0.forwarding=1 iptables -t nat -A POSTROUTING -s 20.0.0.0/255.255.255.0 -o eth0 -j MASQUERADE ip link add gw link eth0 type ipvlan ip -4 addr add 20.0.0.254/24 dev gw ip netns add net1 ip link add ipv1 link eth0 type ipvlan ip link set ipv1 netns net1 ip netns exec net1 ip link set ipv1 up ip netns exec net1 ip -4 addr add 20.0.0.4/24 dev ipv1 ip netns exec net1 route add default gw 20.0.0.254 ip netns exec net1 tc qdisc add dev ipv1 root netem loss 10% ifconfig gw up iptables -t filter -A OUTPUT -p tcp --dport 8888 -j REJECT --reject-with icmp-port-unreachable ======================================================= And then excute the shell(curl any address of eth0 can reach): for((i=1;i<=100000;i++)) do ip netns exec net1 curl x.x.x.x:8888 done ======================================================= Fixes: 2ad7bf36 ("ipvlan: Initial check-in of the IPVLAN driver.") Signed-off-by: N"t.feng" <fengtao40@huawei.com> Suggested-by: NFlorian Westphal <fw@strlen.de> Reviewed-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com> (cherry picked from commit 2572b83c)
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @openeuler-sync-bot Origin pull request: https://gitee.com/openeuler/kernel/pulls/1272 PR sync from: Long Li <leo.lilong@huawei.com> https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/W6KN2XSLJE5HZR2Y5D2OTDQ2GTLDGC5O/ Patchs 1-6 fix some problems recently. Patchs 7-8 backport from mainline. Darrick J. Wong (1): xfs: fix uninitialized variable access Dave Chinner (1): xfs: set XFS_FEAT_NLINK correctly Long Li (4): xfs: factor out xfs_defer_pending_abort xfs: don't leak intent item when recovery intents fail xfs: factor out xfs_destroy_perag() xfs: don't leak perag when growfs fails Ye Bin (1): xfs: fix warning in xfs_vm_writepages() yangerkun (1): xfs: fix mounting failed caused by sequencing problem in the log records -- 2.31.1 Link:https://gitee.com/openeuler/kernel/pulls/1343 Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
- 07 7月, 2023 9 次提交
-
-
由 Darrick J. Wong 提交于
mainline inclusion from mainline-v6.2-rc6 commit 60b730a4 category: bugfix bugzilla: 188220, https://gitee.com/openeuler/kernel/issues/I4KIAO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=60b730a40c43fbcc034970d3e77eb0f25b8cc1cf -------------------------------- If the end position of a GETFSMAP query overlaps an allocated space and we're using the free space info to generate fsmap info, the akeys information gets fed into the fsmap formatter with bad results. Zero-init the space. Reported-by: syzbot+090ae72d552e6bd93cfe@syzkaller.appspotmail.com Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NLong Li <leo.lilong@huawei.com> (cherry picked from commit aebc38d3)
-
由 Dave Chinner 提交于
mainline inclusion from mainline-v5.18-rc2 commit dd0d2f97 category: bugfix bugzilla: 188220, https://gitee.com/openeuler/kernel/issues/I4KIAO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dd0d2f9755191690541b09e6385d0f8cd8bc9d8f -------------------------------- While xfs_has_nlink() is not used in kernel, it is used in userspace (e.g. by xfs_db) so we need to set the XFS_FEAT_NLINK flag correctly in xfs_sb_version_to_features(). Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDave Chinner <david@fromorbit.com> Signed-off-by: NLong Li <leo.lilong@huawei.com> (cherry picked from commit f2096cec)
-
由 Long Li 提交于
Offering: HULK hulk inclusion category: bugfix bugzilla: 188878, https://gitee.com/openeuler/kernel/issues/I76JSK -------------------------------- During growfs, if new ag in memory has been initialized, however sb_agcount has not been updated, if an error occurs at this time it will cause ag leaks as follows, these new ags will not been freed during umount because of sb_agcount is not been updated. unreferenced object 0xffff88810751b000 (size 1024): comm "xfs_growfs", pid 123624, jiffies 4300733989 (age 124294.081s) hex dump (first 32 bytes): 00 a0 38 16 81 88 ff ff 05 00 00 00 00 00 00 00 ..8............. 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000725c8ae4>] kmem_alloc+0x92/0x1d0 [xfs] [<000000005c32d74e>] xfs_initialize_perag+0x8d/0x3b0 [xfs] [<00000000830354cf>] xfs_growfs_data_private.isra.0+0x2af/0x610 [xfs] [<0000000038a29cb1>] xfs_growfs_data+0x228/0x300 [xfs] [<0000000004937dd2>] xfs_file_ioctl+0x8f3/0x10d0 [xfs] [<000000001a5d29a8>] __se_sys_ioctl+0xeb/0x120 [<00000000cf30385a>] do_syscall_64+0x30/0x40 [<00000000e4a6fd2f>] entry_SYSCALL_64_after_hwframe+0x61/0xc6 When growfs fails, use xfs_destroy_perag() to destroy newly initialized ag in error handle path. Signed-off-by: NLong Li <leo.lilong@huawei.com> (cherry picked from commit 670cd2c8)
-
由 Long Li 提交于
Offering: HULK hulk inclusion category: bugfix bugzilla: 188878, https://gitee.com/openeuler/kernel/issues/I76JSK -------------------------------- Factor out xfs_destroy_perag() from xfs_initialize_perag() for error handle, delete perag from radix tree requires lock protection, just like any other places where perag tree are modified. Signed-off-by: NLong Li <leo.lilong@huawei.com> (cherry picked from commit 42297cd9)
-
由 Ye Bin 提交于
Offering: HULK hulk inclusion category: bugfix bugzilla: 188782, https://gitee.com/openeuler/kernel/issues/I76JSK ----------------------------------------------- When do BULKSTAT test got issues as follows: WARNING: CPU: 3 PID: 8425 at fs/xfs/xfs_aops.c:509 xfs_vm_writepages+0x184/0x1c0 Modules linked in: CPU: 3 PID: 8425 Comm: xfs_bulkstat Not tainted 6.3.0-next-20230505-00003-gf3329adf5424-dirty #456 RIP: 0010:xfs_vm_writepages+0x184/0x1c0 RSP: 0018:ffffc90014bb7088 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 1ffff92002976e11 RCX: ffff88817aef8000 RDX: 0000000000000000 RSI: ffff88817aef8000 RDI: 0000000000000002 RBP: ffff888267dd2ad8 R08: ffffffff8313f414 R09: ffffed1022377c18 R10: ffff888111bbe0bb R11: ffffed1022377c17 R12: ffff88817aef8000 R13: ffffc90014bb7358 R14: dffffc0000000000 R15: ffffffff8313f290 FS: 00007f9568bb0440(0000) GS:ffff88882fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000d7a008 CR3: 000000024e11f000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> do_writepages+0x1a8/0x630 __writeback_single_inode+0x126/0xe00 writeback_single_inode+0x2ae/0x530 write_inode_now+0x16e/0x1e0 iput.part.0+0x46c/0x730 iput+0x60/0x80 xfs_bulkstat_one_int+0xd87/0x1580 xfs_bulkstat_iwalk+0x6e/0xd0 xfs_iwalk_ag_recs+0x449/0x770 xfs_iwalk_run_callbacks+0x305/0x630 xfs_iwalk_ag+0x819/0xae0 xfs_iwalk+0x2d5/0x4e0 xfs_bulkstat+0x358/0x520 xfs_ioc_bulkstat.isra.0+0x242/0x340 xfs_file_ioctl+0x1d6/0x1ba0 __x64_sys_ioctl+0x197/0x210 do_syscall_64+0x39/0xb0 entry_SYSCALL_64_after_hwframe+0x63/0xcd Above issue may happens as follows: Porcess1 Process2 process3 process4 xfs_bulkstat xfs_trans_alloc_empty xfs_bulkstat_one_int xfs_iget(XFS_IGET_DONTCACHE) ->Get inode from disk and mark inode with I_DONTCACHE xfs_lookup xfs_iget ->Hold inode refcount xfs_irele xfs_file_write_iter ->Write file made some dirty pages close file xfs_bulkstat xfs_trans_alloc_empty xfs_bulkstat_one_int xfs_iget(XFS_IGET_DONTCACHE) -> process4 close file ******Trigger dentry reclaim, inode refcount is 1****** xfs_irele iput ->Put the last refcount iput_final write_inode_now xfs_vm_writepages WARN_ON_ONCE(current->journal_info) ->Trigger warning As commit a6343e4d grab an empty transaction when do BULKSTAT. If put the last refcount of inode maybe cause writepages will trigger warning, and also lead to data loss. To solve above issue if xfs_iget_cache_hit() just clear inode's I_DONTCACHE flags. Fixes: a6343e4d ("xfs: avoid buffer deadlocks when walking fs inodes") Signed-off-by: NYe Bin <yebin10@huawei.com> Signed-off-by: NLong Li <leo.lilong@huawei.com> (cherry picked from commit 28ce0ae2)
-
由 Long Li 提交于
Offering: HULK hulk inclusion category: bugfix bugzilla: 188865, https://gitee.com/openeuler/kernel/issues/I76JSK -------------------------------- When recovery intents, it may capture some deferred ops and commit the new intent items, if recovery intents fails, there will be no done item drop the reference to the new intent item. This leads to a memory leak as fllows: unreferenced object 0xffff888016719108 (size 432): comm "mount", pid 529, jiffies 4294706839 (age 144.463s) hex dump (first 32 bytes): 08 91 71 16 80 88 ff ff 08 91 71 16 80 88 ff ff ..q.......q..... 18 91 71 16 80 88 ff ff 18 91 71 16 80 88 ff ff ..q.......q..... backtrace: [<ffffffff8230c68f>] xfs_efi_init+0x18f/0x1d0 [<ffffffff8230c720>] xfs_extent_free_create_intent+0x50/0x150 [<ffffffff821b671a>] xfs_defer_create_intents+0x16a/0x340 [<ffffffff821bac3e>] xfs_defer_ops_capture_and_commit+0x8e/0xad0 [<ffffffff82322bb9>] xfs_cui_item_recover+0x819/0x980 [<ffffffff823289b6>] xlog_recover_process_intents+0x246/0xb70 [<ffffffff8233249a>] xlog_recover_finish+0x8a/0x9a0 [<ffffffff822eeafb>] xfs_log_mount_finish+0x2bb/0x4a0 [<ffffffff822c0f4f>] xfs_mountfs+0x14bf/0x1e70 [<ffffffff822d1f80>] xfs_fs_fill_super+0x10d0/0x1b20 [<ffffffff81a21fa2>] get_tree_bdev+0x3d2/0x6d0 [<ffffffff81a1ee09>] vfs_get_tree+0x89/0x2c0 [<ffffffff81a9f35f>] path_mount+0xecf/0x1800 [<ffffffff81a9fd83>] do_mount+0xf3/0x110 [<ffffffff81aa00e4>] __x64_sys_mount+0x154/0x1f0 [<ffffffff83968739>] do_syscall_64+0x39/0x80 Fix it by abort intent items in capture list that don't have a done item when recovery intents fail. If transaction that have deferred ops is commmit fails in xfs_defer_ops_capture_and_commit(), defer capture would not added to capture list, it also need abort too. Signed-off-by: NLong Li <leo.lilong@huawei.com> (cherry picked from commit c1b08a41)
-
由 Long Li 提交于
Offering: HULK hulk inclusion category: bugfix bugzilla: 188865, https://gitee.com/openeuler/kernel/issues/I76JSK -------------------------------- Factor out xfs_defer_pending_abort() from xfs_defer_trans_abort(), which not use transaction parameter, so it can be used after the transaction life cycle. Signed-off-by: NLong Li <leo.lilong@huawei.com> (cherry picked from commit 9bd2b3bd)
-
由 yangerkun 提交于
Offering: HULK hulk inclusion category: bugfix bugzilla: 188870, https://gitee.com/openeuler/kernel/issues/I76JSK -------------------------------- During the test of growfs + power-off, we encountered a mounting failure issue. The specific call stack is as follows: [584505.210179] XFS (loop0): xfs_buf_find: daddr 0x6d6002 out of range, EOFS 0x6d6000 ... [584505.210739] Call Trace: [584505.210776] xfs_buf_get_map+0x44/0x230 [xfs] [584505.210780] ? trace_event_buffer_commit+0x57/0x140 [584505.210818] xfs_buf_read_map+0x54/0x280 [xfs] [584505.210858] ? xlog_recover_items_pass2+0x53/0xb0 [xfs] [584505.210899] xlog_recover_buf_commit_pass2+0x112/0x440 [xfs] [584505.210939] ? xlog_recover_items_pass2+0x53/0xb0 [xfs] [584505.210980] xlog_recover_items_pass2+0x53/0xb0 [xfs] [584505.211020] xlog_recover_commit_trans+0x2ca/0x320 [xfs] [584505.211061] xlog_recovery_process_trans+0xc6/0xf0 [xfs] [584505.211101] xlog_recover_process_data+0x9e/0x110 [xfs] [584505.211141] xlog_do_recovery_pass+0x3b4/0x5c0 [xfs] [584505.211181] xlog_do_log_recovery+0x5e/0x80 [xfs] [584505.211223] xlog_do_recover+0x33/0x1a0 [xfs] [584505.211262] xlog_recover+0xd7/0x170 [xfs] [584505.211303] xfs_log_mount+0x217/0x2b0 [xfs] [584505.211341] xfs_mountfs+0x3da/0x870 [xfs] [584505.211384] xfs_fc_fill_super+0x3fa/0x7a0 [xfs] [584505.211428] ? xfs_setup_devices+0x80/0x80 [xfs] [584505.211432] get_tree_bdev+0x16f/0x260 [584505.211434] vfs_get_tree+0x25/0xc0 [584505.211436] do_new_mount+0x156/0x1b0 [584505.211438] __se_sys_mount+0x165/0x1d0 [584505.211440] do_syscall_64+0x33/0x40 [584505.211442] entry_SYSCALL_64_after_hwframe+0x61/0xc6 After analyzing the log records, we have discovered the following content: ============================================================================ cycle: 173 version: 2 lsn: 173,2742 tail_lsn: 173,1243 length of Log Record: 25600 prev offset: 2702 num ops: 258 uuid: fb958458-48a3-4c76-ae23-7a1cf3053065 format: little endian linux h_size: 32768 ---------------------------------------------------------------------------- ... ---------------------------------------------------------------------------- Oper (100): tid: 1c010724 len: 24 clientid: TRANS flags: none BUF: #regs: 2 start blkno: 7168002 (0x6d6002) len: 1 bmap size: 1 flags: 0x3800 Oper (101): tid: 1c010724 len: 128 clientid: TRANS flags: none AGI Buffer: XAGI ver: 1 seq#: 28 len: 2048 cnt: 0 root: 3 level: 1 free#: 0x0 newino: 0x140 bucket[0 - 3]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff bucket[4 - 7]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff bucket[8 - 11]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff bucket[12 - 15]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff bucket[16 - 19]: 0xffffffff ---------------------------------------------------------------------------- ... ---------------------------------------------------------------------------- Oper (108): tid: 1c010724 len: 24 clientid: TRANS flags: none BUF: #regs: 2 start blkno: 0 (0x0) len: 1 bmap size: 1 flags: 0x9000 Oper (109): tid: 1c010724 len: 384 clientid: TRANS flags: none SUPER BLOCK Buffer: icount: 6360863066640355328 ifree: 898048 fdblks: 0 frext: 0 ---------------------------------------------------------------------------- ... We found that in the log records, the modification transaction for the expanded block is before the growfs transaction, which leads to verification failure during log replay. We need to ensure that when replaying logs, transactions related to the superblock are replayed first. Signed-off-by: NWu Guanghao <wuguanghao3@huawei.com> Signed-off-by: Nyangerkun <yangerkun@huawei.com> Signed-off-by: NLong Li <leo.lilong@huawei.com> (cherry picked from commit dba19fb8)
-
由 Baokun Li 提交于
mainline inclusion from mainline-v6.5 commit d13f99632748462c32fc95d729f5e754bab06064 category: bugfix bugzilla: 188906, https://gitee.com/openeuler/kernel/issues/I7E9M5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d13f99632748462c32fc95d729f5e754bab06064 -------------------------------- Yi found during a review of the patch "ext4: don't BUG on inconsistent journal feature" that when ext4_mark_recovery_complete() returns an error value, the error handling path does not turn off the enabled quotas, which triggers the following kmemleak: ================================================================ unreferenced object 0xffff8cf68678e7c0 (size 64): comm "mount", pid 746, jiffies 4294871231 (age 11.540s) hex dump (first 32 bytes): 00 90 ef 82 f6 8c ff ff 00 00 00 00 41 01 00 00 ............A... c7 00 00 00 bd 00 00 00 0a 00 00 00 48 00 00 00 ............H... backtrace: [<00000000c561ef24>] __kmem_cache_alloc_node+0x4d4/0x880 [<00000000d4e621d7>] kmalloc_trace+0x39/0x140 [<00000000837eee74>] v2_read_file_info+0x18a/0x3a0 [<0000000088f6c877>] dquot_load_quota_sb+0x2ed/0x770 [<00000000340a4782>] dquot_load_quota_inode+0xc6/0x1c0 [<0000000089a18bd5>] ext4_enable_quotas+0x17e/0x3a0 [ext4] [<000000003a0268fa>] __ext4_fill_super+0x3448/0x3910 [ext4] [<00000000b0f2a8a8>] ext4_fill_super+0x13d/0x340 [ext4] [<000000004a9489c4>] get_tree_bdev+0x1dc/0x370 [<000000006e723bf1>] ext4_get_tree+0x1d/0x30 [ext4] [<00000000c7cb663d>] vfs_get_tree+0x31/0x160 [<00000000320e1bed>] do_new_mount+0x1d5/0x480 [<00000000c074654c>] path_mount+0x22e/0xbe0 [<0000000003e97a8e>] do_mount+0x95/0xc0 [<000000002f3d3736>] __x64_sys_mount+0xc4/0x160 [<0000000027d2140c>] do_syscall_64+0x3f/0x90 ================================================================ To solve this problem, we add a "failed_mount10" tag, and call ext4_quota_off_umount() in this tag to release the enabled qoutas. Fixes: 11215630 ("ext4: don't BUG on inconsistent journal feature") Cc: stable@kernel.org Signed-off-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230327141630.156875-2-libaokun1@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Conflicts: fs/ext4/super.c Signed-off-by: NBaokun Li <libaokun1@huawei.com> (cherry picked from commit e980e714)
-
- 06 7月, 2023 5 次提交
-
-
由 Zhihao Cheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7CBCS -------------------------------- Folllowing steps could make ext4_wripages trap into a dead loop: 1. Consume free_clusters until free_clusters > 2 * sbi->s_resv_clusters, and free_clusters > EXT4_FREECLUSTERS_WATERMARK. // eg. free_clusters = 1422, sbi->s_resv_clusters = 512 // nr_cpus = 4, EXT4_FREECLUSTERS_WATERMARK = 512 2. umount && mount. // dirty_clusters = 0 3. Run free_clusters tasks concurrently to write different files, many tasks write(appendant) 4K data by da_write method. And each inode will consume one data block and one extent block in map_block. // There are (free_clusters - EXT4_FREECLUSTERS_WATERMARK = 910) // tasks choosing da_write method, left 512 tasks choose write_begin // method. If tasks which chooses da_write path run first. // dirty_clusters = 910, free_clusters = 1422 // Tasks which choose write_begin path will get ENOSPC: // free_clusters < (nclusters + dirty_clusters + resv_clusters) // 1422 < (1 + 910 + 512) 4. After certain number of map_block iterations in ext4_writepages. // free_clusters = 0, // dirty_clusters = 910 - (1422 / 2) = 199 5. Delete one 4K file. // free_clusters = 1 6. ext4_writepages traps into dead loop: mpage_map_and_submit_extent mpage_map_one_extent // ret = ENOSPC ext4_map_blocks -> ext4_ext_map_blocks -> ext4_mb_new_blocks -> ext4_claim_free_clusters: if (free_clusters >= (nclusters + dirty_clusters)) // false if (err == -ENOSPC && ext4_count_free_clusters(sb)) // true return err *give_up_on_write = true // won't be executed Fix it by terminating ext4_writepages if no free blocks generated. Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> (cherry picked from commit 07a8109d)
-
由 Zhang Yi 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7IO1D CVE: NA -------------------------------- journal_clean_one_cp_list() has been merged into journal_shrink_one_cp_list(), but do chekpoint buffer cleanup from the committing process is just a best effort, it should stop scan once it meet a busy buffer, or else it will cause a lot of invalid buffer scan and checks. We catch a performance regression when doing fs_mark tests below. Test cmd: ./fs_mark -d scratch -s 1024 -n 10000 -t 1 -D 100 -N 100 Before merging checkpoint buffer cleanup: FSUse% Count Size Files/sec App Overhead 95 10000 1024 8304.9 49033 After merging checkpoint buffer cleanup: FSUse% Count Size Files/sec App Overhead 95 10000 1024 7649.0 50012 FSUse% Count Size Files/sec App Overhead 95 10000 1024 2107.1 50871 After merging checkpoint buffer cleanup, the total loop count in journal_shrink_one_cp_list() could be up to 6,261,600+ (50,000+ ~ 100,000+ in general), most of them are invalid. This patch fix it through passing 'shrink_type' into journal_shrink_one_cp_list() and add a new 'SHRINK_BUSY_STOP' to indicate it should stop once meet a busy buffer. After fix, the loop count descending back to 10,000+. After this fix: FSUse% Count Size Files/sec App Overhead 95 10000 1024 8558.4 49109 Signed-off-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> (cherry picked from commit 30b833d5)
-
由 Zhang Yi 提交于
maillist inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I70WHL CVE: NA Reference: https://lore.kernel.org/linux-ext4/20230606135928.434610-1-yi.zhang@huaweicloud.com/T/#t -------------------------------- __journal_try_to_free_buffer() has only one caller and it's logic is much simple now, so just remove it and open code in jbd2_journal_try_to_free_buffers(). Signed-off-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> (cherry picked from commit b177d4d4)
-
由 Zhang Yi 提交于
maillist inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I70WHL CVE: NA Reference: https://lore.kernel.org/linux-ext4/20230606135928.434610-1-yi.zhang@huaweicloud.com/T/#t -------------------------------- Before removing checkpoint buffer from the t_checkpoint_list, we have to check both BH_Dirty and BH_Lock bits together to distinguish buffers have not been or were being written back. But __cp_buffer_busy() checks them separately, it first check lock state and then check dirty, the window between these two checks could be raced by writing back procedure, which locks buffer and clears buffer dirty before I/O completes. So it cannot guarantee checkpointing buffers been written back to disk if some error happens later. Finally, it may clean checkpoint transactions and lead to inconsistent filesystem. jbd2_journal_forget() and __journal_try_to_free_buffer() also have the same problem (journal_unmap_buffer() escape from this issue since it's running under the buffer lock), so fix them through introducing a new helper to try holding the buffer lock and remove really clean buffer. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217490 Cc: stable@vger.kernel.org Suggested-by: NJan Kara <jack@suse.cz> Signed-off-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> (cherry picked from commit 80079353)
-
由 Zhihao Cheng 提交于
maillist inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I70WHL CVE: NA Reference: https://lore.kernel.org/linux-ext4/20230606135928.434610-1-yi.zhang@huaweicloud.com/T/#t -------------------------------- Following process, jbd2_journal_commit_transaction // there are several dirty buffer heads in transaction->t_checkpoint_list P1 wb_workfn jbd2_log_do_checkpoint if (buffer_locked(bh)) // false __block_write_full_page trylock_buffer(bh) test_clear_buffer_dirty(bh) if (!buffer_dirty(bh)) __jbd2_journal_remove_checkpoint(jh) if (buffer_write_io_error(bh)) // false >> bh IO error occurs << jbd2_cleanup_journal_tail __jbd2_update_log_tail jbd2_write_superblock // The bh won't be replayed in next mount. , which could corrupt the ext4 image, fetch a reproducer in [Link]. Since writeback process clears buffer dirty after locking buffer head, we can fix it by try locking buffer and check dirtiness while buffer is locked, the buffer head can be removed if it is neither dirty nor locked. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217490 Fixes: 470decc6 ("[PATCH] jbd2: initial copy of files from jbd") Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> (cherry picked from commit 782635a8)
-