提交 · 3ed55b9fad0d8c21ba6c0803b4288c044cbbce07 · openeuler / Kernel

18 7月, 2023 3 次提交

netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain · 3ed55b9f

由 Pablo Neira Ayuso 提交于 7月 17, 2023

mainline inclusion
from mainline-v6.4
commit 26b5a571
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7H68N
CVE: CVE-2023-3117

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=26b5a5712eb85e253724e56a54c17f8519bd8e4e

--------------------------------

Add a new state to deal with rule expressions deactivation from the
newrule error path, otherwise the anonymous set remains in the list in
inactive state for the next generation. Mark the set/chain transaction
as unbound so the abort path releases this object, set it as inactive in
the next generation so it is not reachable anymore from this transaction
and reference counter is dropped.

Fixes: 1240eb93 ("netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

conflict:
	include/net/netfilter/nf_tables.h
	net/netfilter/nf_tables_api.c
Signed-off-by: NLu Wei <luwei32@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
(cherry picked from commit af739b3b)

3ed55b9f

netfilter: nf_tables: fix chain binding transaction logic · 6e6933ec

由 Pablo Neira Ayuso 提交于 7月 17, 2023

mainline inclusion
from mainline-v6.4
commit 4bedf9ee
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7H68N
CVE: CVE-2023-3117

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4bedf9eee016286c835e3d8fa981ddece5338795

--------------------------------

Add bound flag to rule and chain transactions as in 6a0a8d10
("netfilter: nf_tables: use-after-free in failing rule with bound set")
to skip them in case that the chain is already bound from the abort
path.

This patch fixes an imbalance in the chain use refcnt that triggers a
WARN_ON on the table and chain destroy path.

This patch also disallows nested chain bindings, which is not
supported from userspace.

The logic to deal with chain binding in nft_data_hold() and
nft_data_release() is not correct. The NFT_TRANS_PREPARE state needs a
special handling in case a chain is bound but next expressions in the
same rule fail to initialize as described by 1240eb93 ("netfilter:
nf_tables: incorrect error path handling with NFT_MSG_NEWRULE").

The chain is left bound if rule construction fails, so the objects
stored in this chain (and the chain itself) are released by the
transaction records from the abort path, follow up patch ("netfilter:
nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain")
completes this error handling.

When deleting an existing rule, chain bound flag is set off so the
rule expression .destroy path releases the objects.

Fixes: d0e2c7de ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

conflict:
	include/net/netfilter/nf_tables.h
	net/netfilter/nf_tables_api.c
Signed-off-by: NLu Wei <luwei32@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
(cherry picked from commit 04982868)

6e6933ec

netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE · 72de2bad

由 Pablo Neira Ayuso 提交于 7月 17, 2023

mainline inclusion
from mainline-v6.4-rc7
commit 1240eb93
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7H68N
CVE: CVE-2023-3117

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1240eb93f0616b21c675416516ff3d74798fdc97

--------------------------------

In case of error when adding a new rule that refers to an anonymous set,
deactivate expressions via NFT_TRANS_PREPARE state, not NFT_TRANS_RELEASE.
Thus, the lookup expression marks anonymous sets as inactive in the next
generation to ensure it is not reachable in this transaction anymore and
decrement the set refcount as introduced by c1592a89 ("netfilter:
nf_tables: deactivate anonymous set from preparation phase"). The abort
step takes care of undoing the anonymous set.

This is also consistent with rule deletion, where NFT_TRANS_PREPARE is
used. Note that this error path is exercised in the preparation step of
the commit protocol. This patch replaces nf_tables_rule_release() by the
deactivate and destroy calls, this time with NFT_TRANS_PREPARE.

Due to this incorrect error handling, it is possible to access a
dangling pointer to the anonymous set that remains in the transaction
list.

[1009.379054] BUG: KASAN: use-after-free in nft_set_lookup_global+0x147/0x1a0 [nf_tables]
[1009.379106] Read of size 8 at addr ffff88816c4c8020 by task nft-rule-add/137110
[1009.379116] CPU: 7 PID: 137110 Comm: nft-rule-add Not tainted 6.4.0-rc4+ #256
[1009.379128] Call Trace:
[1009.379132]  <TASK>
[1009.379135]  dump_stack_lvl+0x33/0x50
[1009.379146]  ? nft_set_lookup_global+0x147/0x1a0 [nf_tables]
[1009.379191]  print_address_description.constprop.0+0x27/0x300
[1009.379201]  kasan_report+0x107/0x120
[1009.379210]  ? nft_set_lookup_global+0x147/0x1a0 [nf_tables]
[1009.379255]  nft_set_lookup_global+0x147/0x1a0 [nf_tables]
[1009.379302]  nft_lookup_init+0xa5/0x270 [nf_tables]
[1009.379350]  nf_tables_newrule+0x698/0xe50 [nf_tables]
[1009.379397]  ? nf_tables_rule_release+0xe0/0xe0 [nf_tables]
[1009.379441]  ? kasan_unpoison+0x23/0x50
[1009.379450]  nfnetlink_rcv_batch+0x97c/0xd90 [nfnetlink]
[1009.379470]  ? nfnetlink_rcv_msg+0x480/0x480 [nfnetlink]
[1009.379485]  ? __alloc_skb+0xb8/0x1e0
[1009.379493]  ? __alloc_skb+0xb8/0x1e0
[1009.379502]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
[1009.379509]  ? unwind_get_return_address+0x2a/0x40
[1009.379517]  ? write_profile+0xc0/0xc0
[1009.379524]  ? avc_lookup+0x8f/0xc0
[1009.379532]  ? __rcu_read_unlock+0x43/0x60

Fixes: 958bee14 ("netfilter: nf_tables: use new transaction infrastructure to handle sets")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

conflict:
	net/netfilter/nf_tables_api.c
Signed-off-by: NLu Wei <luwei32@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
(cherry picked from commit a45e538a)

72de2bad

17 7月, 2023 2 次提交

!1420 [sync] PR-1415: Fix generic/299 fail · 4758cc88

由 openeuler-ci-bot 提交于 7月 17, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/1415 
 
PR sync from: Zhihao Cheng <chengzhihao1@huawei.com>
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/TSXRUVDLVGILRT2XURWM3RIMGTKSEUZT/ 
Revert origin fix, add debug message.

Zhihao Cheng (2):
  Revert "ext4: Stop trying writing pages if no free blocks generated"
  ext4: Add debug message to notify user space is out of free


-- 
2.31.1
 
https://gitee.com/openeuler/kernel/issues/I7CBCS 
 
Link:https://gitee.com/openeuler/kernel/pulls/1420 

Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

4758cc88

!1378 [sync] PR-1295: blk-wbt: don't show valid wbt_lat_usec in · 374ca1b0

由 openeuler-ci-bot 提交于 7月 17, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/1295 
 
PR sync from: Li Lingfeng <lilingfeng3@huawei.com>
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/OFH6G7EUCRF236P635HQ5LEDXVZ4AEJJ/ 
Yu Kuai (2):
  blk-wbt: make enable_state more accurate
  blk-wbt: don't show valid wbt_lat_usec in sysfs while wbt is disabled


-- 
2.31.1
 
 
Link:https://gitee.com/openeuler/kernel/pulls/1378 

Reviewed-by: Yu Kuai <yukuai3@huawei.com> 
Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

374ca1b0

14 7月, 2023 4 次提交

ext4: Add debug message to notify user space is out of free · ad36cedd

由 Zhihao Cheng 提交于 7月 14, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7CBCS
CVE: NA

--------------------------------

Add debug message to notify user that ext4_writepages is stuck in loop
caused by ENOSPC.
Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
(cherry picked from commit 4ae7e703)

ad36cedd

Revert "ext4: Stop trying writing pages if no free blocks generated" · b42d3e12

由 Zhihao Cheng 提交于 7月 14, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7CBCS
CVE: NA

--------------------------------

This reverts commit 07a8109d.

When ext4 runs out of space, there could be a potential data lost in
ext4_writepages:
If there are many preallocated blocks for some files, e4b bitmap is
different from block bitmap, and there are more free blocks accounted
by block bitmap.

    ext4_writepages                         P2
ext4_mb_new_blocks                  ext4_map_blocks
 ext4_mb_regular_allocator // No free bits in e4b bitmap
 ext4_mb_discard_preallocations_should_retry
  ext4_mb_discard_preallocations
   ext4_mb_discard_group_preallocations
    ext4_mb_release_inode_pa // updates e4b bitmap by pa->pa_free
     mb_free_blocks
                                     ext4_mb_new_blocks
                                      ext4_mb_regular_allocator
                                      // Got e4b bitmap's free bits
 ext4_mb_regular_allocator  // After 3 times retrying, ret ENOSPC

ext4_writepages
 mpage_map_and_submit_extent
  mpage_map_one_extent // ret ENOSPC
  if (err == -ENOSPC && EXT4_SB(sb)->s_mb_free_pending)
  // s_mb_free_pending is 0
  *give_up_on_write = true  // Abandon writeback, data lost!

Fixes: 07a8109d ("ext4: Stop trying writing pages if no free ...")
Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
(cherry picked from commit 5f142164)

b42d3e12

O
Merge branch 'openEuler-22.03-LTS-SP1' of https://gitee.com/openeuler/kernel... · 3fa6953f
由 openeuler-sync-bot 提交于 7月 14, 2023
```
Merge branch 'openEuler-22.03-LTS-SP1' of https://gitee.com/openeuler/kernel into openEuler-22.03-LTS-SP1
```
3fa6953f

!759 【kernel-openEuler-22.03-LTS-SP1】kernel：fix a type error with 5.10 kernel... · f4841ef1

由 openeuler-ci-bot 提交于 7月 14, 2023

!759 【kernel-openEuler-22.03-LTS-SP1】kernel：fix a type error with 5.10 kernel on openEuler 22.03 LTS SP1 system

Merge Pull Request from: @zhujun3 
 
This PR is to adapt the 5.10 kernel to BC-Linux for Euler V22.10 U1 OS, the step one is compile kernel

Kernel Issue:

(https://gitee.com/openeuler/kernel/issues/I7E2XC?from=project-issue)
    
    
 
Link:https://gitee.com/openeuler/kernel/pulls/759 

Reviewed-by: sanglipeng <sanglipeng1@jd.com> 
Reviewed-by: Xie XiuQi <xiexiuqi@huawei.com> 
Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>

f4841ef1

13 7月, 2023 6 次提交

O
Merge branch 'openEuler-22.03-LTS-SP1' of https://gitee.com/openeuler/kernel... · 52b0d429
由 openeuler-sync-bot 提交于 7月 13, 2023
```
Merge branch 'openEuler-22.03-LTS-SP1' of https://gitee.com/openeuler/kernel into openEuler-22.03-LTS-SP1
```
52b0d429

ubifs: Fix memory leak in do_rename · 12d98636

由 Mårten Lindahl 提交于 7月 08, 2023

mainline inclusion
from mainline-v6.4-rc1
commit 3a36d20e
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7JO0G
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3a36d20e012903f45714df2731261fdefac900cb

--------------------------------

If renaming a file in an encrypted directory, function
fscrypt_setup_filename allocates memory for a file name. This name is
never used, and before returning to the caller the memory for it is not
freed.

When running kmemleak on it we see that it is registered as a leak. The
report below is triggered by a simple program 'rename' that renames a
file in an encrypted directory:

  unreferenced object 0xffff888101502840 (size 32):
    comm "rename", pid 9404, jiffies 4302582475 (age 435.735s)
    backtrace:
      __kmem_cache_alloc_node
      __kmalloc
      fscrypt_setup_filename
      do_rename
      ubifs_rename
      vfs_rename
      do_renameat2

To fix this we can remove the call to fscrypt_setup_filename as it's not
needed.

Fixes: 278d9a24 ("ubifs: Rename whiteout atomically")
Reported-by: NZhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: NMårten Lindahl <marten.lindahl@axis.com>
Reviewed-by: NZhihao Cheng <chengzhihao1@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: NRichard Weinberger <richard@nod.at>
Signed-off-by: NZhaoLong Wang <wangzhaolong1@huawei.com>
(cherry picked from commit 6bc63230)

12d98636

ubifs: Free memory for tmpfile name · 939a5822

由 Mårten Lindahl 提交于 7月 08, 2023

mainline inclusion
from mainline-vv6.4-rc1
commit 1fb815b3
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7JO0G
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1fb815b38bb31d6af9bd0540b8652a0d6fe6cfd3

--------------------------------

When opening a ubifs tmpfile on an encrypted directory, function
fscrypt_setup_filename allocates memory for the name that is to be
stored in the directory entry, but after the name has been copied to the
directory entry inode, the memory is not freed.

When running kmemleak on it we see that it is registered as a leak. The
report below is triggered by a simple program 'tmpfile' just opening a
tmpfile:

  unreferenced object 0xffff88810178f380 (size 32):
    comm "tmpfile", pid 509, jiffies 4294934744 (age 1524.742s)
    backtrace:
      __kmem_cache_alloc_node
      __kmalloc
      fscrypt_setup_filename
      ubifs_tmpfile
      vfs_tmpfile
      path_openat

Free this memory after it has been copied to the inode.
Signed-off-by: NMårten Lindahl <marten.lindahl@axis.com>
Reviewed-by: NZhihao Cheng <chengzhihao1@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: NRichard Weinberger <richard@nod.at>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZhaoLong Wang <wangzhaolong1@huawei.com>
(cherry picked from commit 3c594ca7)

939a5822

!1389 [sync] PR-1312: quota: fix race condition between dqput() and dquot_mark_dquot_dirty() · c3ef7795

由 openeuler-ci-bot 提交于 7月 13, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/1312 
 
PR sync from: Baokun Li <libaokun1@huawei.com>
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/7ATD3RNUBURBEYA34VGOOZB53J377OZQ/ 
Baokun Li (5):
  quota: factor out dquot_write_dquot()
  quota: rename dquot_active() to inode_quota_active()
  quota: add new helper dquot_active()
  quota: fix dqput() to follow the guarantees dquot_srcu should provide
  quota: simplify drop_dquot_ref()


-- 
2.31.1
 
 
Link:https://gitee.com/openeuler/kernel/pulls/1389 

Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

c3ef7795

!1392 [sync] PR-1376: jbd2: Check 'jh->b_transaction' before remove it from checkpoint · 564bbed3

由 openeuler-ci-bot 提交于 7月 13, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/1376 
 
PR sync from: Zhihao Cheng <chengzhihao1@huawei.com>
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/XNJZFYFNQIMIIQRPICSJB7KUZJDPS27T/ 
 
 
Link:https://gitee.com/openeuler/kernel/pulls/1392 

Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

564bbed3

!1308 [sync] PR-1280: cgroup: always put cset in cgroup_css_set_put_fork · 2c5ad3ab

由 openeuler-ci-bot 提交于 7月 13, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/1280 
 
    A successful call to cgroup_css_set_fork() will always have taken
    a ref on kargs->cset (regardless of CLONE_INTO_CGROUP), so always
    do a corresponding put in cgroup_css_set_put_fork().

    Without this, a cset and its contained css structures will be
    leaked for some fork failures.  The following script reproduces
    the leak for a fork failure due to exceeding pids.max in the
    pids controller.  A similar thing can happen if we jump to the
    bad_fork_cancel_cgroup label in copy_process().

    [ -z "$1" ] && echo "Usage $0 pids-root" && exit 1
    PID_ROOT=$1
    CGROUP=$PID_ROOT/foo

    [ -e $CGROUP ] && rmdir -f $CGROUP
    mkdir $CGROUP
    echo 5 > $CGROUP/pids.max
    echo $$ > $CGROUP/cgroup.procs

    fork_bomb()
    {
            set -e
            for i in $(seq 10); do
                    /bin/sleep 3600 &
            done
    }

    (fork_bomb) &
    wait
    echo $$ > $PID_ROOT/cgroup.procs
    kill $(cat $CGROUP/cgroup.procs)
    rmdir $CGROUP 
 
Link:https://gitee.com/openeuler/kernel/pulls/1308 

Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

2c5ad3ab

12 7月, 2023 12 次提交

jbd2: Check 'jh->b_transaction' before remove it from checkpoint · 663a92d7

由 Zhihao Cheng 提交于 7月 11, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I70WHL
CVE: NA

--------------------------------

Following process will corrupt ext4 image:
Step 1:
jbd2_journal_commit_transaction
 __jbd2_journal_insert_checkpoint(jh, commit_transaction)
 // Put jh into trans1->t_checkpoint_list
 journal->j_checkpoint_transactions = commit_transaction
 // Put trans1 into journal->j_checkpoint_transactions

Step 2:
do_get_write_access
 test_clear_buffer_dirty(bh) // clear buffer dirty，set jbd dirty
 __jbd2_journal_file_buffer(jh, transaction) // jh belongs to trans2

Step 3:
drop_cache
 journal_shrink_one_cp_list
  jbd2_journal_try_remove_checkpoint
   if (!trylock_buffer(bh))  // lock bh, true
   if (buffer_dirty(bh))     // buffer is not dirty
   __jbd2_journal_remove_checkpoint(jh)
   // remove jh from trans1->t_checkpoint_list

Step 4:
jbd2_log_do_checkpoint
 trans1 = journal->j_checkpoint_transactions
 // jh is not in trans1->t_checkpoint_list
 jbd2_cleanup_journal_tail(journal)  // trans1 is done

Step 5: Power cut, trans2 is not committed, jh is lost in next mounting.

Fix it by checking 'jh->b_transaction' before remove it from checkpoint.

Fixes: 80079353 ("jbd2: fix a race when checking checkpoint ...")
Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
(cherry picked from commit 7723e91d)

663a92d7

quota: simplify drop_dquot_ref() · 8d16ece8

由 Baokun Li 提交于 7月 05, 2023

maillist inclusion
category: bugfix
bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR

Reference: https://www.spinics.net/lists/kernel/msg4844759.html

----------------------------------------

As Honza said, remove_inode_dquot_ref() currently does not release the
last dquot reference but instead adds the dquot to tofree_head list. This
is because dqput() can sleep while dropping of the last dquot reference
(writing back the dquot and calling ->release_dquot()) and that must not
happen under dq_list_lock. Now that dqput() queues the final dquot cleanup
into a workqueue, remove_inode_dquot_ref() can call dqput() unconditionally
and we can significantly simplify it.

Here we open code the simplified code of remove_inode_dquot_ref() into
remove_dquot_ref() and remove the function put_dquot_list() which is no
longer used.
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
(cherry picked from commit a13fcef3)

8d16ece8

quota: fix dqput() to follow the guarantees dquot_srcu should provide · 50a9c1dc

由 Baokun Li 提交于 7月 05, 2023

maillist inclusion
category: bugfix
bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR

Reference: https://www.spinics.net/lists/kernel/msg4844759.html

----------------------------------------

The dquot_mark_dquot_dirty() using dquot references from the inode
should be protected by dquot_srcu. quota_off code takes care to call
synchronize_srcu(&dquot_srcu) to not drop dquot references while they
are used by other users. But dquot_transfer() breaks this assumption.
We call dquot_transfer() to drop the last reference of dquot and add
it to free_dquots, but there may still be other users using the dquot
at this time, as shown in the function graph below:

       cpu1              cpu2
_________________|_________________
wb_do_writeback         CHOWN(1)
 ...
  ext4_da_update_reserve_space
   dquot_claim_block
    ...
     dquot_mark_dquot_dirty // try to dirty old quota
      test_bit(DQ_ACTIVE_B, &dquot->dq_flags) // still ACTIVE
      if (test_bit(DQ_MOD_B, &dquot->dq_flags))
      // test no dirty, wait dq_list_lock
                    ...
                     dquot_transfer
                      __dquot_transfer
                      dqput_all(transfer_from) // rls old dquot
                       dqput // last dqput
                        dquot_release
                         clear_bit(DQ_ACTIVE_B, &dquot->dq_flags)
                        atomic_dec(&dquot->dq_count)
                        put_dquot_last(dquot)
                         list_add_tail(&dquot->dq_free, &free_dquots)
                         // add the dquot to free_dquots
      if (!test_and_set_bit(DQ_MOD_B, &dquot->dq_flags))
        add dqi_dirty_list // add released dquot to dirty_list

This can cause various issues, such as dquot being destroyed by
dqcache_shrink_scan() after being added to free_dquots, which can trigger
a UAF in dquot_mark_dquot_dirty(); or after dquot is added to free_dquots
and then to dirty_list, it is added to free_dquots again after
dquot_writeback_dquots() is executed, which causes the free_dquots list to
be corrupted and triggers a UAF when dqcache_shrink_scan() is called for
freeing dquot twice.

As Honza said, we need to fix dquot_transfer() to follow the guarantees
dquot_srcu should provide. But calling synchronize_srcu() directly from
dquot_transfer() is too expensive (and mostly unnecessary). So we add
dquot whose last reference should be dropped to the new global dquot
list releasing_dquots, and then queue work item which would call
synchronize_srcu() and after that perform the final cleanup of all the
dquots on releasing_dquots.

Fixes: 4580b30e ("quota: Do not dirty bad dquots")
Suggested-by: NJan Kara <jack@suse.cz>
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
(cherry picked from commit d82ddaab)

50a9c1dc

quota: add new helper dquot_active() · 364aa369

由 Baokun Li 提交于 7月 05, 2023

maillist inclusion
category: bugfix
bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR

Reference: https://www.spinics.net/lists/kernel/msg4844759.html

----------------------------------------

Add new helper function dquot_active() to make the code more concise.
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
(cherry picked from commit 3fb7aa3a)

364aa369

quota: rename dquot_active() to inode_quota_active() · 2dc40f74

由 Baokun Li 提交于 7月 05, 2023

maillist inclusion
category: bugfix
bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR

Reference: https://www.spinics.net/lists/kernel/msg4844759.html

----------------------------------------

Now we have a helper function dquot_dirty() to determine if dquot has
DQ_MOD_B bit. dquot_active() can easily be misunderstood as a helper
function to determine if dquot has DQ_ACTIVE_B bit. So we avoid this by
renaming it to inode_quota_active() and later on we will add the helper
function dquot_active() to determine if dquot has DQ_ACTIVE_B bit.
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
(cherry picked from commit 329a1eb4)

2dc40f74

quota: factor out dquot_write_dquot() · 42d3a2de

由 Baokun Li 提交于 7月 05, 2023

maillist inclusion
category: bugfix
bugzilla: 188812,https://gitee.com/openeuler/kernel/issues/I7E0YR

Reference: https://www.spinics.net/lists/kernel/msg4844759.html

----------------------------------------

Refactor out dquot_write_dquot() to reduce duplicate code.
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
(cherry picked from commit 0a3781ae)

42d3a2de

O
Merge branch 'openEuler-22.03-LTS-SP1' of https://gitee.com/openeuler/kernel... · 593f244e
由 openeuler-sync-bot 提交于 7月 12, 2023
```
Merge branch 'openEuler-22.03-LTS-SP1' of https://gitee.com/openeuler/kernel into openEuler-22.03-LTS-SP1
```
593f244e

!1329 [sync] PR-1325: jbd2: fix several checkpoint · 6c44b563

由 openeuler-ci-bot 提交于 7月 12, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/1325 
 
PR sync from: Zhihao Cheng <chengzhihao1@huawei.com>
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/QARA5X5OQUKRFUIORG2YVB6YE3V5CGQB/ 
Zhang Yi (4):
  jbd2: remove journal_clean_one_cp_list()
  jbd2: fix a race when checking checkpoint buffer busy
  jbd2: remove __journal_try_to_free_buffer()
  jbd2: fix checkpoint cleanup performance regression

Zhihao Cheng (1):
  jbd2: Fix wrongly judgement for buffer head removing while doing
    checkpoint


-- 
2.31.1
 
 
Link:https://gitee.com/openeuler/kernel/pulls/1329 

Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

6c44b563

!1332 [sync] PR-1314: ext4: Stop trying writing pages if no free blocks generated · 3bb5ef86

由 openeuler-ci-bot 提交于 7月 12, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/1314 
 
PR sync from: Zhihao Cheng <chengzhihao1@huawei.com>
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/ALOJ633HB2KNGCGZVSSVUI34JMM2MTRP/ 
 
 
Link:https://gitee.com/openeuler/kernel/pulls/1332 

Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

3bb5ef86

dm thin: fix deadlock when swapping to thin device · 73c633e6

由 Coly Li 提交于 7月 08, 2023

mainline inclusion
from mainline-v6.3-rc4
commit 9bbf5fee
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7JLUM
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.4&id=9bbf5feecc7eab2c370496c1c161bbfe62084028

----------------------------------------

This is an already known issue that dm-thin volume cannot be used as
swap, otherwise a deadlock may happen when dm-thin internal memory
demand triggers swap I/O on the dm-thin volume itself.

But thanks to commit a666e5c0 ("dm: fix deadlock when swapping to
encrypted device"), the limit_swap_bios target flag can also be used
for dm-thin to avoid the recursive I/O when it is used as swap.

Fix is to simply set ti->limit_swap_bios to true in both pool_ctr()
and thin_ctr().

In my test, I create a dm-thin volume /dev/vg/swap and use it as swap
device. Then I run fio on another dm-thin volume /dev/vg/main and use
large --blocksize to trigger swap I/O onto /dev/vg/swap.

The following fio command line is used in my test,
  fio --name recursive-swap-io --lockmem 1 --iodepth 128 \
     --ioengine libaio --filename /dev/vg/main --rw randrw \
    --blocksize 1M --numjobs 32 --time_based --runtime=12h

Without this fix, the whole system can be locked up within 15 seconds.

With this fix, there is no any deadlock or hung task observed after
2 hours of running fio.

Furthermore, if blocksize is changed from 1M to 128M, after around 30
seconds fio has no visible I/O, and the out-of-memory killer message
shows up in kernel message. After around 20 minutes all fio processes
are killed and the whole system is back to being alive.

This is exactly what is expected when recursive I/O happens on dm-thin
volume when it is used as swap.

Depends-on: a666e5c0 ("dm: fix deadlock when swapping to encrypted device")
Cc: stable@vger.kernel.org
Signed-off-by: NColy Li <colyli@suse.de>
Acked-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

Conflict:
  drivers/md/dm-thin.c
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
(cherry picked from commit 6283fa7e)

73c633e6

blk-wbt: don't show valid wbt_lat_usec in sysfs while wbt is disabled · 9fb770fe

由 Yu Kuai 提交于 7月 03, 2023

mainline inclusion
from mainline-v6.2-rc1
commit 3642ef4d
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6Z1UG
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3-rc7&id=3642ef4d95699193c4a461862382e643ae3720f0

----------------------------------------

Currently, if wbt is initialized and then disabled by
wbt_disable_default(), sysfs will still show valid wbt_lat_usec, which
will confuse users that wbt is still enabled.

This patch shows wbt_lat_usec as zero if it's disabled.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reported-and-tested-by: NHolger Hoffstätte <holger@applied-asynchrony.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221019121518.3865235-5-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
(cherry picked from commit 18e44529)

9fb770fe

blk-wbt: make enable_state more accurate · 4db0497b

由 Yu Kuai 提交于 7月 03, 2023

mainline inclusion
from mainline-v6.2-rc1
commit a9a236d2
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6Z1UG
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3&id=a9a236d238a5e8ab2e74ca62c2c7ba5dd435af77

----------------------------------------

Currently, if user disable wbt through sysfs, 'enable_state' will be
'WBT_STATE_ON_MANUAL', which will be confusing. Add a new state
'WBT_STATE_OFF_MANUAL' to cover that case.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221019121518.3865235-4-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
(cherry picked from commit 52ad37a2)

4db0497b

11 7月, 2023 6 次提交

!1340 [sync] PR-1286: ext4: turning quotas off if mount failed after enable quotas · 36fbed46

由 openeuler-ci-bot 提交于 7月 11, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/1286 
 
PR sync from: Baokun Li <libaokun1@huawei.com>
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/X3ZSP2AARUKCTNGQH7V2EC4D2KQ67AMO/ 
 
 
Link:https://gitee.com/openeuler/kernel/pulls/1340 

Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

36fbed46

!1367 [sync] PR-1324: io_uring: hold uring mutex around poll removal · 617c037e

由 openeuler-ci-bot 提交于 7月 11, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/1324 
 
PR sync from: Zhong Jinghua <zhongjinghua@huawei.com>
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/2P2KGVU22TWAYJ5N3JDYWA7EXWJOL2OS/ 
 
 
Link:https://gitee.com/openeuler/kernel/pulls/1367 

Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

617c037e

!1363 [sync] PR-1287: ipvlan:Fix out-of-bounds caused by unclear skb->cb · 492a8f90

由 openeuler-ci-bot 提交于 7月 11, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/1287 
 
PR sync from: Zhengchao Shao <shaozhengchao@huawei.com>
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/AM4RDLF2OSU74VL45PDNQCRW7E3VXA63/ 
 
 
Link:https://gitee.com/openeuler/kernel/pulls/1363 

Reviewed-by: Yue Haibing <yuehaibing@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

492a8f90

io_uring: hold uring mutex around poll removal · 1f614ed5

由 Jens Axboe 提交于 7月 05, 2023

stable inclusion
from stable-v5.10.185
commit 4716c73b188566865bdd79c3a6709696a224ac04
category: bugfix
bugzilla: 188954, https://gitee.com/src-openeuler/kernel/issues/I7GVI5?from=project-issue
CVE: CVE-2023-3389

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4716c73b188566865bdd79c3a6709696a224ac04

----------------------------------------

Snipped from commit 9ca9fb24 upstream.

While reworking the poll hashing in the v6.0 kernel, we ended up
grabbing the ctx->uring_lock in poll update/removal. This also fixed
a bug with linked timeouts racing with timeout expiry and poll
removal.

Bring back just the locking fix for that.
Reported-and-tested-by: NQuerijn Voet <querijnqyn@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com>
(cherry picked from commit 43a7aef4)

1f614ed5

ipvlan:Fix out-of-bounds caused by unclear skb->cb · 16bcf782

由 t.feng 提交于 7月 03, 2023

stable inclusion
from stable-v5.10.181
commit f4a371d3f5a7a71dff1ab48b3122c5cf23cc7ad5
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7GVI1
CVE: CVE-2023-3090

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f4a371d3f5a7a71dff1ab48b3122c5cf23cc7ad5

--------------------------------

[ Upstream commit 90cbed52 ]

If skb enqueue the qdisc, fq_skb_cb(skb)->time_to_send is changed which
is actually skb->cb, and IPCB(skb_in)->opt will be used in
__ip_options_echo. It is possible that memcpy is out of bounds and lead
to stack overflow.
We should clear skb->cb before ip_local_out or ip6_local_out.

v2:
1. clean the stack info
2. use IPCB/IP6CB instead of skb->cb

crash on stable-5.10(reproduce in kasan kernel).
Stack info:
[ 2203.651571] BUG: KASAN: stack-out-of-bounds in
__ip_options_echo+0x589/0x800
[ 2203.653327] Write of size 4 at addr ffff88811a388f27 by task
swapper/3/0
[ 2203.655460] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Not tainted
5.10.0-60.18.0.50.h856.kasan.eulerosv2r11.x86_64 #1
[ 2203.655466] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.10.2-0-g5f4c7b1-20181220_000000-szxrtosci10000 04/01/2014
[ 2203.655475] Call Trace:
[ 2203.655481]  <IRQ>
[ 2203.655501]  dump_stack+0x9c/0xd3
[ 2203.655514]  print_address_description.constprop.0+0x19/0x170
[ 2203.655530]  __kasan_report.cold+0x6c/0x84
[ 2203.655586]  kasan_report+0x3a/0x50
[ 2203.655594]  check_memory_region+0xfd/0x1f0
[ 2203.655601]  memcpy+0x39/0x60
[ 2203.655608]  __ip_options_echo+0x589/0x800
[ 2203.655654]  __icmp_send+0x59a/0x960
[ 2203.655755]  nf_send_unreach+0x129/0x3d0 [nf_reject_ipv4]
[ 2203.655763]  reject_tg+0x77/0x1bf [ipt_REJECT]
[ 2203.655772]  ipt_do_table+0x691/0xa40 [ip_tables]
[ 2203.655821]  nf_hook_slow+0x69/0x100
[ 2203.655828]  __ip_local_out+0x21e/0x2b0
[ 2203.655857]  ip_local_out+0x28/0x90
[ 2203.655868]  ipvlan_process_v4_outbound+0x21e/0x260 [ipvlan]
[ 2203.655931]  ipvlan_xmit_mode_l3+0x3bd/0x400 [ipvlan]
[ 2203.655967]  ipvlan_queue_xmit+0xb3/0x190 [ipvlan]
[ 2203.655977]  ipvlan_start_xmit+0x2e/0xb0 [ipvlan]
[ 2203.655984]  xmit_one.constprop.0+0xe1/0x280
[ 2203.655992]  dev_hard_start_xmit+0x62/0x100
[ 2203.656000]  sch_direct_xmit+0x215/0x640
[ 2203.656028]  __qdisc_run+0x153/0x1f0
[ 2203.656069]  __dev_queue_xmit+0x77f/0x1030
[ 2203.656173]  ip_finish_output2+0x59b/0xc20
[ 2203.656244]  __ip_finish_output.part.0+0x318/0x3d0
[ 2203.656312]  ip_finish_output+0x168/0x190
[ 2203.656320]  ip_output+0x12d/0x220
[ 2203.656357]  __ip_queue_xmit+0x392/0x880
[ 2203.656380]  __tcp_transmit_skb+0x1088/0x11c0
[ 2203.656436]  __tcp_retransmit_skb+0x475/0xa30
[ 2203.656505]  tcp_retransmit_skb+0x2d/0x190
[ 2203.656512]  tcp_retransmit_timer+0x3af/0x9a0
[ 2203.656519]  tcp_write_timer_handler+0x3ba/0x510
[ 2203.656529]  tcp_write_timer+0x55/0x180
[ 2203.656542]  call_timer_fn+0x3f/0x1d0
[ 2203.656555]  expire_timers+0x160/0x200
[ 2203.656562]  run_timer_softirq+0x1f4/0x480
[ 2203.656606]  __do_softirq+0xfd/0x402
[ 2203.656613]  asm_call_irq_on_stack+0x12/0x20
[ 2203.656617]  </IRQ>
[ 2203.656623]  do_softirq_own_stack+0x37/0x50
[ 2203.656631]  irq_exit_rcu+0x134/0x1a0
[ 2203.656639]  sysvec_apic_timer_interrupt+0x36/0x80
[ 2203.656646]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 2203.656654] RIP: 0010:default_idle+0x13/0x20
[ 2203.656663] Code: 89 f0 5d 41 5c 41 5d 41 5e c3 cc cc cc cc cc cc cc
cc cc cc cc cc cc 0f 1f 44 00 00 0f 1f 44 00 00 0f 00 2d 9f 32 57 00 fb
f4 <c3> cc cc cc cc 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 be 08
[ 2203.656668] RSP: 0018:ffff88810036fe78 EFLAGS: 00000256
[ 2203.656676] RAX: ffffffffaf2a87f0 RBX: ffff888100360000 RCX:
ffffffffaf290191
[ 2203.656681] RDX: 0000000000098b5e RSI: 0000000000000004 RDI:
ffff88811a3c4f60
[ 2203.656686] RBP: 0000000000000000 R08: 0000000000000001 R09:
ffff88811a3c4f63
[ 2203.656690] R10: ffffed10234789ec R11: 0000000000000001 R12:
0000000000000003
[ 2203.656695] R13: ffff888100360000 R14: 0000000000000000 R15:
0000000000000000
[ 2203.656729]  default_idle_call+0x5a/0x150
[ 2203.656735]  cpuidle_idle_call+0x1c6/0x220
[ 2203.656780]  do_idle+0xab/0x100
[ 2203.656786]  cpu_startup_entry+0x19/0x20
[ 2203.656793]  secondary_startup_64_no_verify+0xc2/0xcb

[ 2203.657409] The buggy address belongs to the page:
[ 2203.658648] page:0000000027a9842f refcount:1 mapcount:0
mapping:0000000000000000 index:0x0 pfn:0x11a388
[ 2203.658665] flags:
0x17ffffc0001000(reserved|node=0|zone=2|lastcpupid=0x1fffff)
[ 2203.658675] raw: 0017ffffc0001000 ffffea000468e208 ffffea000468e208
0000000000000000
[ 2203.658682] raw: 0000000000000000 0000000000000000 00000001ffffffff
0000000000000000
[ 2203.658686] page dumped because: kasan: bad access detected

To reproduce(ipvlan with IPVLAN_MODE_L3):
Env setting:
=======================================================
modprobe ipvlan ipvlan_default_mode=1
sysctl net.ipv4.conf.eth0.forwarding=1
iptables -t nat -A POSTROUTING -s 20.0.0.0/255.255.255.0 -o eth0 -j
MASQUERADE
ip link add gw link eth0 type ipvlan
ip -4 addr add 20.0.0.254/24 dev gw
ip netns add net1
ip link add ipv1 link eth0 type ipvlan
ip link set ipv1 netns net1
ip netns exec net1 ip link set ipv1 up
ip netns exec net1 ip -4 addr add 20.0.0.4/24 dev ipv1
ip netns exec net1 route add default gw 20.0.0.254
ip netns exec net1 tc qdisc add dev ipv1 root netem loss 10%
ifconfig gw up
iptables -t filter -A OUTPUT -p tcp --dport 8888 -j REJECT --reject-with
icmp-port-unreachable
=======================================================
And then excute the shell(curl any address of eth0 can reach):

for((i=1;i<=100000;i++))
do
        ip netns exec net1 curl x.x.x.x:8888
done
=======================================================

Fixes: 2ad7bf36 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: N"t.feng" <fengtao40@huawei.com>
Suggested-by: NFlorian Westphal <fw@strlen.de>
Reviewed-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com>
(cherry picked from commit 2572b83c)

16bcf782

!1343 [sync] PR-1272: xfs: fix some problems recently · b0421760

由 openeuler-ci-bot 提交于 7月 11, 2023

Merge Pull Request from: @openeuler-sync-bot 
 

Origin pull request: 
https://gitee.com/openeuler/kernel/pulls/1272 
 
PR sync from: Long Li <leo.lilong@huawei.com>
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/W6KN2XSLJE5HZR2Y5D2OTDQ2GTLDGC5O/ 
Patchs 1-6 fix some problems recently.
Patchs 7-8 backport from mainline.

Darrick J. Wong (1):
  xfs: fix uninitialized variable access

Dave Chinner (1):
  xfs: set XFS_FEAT_NLINK correctly

Long Li (4):
  xfs: factor out xfs_defer_pending_abort
  xfs: don't leak intent item when recovery intents fail
  xfs: factor out xfs_destroy_perag()
  xfs: don't leak perag when growfs fails

Ye Bin (1):
  xfs: fix warning in xfs_vm_writepages()

yangerkun (1):
  xfs: fix mounting failed caused by sequencing problem in the log
    records


-- 
2.31.1
 
 
Link:https://gitee.com/openeuler/kernel/pulls/1343 

Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> 
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>

b0421760

07 7月, 2023 7 次提交

xfs: fix uninitialized variable access · e157b904

由 Darrick J. Wong 提交于 6月 29, 2023

mainline inclusion
from mainline-v6.2-rc6
commit 60b730a4
category: bugfix
bugzilla: 188220, https://gitee.com/openeuler/kernel/issues/I4KIAO

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=60b730a40c43fbcc034970d3e77eb0f25b8cc1cf

--------------------------------

If the end position of a GETFSMAP query overlaps an allocated space and
we're using the free space info to generate fsmap info, the akeys
information gets fed into the fsmap formatter with bad results.
Zero-init the space.

Reported-by: syzbot+090ae72d552e6bd93cfe@syzkaller.appspotmail.com
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NLong Li <leo.lilong@huawei.com>
(cherry picked from commit aebc38d3)

e157b904

xfs: set XFS_FEAT_NLINK correctly · 163b4f9f

由 Dave Chinner 提交于 6月 29, 2023

mainline inclusion
from mainline-v5.18-rc2
commit dd0d2f97
category: bugfix
bugzilla: 188220, https://gitee.com/openeuler/kernel/issues/I4KIAO

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dd0d2f9755191690541b09e6385d0f8cd8bc9d8f

--------------------------------

While xfs_has_nlink() is not used in kernel, it is used in userspace
(e.g. by xfs_db) so we need to set the XFS_FEAT_NLINK flag correctly
in xfs_sb_version_to_features().
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NLong Li <leo.lilong@huawei.com>
(cherry picked from commit f2096cec)

163b4f9f

xfs: don't leak perag when growfs fails · ebe9ed75

由 Long Li 提交于 6月 29, 2023

Offering: HULK
hulk inclusion
category: bugfix
bugzilla: 188878, https://gitee.com/openeuler/kernel/issues/I76JSK

--------------------------------

During growfs, if new ag in memory has been initialized, however sb_agcount
has not been updated, if an error occurs at this time it will cause ag
leaks as follows, these new ags will not been freed during umount because
of sb_agcount is not been updated.

unreferenced object 0xffff88810751b000 (size 1024):
  comm "xfs_growfs", pid 123624, jiffies 4300733989 (age 124294.081s)
  hex dump (first 32 bytes):
    00 a0 38 16 81 88 ff ff 05 00 00 00 00 00 00 00  ..8.............
    00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000725c8ae4>] kmem_alloc+0x92/0x1d0 [xfs]
    [<000000005c32d74e>] xfs_initialize_perag+0x8d/0x3b0 [xfs]
    [<00000000830354cf>] xfs_growfs_data_private.isra.0+0x2af/0x610 [xfs]
    [<0000000038a29cb1>] xfs_growfs_data+0x228/0x300 [xfs]
    [<0000000004937dd2>] xfs_file_ioctl+0x8f3/0x10d0 [xfs]
    [<000000001a5d29a8>] __se_sys_ioctl+0xeb/0x120
    [<00000000cf30385a>] do_syscall_64+0x30/0x40
    [<00000000e4a6fd2f>] entry_SYSCALL_64_after_hwframe+0x61/0xc6

When growfs fails, use xfs_destroy_perag() to destroy newly initialized ag
in error handle path.
Signed-off-by: NLong Li <leo.lilong@huawei.com>
(cherry picked from commit 670cd2c8)

ebe9ed75

xfs: factor out xfs_destroy_perag() · ffbfbe96

由 Long Li 提交于 6月 29, 2023

Offering: HULK
hulk inclusion
category: bugfix
bugzilla: 188878, https://gitee.com/openeuler/kernel/issues/I76JSK

--------------------------------

Factor out xfs_destroy_perag() from xfs_initialize_perag() for error
handle, delete perag from radix tree requires lock protection, just like
any other places where perag tree are modified.
Signed-off-by: NLong Li <leo.lilong@huawei.com>
(cherry picked from commit 42297cd9)

ffbfbe96

xfs: fix warning in xfs_vm_writepages() · 11a04e90

由 Ye Bin 提交于 6月 29, 2023

Offering: HULK
hulk inclusion
category: bugfix
bugzilla: 188782, https://gitee.com/openeuler/kernel/issues/I76JSK

-----------------------------------------------

When do BULKSTAT test got issues as follows:
WARNING: CPU: 3 PID: 8425 at fs/xfs/xfs_aops.c:509 xfs_vm_writepages+0x184/0x1c0
Modules linked in:
CPU: 3 PID: 8425 Comm: xfs_bulkstat Not tainted 6.3.0-next-20230505-00003-gf3329adf5424-dirty #456
RIP: 0010:xfs_vm_writepages+0x184/0x1c0
RSP: 0018:ffffc90014bb7088 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 1ffff92002976e11 RCX: ffff88817aef8000
RDX: 0000000000000000 RSI: ffff88817aef8000 RDI: 0000000000000002
RBP: ffff888267dd2ad8 R08: ffffffff8313f414 R09: ffffed1022377c18
R10: ffff888111bbe0bb R11: ffffed1022377c17 R12: ffff88817aef8000
R13: ffffc90014bb7358 R14: dffffc0000000000 R15: ffffffff8313f290
FS:  00007f9568bb0440(0000) GS:ffff88882fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000d7a008 CR3: 000000024e11f000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 do_writepages+0x1a8/0x630
 __writeback_single_inode+0x126/0xe00
 writeback_single_inode+0x2ae/0x530
 write_inode_now+0x16e/0x1e0
 iput.part.0+0x46c/0x730
 iput+0x60/0x80
 xfs_bulkstat_one_int+0xd87/0x1580
 xfs_bulkstat_iwalk+0x6e/0xd0
 xfs_iwalk_ag_recs+0x449/0x770
 xfs_iwalk_run_callbacks+0x305/0x630
 xfs_iwalk_ag+0x819/0xae0
 xfs_iwalk+0x2d5/0x4e0
 xfs_bulkstat+0x358/0x520
 xfs_ioc_bulkstat.isra.0+0x242/0x340
 xfs_file_ioctl+0x1d6/0x1ba0
 __x64_sys_ioctl+0x197/0x210
 do_syscall_64+0x39/0xb0
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Above issue may happens as follows:
Porcess1          Process2           process3         process4
xfs_bulkstat
 xfs_trans_alloc_empty
 xfs_bulkstat_one_int
   xfs_iget(XFS_IGET_DONTCACHE)
   ->Get inode from disk and mark
     inode with I_DONTCACHE

                                                     xfs_lookup
                                                       xfs_iget
                                                       ->Hold inode refcount
   xfs_irele

                 xfs_file_write_iter
                 ->Write file made some dirty pages
                 close file

                                    xfs_bulkstat
                                      xfs_trans_alloc_empty
                                      xfs_bulkstat_one_int
                                        xfs_iget(XFS_IGET_DONTCACHE)

                                                       -> process4 close file

        ******Trigger dentry reclaim, inode refcount is 1******
                                        xfs_irele
                                          iput ->Put the last refcount
                                            iput_final
                                              write_inode_now
                                                xfs_vm_writepages
                                                  WARN_ON_ONCE(current->journal_info)
                                                  ->Trigger warning

As commit a6343e4d grab an empty transaction when do BULKSTAT. If put
the last refcount of inode maybe cause writepages will trigger warning, and
also lead to data loss.
To solve above issue if xfs_iget_cache_hit() just clear inode's I_DONTCACHE
flags.

Fixes: a6343e4d ("xfs: avoid buffer deadlocks when walking fs inodes")
Signed-off-by: NYe Bin <yebin10@huawei.com>
Signed-off-by: NLong Li <leo.lilong@huawei.com>
(cherry picked from commit 28ce0ae2)

11a04e90

xfs: don't leak intent item when recovery intents fail · c1df2e81

由 Long Li 提交于 6月 29, 2023

Offering: HULK
hulk inclusion
category: bugfix
bugzilla: 188865, https://gitee.com/openeuler/kernel/issues/I76JSK

--------------------------------

When recovery intents, it may capture some deferred ops and commit the new
intent items, if recovery intents fails, there will be no done item drop
the reference to the new intent item. This leads to a memory leak as
fllows:

unreferenced object 0xffff888016719108 (size 432):
  comm "mount", pid 529, jiffies 4294706839 (age 144.463s)
  hex dump (first 32 bytes):
    08 91 71 16 80 88 ff ff 08 91 71 16 80 88 ff ff  ..q.......q.....
    18 91 71 16 80 88 ff ff 18 91 71 16 80 88 ff ff  ..q.......q.....
  backtrace:
    [<ffffffff8230c68f>] xfs_efi_init+0x18f/0x1d0
    [<ffffffff8230c720>] xfs_extent_free_create_intent+0x50/0x150
    [<ffffffff821b671a>] xfs_defer_create_intents+0x16a/0x340
    [<ffffffff821bac3e>] xfs_defer_ops_capture_and_commit+0x8e/0xad0
    [<ffffffff82322bb9>] xfs_cui_item_recover+0x819/0x980
    [<ffffffff823289b6>] xlog_recover_process_intents+0x246/0xb70
    [<ffffffff8233249a>] xlog_recover_finish+0x8a/0x9a0
    [<ffffffff822eeafb>] xfs_log_mount_finish+0x2bb/0x4a0
    [<ffffffff822c0f4f>] xfs_mountfs+0x14bf/0x1e70
    [<ffffffff822d1f80>] xfs_fs_fill_super+0x10d0/0x1b20
    [<ffffffff81a21fa2>] get_tree_bdev+0x3d2/0x6d0
    [<ffffffff81a1ee09>] vfs_get_tree+0x89/0x2c0
    [<ffffffff81a9f35f>] path_mount+0xecf/0x1800
    [<ffffffff81a9fd83>] do_mount+0xf3/0x110
    [<ffffffff81aa00e4>] __x64_sys_mount+0x154/0x1f0
    [<ffffffff83968739>] do_syscall_64+0x39/0x80

Fix it by abort intent items in capture list that don't have a done item
when recovery intents fail. If transaction that have deferred ops is
commmit fails in xfs_defer_ops_capture_and_commit(), defer capture would
not added to capture list, it also need abort too.
Signed-off-by: NLong Li <leo.lilong@huawei.com>
(cherry picked from commit c1b08a41)

c1df2e81

xfs: factor out xfs_defer_pending_abort · 4ef24aa2

由 Long Li 提交于 6月 29, 2023

Offering: HULK
hulk inclusion
category: bugfix
bugzilla: 188865, https://gitee.com/openeuler/kernel/issues/I76JSK

--------------------------------

Factor out xfs_defer_pending_abort() from xfs_defer_trans_abort(), which
not use transaction parameter, so it can be used after the transaction
life cycle.
Signed-off-by: NLong Li <leo.lilong@huawei.com>
(cherry picked from commit 9bd2b3bd)

4ef24aa2

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功