1. 20 7月, 2023 4 次提交
  2. 18 7月, 2023 1 次提交
  3. 12 7月, 2023 1 次提交
    • C
      dm thin: fix deadlock when swapping to thin device · 73c633e6
      Coly Li 提交于
      mainline inclusion
      from mainline-v6.3-rc4
      commit 9bbf5fee
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7JLUM
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.4&id=9bbf5feecc7eab2c370496c1c161bbfe62084028
      
      ----------------------------------------
      
      This is an already known issue that dm-thin volume cannot be used as
      swap, otherwise a deadlock may happen when dm-thin internal memory
      demand triggers swap I/O on the dm-thin volume itself.
      
      But thanks to commit a666e5c0 ("dm: fix deadlock when swapping to
      encrypted device"), the limit_swap_bios target flag can also be used
      for dm-thin to avoid the recursive I/O when it is used as swap.
      
      Fix is to simply set ti->limit_swap_bios to true in both pool_ctr()
      and thin_ctr().
      
      In my test, I create a dm-thin volume /dev/vg/swap and use it as swap
      device. Then I run fio on another dm-thin volume /dev/vg/main and use
      large --blocksize to trigger swap I/O onto /dev/vg/swap.
      
      The following fio command line is used in my test,
        fio --name recursive-swap-io --lockmem 1 --iodepth 128 \
           --ioengine libaio --filename /dev/vg/main --rw randrw \
          --blocksize 1M --numjobs 32 --time_based --runtime=12h
      
      Without this fix, the whole system can be locked up within 15 seconds.
      
      With this fix, there is no any deadlock or hung task observed after
      2 hours of running fio.
      
      Furthermore, if blocksize is changed from 1M to 128M, after around 30
      seconds fio has no visible I/O, and the out-of-memory killer message
      shows up in kernel message. After around 20 minutes all fio processes
      are killed and the whole system is back to being alive.
      
      This is exactly what is expected when recursive I/O happens on dm-thin
      volume when it is used as swap.
      
      Depends-on: a666e5c0 ("dm: fix deadlock when swapping to encrypted device")
      Cc: stable@vger.kernel.org
      Signed-off-by: NColy Li <colyli@suse.de>
      Acked-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@kernel.org>
      
      Conflict:
        drivers/md/dm-thin.c
      Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
      (cherry picked from commit 6283fa7e)
      73c633e6
  4. 11 7月, 2023 1 次提交
    • T
      ipvlan:Fix out-of-bounds caused by unclear skb->cb · 16bcf782
      t.feng 提交于
      stable inclusion
      from stable-v5.10.181
      commit f4a371d3f5a7a71dff1ab48b3122c5cf23cc7ad5
      category: bugfix
      bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7GVI1
      CVE: CVE-2023-3090
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f4a371d3f5a7a71dff1ab48b3122c5cf23cc7ad5
      
      --------------------------------
      
      [ Upstream commit 90cbed52 ]
      
      If skb enqueue the qdisc, fq_skb_cb(skb)->time_to_send is changed which
      is actually skb->cb, and IPCB(skb_in)->opt will be used in
      __ip_options_echo. It is possible that memcpy is out of bounds and lead
      to stack overflow.
      We should clear skb->cb before ip_local_out or ip6_local_out.
      
      v2:
      1. clean the stack info
      2. use IPCB/IP6CB instead of skb->cb
      
      crash on stable-5.10(reproduce in kasan kernel).
      Stack info:
      [ 2203.651571] BUG: KASAN: stack-out-of-bounds in
      __ip_options_echo+0x589/0x800
      [ 2203.653327] Write of size 4 at addr ffff88811a388f27 by task
      swapper/3/0
      [ 2203.655460] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Not tainted
      5.10.0-60.18.0.50.h856.kasan.eulerosv2r11.x86_64 #1
      [ 2203.655466] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
      BIOS rel-1.10.2-0-g5f4c7b1-20181220_000000-szxrtosci10000 04/01/2014
      [ 2203.655475] Call Trace:
      [ 2203.655481]  <IRQ>
      [ 2203.655501]  dump_stack+0x9c/0xd3
      [ 2203.655514]  print_address_description.constprop.0+0x19/0x170
      [ 2203.655530]  __kasan_report.cold+0x6c/0x84
      [ 2203.655586]  kasan_report+0x3a/0x50
      [ 2203.655594]  check_memory_region+0xfd/0x1f0
      [ 2203.655601]  memcpy+0x39/0x60
      [ 2203.655608]  __ip_options_echo+0x589/0x800
      [ 2203.655654]  __icmp_send+0x59a/0x960
      [ 2203.655755]  nf_send_unreach+0x129/0x3d0 [nf_reject_ipv4]
      [ 2203.655763]  reject_tg+0x77/0x1bf [ipt_REJECT]
      [ 2203.655772]  ipt_do_table+0x691/0xa40 [ip_tables]
      [ 2203.655821]  nf_hook_slow+0x69/0x100
      [ 2203.655828]  __ip_local_out+0x21e/0x2b0
      [ 2203.655857]  ip_local_out+0x28/0x90
      [ 2203.655868]  ipvlan_process_v4_outbound+0x21e/0x260 [ipvlan]
      [ 2203.655931]  ipvlan_xmit_mode_l3+0x3bd/0x400 [ipvlan]
      [ 2203.655967]  ipvlan_queue_xmit+0xb3/0x190 [ipvlan]
      [ 2203.655977]  ipvlan_start_xmit+0x2e/0xb0 [ipvlan]
      [ 2203.655984]  xmit_one.constprop.0+0xe1/0x280
      [ 2203.655992]  dev_hard_start_xmit+0x62/0x100
      [ 2203.656000]  sch_direct_xmit+0x215/0x640
      [ 2203.656028]  __qdisc_run+0x153/0x1f0
      [ 2203.656069]  __dev_queue_xmit+0x77f/0x1030
      [ 2203.656173]  ip_finish_output2+0x59b/0xc20
      [ 2203.656244]  __ip_finish_output.part.0+0x318/0x3d0
      [ 2203.656312]  ip_finish_output+0x168/0x190
      [ 2203.656320]  ip_output+0x12d/0x220
      [ 2203.656357]  __ip_queue_xmit+0x392/0x880
      [ 2203.656380]  __tcp_transmit_skb+0x1088/0x11c0
      [ 2203.656436]  __tcp_retransmit_skb+0x475/0xa30
      [ 2203.656505]  tcp_retransmit_skb+0x2d/0x190
      [ 2203.656512]  tcp_retransmit_timer+0x3af/0x9a0
      [ 2203.656519]  tcp_write_timer_handler+0x3ba/0x510
      [ 2203.656529]  tcp_write_timer+0x55/0x180
      [ 2203.656542]  call_timer_fn+0x3f/0x1d0
      [ 2203.656555]  expire_timers+0x160/0x200
      [ 2203.656562]  run_timer_softirq+0x1f4/0x480
      [ 2203.656606]  __do_softirq+0xfd/0x402
      [ 2203.656613]  asm_call_irq_on_stack+0x12/0x20
      [ 2203.656617]  </IRQ>
      [ 2203.656623]  do_softirq_own_stack+0x37/0x50
      [ 2203.656631]  irq_exit_rcu+0x134/0x1a0
      [ 2203.656639]  sysvec_apic_timer_interrupt+0x36/0x80
      [ 2203.656646]  asm_sysvec_apic_timer_interrupt+0x12/0x20
      [ 2203.656654] RIP: 0010:default_idle+0x13/0x20
      [ 2203.656663] Code: 89 f0 5d 41 5c 41 5d 41 5e c3 cc cc cc cc cc cc cc
      cc cc cc cc cc cc 0f 1f 44 00 00 0f 1f 44 00 00 0f 00 2d 9f 32 57 00 fb
      f4 <c3> cc cc cc cc 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 be 08
      [ 2203.656668] RSP: 0018:ffff88810036fe78 EFLAGS: 00000256
      [ 2203.656676] RAX: ffffffffaf2a87f0 RBX: ffff888100360000 RCX:
      ffffffffaf290191
      [ 2203.656681] RDX: 0000000000098b5e RSI: 0000000000000004 RDI:
      ffff88811a3c4f60
      [ 2203.656686] RBP: 0000000000000000 R08: 0000000000000001 R09:
      ffff88811a3c4f63
      [ 2203.656690] R10: ffffed10234789ec R11: 0000000000000001 R12:
      0000000000000003
      [ 2203.656695] R13: ffff888100360000 R14: 0000000000000000 R15:
      0000000000000000
      [ 2203.656729]  default_idle_call+0x5a/0x150
      [ 2203.656735]  cpuidle_idle_call+0x1c6/0x220
      [ 2203.656780]  do_idle+0xab/0x100
      [ 2203.656786]  cpu_startup_entry+0x19/0x20
      [ 2203.656793]  secondary_startup_64_no_verify+0xc2/0xcb
      
      [ 2203.657409] The buggy address belongs to the page:
      [ 2203.658648] page:0000000027a9842f refcount:1 mapcount:0
      mapping:0000000000000000 index:0x0 pfn:0x11a388
      [ 2203.658665] flags:
      0x17ffffc0001000(reserved|node=0|zone=2|lastcpupid=0x1fffff)
      [ 2203.658675] raw: 0017ffffc0001000 ffffea000468e208 ffffea000468e208
      0000000000000000
      [ 2203.658682] raw: 0000000000000000 0000000000000000 00000001ffffffff
      0000000000000000
      [ 2203.658686] page dumped because: kasan: bad access detected
      
      To reproduce(ipvlan with IPVLAN_MODE_L3):
      Env setting:
      =======================================================
      modprobe ipvlan ipvlan_default_mode=1
      sysctl net.ipv4.conf.eth0.forwarding=1
      iptables -t nat -A POSTROUTING -s 20.0.0.0/255.255.255.0 -o eth0 -j
      MASQUERADE
      ip link add gw link eth0 type ipvlan
      ip -4 addr add 20.0.0.254/24 dev gw
      ip netns add net1
      ip link add ipv1 link eth0 type ipvlan
      ip link set ipv1 netns net1
      ip netns exec net1 ip link set ipv1 up
      ip netns exec net1 ip -4 addr add 20.0.0.4/24 dev ipv1
      ip netns exec net1 route add default gw 20.0.0.254
      ip netns exec net1 tc qdisc add dev ipv1 root netem loss 10%
      ifconfig gw up
      iptables -t filter -A OUTPUT -p tcp --dport 8888 -j REJECT --reject-with
      icmp-port-unreachable
      =======================================================
      And then excute the shell(curl any address of eth0 can reach):
      
      for((i=1;i<=100000;i++))
      do
              ip netns exec net1 curl x.x.x.x:8888
      done
      =======================================================
      
      Fixes: 2ad7bf36 ("ipvlan: Initial check-in of the IPVLAN driver.")
      Signed-off-by: N"t.feng" <fengtao40@huawei.com>
      Suggested-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com>
      (cherry picked from commit 2572b83c)
      16bcf782
  5. 05 7月, 2023 3 次提交
  6. 04 7月, 2023 2 次提交
  7. 03 7月, 2023 1 次提交
  8. 30 6月, 2023 1 次提交
  9. 28 6月, 2023 2 次提交
  10. 27 6月, 2023 1 次提交
  11. 26 6月, 2023 1 次提交
  12. 25 6月, 2023 1 次提交
  13. 21 6月, 2023 3 次提交
  14. 19 6月, 2023 1 次提交
  15. 14 6月, 2023 1 次提交
    • W
      drm/qxl: Fix missing free_irq · be99270d
      Wei Li 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q4S3
      
      --------------------------------
      
      When doing "cat /proc/interrupts" after qxl.ko is unloaded, an oops occurs:
      
      BUG: unable to handle page fault for address: ffffffffc0274769
      PGD 2a0d067 P4D 2a0d067 PUD 2a0f067 PMD 103f39067 PTE 0
      Oops: 0000 [#1] PREEMPT SMP PTI
      CPU: 6 PID: 246 Comm: cat Not tainted 6.1.0-rc2 #24
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
      RIP: 0010:string_nocheck+0x34/0x50
      Code: 66 85 c0 74 3c 83 e8 01 4c 8d 5c 07 01 31 c0 eb 19 49 39 fa 76 03 44 88 07 48 83 c7
      RSP: 0018:ffffc90000893bb8 EFLAGS: 00010046
      RAX: 0000000000000000 RBX: ffffc90000893c50 RCX: ffff0a00ffffff04
      RDX: ffffffffc0274769 RSI: ffff888102812000 RDI: ffff88810281133e
      RBP: ffff888102812000 R08: ffffffff823fa5e6 R09: 0000000000000007
      R10: ffff888102812000 R11: ffff88820281133d R12: ffffffffc0274769
      R13: ffff0a00ffffff04 R14: 0000000000000cc4 R15: ffffffff823276b4
      FS:  000000000214f8c0(0000) GS:ffff88842fd80000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffc0274769 CR3: 00000001025c4005 CR4: 0000000000770ee0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      PKRU: 55555554
      Call Trace:
       <TASK>
       string+0x46/0x60
       vsnprintf+0x27a/0x4f0
       seq_vprintf+0x34/0x50
       seq_printf+0x53/0x70
       ? seq_read_iter+0x365/0x450
       show_interrupts+0x259/0x330
       seq_read_iter+0x2a3/0x450
       proc_reg_read_iter+0x47/0x70
       generic_file_splice_read+0x94/0x160
       splice_direct_to_actor+0xb0/0x230
       ? do_splice_direct+0xd0/0xd0
       do_splice_direct+0x8b/0xd0
       do_sendfile+0x345/0x4f0
       __x64_sys_sendfile64+0xa1/0xc0
       do_syscall_64+0x38/0x90
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x4bb0ce
      Code: c3 0f 1f 00 4c 89 d2 4c 89 c6 e9 bd fd ff ff 0f 1f 44 00 00 31 c0 c3 0f 1f 44 00 00
      RSP: 002b:00007ffd99dc3fb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
      RAX: ffffffffffffffda RBX: 0000000001000000 RCX: 00000000004bb0ce
      RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000001
      RBP: 0000000000000001 R08: 000000000068f240 R09: 0000000001000000
      R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000000003
      R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
       </TASK>
      
      It seems that qxl doesn't free the interrupt it requests during unload,
      fix this by adding the missing free_irq().
      
      Fixes: f64122c1 ("drm: add new QXL driver. (v1.4)")
      Signed-off-by: NWei Li <liwei391@huawei.com>
      (cherry picked from commit ed64582f)
      be99270d
  16. 13 6月, 2023 1 次提交
  17. 08 6月, 2023 6 次提交
  18. 03 6月, 2023 9 次提交
    • L
      md/raid10: fix incorrect done of recovery · 304e8d84
      Li Nan 提交于
      hulk inclusion
      category: bugfix
      bugzilla: 188535, https://gitee.com/openeuler/kernel/issues/I6O61Q
      CVE: NA
      
      --------------------------------
      
      Recovery will go to giveup and let chunks_skipped++ in raid10_sync_request
      if there are some bad_blocks, and it will return max_sector when
      chunks_skipped >= geo.raid_disks. Now, recovery fail and data is
      inconsistent but user think recovery is done, it is wrong.
      
      Fix it by set mirror's recovery_disabled and spare device shouln't be
      added to here.
      Signed-off-by: NLi Nan <linan122@huawei.com>
      Reviewed-by: NHou Tao <houtao1@huawei.com>
      (cherry picked from commit b0ac58c9)
      304e8d84
    • L
      md/raid10: fix null-ptr-deref in raid10_sync_request · 94831546
      Li Nan 提交于
      hulk inclusion
      category: bugfix
      bugzilla: 188378, https://gitee.com/openeuler/kernel/issues/I6GGV7
      CVE: NA
      
      --------------------------------
      
      init_resync() init mempool and set conf->have_replacemnt at the begaining
      of sync, close_sync() free the mempool when sync is completed.
      
      After commit 7e83ccbe ("md/raid10: Allow skipping recovery when clean
      arrays are assembled"), recovery might skipped and init_resync() is called
      but close_sync() is not. null-ptr-deref occurs as below:
        1) creat a array, wait for resync to complete, mddev->recovery_cp is set
           to MaxSector.
        2) recovery is woken and it is skipped. conf->have_replacement is set to
           0 in init_resync(). close_sync() not called.
        3) some io errors and rdev A is set to WantReplacement.
        4) a new device is added and set to A's replacement.
        5) recovery is woken, A have replacement, but conf->have_replacemnt is
           0. r10bio->dev[i].repl_bio will not be alloced and null-ptr-deref
           occurs.
      
      Fix it by not init_resync() if recovery skipped.
      
      Fixes: 7e83ccbe md/raid10: Allow skipping recovery when clean arrays are assembled")
      Signed-off-by: NLi Nan <linan122@huawei.com>
      Reviewed-by: NHou Tao <houtao1@huawei.com>
      (cherry picked from commit 2de30b8f)
      94831546
    • L
      md: fix unexpected changes of return value in rdev_set_badblocks · 74720ee6
      Li Nan 提交于
      hulk inclusion
      category: bugfix
      bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6XBZQ
      CVE: NA
      
      --------------------------------
      
      If we set any badblocks fail, we will remove this rdev(set it to Faulty
      or set recovery_disabled). Previous patch "md/raid10: fix io hung in
      md_wait_for_blocked_rdev()" check badblocks->changed instead of return
      value in rdev_set_badblocks(), but return value of this func also changed
      accordingly, which is not what we expected.
      
      Keep the return value consistent with before.
      Signed-off-by: NLi Nan <linan122@huawei.com>
      Reviewed-by: NYu Kuai <yukuai3@huawei.com>
      Reviewed-by: NHou Tao <houtao1@huawei.com>
      (cherry picked from commit bebf3d97)
      74720ee6
    • L
      md/raid10: fix io hung in md_wait_for_blocked_rdev() · 1f407ca9
      Li Nan 提交于
      hulk inclusion
      category: bugfix
      bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6XBZQ
      CVE: NA
      
      --------------------------------
      
      If badblocks are merged but bb->count exceedded, badblocks_set() will
      return 1 and merged badblocks will become un-ack. rdev_set_badblocks()
      will not set sb_flags and wakeup mddev->thread, io wait in
      md_wait_for_blocked_rdev() will hung because BlockedBadBlocks may not be
      cleared.
      
      Fix it by checking badblocks->changed instead of return value. This flag
      is set when badblocks changes.
      Signed-off-by: NLi Nan <linan122@huawei.com>
      Reviewed-by: NYu Kuai <yukuai3@huawei.com>
      Reviewed-by: NHou Tao <houtao1@huawei.com>
      (cherry picked from commit c23e1cd1)
      1f407ca9
    • L
      md/raid10: fix incorrect counting of rdev->nr_pending · 24ad8fdd
      Li Nan 提交于
      hulk inclusion
      category: bugfix
      bugzilla: 188605, https://gitee.com/openeuler/kernel/issues/I6ZJ3T
      CVE: NA
      
      --------------------------------
      
      We get rdev from mirrors.replacement twice in raid10_write_request().
      If replacement changes between two reads, it will increase A->nr_pending
      and decrease B->nr_pending.
      
        T1 (write)	   T2 (remove)	    T3 (add)
                         raid10_remove_disk
      
        raid10_write_request
         rrdev = conf->mirrors[d].replacement; ->rdev A
         A nr_pending++
      
                          p->rdev = p->replacement; ->rdev A
                          p->replacement = NULL;
      
      				    //A it set to WantReplacement
                                          raid10_add_disk
      				     p->replacement = rdev; ->rdev B
      
         if blocked_rdev
          rdev = conf->mirrors[d].replacement; ->rdev B
          B nr_pending--
      
      We will record rdev in r10bio, and get rdev from r10bio to fix it.
      
      Fixes: 475b0321 ("md/raid10: writes should get directed to replacement as well as original.")
      Signed-off-by: NLi Nan <linan122@huawei.com>
      Reviewed-by: NHou Tao <houtao1@huawei.com>
      (cherry picked from commit 7b3b8187)
      24ad8fdd
    • L
      md/raid10: remove WANR_ON_ONCE in raid10_end_write_request · 7599ee43
      Li Nan 提交于
      hulk inclusion
      category: bugfix
      bugzilla: 188605, https://gitee.com/openeuler/kernel/issues/I6GOYF
      CVE: NA
      
      --------------------------------
      
      It might read mirror.redev first and then mirror->replacement because of
      memory reordering in raid10_end_write_request(), WARN_ON occurs if we
      remove disk at the same time.
      
        T1 remove			T2 io end
        raid10_remove_disk		raid10_end_write_request
         p->rdev = NULL
      				 read rdev -> NULL
         smp_mb
         p->replacement = NULL
      				 read replacement -> NULL
      
      It is meaningless to compare rdev with mirror->rdev after we get it from
      r10_bio in raid10_end_write_request(). Remove this WANR_ON_ONCE.
      
      Fixes: 2ecf5e6ecbfd ("md/raid10: fix uaf if replacement replaces rdev")
      Signed-off-by: NLi Nan <linan122@huawei.com>
      Reviewed-by: NHou Tao <houtao1@huawei.com>
      (cherry picked from commit a3ebeed7)
      7599ee43
    • L
      md/raid10: fix uaf if replacement replaces rdev · a7cc3cf3
      Li Nan 提交于
      hulk inclusion
      category: bugfix
      bugzilla: 188377, https://gitee.com/openeuler/kernel/issues/I6GOYF
      CVE: NA
      
      --------------------------------
      
      After commit 4ca40c2c ("md/raid10: Allow replacement device to be
      replace old drive.") mirrors->replacement can replace rdev during
      replacement's io pending, and repl_bio will write rdev (see
      raid10_write_one_disk()). We will get wrong device by r10conf in
      raid10_end_write_request(). In which case, r10_bio->devs[slot].repl_bio
      will be put but not set to IO_MADE_GOOD, and it will be put again later in
      raid_end_bio_io(), uaf occurs.
      
      Fix it by using r10_bio to record rdev. Put the operations of io fail and
      no replacement together, so no need to change repl.
      
        ==================================================================
        BUG: KASAN: use-after-free in bio_flagged include/linux/bio.h:238 [inline]
        BUG: KASAN: use-after-free in bio_put+0x78/0x80 block/bio.c:650
        Read of size 2 at addr ffff888116524dd4 by task md0_raid10/2618
      
        CPU: 0 PID: 2618 Comm: md0_raid10 Not tainted 5.10.0+ #3
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
        sd 0:0:0:0: rejecting I/O to offline device
        Call Trace:
         __dump_stack lib/dump_stack.c:77 [inline]
         dump_stack+0x107/0x167 lib/dump_stack.c:118
         print_address_description.constprop.0+0x1c/0x270 mm/kasan/report.c:390
         __kasan_report mm/kasan/report.c:550 [inline]
         kasan_report.cold+0x22/0x3a mm/kasan/report.c:567
         bio_flagged include/linux/bio.h:238 [inline]
         bio_put+0x78/0x80 block/bio.c:650
         put_all_bios drivers/md/raid10.c:248 [inline]
         free_r10bio drivers/md/raid10.c:257 [inline]
         raid_end_bio_io+0x3b5/0x590 drivers/md/raid10.c:309
         handle_write_completed drivers/md/raid10.c:2699 [inline]
         raid10d+0x2f85/0x5af0 drivers/md/raid10.c:2759
         md_thread+0x444/0x4b0 drivers/md/md.c:7932
         kthread+0x38c/0x470 kernel/kthread.c:313
         ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:299
      
        Allocated by task 1400:
         kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
         kasan_set_track mm/kasan/common.c:56 [inline]
         set_alloc_info mm/kasan/common.c:498 [inline]
         __kasan_kmalloc.constprop.0+0xb5/0xe0 mm/kasan/common.c:530
         slab_post_alloc_hook mm/slab.h:512 [inline]
         slab_alloc_node mm/slub.c:2923 [inline]
         slab_alloc mm/slub.c:2931 [inline]
         kmem_cache_alloc+0x144/0x360 mm/slub.c:2936
         mempool_alloc+0x146/0x360 mm/mempool.c:391
         bio_alloc_bioset+0x375/0x610 block/bio.c:486
         bio_clone_fast+0x20/0x50 block/bio.c:711
         raid10_write_one_disk+0x166/0xd30 drivers/md/raid10.c:1240
         raid10_write_request+0x1600/0x2c90 drivers/md/raid10.c:1484
         __make_request drivers/md/raid10.c:1508 [inline]
         raid10_make_request+0x376/0x620 drivers/md/raid10.c:1537
         md_handle_request+0x699/0x970 drivers/md/md.c:451
         md_submit_bio+0x204/0x400 drivers/md/md.c:489
         __submit_bio block/blk-core.c:959 [inline]
         __submit_bio_noacct block/blk-core.c:1007 [inline]
         submit_bio_noacct+0x2e3/0xcf0 block/blk-core.c:1086
         submit_bio+0x1a0/0x3a0 block/blk-core.c:1146
         submit_bh_wbc+0x685/0x8e0 fs/buffer.c:3053
         ext4_commit_super+0x37e/0x6c0 fs/ext4/super.c:5696
         flush_stashed_error_work+0x28b/0x400 fs/ext4/super.c:791
         process_one_work+0x9a6/0x1590 kernel/workqueue.c:2280
         worker_thread+0x61d/0x1310 kernel/workqueue.c:2426
         kthread+0x38c/0x470 kernel/kthread.c:313
         ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:299
      
        Freed by task 2618:
         kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
         kasan_set_track+0x1c/0x30 mm/kasan/common.c:56
         kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:361
         __kasan_slab_free+0x151/0x180 mm/kasan/common.c:482
         slab_free_hook mm/slub.c:1569 [inline]
         slab_free_freelist_hook+0xa9/0x180 mm/slub.c:1608
         slab_free mm/slub.c:3179 [inline]
         kmem_cache_free+0xcd/0x3d0 mm/slub.c:3196
         mempool_free+0xe3/0x3b0 mm/mempool.c:500
         bio_free+0xe2/0x140 block/bio.c:266
         bio_put+0x58/0x80 block/bio.c:651
         raid10_end_write_request+0x885/0xb60 drivers/md/raid10.c:516
         bio_endio+0x376/0x6a0 block/bio.c:1465
         req_bio_endio block/blk-core.c:289 [inline]
         blk_update_request+0x5f5/0xf40 block/blk-core.c:1525
         blk_mq_end_request+0x4c/0x510 block/blk-mq.c:654
         blk_flush_complete_seq+0x835/0xd80 block/blk-flush.c:204
         flush_end_io+0x7b7/0xb90 block/blk-flush.c:261
         __blk_mq_end_request+0x282/0x4c0 block/blk-mq.c:645
         scsi_end_request+0x3a8/0x850 drivers/scsi/scsi_lib.c:607
         scsi_io_completion+0x3f5/0x1320 drivers/scsi/scsi_lib.c:970
         scsi_softirq_done+0x11b/0x490 drivers/scsi/scsi_lib.c:1448
         blk_mq_complete_request block/blk-mq.c:788 [inline]
         blk_mq_complete_request+0x84/0xb0 block/blk-mq.c:785
         scsi_mq_done+0x155/0x360 drivers/scsi/scsi_lib.c:1603
         virtscsi_vq_done drivers/scsi/virtio_scsi.c:184 [inline]
         virtscsi_req_done+0x14c/0x220 drivers/scsi/virtio_scsi.c:199
         vring_interrupt drivers/virtio/virtio_ring.c:2061 [inline]
         vring_interrupt+0x27a/0x300 drivers/virtio/virtio_ring.c:2047
         __handle_irq_event_percpu+0x2f8/0x830 kernel/irq/handle.c:156
         handle_irq_event_percpu kernel/irq/handle.c:196 [inline]
         handle_irq_event+0x105/0x280 kernel/irq/handle.c:213
         handle_edge_irq+0x258/0xd20 kernel/irq/chip.c:828
         asm_call_irq_on_stack+0xf/0x20
         __run_irq_on_irqstack arch/x86/include/asm/irq_stack.h:48 [inline]
         run_irq_on_irqstack_cond arch/x86/include/asm/irq_stack.h:101 [inline]
         handle_irq arch/x86/kernel/irq.c:230 [inline]
         __common_interrupt arch/x86/kernel/irq.c:249 [inline]
         common_interrupt+0xe2/0x190 arch/x86/kernel/irq.c:239
         asm_common_interrupt+0x1e/0x40 arch/x86/include/asm/idtentry.h:626
      
      Fixes: 4ca40c2c ("md/raid10: Allow replacement device to be replace old drive.")
      Signed-off-by: NLi Nan <linan122@huawei.com>
      Reviewed-by: NHou Tao <houtao1@huawei.com>
      (cherry picked from commit af959500)
      a7cc3cf3
    • L
      md/raid10: fix null-ptr-deref of mreplace in raid10_sync_request · 02fd87d7
      Li Nan 提交于
      hulk inclusion
      category: bugfix
      bugzilla: 188527, https://gitee.com/openeuler/kernel/issues/I6O3HO
      CVE: NA
      
      --------------------------------
      
      need_replace will be set to 1 if no-Faulty mreplace exists, and mreplace
      will be deref later. However, the latter check of mreplace might set
      mreplace to NULL, null-ptr-deref occurs if need_replace is 1 at this time.
      
      Fix it by merging two checks into one.
      
      Fixes: ee37d731 ("md/raid10: Fix raid10 replace hang when new added disk faulty")
      Signed-off-by: NLi Nan <linan122@huawei.com>
      Reviewed-by: NYu Kuai <yukuai3@huawei.com>
      Reviewed-by: NHou Tao <houtao1@huawei.com>
      (cherry picked from commit 7718714e)
      02fd87d7
    • L
      md/raid10: fix io loss while replacement replace rdev · f76a47d5
      Li Nan 提交于
      hulk inclusion
      category: bugfix
      bugzilla: 188787, https://gitee.com/openeuler/kernel/issues/I78YIW
      CVE: NA
      
      --------------------------------
      
      When we remove a disk which has replacement, first set rdev to NULL
      and then set replacement to rdev, finally set replacement to NULL (see
      raid10_remove_disk()). If io is submitted during the same time, it might
      read both rdev and replacement as NULL, and io will not be submitted.
      
        rdev -> NULL
                              read rdev
        replacement -> NULL
                              read replacement
      
      Fix it by reading replacement first and rdev later, meanwhile, use smp_mb()
      to prevent memory reordering.
      
      Fixes: 475b0321 ("md/raid10: writes should get directed to replacement as well as original.")
      Signed-off-by: NLi Nan <linan122@huawei.com>
      Reviewed-by: NYu Kuai <yukuai3@huawei.com>
      Reviewed-by: NHou Tao <houtao1@huawei.com>
      (cherry picked from commit e8025850)
      f76a47d5