1. 09 6月, 2023 18 次提交
  2. 08 6月, 2023 22 次提交
    • H
      scripts: Fix issue of module signing with openssl 3.x · 978f3573
      Huaxin Lu 提交于
      openEuler inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7BZZ1
      CVE: NA
      
      --------------------------------------------------------
      
      The SM2 signature for module signing is only supported
      with openEuler openssl 1.1.1. Fix the compile option to
      avoid compilation failure with openssl 3.x.
      
      Fixes: c1ad2f07 ("sign-file: Support SM signature")
      Signed-off-by: NHuaxin Lu <luhuaxin1@huawei.com>
      (cherry picked from commit 78568d28)
      978f3573
    • D
      spi: dw: Add support for 32-bits max xfer size · 19eafebe
      Damien Le Moal 提交于
      mainline inclusion
      from mainline-v5.11-rc1
      commit a51acc24
      category: feature
      bugzilla: httpsL//gitee/com/openeuler/kernel/issues/I7BZ1B
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a51acc2400d47
      
      ----------------------------------------------------------------------
      
      The Synopsis DesignWare DW_apb_ssi specifications version 3.23 onward
      define a 32-bits maximum transfer size synthesis parameter
      (SSI_MAX_XFER_SIZE=32) in addition to the legacy 16-bits configuration
      (SSI_MAX_XFER_SIZE=16) for SPI controllers. When SSI_MAX_XFER_SIZE=32,
      the layout of the ctrlr0 register changes, moving the data frame format
      field from bits [3..0] to bits [16..20], and the RX/TX FIFO word size
      can be up to 32-bits.
      
      To support this new format, introduce the DW SPI capability flag
      DW_SPI_CAP_DFS32 to indicate that a controller is configured with
      SSI_MAX_XFER_SIZE=32. Since SSI_MAX_XFER_SIZE is a controller synthesis
      parameter not accessible through a register, the detection of this
      parameter value is done in spi_hw_init() by writing and reading the
      ctrlr0 register and testing the value of bits [3..0]. These bits are
      ignored (unchanged) for SSI_MAX_XFER_SIZE=16, allowing the detection.
      If a DFS32 capable SPI controller is detected, the new field dfs_offset
      in struct dw_spi is set to SPI_DFS32_OFFSET (16).
      
      dw_spi_update_config() is modified to set the data frame size field at
      the correct position is the CTRLR0 register, as indicated by the
      dfs_offset field of the dw_spi structure.
      
      The DW_SPI_CAP_DFS32 flag is also unconditionally set for SPI slave
      controllers, e.g. controllers that have the DW_SPI_CAP_DWC_SSI
      capability flag set. However, for these ssi controllers, the dfs_offset
      field is set to 0 as before (as per specifications).
      
      Finally, for any controller with the DW_SPI_CAP_DFS32 capability flag
      set, dw_spi_add_host() extends the value of bits_per_word_mask from
      16-bits to 32-bits. dw_reader() and dw_writer() are also modified to
      handle 32-bits iTX/RX FIFO words.
      Suggested-by: NSean Anderson <seanga2@gmail.com>
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Acked-by: NSerge Semin <fancer.lancer@gmail.com>
      Link: https://lore.kernel.org/r/20201206011817.11700-3-damien.lemoal@wdc.comSigned-off-by: NMark Brown <broonie@kernel.org>
      Signed-off-by: NZhou Juan <nnuzj07170227@163.com>
      19eafebe
    • J
      perf: hisi: delete global enable pmu from xxx_write_counter() · 8ee6f013
      Junhao He 提交于
      driver inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7C2U9
      CVE: NA
      
      ----------------------------------------------------------------------
      
      UC PMU global enable register be setup in pmu callback pmu::enable(),
      which also be will setup in pmu::start()->xxx_write_counter(). And it
      will start statistical information when callback pmu:start() return,
      not is pmu:enable() return. Therefore the driver counter counts more
      data than normal.
      
      Fixes: 5ed05cb2 ("drivers/perf: hisi: Add support for HiSilicon UC PMU driver")
      Signed-off-by: NJunhao He <hejunhao3@huawei.com>
      8ee6f013
    • Y
      scsi: hisi_sas: Check usage count only when the runtime PM status is RPM_SUSPENDING · 41674625
      Yihang Li 提交于
      driver inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7BNF8
      CVE: NA
      
      ----------------------------------------------------------------------
      
      Users can suspend the machine with 'echo disk > /sys/power/state', but the
      suspend will fail because the SAS controller cannot be suspended:
      
      [root@localhost ~]# echo freeze > /sys/power/state
      -bash: echo: write error: Device or resource busy
      [15104.142955] PM: suspend entry (s2idle)
      ...
      [15104.283465] hisi_sas_v3_hw 0000:32:04.0: entering suspend state
      [15104.283480] hisi_sas_v3_hw 0000:30:04.0: entering suspend state
      [15104.283500] hisi_sas_v3_hw 0000:32:04.0: PM suspend: host status cannot be suspended
      [15104.283508] hisi_sas_v3_hw 0000:30:04.0: PM suspend: host status cannot be suspended
      [15104.283516] hisi_sas_v3_hw 0000:32:04.0: PM: pci_pm_suspend(): suspend_v3_hw+0x0/0x210 [hisi_sas_v3_hw] returns -16
      [15104.283527] hisi_sas_v3_hw 0000:32:04.0: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x1c0 returns -16
      [15104.283524] hisi_sas_v3_hw 0000:30:04.0: PM: pci_pm_suspend(): suspend_v3_hw+0x0/0x210 [hisi_sas_v3_hw] returns -16
      [15104.283533] hisi_sas_v3_hw 0000:32:04.0: PM: failed to suspend async: error -16
      [15104.283536] hisi_sas_v3_hw 0000:30:04.0: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x1c0 returns -16
      [15104.283542] hisi_sas_v3_hw 0000:30:04.0: PM: failed to suspend async: error -16
      
      The problem is that when the ->runtime_suspend() callback suspend_v3_hw()
      is executing, the current runtime PM status is RPM_ACTIVE and the usage
      count of the controller is not 0, so return immediately.
      
      To fix it, Check the device usage count only when the runtime PM status is
      RPM_SUSPENDING.
      Signed-off-by: NYihang Li <liyihang9@huawei.com>
      Signed-off-by: Nxiabing <xiabing12@h-partners.com>
      41674625
    • A
      scsi: hisi_sas: Work around build failure in suspend function · c3500887
      Arnd Bergmann 提交于
      mainline inclusion
      from mainline-v6.4-rc1
      commit e01e2290
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7BNF8
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e01e2290f0948ea6d383a5b715738911308b4d2b
      
      ----------------------------------------------------------------------
      
      The suspend/resume functions in this driver seem to have multiple problems,
      the latest one just got introduced by a bugfix:
      
      drivers/scsi/hisi_sas/hisi_sas_v3_hw.c: In function '_suspend_v3_hw':
      drivers/scsi/hisi_sas/hisi_sas_v3_hw.c:5142:39: error: 'struct dev_pm_info' has no member named 'usage_count'
       5142 |         if (atomic_read(&device->power.usage_count)) {
      drivers/scsi/hisi_sas/hisi_sas_v3_hw.c: In function '_suspend_v3_hw':
      drivers/scsi/hisi_sas/hisi_sas_v3_hw.c:5142:39: error: 'struct dev_pm_info' has no member named 'usage_count'
       5142 |         if (atomic_read(&device->power.usage_count)) {
      
      As far as I can tell, the 'usage_count' is not meant to be accessed by
      device drivers at all, though I don't know what the driver is supposed to
      do instead.
      
      Another problem is the use of the deprecated UNIVERSAL_DEV_PM_OPS(), and
      marking functions as __maybe_unused to avoid warnings about unused
      functions.  This should probably be changed to using
      DEFINE_RUNTIME_DEV_PM_OPS().
      
      Both changes require actually understanding what the driver needs to do,
      and being able to test this, so instead here is the simplest patch to make
      it pass the randconfig builds instead.
      
      Fixes: e368d38c ("scsi: hisi_sas: Exit suspend state when usage count is greater than 0")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Link: https://lore.kernel.org/r/20230405083611.3376739-1-arnd@kernel.orgReviewed-by: NXiang Chen <chenxiang66@hisilicon.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: Nxiabing <xiabing12@h-partners.com>
      c3500887
    • Y
      scsi: hisi_sas: Block requests before take debugfs snapshot · fc7dbc50
      Yihang Li 提交于
      driver inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7BNF8
      CVE: NA
      
      ----------------------------------------------------------------------
      
      When the FIO is running and the dump is triggered continuously, some SATA
      I/Os fail to be returned to the upper layer due to the setting of
      HISI_SAS_REJECT_CMD_BIT. The SCSI layer invokes the error processing
      thread. However, sas_ata_hard_reset() also fails to be reset due to the
      setting of HISI_SAS_REJECT_CMD_BIT. As a result, the device is disabled.
      Call scsi_block_requests() and wait command complete before setting
      HISI_SAS_REJECT_CMD_BIT to avoid SATA I/O failures.
      Signed-off-by: NYihang Li <liyihang9@huawei.com>
      Signed-off-by: Nxiabing <xiabing12@h-partners.com>
      fc7dbc50
    • Q
      scsi: hisi_sas: Add slave_destroy interface for v3 hw · 63184f79
      Qi Liu 提交于
      driver inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7BNF8
      CVE: NA
      
      ----------------------------------------------------------------------
      
      A WARNING is triggered when executing link reset of remote PHY
      and rmmod SAS driver simultaneously. Following is the WARNING log:
      
      WARNING: CPU: 61 PID: 21818 at drivers/base/core.c:1347 __device_links_no_driver+0xb4/0xc0
       Call trace:
        __device_links_no_driver+0xb4/0xc0
        device_links_driver_cleanup+0xb0/0xfc
        __device_release_driver+0x198/0x23c
        device_release_driver+0x38/0x50
        bus_remove_device+0x130/0x140
        device_del+0x184/0x434
        __scsi_remove_device+0x118/0x150
        scsi_remove_target+0x1bc/0x240
        sas_rphy_remove+0x90/0x94
        sas_rphy_delete+0x24/0x3c
        sas_destruct_devices+0x64/0xa0 [libsas]
        sas_revalidate_domain+0xe4/0x150 [libsas]
        process_one_work+0x1e0/0x46c
        worker_thread+0x15c/0x464
        kthread+0x160/0x170
        ret_from_fork+0x10/0x20
       ---[ end trace 71e059eb58f85d4a ]---
      
      During SAS phy up, link->status is set to DL_STATE_AVAILABLE in
      device_links_driver_bound, then this setting influences
      __device_links_no_driver() before driver rmmod and caused WARNING.
      
      So we add the slave_destroy interface, to make sure link is removed
      after flush workque.
      
      Fixes: 16fd4a7c ("scsi: hisi_sas: Add device link between SCSI devices and hisi_hba")
      Signed-off-by: NQi Liu <liuqi115@huawei.com>
      Signed-off-by: NJohn Garry <john.garry@huawei.com>
      Signed-off-by: Nxiabing <xiabing12@h-partners.com>
      63184f79
    • O
      !996 [sync] PR-990: ubi: Fix deadlock caused by recursively holding work_sem · 5ab0cca4
      openeuler-ci-bot 提交于
      Merge Pull Request from: @openeuler-sync-bot 
       
      
      Origin pull request: 
      https://gitee.com/openeuler/kernel/pulls/990 
       
      PR sync from:  ZhaoLong Wang <wangzhaolong1@huawei.com>
       https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/thread/3EWABXFSAAORW3LZWZCYAYWH3W3EEKZU/ 
      Fix deadlock caused by recursively holding work_sem
      
      Lee Jones (1):
        mtd: ubi: wl: Fix a couple of kernel-doc issues
      
      ZhaoLong Wang (1):
        ubi: Fix deadlock caused by recursively holding work_sem
      
      
      -- 
      2.39.2
       
       
      Link:https://gitee.com/openeuler/kernel/pulls/996 
      
      Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> 
      Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> 
      5ab0cca4
    • O
      !1001 [sync] PR-928: hikey9xx: Fixed incorrect use of kfree to free sreg · 58d4723f
      openeuler-ci-bot 提交于
      Merge Pull Request from: @openeuler-sync-bot 
       
      
      Origin pull request: 
      https://gitee.com/openeuler/kernel/pulls/928 
       
      PR sync from:  ZhaoLong Wang <wangzhaolong1@huawei.com>
       https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/thread/TB3Q7BQ6ZAGBRI7WS6JPCAF77IWURUIW/ 
       
       
      Link:https://gitee.com/openeuler/kernel/pulls/1001 
      
      Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> 
      Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> 
      58d4723f
    • O
      !1018 [sync] PR-944: nbd: get config_lock before sock_shutdown · f384ae3d
      openeuler-ci-bot 提交于
      Merge Pull Request from: @openeuler-sync-bot 
       
      
      Origin pull request: 
      https://gitee.com/openeuler/kernel/pulls/944 
       
      PR sync from:  Zhong Jinghua <zhongjinghua@huawei.com>
       https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/thread/SH7IMWAMZDZBA352XEQZ4WD6CFY3EGCW/ 
       
       
      Link:https://gitee.com/openeuler/kernel/pulls/1018 
      
      Reviewed-by: Hou Tao <houtao1@huawei.com> 
      Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> 
      f384ae3d
    • O
      !1033 perf: hns3: add event suppport for ROH and default use hardware event 0 as group leader event · d3394d89
      openeuler-ci-bot 提交于
      Merge Pull Request from: @svishen 
       
      This pull request add ROH event and default use hardware event 0 as group leader event
      
      (1)perf: hns3: add event suppport for ROH
      (2)perf: hns3: default use hardware event 0 as group leader event
      
      issue:
      https://gitee.com/openeuler/kernel/issues/I7BY7T 
       
      Link:https://gitee.com/openeuler/kernel/pulls/1033 
      
      Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> 
      d3394d89
    • L
      vfio/migration: bugfix lost interruption after live migration · c796499d
      Longfang Liu 提交于
      driver inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7BTMW
      CVE: NA
      
      ----------------------------------------------------------------------
      
      During repeated live migration. There may be a problem with missing
      interrupts. In this case, re-send the doorbell on the migration end.
      Let QM report an interrupt again, and migrate the interrupt to the
      destination together.
      Thereby preventing the problem of interrupt loss.
      
      fixec: a0464f0b ("vfio/hisilicon: add acc live migration driver")
      Signed-off-by: NLongfang Liu <liulongfang@huawei.com>
      Signed-off-by: NJiangShui Yang <yangjiangshui@h-partners.com>
      (cherry picked from commit a60b29d3)
      c796499d
    • L
      crypto: hisilicon/qm - fix EQ/AEQ interrupt issue · f325b4da
      Longfang Liu 提交于
      driver inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7BTMW
      CVE: NA
      
      ----------------------------------------------------------------------
      
      During a live migration operation. In order to prevent the problem
      of EQ/AEQ interrupt loss. Migration driver will trigger an EQ/AEQ
      doorbell at the end of the migration.
      This operation may cause a double interrupt of EQ/AEQ. In order
      to ensure that the interrupt processing is normal. The interrupt
      handling of EQ/AEQ needs to be updated to prevent interrupt
      duplication.
      
      fixec: a0464f0b ("vfio/hisilicon: add acc live migration driver")
      Signed-off-by: NLongfang Liu <liulongfang@huawei.com>
      Signed-off-by: NJiangShui Yang <yangjiangshui@h-partners.com>
      (cherry picked from commit 24a30ee7)
      f325b4da
    • Z
      xfs: atomic drop extent entries when inactiving attr · ac00cf70
      Zhang Yi 提交于
      Offering: HULK
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I76JSK
      
      --------------------------------
      
      When inactiving an unlinked inode and it's attrs, if xlog is shutdown
      either during or just after the process of recurse deleting attribute
      nodes/leafs in xfs_attr3_root_inactive(), the log will records some
      buffer cancel items, but doesn't contain the corresponding extent
      entries and inode updates, this is incomplete and inconsistent. Because
      of the inactiving process is not completed and the unlinked inode is
      still in the agi_unlinked table, it will continue to be inactived after
      replaying the log on the next mount, the attr node/leaf blocks' created
      record before the cancel items could not be replayed but the inode
      does. So we could get corrupted data when reading the canceled blocks.
      
       XFS (pmem0): Metadata corruption detected at
       xfs_da3_node_read_verify+0x53/0x220, xfs_da3_node block 0x78
       XFS (pmem0): Unmount and run xfs_repair
       XFS (pmem0): First 128 bytes of corrupted metadata buffer:
       00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
       00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
       00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
       00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
       00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
       00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
       00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
       00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
       XFS (pmem0): metadata I/O error in "xfs_da_read_buf+0x104/0x190" at daddr 0x78 len 8 error 117
      
      In order to fix the issue, we need to remove the extent entries, update
      inode and attr btree atomically when staling attr node/leaf blocks. And
      note that we may also need to log and update the parent attr node
      entry when removing child or leaf attr block. Fortunately, it doesn't
      have to be so complicated, we could leave the removed entres as
      holes and skip them if we need to do re-inactiving, the whole node tree
      will be removed completely in the end.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NLong Li <leo.lilong@huawei.com>
      ac00cf70
    • Z
      xfs: factor out __xfs_da3_node_read() · 7a7842a1
      Zhang Yi 提交于
      Offering: HULK
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I76JSK
      
      --------------------------------
      
      Factor out a wrapper __xfs_da3_node_read() from xfs_da3_node_read()
      which could pass flags parameter.
      Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NLong Li <leo.lilong@huawei.com>
      7a7842a1
    • L
      xfs: fix a UAF in xfs_iflush_abort_clean · 1f403b90
      Long Li 提交于
      Offering: HULK
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I6I11V
      
      --------------------------------
      
      KASAN reported a UAF bug while fault injection test:
      
       ==================================================================
       BUG: KASAN: use-after-free in __list_del_entry_valid+0x2b1/0x2c0
       Read of size 8 at addr ffff888023edb888 by task kworker/0:1/34
      
       CPU: 0 PID: 34 Comm: kworker/0:1 Not tainted 5.10.0-07305-g4c00b418452b-dirty #369
       Workqueue: xfs-reclaim/sda xfs_reclaim_worker
       Call Trace:
        dump_stack+0x115/0x16b
        print_address_description.constprop.0+0x2c/0x450
        kasan_report.cold+0x5d/0xdb
        __asan_report_load8_noabort+0x20/0x30
        __list_del_entry_valid+0x2b1/0x2c0
        xfs_iflush_abort_clean+0x11c/0x290
        xfs_iflush_abort+0xd2/0x2c0
        xfs_iflush_shutdown_abort+0x2e3/0x580
        xfs_icwalk_ag+0xe9d/0x1a00
        xfs_reclaim_worker+0x29/0x50
        process_one_work+0x71f/0x11d0
        worker_thread+0x5cb/0x10a0
        kthread+0x35b/0x490
        ret_from_fork+0x1f/0x30
      
       Allocated by task 642:
        kasan_save_stack+0x23/0x60
        __kasan_kmalloc.constprop.0+0xd9/0x140
        kasan_slab_alloc+0x12/0x20
        kmem_cache_alloc+0x1c4/0xa50
        _xfs_buf_alloc+0x72/0xd50
        xfs_buf_get_map+0x156/0x7c0
        xfs_trans_get_buf_map+0x41c/0x8c0
        xfs_ialloc_inode_init+0x455/0xaf0
        xfs_ialloc_ag_alloc+0x71f/0x1790
        xfs_dialloc+0x3f9/0x8a0
        xfs_ialloc+0x12e/0x1970
        xfs_dir_ialloc+0x144/0x730
        xfs_create+0x623/0xe80
        xfs_generic_create+0x571/0x820
        xfs_vn_create+0x31/0x40
        path_openat+0x209d/0x3b10
        do_filp_open+0x1c2/0x2e0
        do_sys_openat2+0x4fc/0x900
        do_sys_open+0xd8/0x150
        __x64_sys_open+0x87/0xd0
        do_syscall_64+0x45/0x70
        entry_SYSCALL_64_after_hwframe+0x61/0xc6
      
       The buggy address belongs to the object at ffff888023edb780
        which belongs to the cache xfs_buf of size 392
       The buggy address is located 264 bytes inside of
        392-byte region [ffff888023edb780, ffff888023edb908)
       The buggy address belongs to the page:
       page:ffffea00008fb600 refcount:1 mapcount:0 mapping:0000000000000000
       index:0xffff888023edb780 pfn:0x23ed8
       head:ffffea00008fb600 order:2 compound_mapcount:0 compound_pincount:0
       flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
       raw: 001fffff80010200 ffffea00008fb008 ffff888019016050 ffff888018ff4300
       raw: ffff888023edb780 000000000013000d 00000001ffffffff 0000000000000000
       page dumped because: kasan: bad access detected
      
       Memory state around the buggy address:
        ffff888023edb780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
        ffff888023edb800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       >ffff888023edb880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                             ^
        ffff888023edb900: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
        ffff888023edb980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ==================================================================
      
      This is a low probability problem, it tooks me a long time to find the
      process that the problem occurred:
      
      1. When creating a new file, if there are no free inodes, we need to
      allocate a new chunk.  The buf item and inode items associated with inode
      will be submitted to CIL independently. If all goes well, both the buf item
      and the inode item will be inserted into the AIL, and the buf item will be
      in front of the inode item.
      
      2. At the first time, xfsaild only pushed buf item. If an error occurs
      while writing back the inode buffer, the inode item will be set XFS_LI_FAILED
      in xfs_buf_inode_io_fail() when buf io end, and the buf item will remain
      in the AIL.
      
      3. At the second time, xfsaild only pushed buf item again, while writing
      back the inode buffer and the log has shut down, the inode buffer will be
      set XBF_STALE and the buf item is removed from AIL when buf io end. Because
      of inode is not flushed, ili_last_fields in xfs_inode is still 0, so inode
      item will left in AIL.
      
      4. Concurrently, a new transaction log inode that in the same cluster as
      the previous inode, it will get the same inode buffer in xfs_buf_find(),
      _XBF_INODES flag will be cleared in xfs_buf_find() due to buffer is staled.
      
      5. At the third time, xfsaild push the inode item that has marked
      XFS_LI_FAILED, AIL will resubmit the inode item in xfsaild_resubmit_item().
      It will go to the wrong code path due to inode buffer missing _XBF_INODES
      flag, all inode items that in bp->b_li_list will be reduced the references
      to buffer, and inode item's li_buf set to null, but inode item still in
      bp->b_li_list. After all reference count decreasing the inode buffer will
      be freed.
      
      6. When xfs reclaim inode, remove inode item from bp->b_li_list will cause
      a uaf xfs_iflush_abort_clean().
      
      Fix it by add xfs shutdown condition check in xfs_buf_find(), if it has
      been shutdown, it is useless to get the buffer. While the inode item is
      still reference to the inode buffer, the _XBF_INODES flag will not be
      missing.
      Signed-off-by: NLong Li <leo.lilong@huawei.com>
      1f403b90
    • L
      xfs: fix a UAF when inode item push · 8fdd98e8
      Long Li 提交于
      Offering: HULK
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I76JSK
      
      --------------------------------
      
      KASAN reported a UAF bug while fault injection test:
      
        ==================================================================
        BUG: KASAN: use-after-free in xfs_inode_item_push+0x2db/0x2f0
        Read of size 8 at addr ffff888022f74788 by task xfsaild/sda/479
      
        CPU: 0 PID: 479 Comm: xfsaild/sda Not tainted 6.2.0-rc7-00003-ga8a43e2eb5f6 #89
        Call Trace:
         <TASK>
         dump_stack_lvl+0x51/0x6a
         print_report+0x171/0x4a6
         kasan_report+0xb7/0x130
         xfs_inode_item_push+0x2db/0x2f0
         xfsaild+0x729/0x1f70
         kthread+0x290/0x340
         ret_from_fork+0x1f/0x30
         </TASK>
      
        Allocated by task 494:
         kasan_save_stack+0x22/0x40
         kasan_set_track+0x25/0x30
         __kasan_slab_alloc+0x58/0x70
         kmem_cache_alloc+0x197/0x5d0
         xfs_inode_item_init+0x62/0x170
         xfs_trans_ijoin+0x15e/0x240
         xfs_init_new_inode+0x573/0x1820
         xfs_create+0x6a1/0x1020
         xfs_generic_create+0x544/0x5d0
         vfs_mkdir+0x5d0/0x980
         do_mkdirat+0x14e/0x220
         __x64_sys_mkdir+0x6a/0x80
         do_syscall_64+0x39/0x80
         entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
        Freed by task 14:
         kasan_save_stack+0x22/0x40
         kasan_set_track+0x25/0x30
         kasan_save_free_info+0x2e/0x40
         __kasan_slab_free+0x114/0x1b0
         kmem_cache_free+0xee/0x4e0
         xfs_inode_free_callback+0x187/0x2a0
         rcu_do_batch+0x317/0xce0
         rcu_core+0x686/0xa90
         __do_softirq+0x1b6/0x626
      
        The buggy address belongs to the object at ffff888022f74758
         which belongs to the cache xfs_ili of size 200
        The buggy address is located 48 bytes inside of
         200-byte region [ffff888022f74758, ffff888022f74820)
      
        The buggy address belongs to the physical page:
        page:ffffea00008bdd00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x22f74
        head:ffffea00008bdd00 order:1 compound_mapcount:0 subpages_mapcount:0 compound_pincount:0
        flags: 0x1fffff80010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
        raw: 001fffff80010200 ffff888010ed4040 ffffea00008b2510 ffffea00008bde10
        raw: 0000000000000000 00000000001a001a 00000001ffffffff 0000000000000000
        page dumped because: kasan: bad access detected
      
        Memory state around the buggy address:
         ffff888022f74680: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc
         ffff888022f74700: fc fc fc fc fc fc fc fc fc fc fc fa fb fb fb fb
        >ffff888022f74780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                              ^
         ffff888022f74800: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
         ffff888022f74880: fc fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ==================================================================
      
      When push inode item in xfsaild, it will race with reclaim inodes task.
      Consider the following call graph, both tasks deal with the same inode.
      During flushing the cluster, it will enter xfs_iflush_abort() in shutdown
      conditions, inode's XFS_IFLUSHING flag will be cleared and lip->li_buf set
      to null. Concurrently, inode will be reclaimed in shutdown conditions,
      there is no need to wait xfs buf lock because of lip->li_buf is null at
      this time, inode will be freed via rcu callback if xfsaild task schedule
      out during flushing the cluster. so, it is unsafe to reference lip after
      flushing the cluster in xfs_inode_item_push().
      
      			<log item is in AIL>
      			<filesystem shutdown>
      spin_lock(&ailp->ail_lock)
      xfs_inode_item_push(lip)
        xfs_buf_trylock(bp)
        spin_unlock(&lip->li_ailp->ail_lock)
        xfs_iflush_cluster(bp)
          if (xfs_is_shutdown())
            xfs_iflush_abort(ip)
      	xfs_trans_ail_delete(ip)
      	  spin_lock(&ailp->ail_lock)
      	  spin_unlock(&ailp->ail_lock)
      	xfs_iflush_abort_clean(ip)
            error = -EIO
      			<log item removed from AIL>
      			<log item li_buf set to null>
          if (error)
            xfs_force_shutdown()
      	xlog_shutdown_wait(mp->m_log)
      	  might_sleep()
      					xfs_reclaim_inode(ip)
      					if (shutdown)
      					  xfs_iflush_shutdown_abort(ip)
      					    if (!bp)
      					      xfs_iflush_abort(ip)
      					      return
      				        __xfs_inode_free(ip)
      					   call_rcu(ip, xfs_inode_free_callback)
      			......
      			<rcu grace period expires>
      			<rcu free callbacks run somewhere>
      			  xfs_inode_free_callback(ip)
      			    kmem_cache_free(ip->i_itemp)
      			......
      <starts running again>
          xfs_buf_ioend_fail(bp);
            xfs_buf_ioend(bp)
              xfs_buf_relse(bp);
          return error
      spin_lock(&lip->li_ailp->ail_lock)
        <UAF on log item>
      
      Fix the uaf by add XFS_ILOCK_SHARED lock in xfs_inode_item_push(), this
      prevents race conditions between inode item push and inode reclaim.
      
      Fixes: 90c60e16 ("xfs: xfs_iflush() is no longer necessary")
      Signed-off-by: NLong Li <leo.lilong@huawei.com>
      8fdd98e8
    • W
      xfs: fix the problem of mount failure caused by not refreshing mp->m_sb · c85caba9
      Wu Guanghao 提交于
      Offering: HULK
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I77NBB
      
      --------------------------------
      
      After testing xfs_growfs + fsstress + fault injection, the following stack
      appeared when mounting the filesystem:
      
      [  149.902032] XFS (loop0): xfs_buf_map_verify: daddr 0x200001 out of range, EOFS 0x200000
      [  149.902072] WARNING: CPU: 12 PID: 3045 at fs/xfs/xfs_buf.c:535 xfs_buf_get_map+0x5ae/0x650 [xfs]
      ...
      [  149.902473]  xfs_buf_read_map+0x59/0x330 [xfs]
      [  149.902621]  ? xlog_recover_items_pass2+0x55/0xd0 [xfs]
      [  149.902809]  xlog_recover_buf_commit_pass2+0xff/0x640 [xfs]
      [  149.902959]  ? xlog_recover_items_pass2+0x55/0xd0 [xfs]
      [  149.903104]  xlog_recover_items_pass2+0x55/0xd0 [xfs]
      [  149.903247]  xlog_recover_commit_trans+0x2e0/0x330 [xfs]
      [  149.903390]  xlog_recovery_process_trans+0x8e/0xf0 [xfs]
      [  149.903531]  xlog_recover_process_data+0x9c/0x130 [xfs]
      [  149.903687]  xlog_do_recovery_pass+0x3cc/0x5d0 [xfs]
      [  149.903843]  xlog_do_log_recovery+0x5c/0x80 [xfs]
      [  149.903984]  xlog_do_recover+0x33/0x1c0 [xfs]
      [  149.904125]  xlog_recover+0xdd/0x190 [xfs]
      [  149.904265]  xfs_log_mount+0x125/0x2f0 [xfs]
      [  149.904410]  xfs_mountfs+0x41a/0x910 [xfs]
      [  149.904558]  ? __pfx_xfs_fstrm_free_func+0x10/0x10 [xfs]
      [  149.904725]  xfs_fs_fill_super+0x4b7/0x940 [xfs]
      [  149.904873]  ? __pfx_xfs_fs_fill_super+0x10/0x10 [xfs]
      [  149.905016]  get_tree_bdev+0x19a/0x280
      [  149.905020]  vfs_get_tree+0x29/0xd0
      [  149.905023]  path_mount+0x69e/0x9b0
      [  149.905026]  do_mount+0x7d/0xa0
      [  149.905029]  __x64_sys_mount+0xdc/0x100
      [  149.905032]  do_syscall_64+0x3e/0x90
      [  149.905035]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      The trigger process is as follows:
      
      1. Growfs size from 0x200000 to 0x300000
      2. Using the space range of 0x200000~0x300000
      3. The above operations have only been written to the log area on disk
      4. Fault injection and shutdown filesystem
      5. Mount the filesystem and replay the log about growfs, but only modify
         the superblock buffer without modifying the mp->m_sb structure in
         memory
      6. Continuing the log replay, at this point we are replaying operation 2,
         then it was discovered that the blocks used more than
         mp->m_sb.sb_dblocks
      
      Therefore, during log replay, if there are any modifications made to the
      superblock, we should refresh the information recorded in the mp->m_sb.
      Signed-off-by: NWu Guanghao <wuguanghao3@huawei.com>
      Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: NLong Li <leo.lilong@huawei.com>
      c85caba9
    • D
      iomap: iomap: fix memory corruption when recording errors during writeback · db8520d4
      Darrick J. Wong 提交于
      mainline inclusion
      from mainline-v6.0-rc7
      commit 3d5f3ba1
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I76JSK
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3d5f3ba1ac28059bdf7000cae2403e4e984308d2
      
      --------------------------------
      
      Every now and then I see this crash on arm64:
      
      Unable to handle kernel NULL pointer dereference at virtual address 00000000000000f8
      Buffer I/O error on dev dm-0, logical block 8733687, async page read
      Mem abort info:
        ESR = 0x0000000096000006
        EC = 0x25: DABT (current EL), IL = 32 bits
        SET = 0, FnV = 0
        EA = 0, S1PTW = 0
        FSC = 0x06: level 2 translation fault
      Data abort info:
        ISV = 0, ISS = 0x00000006
        CM = 0, WnR = 0
      user pgtable: 64k pages, 42-bit VAs, pgdp=0000000139750000
      [00000000000000f8] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000, pmd=0000000000000000
      Internal error: Oops: 96000006 [#1] PREEMPT SMP
      Buffer I/O error on dev dm-0, logical block 8733688, async page read
      Dumping ftrace buffer:
      Buffer I/O error on dev dm-0, logical block 8733689, async page read
         (ftrace buffer empty)
      XFS (dm-0): log I/O error -5
      Modules linked in: dm_thin_pool dm_persistent_data
      XFS (dm-0): Metadata I/O Error (0x1) detected at xfs_trans_read_buf_map+0x1ec/0x590 [xfs] (fs/xfs/xfs_trans_buf.c:296).
       dm_bio_prison
      XFS (dm-0): Please unmount the filesystem and rectify the problem(s)
      XFS (dm-0): xfs_imap_lookup: xfs_ialloc_read_agi() returned error -5, agno 0
       dm_bufio dm_log_writes xfs nft_chain_nat xt_REDIRECT nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_REJECT
      potentially unexpected fatal signal 6.
       nf_reject_ipv6
      potentially unexpected fatal signal 6.
       ipt_REJECT nf_reject_ipv4
      CPU: 1 PID: 122166 Comm: fsstress Tainted: G        W          6.0.0-rc5-djwa #rc5 3004c9f1de887ebae86015f2677638ce51ee7
       rpcsec_gss_krb5 auth_rpcgss xt_tcpudp ip_set_hash_ip ip_set_hash_net xt_set nft_compat ip_set_hash_mac ip_set nf_tables
      Hardware name: QEMU KVM Virtual Machine, BIOS 1.5.1 06/16/2021
      pstate: 60001000 (nZCv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--)
       ip_tables
      pc : 000003fd6d7df200
       x_tables
      lr : 000003fd6d7df1ec
       overlay nfsv4
      CPU: 0 PID: 54031 Comm: u4:3 Tainted: G        W          6.0.0-rc5-djwa #rc5 3004c9f1de887ebae86015f2677638ce51ee7405
      Hardware name: QEMU KVM Virtual Machine, BIOS 1.5.1 06/16/2021
      Workqueue: writeback wb_workfn
      sp : 000003ffd9522fd0
       (flush-253:0)
      pstate: 60401005 (nZCv daif +PAN -UAO -TCO -DIT +SSBS BTYPE=--)
      pc : errseq_set+0x1c/0x100
      x29: 000003ffd9522fd0 x28: 0000000000000023 x27: 000002acefeb6780
      x26: 0000000000000005 x25: 0000000000000001 x24: 0000000000000000
      x23: 00000000ffffffff x22: 0000000000000005
      lr : __filemap_set_wb_err+0x24/0xe0
       x21: 0000000000000006
      sp : fffffe000f80f760
      x29: fffffe000f80f760 x28: 0000000000000003 x27: fffffe000f80f9f8
      x26: 0000000002523000 x25: 00000000fffffffb x24: fffffe000f80f868
      x23: fffffe000f80fbb0 x22: fffffc0180c26a78 x21: 0000000002530000
      x20: 0000000000000000 x19: 0000000000000000 x18: 0000000000000000
      
      x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
      x14: 0000000000000001 x13: 0000000000470af3 x12: fffffc0058f70000
      x11: 0000000000000040 x10: 0000000000001b20 x9 : fffffe000836b288
      x8 : fffffc00eb9fd480 x7 : 0000000000f83659 x6 : 0000000000000000
      x5 : 0000000000000869 x4 : 0000000000000005 x3 : 00000000000000f8
      x20: 000003fd6d740020 x19: 000000000001dd36 x18: 0000000000000001
      x17: 000003fd6d78704c x16: 0000000000000001 x15: 000002acfac87668
      x2 : 0000000000000ffa x1 : 00000000fffffffb x0 : 00000000000000f8
      Call trace:
       errseq_set+0x1c/0x100
       __filemap_set_wb_err+0x24/0xe0
       iomap_do_writepage+0x5e4/0xd5c
       write_cache_pages+0x208/0x674
       iomap_writepages+0x34/0x60
       xfs_vm_writepages+0x8c/0xcc [xfs 7a861f39c43631f15d3a5884246ba5035d4ca78b]
      x14: 0000000000000000 x13: 2064656e72757465 x12: 0000000000002180
      x11: 000003fd6d8a82d0 x10: 0000000000000000 x9 : 000003fd6d8ae288
      x8 : 0000000000000083 x7 : 00000000ffffffff x6 : 00000000ffffffee
      x5 : 00000000fbad2887 x4 : 000003fd6d9abb58 x3 : 000003fd6d740020
      x2 : 0000000000000006 x1 : 000000000001dd36 x0 : 0000000000000000
      CPU: 1 PID: 122167 Comm: fsstress Tainted: G        W          6.0.0-rc5-djwa #rc5 3004c9f1de887ebae86015f2677638ce51ee7
       do_writepages+0x90/0x1c4
       __writeback_single_inode+0x4c/0x4ac
      Hardware name: QEMU KVM Virtual Machine, BIOS 1.5.1 06/16/2021
       writeback_sb_inodes+0x214/0x4ac
       wb_writeback+0xf4/0x3b0
      pstate: 60001000 (nZCv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--)
       wb_workfn+0xfc/0x580
       process_one_work+0x1e8/0x480
      pc : 000003fd6d7df200
       worker_thread+0x78/0x430
      
      This crash is a result of iomap_writepage_map encountering some sort of
      error during writeback and wanting to set that error code in the file
      mapping so that fsync will report it.  Unfortunately, the code
      dereferences folio->mapping after unlocking the folio, which means that
      another thread could have removed the page from the page cache
      (writeback doesn't hold the invalidation lock) and give it to somebody
      else.
      
      At best we crash the system like above; at worst, we corrupt memory or
      set an error on some other unsuspecting file while failing to record the
      problems with *this* file.  Regardless, fix the problem by reporting the
      error to the inode mapping.
      
      NOTE: Commit 598ecfba lifted the XFS writeback code to iomap, so
      this fix should be backported to XFS in the 4.6-5.4 kernels in addition
      to iomap in the 5.5-5.19 kernels.
      
      Fixes: e735c007 ("iomap: Convert iomap_add_to_ioend() to take a folio") # 5.17 onward
      Fixes: 598ecfba ("iomap: lift the xfs writeback code to iomap") # 5.5-5.16, needs backporting
      Fixes: 150d5be0 ("xfs: remove xfs_cancel_ioend") # 4.6-5.4, needs backporting
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      
      conflicts:
      fs/iomap/buffered-io.c
      Signed-off-by: NYe Bin <yebin10@huawei.com>
      Signed-off-by: NLong Li <leo.lilong@huawei.com>
      db8520d4
    • L
      xfs: fix hung when transaction commit fail in xfs_inactive_ifree · b2c4bff9
      Long Li 提交于
      Offering: HULK
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I76JSK
      
      --------------------------------
      
      After running unplug disk test and unmount filesystem, the umount thread
      hung all the time.
      
       crash> dmesg
       sd 0:0:0:0: rejecting I/O to offline device
       XFS (sda): log I/O error -5
       XFS (sda): Corruption of in-memory data (0x8) detected at
       xfs_defer_finish_noroll+0x12e0/0x1cf0 (fs/xfs/libxfs/xfs_defer.c:504).
      	Shutting down filesystem.
       XFS (sda): Please unmount the filesystem and rectify the problem(s)
       XFS (sda): xfs_inactive_ifree: xfs_trans_commit returned error -5
       XFS (sda): Unmounting Filesystem
      
       crash> bt 3368
       PID: 3368   TASK: ffff88801bcd8040  CPU: 3   COMMAND: "umount"
        #0 [ffffc900086a7ae0] __schedule at ffffffff83d3fd25
        #1 [ffffc900086a7be8] schedule at ffffffff83d414dd
        #2 [ffffc900086a7c10] xfs_ail_push_all_sync at ffffffff8256db24
        #3 [ffffc900086a7d18] xfs_unmount_flush_inodes at ffffffff824ee7e2
        #4 [ffffc900086a7d28] xfs_unmountfs at ffffffff824f2eff
        #5 [ffffc900086a7da8] xfs_fs_put_super at ffffffff82503e69
        #6 [ffffc900086a7de8] generic_shutdown_super at ffffffff81aeb8cd
        #7 [ffffc900086a7e10] kill_block_super at ffffffff81aefcfa
        #8 [ffffc900086a7e30] deactivate_locked_super at ffffffff81aeb2da
        #9 [ffffc900086a7e48] deactivate_super at ffffffff81aeb639
       #10 [ffffc900086a7e68] cleanup_mnt at ffffffff81b6ddd5
       #11 [ffffc900086a7ea0] __cleanup_mnt at ffffffff81b6dfdf
       #12 [ffffc900086a7eb0] task_work_run at ffffffff8126e5cf
       #13 [ffffc900086a7ef8] exit_to_user_mode_prepare at ffffffff813fa136
       #14 [ffffc900086a7f28] syscall_exit_to_user_mode at ffffffff83d25dbb
       #15 [ffffc900086a7f40] do_syscall_64 at ffffffff83d1f8d9
       #16 [ffffc900086a7f50] entry_SYSCALL_64_after_hwframe at ffffffff83e00085
      
      When we free a cluster buffer from xfs_ifree_cluster, all the inodes in
      cache are marked XFS_ISTALE. On journal commit dirty stale inodes as are
      handled by both buffer and inode log items, inodes marked as XFS_ISTALE
      in AIL will be removed from the AIL because the buffer log item will clean
      it. If the transaction commit fails in the xfs_inactive_ifree(), inodes
      marked as XFS_ISTALE will be left in AIL due to buf log item is not
      committed, this will cause the unmount thread above to be blocked all the
      time. Set inode item abort associated with the buffer that is stale after
      buf item release, let ail clean up these items, that prevent inode item
      left in AIL and can not being pushed.
      Signed-off-by: NLong Li <leo.lilong@huawei.com>
      b2c4bff9
    • Y
      xfs: fix dead loop when do mount with IO fault injection · 729a6872
      Ye Bin 提交于
      Offering: HULK
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I76JSK
      
      --------------------------------
      
      When do IO fault injection, mount maybe hung:
      blk_update_request: I/O error, dev dm-4, sector 2128216 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
      XFS (dm-4): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0x190/0x200 [xfs]" at daddr 0x207958 len 8 error 5
      blk_update_request: I/O error, dev dm-4, sector 2108042 op 0x1:(WRITE) flags 0x29800 phys_seg 1 prio class 0
      XFS (dm-4): log I/O error -5
      XFS (dm-4): Metadata I/O Error (0x1) detected at xfs_trans_read_buf_map+0x2b6/0x510 [xfs] (fs/xfs/xfs_trans_buf.c:296).  Shutting down filesystem.
      sd 6:0:0:3: [sdh] Synchronizing SCSI cache
      XFS (dm-4): Please unmount the filesystem and rectify the problem(s)
      XFS (dm-4): Failed to recover intents
      XFS (dm-4): Ending recovery (logdev: internal)
      
      PID: 2489297  TASK: ffff8880355c1b00  CPU: 0   COMMAND: "mount"
      __schedule at ffffffff93aa03c1
      schedule at ffffffff93aa0c6f
      schedule_timeout at ffffffff93aa63c0
      xfs_wait_buftarg at ffffffffc1170ff0 [xfs]
      xfs_log_mount_finish at ffffffffc11bddc4 [xfs]
      xfs_mountfs at ffffffffc11a4492 [xfs]
      xfs_fc_fill_super at ffffffffc11ae01c [xfs]
      get_tree_bdev at ffffffff92c62a79
      vfs_get_tree at ffffffff92c60fe0
      do_new_mount at ffffffff92caaca0
      path_mount at ffffffff92cabf83
      __se_sys_mount at ffffffff92cac352
      do_syscall_64 at ffffffff93a8b153
      entry_SYSCALL_64_after_hwframe at ffffffff93c00099
      
      Ftrace log:
      mount-2489297 [002] .... 337330.575879: xfs_buf_wait_buftarg: dev 253:4 bno 0x3220 nblks 0x8 hold 2 pincount 0 lock 1 flags DONE|PAGES caller __list_l0
      
      Above issue hapnens as xfs_buf log item is in AIL list, but xlog is already
      shutdown, so xfs_log_worker() will not wakeup xfsaild to submit AIL list.
      Then the last 'b_hold' will no chance to be decreased. Then xfs_wait_buftarg()
      will dead loop to free xfs_buf.
      To solve above issue there is need to push AIL list before call xfs_wait_buftarg().
      As xfs_log_mount_finish() return error, xfs_mountfs() will call xfs_log_mount_cancel()
      to clean AIL list, and call xfs_wait_buftarg() to make sure all xfs_buf has been
      reclaimed. So what we need to do is call xfs_wait_buftarg() when 'error == 0' in
      xfs_log_mount_finish().
      Signed-off-by: NYe Bin <yebin@huawei.com>
      Signed-off-by: NLong Li <leo.lilong@huawei.com>
      729a6872
    • L
      xfs: fix ag count overflow during growfs · 2b06545e
      Long Li 提交于
      Offering: HULK
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I73AXQ
      
      --------------------------------
      
      syzkaller found a UAF:
      
      ==================================================================
      BUG: KASAN: use-after-free in __rb_erase_augmented include/linux/rbtree_augmented.h:225 [inline]
      BUG: KASAN: use-after-free in rb_erase+0x16e/0x690 lib/rbtree.c:443
      Write of size 8 at addr ffff888101990a40 by task kworker/1:1H/114
      CPU: 1 PID: 114 Comm: kworker/1:1H Not tainted 5.10.0-00734-gc980ff0a1f18-dirty #1
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
      	 BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
      Workqueue: xfs-log/sda xlog_ioend_work
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0xbe/0xfd lib/dump_stack.c:118
       kasan_report+0x3a/0x50 mm/kasan/report.c:559
       __rb_erase_augmented include/linux/rbtree_augmented.h:225 [inline]
       rb_erase+0x16e/0x690 lib/rbtree.c:443
       xfs_extent_busy_clear_one+0x5a/0x1c0 fs/xfs/xfs_extent_busy.c:517
       xfs_extent_busy_clear+0x18b/0x1d0 fs/xfs/xfs_extent_busy.c:569
       xlog_cil_committed+0x12a/0x370 fs/xfs/xfs_log_cil.c:659
       xlog_cil_process_committed+0xbc/0xe0 fs/xfs/xfs_log_cil.c:683
       xlog_state_do_iclog_callbacks+0x30c/0x4b0 fs/xfs/xfs_log.c:2777
       xlog_state_do_callback+0x99/0x150 fs/xfs/xfs_log.c:2802
       xlog_ioend_work+0x57/0xc0 fs/xfs/xfs_log.c:1308
       process_one_work+0x406/0x810 kernel/workqueue.c:2280
       worker_thread+0x96/0x720 kernel/workqueue.c:2426
       kthread+0x1f4/0x250 kernel/kthread.c:313
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:299
      
      Allocated by task 22679:
       kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
       kasan_set_track mm/kasan/common.c:56 [inline]
       set_alloc_info mm/kasan/common.c:498 [inline]
       __kasan_kmalloc mm/kasan/common.c:530 [inline]
       __kasan_kmalloc.constprop.0+0xf0/0x130 mm/kasan/common.c:501
       kmalloc include/linux/slab.h:568 [inline]
       kmem_alloc+0xc2/0x230 fs/xfs/kmem.c:21
       kmem_zalloc fs/xfs/kmem.h:69 [inline]
       xfs_extent_busy_insert+0x3c/0x370 fs/xfs/xfs_extent_busy.c:36
       __xfs_free_extent+0x268/0x340 fs/xfs/libxfs/xfs_alloc.c:3327
       xfs_free_extent fs/xfs/libxfs/xfs_alloc.h:183 [inline]
       xfs_ag_extend_space+0x26e/0x280 fs/xfs/libxfs/xfs_ag.c:540
       xfs_growfs_data_private.isra.0+0x64e/0x6f0 fs/xfs/xfs_fsops.c:112
       xfs_growfs_data+0x287/0x360 fs/xfs/xfs_fsops.c:239
       xfs_file_ioctl+0x9f2/0x1320 fs/xfs/xfs_ioctl.c:2274
       vfs_ioctl fs/ioctl.c:48 [inline]
       __do_sys_ioctl fs/ioctl.c:753 [inline]
       __se_sys_ioctl+0x111/0x160 fs/ioctl.c:739
       do_syscall_64+0x30/0x40 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x61/0xc6
      
      Freed by task 114:
       kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
       kasan_set_track+0x1c/0x30 mm/kasan/common.c:56
       kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:361
       __kasan_slab_free.part.0+0x13f/0x1b0 mm/kasan/common.c:482
       slab_free_hook mm/slub.c:1569 [inline]
       slab_free_freelist_hook mm/slub.c:1608 [inline]
       slab_free mm/slub.c:3179 [inline]
       kfree+0xce/0x860 mm/slub.c:4176
       kvfree+0x47/0x50 mm/util.c:647
       xfs_extent_busy_clear+0x18b/0x1d0 fs/xfs/xfs_extent_busy.c:569
       xlog_cil_committed+0x12a/0x370 fs/xfs/xfs_log_cil.c:659
       xlog_cil_process_committed+0xbc/0xe0 fs/xfs/xfs_log_cil.c:683
       xlog_state_do_iclog_callbacks+0x30c/0x4b0 fs/xfs/xfs_log.c:2777
       xlog_state_do_callback+0x99/0x150 fs/xfs/xfs_log.c:2802
       xlog_ioend_work+0x57/0xc0 fs/xfs/xfs_log.c:1308
       process_one_work+0x406/0x810 kernel/workqueue.c:2280
       worker_thread+0x96/0x720 kernel/workqueue.c:2426
       kthread+0x1f4/0x250 kernel/kthread.c:313
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:299
      
      The buggy address belongs to the object at ffff888101990a40
       which belongs to the cache kmalloc-64 of size 64
      The buggy address is located 0 bytes inside of
       64-byte region [ffff888101990a40, ffff888101990a80)
      The buggy address belongs to the page:
      page:ffffea0004066400 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x101990
      head:ffffea0004066400 order:1 compound_mapcount:0
      flags: 0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
      raw: 0017ffffc0010200 ffffea00009d5d88 ffff888100040a70 ffff88810004d500
      raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff888101990900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff888101990980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      >ffff888101990a00: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
                                                 ^
       ffff888101990a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff888101990b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      ==================================================================
      
      The bug can be reproduced with the following sequence:
      
       # truncate -s  1073741824 xfs_test.img
       # mkfs.xfs -f -b size=1024 -d agcount=4 xfs_test.img
       # truncate -s 2305843009213693952  xfs_test.img
       # mount -o loop xfs_test.img /mnt/test
       # fsstress -d /mnt/test -l 0 -n 10000 >/dev/null &
       # xfs_growfs -D  1125899907891200  /mnt/test
      
      The root cause is that during growfs, user space passed in a large value
      of newblcoks to xfs_growfs_data_private(), due to current sb_agblocks is
      too small, new AG count will exceed UINT_MAX. Because of AG number type
      is unsigned int and it would overflow, that caused nagcount much smaller
      than the actual value and new blocks in the old last AG very large. When
      old last AG expand the space, xfs_extlen_t type is unsigned int, it would
      overflow again, if new blocks exceed UINT_MAX and the lower 32 bit are
      zero. This will cause busy extent whose length is equal to zero insert into
      rbtree, xfs_extent_busy_clear_one() will free it abormally but not remove
      it from tbree, UAF will be triggered when access rbtree on the next time.
      Fix it by add checks for nagcount overflow inxfs_growfs_data_private.
      Signed-off-by: NLong Li <leo.lilong@huawei.com>
      2b06545e