1. 26 4月, 2023 1 次提交
  2. 19 4月, 2023 1 次提交
    • D
      xfs: invalidate block device page cache during unmount · 235b15be
      Darrick J. Wong 提交于
      mainline inclusion
      from mainline-v6.2-rc1
      commit 032e1603
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=032e160305f6872e590c77f11896fb28365c6d6c
      
      --------------------------------
      
      Every now and then I see fstests failures on aarch64 (64k pages) that
      trigger on the following sequence:
      
      mkfs.xfs $dev
      mount $dev $mnt
      touch $mnt/a
      umount $mnt
      xfs_db -c 'path /a' -c 'print' $dev
      
      99% of the time this succeeds, but every now and then xfs_db cannot find
      /a and fails.  This turns out to be a race involving udev/blkid, the
      page cache for the block device, and the xfs_db process.
      
      udev is triggered whenever anyone closes a block device or unmounts it.
      The default udev rules invoke blkid to read the fs super and create
      symlinks to the bdev under /dev/disk.  For this, it uses buffered reads
      through the page cache.
      
      xfs_db also uses buffered reads to examine metadata.  There is no
      coordination between xfs_db and udev, which means that they can run
      concurrently.  Note there is no coordination between the kernel and
      blkid either.
      
      On a system with 64k pages, the page cache can cache the superblock and
      the root inode (and hence the root dir) with the same 64k page.  If
      udev spawns blkid after the mkfs and the system is busy enough that it
      is still running when xfs_db starts up, they'll both read from the same
      page in the pagecache.
      
      The unmount writes updated inode metadata to disk directly.  The XFS
      buffer cache does not use the bdev pagecache, nor does it invalidate the
      pagecache on umount.  If the above scenario occurs, the pagecache no
      longer reflects what's on disk, xfs_db reads the stale metadata, and
      fails to find /a.  Most of the time this succeeds because closing a bdev
      invalidates the page cache, but when processes race, everyone loses.
      
      Fix the problem by invalidating the bdev pagecache after flushing the
      bdev, so that xfs_db will see up to date metadata.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NGao Xiang <hsiangkao@linux.alibaba.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      
      Conflicts:
      	fs/xfs/xfs_buf.c
      Signed-off-by: Nyangerkun <yangerkun@huaweicloud.com>
      Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      235b15be
  3. 12 4月, 2023 1 次提交
    • D
      xfs: check buffer pin state after locking in delwri_submit · 9781b974
      Dave Chinner 提交于
      mainline inclusion
      from mainline-v5.17-rc6
      commit dbd0f529
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dbd0f5299302f8506637592e2373891a748c6990
      
      --------------------------------
      
      AIL flushing can get stuck here:
      
      [316649.005769] INFO: task xfsaild/pmem1:324525 blocked for more than 123 seconds.
      [316649.007807]       Not tainted 5.17.0-rc6-dgc+ #975
      [316649.009186] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [316649.011720] task:xfsaild/pmem1   state:D stack:14544 pid:324525 ppid:     2 flags:0x00004000
      [316649.014112] Call Trace:
      [316649.014841]  <TASK>
      [316649.015492]  __schedule+0x30d/0x9e0
      [316649.017745]  schedule+0x55/0xd0
      [316649.018681]  io_schedule+0x4b/0x80
      [316649.019683]  xfs_buf_wait_unpin+0x9e/0xf0
      [316649.021850]  __xfs_buf_submit+0x14a/0x230
      [316649.023033]  xfs_buf_delwri_submit_buffers+0x107/0x280
      [316649.024511]  xfs_buf_delwri_submit_nowait+0x10/0x20
      [316649.025931]  xfsaild+0x27e/0x9d0
      [316649.028283]  kthread+0xf6/0x120
      [316649.030602]  ret_from_fork+0x1f/0x30
      
      in the situation where flushing gets preempted between the unpin
      check and the buffer trylock under nowait conditions:
      
      	blk_start_plug(&plug);
      	list_for_each_entry_safe(bp, n, buffer_list, b_list) {
      		if (!wait_list) {
      			if (xfs_buf_ispinned(bp)) {
      				pinned++;
      				continue;
      			}
      Here >>>>>>
      			if (!xfs_buf_trylock(bp))
      				continue;
      
      This means submission is stuck until something else triggers a log
      force to unpin the buffer.
      
      To get onto the delwri list to begin with, the buffer pin state has
      already been checked, and hence it's relatively rare we get a race
      between flushing and encountering a pinned buffer in delwri
      submission to begin with. Further, to increase the pin count the
      buffer has to be locked, so the only way we can hit this race
      without failing the trylock is to be preempted between the pincount
      check seeing zero and the trylock being run.
      
      Hence to avoid this problem, just invert the order of trylock vs
      pin check. We shouldn't hit that many pinned buffers here, so
      optimising away the trylock for pinned buffers should not matter for
      performance at all.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChandan Babu R <chandan.babu@oracle.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NLong Li <leo.lilong@huawei.com>
      Reviewed-by: NYang Erkun <yangerkun@huawei.com>
      Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      9781b974
  4. 27 12月, 2021 1 次提交
  5. 15 11月, 2021 1 次提交
  6. 16 9月, 2020 11 次提交
  7. 29 7月, 2020 1 次提交
  8. 07 7月, 2020 4 次提交
  9. 03 6月, 2020 1 次提交
    • C
      mm: remove the prot argument from vm_map_ram · d4efd79a
      Christoph Hellwig 提交于
      This is always PAGE_KERNEL - for long term mappings with other properties
      vmap should be used.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Gao Xiang <xiang@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Kelley <mikelley@microsoft.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: Wei Liu <wei.liu@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: http://lkml.kernel.org/r/20200414131348.444715-19-hch@lst.deSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d4efd79a
  10. 08 5月, 2020 1 次提交
  11. 07 5月, 2020 5 次提交
  12. 27 3月, 2020 1 次提交
  13. 12 3月, 2020 2 次提交
  14. 03 3月, 2020 2 次提交
  15. 27 1月, 2020 5 次提交
  16. 19 11月, 2019 2 次提交