1. 20 5月, 2020 2 次提交
  2. 07 5月, 2020 5 次提交
  3. 05 5月, 2020 1 次提交
  4. 06 4月, 2020 1 次提交
  5. 02 4月, 2020 1 次提交
    • B
      xfs: fix inode number overflow in ifree cluster helper · d9fdd0ad
      Brian Foster 提交于
      Qian Cai reports seemingly random buffer read verifier errors during
      filesystem writeback. This was isolated to a recent patch that
      factored out some inode cluster freeing code and happened to cast an
      unsigned inode number type to a signed value. If the inode number
      value overflows, we can skip marking in-core inodes associated with
      the underlying buffer stale at the time the physical inodes are
      freed. If such an inode happens to be dirty, xfsaild will eventually
      attempt to write it back over non-inode blocks. The invalidation of
      the underlying inode buffer causes writeback to read the buffer from
      disk. This fails the read verifier (preventing eventual corruption)
      if the buffer no longer looks like an inode cluster. Analysis by
      Dave Chinner.
      
      Fix up the helper to use the proper type for inode number values.
      
      Fixes: 5806165a ("xfs: factor inode lookup from xfs_ifree_cluster")
      Reported-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      d9fdd0ad
  6. 29 3月, 2020 1 次提交
  7. 27 3月, 2020 1 次提交
  8. 19 3月, 2020 2 次提交
  9. 12 3月, 2020 2 次提交
  10. 03 3月, 2020 2 次提交
  11. 27 1月, 2020 1 次提交
  12. 24 1月, 2020 1 次提交
  13. 15 1月, 2020 1 次提交
  14. 14 11月, 2019 3 次提交
  15. 05 11月, 2019 1 次提交
  16. 22 10月, 2019 2 次提交
  17. 04 9月, 2019 1 次提交
    • K
      xfs: Fix deadlock between AGI and AGF with RENAME_WHITEOUT · bc56ad8c
      kaixuxia 提交于
      When performing rename operation with RENAME_WHITEOUT flag, we will
      hold AGF lock to allocate or free extents in manipulating the dirents
      firstly, and then doing the xfs_iunlink_remove() call last to hold
      AGI lock to modify the tmpfile info, so we the lock order AGI->AGF.
      
      The big problem here is that we have an ordering constraint on AGF
      and AGI locking - inode allocation locks the AGI, then can allocate
      a new extent for new inodes, locking the AGF after the AGI. Hence
      the ordering that is imposed by other parts of the code is AGI before
      AGF. So we get an ABBA deadlock between the AGI and AGF here.
      
      Process A:
      Call trace:
       ? __schedule+0x2bd/0x620
       schedule+0x33/0x90
       schedule_timeout+0x17d/0x290
       __down_common+0xef/0x125
       ? xfs_buf_find+0x215/0x6c0 [xfs]
       down+0x3b/0x50
       xfs_buf_lock+0x34/0xf0 [xfs]
       xfs_buf_find+0x215/0x6c0 [xfs]
       xfs_buf_get_map+0x37/0x230 [xfs]
       xfs_buf_read_map+0x29/0x190 [xfs]
       xfs_trans_read_buf_map+0x13d/0x520 [xfs]
       xfs_read_agf+0xa6/0x180 [xfs]
       ? schedule_timeout+0x17d/0x290
       xfs_alloc_read_agf+0x52/0x1f0 [xfs]
       xfs_alloc_fix_freelist+0x432/0x590 [xfs]
       ? down+0x3b/0x50
       ? xfs_buf_lock+0x34/0xf0 [xfs]
       ? xfs_buf_find+0x215/0x6c0 [xfs]
       xfs_alloc_vextent+0x301/0x6c0 [xfs]
       xfs_ialloc_ag_alloc+0x182/0x700 [xfs]
       ? _xfs_trans_bjoin+0x72/0xf0 [xfs]
       xfs_dialloc+0x116/0x290 [xfs]
       xfs_ialloc+0x6d/0x5e0 [xfs]
       ? xfs_log_reserve+0x165/0x280 [xfs]
       xfs_dir_ialloc+0x8c/0x240 [xfs]
       xfs_create+0x35a/0x610 [xfs]
       xfs_generic_create+0x1f1/0x2f0 [xfs]
       ...
      
      Process B:
      Call trace:
       ? __schedule+0x2bd/0x620
       ? xfs_bmapi_allocate+0x245/0x380 [xfs]
       schedule+0x33/0x90
       schedule_timeout+0x17d/0x290
       ? xfs_buf_find+0x1fd/0x6c0 [xfs]
       __down_common+0xef/0x125
       ? xfs_buf_get_map+0x37/0x230 [xfs]
       ? xfs_buf_find+0x215/0x6c0 [xfs]
       down+0x3b/0x50
       xfs_buf_lock+0x34/0xf0 [xfs]
       xfs_buf_find+0x215/0x6c0 [xfs]
       xfs_buf_get_map+0x37/0x230 [xfs]
       xfs_buf_read_map+0x29/0x190 [xfs]
       xfs_trans_read_buf_map+0x13d/0x520 [xfs]
       xfs_read_agi+0xa8/0x160 [xfs]
       xfs_iunlink_remove+0x6f/0x2a0 [xfs]
       ? current_time+0x46/0x80
       ? xfs_trans_ichgtime+0x39/0xb0 [xfs]
       xfs_rename+0x57a/0xae0 [xfs]
       xfs_vn_rename+0xe4/0x150 [xfs]
       ...
      
      In this patch we move the xfs_iunlink_remove() call to
      before acquiring the AGF lock to preserve correct AGI/AGF locking
      order.
      Signed-off-by: Nkaixuxia <kaixuxia@tencent.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      bc56ad8c
  18. 27 8月, 2019 1 次提交
  19. 29 6月, 2019 3 次提交
  20. 12 6月, 2019 2 次提交
  21. 02 5月, 2019 1 次提交
  22. 15 4月, 2019 1 次提交
    • B
      xfs: shutdown after buf release in iflush cluster abort path · 22fedd80
      Brian Foster 提交于
      If xfs_iflush_cluster() fails due to corruption, the error path
      issues a shutdown and simulates an I/O completion to release the
      buffer. This code has a couple small problems. First, the shutdown
      sequence can issue a synchronous log force, which is unsafe to do
      with buffer locks held. Second, the simulated I/O completion does not
      guarantee the buffer is async and thus is unlocked and released.
      
      For example, if the last operation on the buffer was a read off disk
      prior to the corruption event, XBF_ASYNC is not set and the buffer
      is left locked and held upon return. This results in a memory leak
      as shown by the following message on module unload:
      
       BUG xfs_buf (...): Objects remaining in xfs_buf on __kmem_cache_shutdown()
      
      Fix both of these problems by setting XBF_ASYNC on the buffer prior
      to the simulated I/O error and performing the shutdown immediately
      after ioend processing when the buffer has been released.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      22fedd80
  23. 15 2月, 2019 2 次提交
  24. 12 2月, 2019 2 次提交
    • D
      xfs: cache unlinked pointers in an rhashtable · 9b247179
      Darrick J. Wong 提交于
      Use a rhashtable to cache the unlinked list incore.  This should speed
      up unlinked processing considerably when there are a lot of inodes on
      the unlinked list because iunlink_remove no longer has to traverse an
      entire bucket list to find which inode points to the one being removed.
      
      The incore list structure records "X.next_unlinked = Y" relations, with
      the rhashtable using Y to index the records.  This makes finding the
      inode X that points to a inode Y very quick.  If our cache fails to find
      anything we can always fall back on the old method.
      
      FWIW this drastically reduces the amount of time it takes to remove
      inodes from the unlinked list.  I wrote a program to open a lot of
      O_TMPFILE files and then close them in the same order, which takes
      a very long time if we have to traverse the unlinked lists.  With the
      ptach, I see:
      
      + /d/t/tmpfile/tmpfile
      Opened 193531 files in 6.33s.
      Closed 193531 files in 5.86s
      
      real    0m12.192s
      user    0m0.064s
      sys     0m11.619s
      + cd /
      + umount /mnt
      
      real    0m0.050s
      user    0m0.004s
      sys     0m0.030s
      
      And without the patch:
      
      + /d/t/tmpfile/tmpfile
      Opened 193588 files in 6.35s.
      Closed 193588 files in 751.61s
      
      real    12m38.853s
      user    0m0.084s
      sys     12m34.470s
      + cd /
      + umount /mnt
      
      real    0m0.086s
      user    0m0.000s
      sys     0m0.060s
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      9b247179
    • D
      xfs: add tracepoints for high level iunlink operations · 4664c66c
      Darrick J. Wong 提交于
      Add tracepoints so we can associate high level operations with low level
      updates.  No functional changes.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      4664c66c