1. 07 7月, 2022 4 次提交
  2. 27 5月, 2022 1 次提交
  3. 04 5月, 2022 1 次提交
    • A
      xfs: Set up infrastructure for log attribute replay · fd920008
      Allison Henderson 提交于
      Currently attributes are modified directly across one or more
      transactions. But they are not logged or replayed in the event of an
      error. The goal of log attr replay is to enable logging and replaying
      of attribute operations using the existing delayed operations
      infrastructure.  This will later enable the attributes to become part of
      larger multi part operations that also must first be recorded to the
      log.  This is mostly of interest in the scheme of parent pointers which
      would need to maintain an attribute containing parent inode information
      any time an inode is moved, created, or removed.  Parent pointers would
      then be of interest to any feature that would need to quickly derive an
      inode path from the mount point. Online scrub, nfs lookups and fs grow
      or shrink operations are all features that could take advantage of this.
      
      This patch adds two new log item types for setting or removing
      attributes as deferred operations.  The xfs_attri_log_item will log an
      intent to set or remove an attribute.  The corresponding
      xfs_attrd_log_item holds a reference to the xfs_attri_log_item and is
      freed once the transaction is done.  Both log items use a generic
      xfs_attr_log_format structure that contains the attribute name, value,
      flags, inode, and an op_flag that indicates if the operations is a set
      or remove.
      
      [dchinner: added extra little bits needed for intent whiteouts]
      Signed-off-by: NAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      fd920008
  4. 28 4月, 2022 1 次提交
  5. 12 4月, 2022 1 次提交
  6. 11 4月, 2022 3 次提交
  7. 17 2月, 2022 1 次提交
  8. 13 1月, 2022 1 次提交
    • D
      xfs: fix online fsck handling of v5 feature bits on secondary supers · 4a9bca86
      Darrick J. Wong 提交于
      While I was auditing the code in xfs_repair that adds feature bits to
      existing V5 filesystems, I decided to have a look at how online fsck
      handles feature bits, and I found a few problems:
      
      1) ATTR2 is added to the primary super when an xattr is set to a file,
      but that isn't consistently propagated to secondary supers.  This isn't
      a corruption, merely a discrepancy that repair will fix if it ever has
      to restore the primary from a secondary.  Hence, if we find a mismatch
      on a secondary, this is a preen condition, not a corruption.
      
      2) There are more compat and ro_compat features now than there used to
      be, but we mask off the newer features from testing.  This means we
      ignore inconsistencies in the INOBTCOUNT and BIGTIME features, which is
      wrong.  Get rid of the masking and compare directly.
      
      3) NEEDSREPAIR, when set on a secondary, is ignored by everyone.  Hence
      a mismatch here should also be flagged for preening, and online repair
      should clear the flag.  Right now we ignore it due to (2).
      
      4) log_incompat features are ephemeral, since we can clear the feature
      bit as soon as the log no longer contains live records for a particular
      log feature.  As such, the only copy we care about is the one in the
      primary super.  If we find any bits set in the secondary super, we
      should flag that for preening, and clear the bits if the user elects to
      repair it.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      4a9bca86
  9. 07 1月, 2022 1 次提交
  10. 22 12月, 2021 2 次提交
    • D
      xfs: fix a bug in the online fsck directory leaf1 bestcount check · e5d1802c
      Darrick J. Wong 提交于
      When xfs_scrub encounters a directory with a leaf1 block, it tries to
      validate that the leaf1 block's bestcount (aka the best free count of
      each directory data block) is the correct size.  Previously, this author
      believed that comparing bestcount to the directory isize (since
      directory data blocks are under isize, and leaf/bestfree blocks are
      above it) was sufficient.
      
      Unfortunately during testing of online repair, it was discovered that it
      is possible to create a directory with a hole between the last directory
      block and isize.  The directory code seems to handle this situation just
      fine and xfs_repair doesn't complain, which effectively makes this quirk
      part of the disk format.
      
      Fix the check to work properly.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      e5d1802c
    • D
      xfs: fix quotaoff mutex usage now that we don't support disabling it · 59d7fab2
      Darrick J. Wong 提交于
      Prior to commit 40b52225 ("xfs: remove support for disabling quota
      accounting on a mounted file system"), we used the quotaoff mutex to
      protect dquot operations against quotaoff trying to pull down dquots as
      part of disabling quota.
      
      Now that we only support turning off quota enforcement, the quotaoff
      mutex only protects changes in m_qflags/sb_qflags.  We don't need it to
      protect dquots, which means we can remove it from setqlimits and the
      dquot scrub code.  While we're at it, fix the function that forces
      quotacheck, since it should have been taking the quotaoff mutex.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      59d7fab2
  11. 20 10月, 2021 5 次提交
    • D
      xfs: rename m_ag_maxlevels to m_allocbt_maxlevels · 7cb3efb4
      Darrick J. Wong 提交于
      Years ago when XFS was thought to be much more simple, we introduced
      m_ag_maxlevels to specify the maximum btree height of per-AG btrees for
      a given filesystem mount.  Then we observed that inode btrees don't
      actually have the same height and split that off; and now we have rmap
      and refcount btrees with much different geometries and separate
      maxlevels variables.
      
      The 'ag' part of the name doesn't make much sense anymore, so rename
      this to m_alloc_maxlevels to reinforce that this is the maximum height
      of the *free space* btrees.  This sets us up for the next patch, which
      will add a variable to track the maximum height of all AG btrees.
      
      (Also take the opportunity to improve adjacent comments and fix minor
      style problems.)
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      7cb3efb4
    • D
      xfs: prepare xfs_btree_cur for dynamic cursor heights · 6ca444cf
      Darrick J. Wong 提交于
      Split out the btree level information into a separate struct and put it
      at the end of the cursor structure as a VLA.  Files with huge data forks
      (and in the future, the realtime rmap btree) will require the ability to
      support many more levels than a per-AG btree cursor, which means that
      we're going to create per-btree type cursor caches to conserve memory
      for the more common case.
      
      Note that a subsequent patch actually introduces dynamic cursor heights.
      This one merely rearranges the structure to prepare for that.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NChandan Babu R <chandan.babu@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      6ca444cf
    • D
      xfs: dynamically allocate btree scrub context structure · eae5db47
      Darrick J. Wong 提交于
      Reorganize struct xchk_btree so that we can dynamically size the context
      structure to fit the type of btree cursor that we have.  This will
      enable us to use memory more efficiently once we start adding very tall
      btree types.  Right-size the lastkey array to match the number of *node*
      levels in the tree so that we stop wasting space.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      eae5db47
    • D
      xfs: don't track firstrec/firstkey separately in xchk_btree · d47fef93
      Darrick J. Wong 提交于
      The btree scrubbing code checks that the records (or keys) that it finds
      in a btree block are all in order by calling the btree cursor's
      ->recs_inorder function.  This of course makes no sense for the first
      item in the block, so we switch that off with a separate variable in
      struct xchk_btree.
      
      Christoph helped me figure out that the variable is unnecessary, since
      we just accessed bc_ptrs[level] and can compare that against zero.  Use
      that, and save ourselves some memory space.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      d47fef93
    • D
      xfs: fix incorrect decoding in xchk_btree_cur_fsbno · 94a14cfd
      Darrick J. Wong 提交于
      During review of subsequent patches, Dave and I noticed that this
      function doesn't work quite right -- accessing cur->bc_ino depends on
      the ROOT_IN_INODE flag, not LONG_PTRS.  Fix that and the parentheses
      isssue.  While we're at it, remove the piece that accesses cur->bc_ag,
      because block 0 of an AG is never part of a btree.
      
      Note: This changes the btree scrubber tracepoints behavior -- if the
      cursor has no buffer for a certain level, it will always report
      NULLFSBLOCK.  It is assumed that anyone tracing the online fsck code
      will also be tracing xchk_start/xchk_done or otherwise be aware of what
      exactly is being scrubbed.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      94a14cfd
  12. 15 10月, 2021 3 次提交
  13. 21 8月, 2021 1 次提交
    • D
      xfs: fix perag structure refcounting error when scrub fails · 61e0d0cc
      Darrick J. Wong 提交于
      The kernel test robot found the following bug when running xfs/355 to
      scrub a bmap btree:
      
      XFS: Assertion failed: !sa->pag, file: fs/xfs/scrub/common.c, line: 412
      ------------[ cut here ]------------
      kernel BUG at fs/xfs/xfs_message.c:110!
      invalid opcode: 0000 [#1] SMP PTI
      CPU: 2 PID: 1415 Comm: xfs_scrub Not tainted 5.14.0-rc4-00021-g48c6615c #1
      Hardware name: Hewlett-Packard p6-1451cx/2ADA, BIOS 8.15 02/05/2013
      RIP: 0010:assfail+0x23/0x28 [xfs]
      RSP: 0018:ffffc9000aacb890 EFLAGS: 00010202
      RAX: 0000000000000000 RBX: ffffc9000aacbcc8 RCX: 0000000000000000
      RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffc09e7dcd
      RBP: ffffc9000aacbc80 R08: ffff8881fdf17d50 R09: 0000000000000000
      R10: 000000000000000a R11: f000000000000000 R12: 0000000000000000
      R13: ffff88820c7ed000 R14: 0000000000000001 R15: ffffc9000aacb980
      FS:  00007f185b955700(0000) GS:ffff8881fdf00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f7f6ef43000 CR3: 000000020de38002 CR4: 00000000001706e0
      Call Trace:
       xchk_ag_read_headers+0xda/0x100 [xfs]
       xchk_ag_init+0x15/0x40 [xfs]
       xchk_btree_check_block_owner+0x76/0x180 [xfs]
       xchk_btree_get_block+0xd0/0x140 [xfs]
       xchk_btree+0x32e/0x440 [xfs]
       xchk_bmap_btree+0xd4/0x140 [xfs]
       xchk_bmap+0x1eb/0x3c0 [xfs]
       xfs_scrub_metadata+0x227/0x4c0 [xfs]
       xfs_ioc_scrub_metadata+0x50/0xc0 [xfs]
       xfs_file_ioctl+0x90c/0xc40 [xfs]
       __x64_sys_ioctl+0x83/0xc0
       do_syscall_64+0x3b/0xc0
      
      The unusual handling of errors while initializing struct xchk_ag is the
      root cause here.  Since the beginning of xfs_scrub, the goal of
      xchk_ag_read_headers has been to read all three AG header buffers and
      attach them both to the xchk_ag structure and the scrub transaction.
      Corruption errors on any of the three headers doesn't necessarily
      trigger an immediate return to userspace, because xfs_scrub can also
      tell us to /fix/ the problem.
      
      In other words, it's possible for the xchk_ag init functions to return
      an error code and a partially filled out structure so that scrub can use
      however much information it managed to pull.  Before 5.15, it was
      sufficient to cancel (or commit) the scrub transaction on the way out of
      the scrub code to release the buffers.
      
      Ccommit 48c6615c added a reference to the perag structure to struct
      xchk_ag.  Since perag structures are not attached to transactions like
      buffers are, this adds the requirement that the perag ref be released
      explicitly.  The scrub teardown function xchk_teardown was amended to do
      this for the xchk_ag embedded in struct xfs_scrub.
      
      Unfortunately, I forgot that certain parts of the scrub code probe
      multiple AGs and therefore handle the initialization and cleanup on
      their own.  Specifically, the bmbt scrubber will initialize it long
      enough to cross-reference AG metadata for btree blocks and for the
      extent mappings in the bmbt.
      
      If one of the AG headers is corrupt, the init function returns with a
      live perag structure reference and some of the AG header buffers.  If an
      error occurs, the cross referencing will be noted as XCORRUPTion and
      skipped, but the main scrub process will move on to the next record.
      It is now necessary to release the perag reference before we try to
      analyze something from a different AG, or else we'll trip over the
      assertion noted above.
      
      Fixes: 48c6615c ("xfs: grab active perag ref when reading AG headers")
      Reported-by: Nkernel test robot <oliver.sang@intel.com>
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com>
      61e0d0cc
  14. 20 8月, 2021 15 次提交