1. 12 5月, 2022 5 次提交
    • D
      xfs: XFS_DAS_LEAF_REPLACE state only needed if !LARP · 411b434a
      Dave Chinner 提交于
      We can skip the REPLACE state when LARP is enabled, but that means
      the XFS_DAS_FLIP_LFLAG state is now poorly named - it indicates
      something that has been done rather than what the state is going to
      do. Rename it to "REMOVE_OLD" to indicate that we are now going to
      perform removal of the old attr.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: Allison Henderson<allison.henderson@oracle.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      411b434a
    • D
      xfs: split remote attr setting out from replace path · 7d035336
      Dave Chinner 提交于
      When we set a new xattr, we have three exit paths:
      
      	1. nothing else to do
      	2. allocate and set the remote xattr value
      	3. perform the rest of a replace operation
      
      Currently we push both 2 and 3 into the same state, regardless of
      whether we just set a remote attribute or not. Once we've set the
      remote xattr, we have two exit states:
      
      	1. nothing else to do
      	2. perform the rest of a replace operation
      
      Hence we can split the remote xattr allocation and setting into
      their own states and factor it out of xfs_attr_set_iter() to further
      clean up the state machine and the implementation of the state
      machine.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: Allison Henderson<allison.henderson@oracle.com>
      Reviewed-by: NDave Chinner <david@fromorbit.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      7d035336
    • D
      xfs: consolidate leaf/node states in xfs_attr_set_iter · 251b29c8
      Dave Chinner 提交于
      The operations performed from XFS_DAS_FOUND_LBLK through to
      XFS_DAS_RM_LBLK are now identical to XFS_DAS_FOUND_NBLK through to
      XFS_DAS_RM_NBLK. We can collapse these down into a single set of
      code.
      
      To do this, define the states that leaf and node run through as
      separate sets of sequential states. Then as we move to the next
      state, we can use increments rather than specific state assignments
      to move through the states. This means the state progression is set
      by the initial state that enters the series and we don't need to
      duplicate the code anymore.
      
      At the exit point of the series we need to select the correct leaf
      or node state, but that can also be done by state increment rather
      than assignment.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: Allison Henderson<allison.henderson@oracle.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      251b29c8
    • D
      xfs: kill XFS_DAC_LEAF_ADDNAME_INIT · 2157d169
      Dave Chinner 提交于
      We re-enter the XFS_DAS_FOUND_LBLK state when we have to allocate
      multiple extents for a remote xattr. We currently have a flag
      called XFS_DAC_LEAF_ADDNAME_INIT to avoid running the remote attr
      hole finding code more than once.
      
      However, for the node format tree, we have a separate state for this
      so we never reenter the state machine at XFS_DAS_FOUND_NBLK and so
      it does not need a special flag to skip over the remote attr hold
      finding code.
      
      Convert the leaf block code to use the same state machine as the
      node blocks and kill the  XFS_DAC_LEAF_ADDNAME_INIT flag.
      
      This further points out that this "ALLOC" state is only traversed
      if we have remote xattrs or we are doing a rename operation. Rename
      both the leaf and node alloc states to _ALLOC_RMT to indicate they
      are iterating to do allocation of remote xattr blocks.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: Allison Henderson<allison.henderson@oracle.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      2157d169
    • D
      xfs: separate out initial attr_set states · e0c41089
      Dave Chinner 提交于
      We current use XFS_DAS_UNINIT for several steps in the attr_set
      state machine. We use it for setting shortform xattrs, converting
      from shortform to leaf, leaf add, leaf-to-node and leaf add. All of
      these things are essentially known before we start the state machine
      iterating, so we really should separate them out:
      
      XFS_DAS_SF_ADD:
      	- tries to do a shortform add
      	- on success -> done
      	- on ENOSPC converts to leaf, -> XFS_DAS_LEAF_ADD
      	- on error, dies.
      
      XFS_DAS_LEAF_ADD:
      	- tries to do leaf add
      	- on success:
      		- inline attr -> done
      		- remote xattr || REPLACE -> XFS_DAS_FOUND_LBLK
      	- on ENOSPC converts to node, -> XFS_DAS_NODE_ADD
      	- on error, dies
      
      XFS_DAS_NODE_ADD:
      	- tries to do node add
      	- on success:
      		- inline attr -> done
      		- remote xattr || REPLACE -> XFS_DAS_FOUND_NBLK
      	- on error, dies
      
      This makes it easier to understand how the state machine starts
      up and sets us up on the path to further state machine
      simplifications.
      
      This also converts the DAS state tracepoints to use strings rather
      than numbers, as converting between enums and numbers requires
      manual counting rather than just reading the name.
      
      This also introduces a XFS_DAS_DONE state so that we can trace
      successful operation completions easily.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: Allison Henderson<allison.henderson@oracle.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      e0c41089
  2. 11 5月, 2022 6 次提交
  3. 04 5月, 2022 1 次提交
    • A
      xfs: Set up infrastructure for log attribute replay · fd920008
      Allison Henderson 提交于
      Currently attributes are modified directly across one or more
      transactions. But they are not logged or replayed in the event of an
      error. The goal of log attr replay is to enable logging and replaying
      of attribute operations using the existing delayed operations
      infrastructure.  This will later enable the attributes to become part of
      larger multi part operations that also must first be recorded to the
      log.  This is mostly of interest in the scheme of parent pointers which
      would need to maintain an attribute containing parent inode information
      any time an inode is moved, created, or removed.  Parent pointers would
      then be of interest to any feature that would need to quickly derive an
      inode path from the mount point. Online scrub, nfs lookups and fs grow
      or shrink operations are all features that could take advantage of this.
      
      This patch adds two new log item types for setting or removing
      attributes as deferred operations.  The xfs_attri_log_item will log an
      intent to set or remove an attribute.  The corresponding
      xfs_attrd_log_item holds a reference to the xfs_attri_log_item and is
      freed once the transaction is done.  Both log items use a generic
      xfs_attr_log_format structure that contains the attribute name, value,
      flags, inode, and an op_flag that indicates if the operations is a set
      or remove.
      
      [dchinner: added extra little bits needed for intent whiteouts]
      Signed-off-by: NAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      fd920008
  4. 20 8月, 2021 1 次提交
    • D
      xfs: rename xfs_has_attr() · 51b495eb
      Dave Chinner 提交于
      xfs_has_attr() is poorly named. It has global scope as it is defined
      in a header file, but it has no namespace scope that tells us what
      it is checking has attributes. It's not even clear what "has_attr"
      means, because what it is actually doing is an attribute fork lookup
      to see if the attribute exists.
      
      Upcoming patches use this "xfs_has_<foo>" namespace for global
      filesystem features, which conflicts with this function.
      
      Rename xfs_has_attr() to xfs_attr_lookup() and make it a static
      function, freeing up the "xfs_has_" namespace for global scope
      usage.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      51b495eb
  5. 02 6月, 2021 2 次提交
    • A
      xfs: Add delay ready attr set routines · 8f502a40
      Allison Henderson 提交于
      This patch modifies the attr set routines to be delay ready. This means
      they no longer roll or commit transactions, but instead return -EAGAIN
      to have the calling routine roll and refresh the transaction.  In this
      series, xfs_attr_set_args has become xfs_attr_set_iter, which uses a
      state machine like switch to keep track of where it was when EAGAIN was
      returned. See xfs_attr.h for a more detailed diagram of the states.
      
      Two new helper functions have been added: xfs_attr_rmtval_find_space and
      xfs_attr_rmtval_set_blk.  They provide a subset of logic similar to
      xfs_attr_rmtval_set, but they store the current block in the delay attr
      context to allow the caller to roll the transaction between allocations.
      This helps to simplify and consolidate code used by
      xfs_attr_leaf_addname and xfs_attr_node_addname. xfs_attr_set_args has
      now become a simple loop to refresh the transaction until the operation
      is completed.  Lastly, xfs_attr_rmtval_remove is no longer used, and is
      removed.
      Signed-off-by: NAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      8f502a40
    • A
      xfs: Add delay ready attr remove routines · 2b74b03c
      Allison Henderson 提交于
      This patch modifies the attr remove routines to be delay ready. This
      means they no longer roll or commit transactions, but instead return
      -EAGAIN to have the calling routine roll and refresh the transaction. In
      this series, xfs_attr_remove_args is merged with
      xfs_attr_node_removename become a new function, xfs_attr_remove_iter.
      This new version uses a sort of state machine like switch to keep track
      of where it was when EAGAIN was returned. A new version of
      xfs_attr_remove_args consists of a simple loop to refresh the
      transaction until the operation is completed. A new XFS_DAC_DEFER_FINISH
      flag is used to finish the transaction where ever the existing code used
      to.
      
      Calls to xfs_attr_rmtval_remove are replaced with the delay ready
      version __xfs_attr_rmtval_remove. We will rename
      __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
      done.
      
      xfs_attr_rmtval_remove itself is still in use by the set routines (used
      during a rename).  For reasons of preserving existing function, we
      modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
      set.  Similar to how xfs_attr_remove_args does here.  Once we transition
      the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
      used and will be removed.
      
      This patch also adds a new struct xfs_delattr_context, which we will use
      to keep track of the current state of an attribute operation. The new
      xfs_delattr_state enum is used to track various operations that are in
      progress so that we know not to repeat them, and resume where we left
      off before EAGAIN was returned to cycle out the transaction. Other
      members take the place of local variables that need to retain their
      values across multiple function calls.  See xfs_attr.h for a more
      detailed diagram of the states.
      Signed-off-by: NAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      2b74b03c
  6. 16 4月, 2021 1 次提交
  7. 29 7月, 2020 1 次提交
  8. 14 5月, 2020 1 次提交
  9. 03 3月, 2020 14 次提交
  10. 10 1月, 2020 2 次提交
  11. 31 8月, 2019 1 次提交
    • D
      xfs: allocate xattr buffer on demand · ddbca70c
      Dave Chinner 提交于
      When doing file lookups and checking for permissions, we end up in
      xfs_get_acl() to see if there are any ACLs on the inode. This
      requires and xattr lookup, and to do that we have to supply a buffer
      large enough to hold an maximum sized xattr.
      
      On workloads were we are accessing a wide range of cache cold files
      under memory pressure (e.g. NFS fileservers) we end up spending a
      lot of time allocating the buffer. The buffer is 64k in length, so
      is a contiguous multi-page allocation, and if that then fails we
      fall back to vmalloc(). Hence the allocation here is /expensive/
      when we are looking up hundreds of thousands of files a second.
      
      Initial numbers from a bpf trace show average time in xfs_get_acl()
      is ~32us, with ~19us of that in the memory allocation. Note these
      are average times, so there are going to be affected by the worst
      case allocations more than the common fast case...
      
      To avoid this, we could just do a "null"  lookup to see if the ACL
      xattr exists and then only do the allocation if it exists. This,
      however, optimises the path for the "no ACL present" case at the
      expense of the "acl present" case. i.e. we can halve the time in
      xfs_get_acl() for the no acl case (i.e down to ~10-15us), but that
      then increases the ACL case by 30% (i.e. up to 40-45us).
      
      To solve this and speed up both cases, drive the xattr buffer
      allocation into the attribute code once we know what the actual
      xattr length is. For the no-xattr case, we avoid the allocation
      completely, speeding up that case. For the common ACL case, we'll
      end up with a fast heap allocation (because it'll be smaller than a
      page), and only for the rarer "we have a remote xattr" will we have
      a multi-page allocation occur. Hence the common ACL case will be
      much faster, too.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      ddbca70c
  12. 06 7月, 2019 1 次提交
  13. 30 4月, 2019 1 次提交
    • D
      xfs: always rejoin held resources during defer roll · 710d707d
      Darrick J. Wong 提交于
      During testing of xfs/141 on a V4 filesystem, I observed some
      inconsistent behavior with regards to resources that are held (i.e.
      remain locked) across a defer roll.  The transaction roll always gives
      the defer roll function a new transaction, even if committing the old
      transaction fails.  However, the defer roll function only rejoins the
      held resources if the transaction commit succeedied.  This means that
      callers of defer roll have to figure out whether the held resources are
      attached to the transaction being passed back.
      
      Worse yet, if the defer roll was part of a defer finish call, we have a
      third possibility: the defer finish could pass back a dirty transaction
      with dirty held resources and an error code.
      
      The only sane way to handle all of these scenarios is to require that
      the code that held the resource either cancel the transaction before
      unlocking and releasing the resources, or use functions that detach
      resources from a transaction properly (e.g.  xfs_trans_brelse) if they
      need to drop the reference before committing or cancelling the
      transaction.
      
      In order to make this so, change the defer roll code to join held
      resources to the new transaction unconditionally and fix all the bhold
      callers to release the held buffers correctly.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      710d707d
  14. 12 2月, 2019 1 次提交
  15. 18 10月, 2018 2 次提交