1. 24 2月, 2015 1 次提交
  2. 23 2月, 2015 2 次提交
  3. 22 1月, 2015 4 次提交
    • D
      xfs: set buf types when converting extent formats · fe22d552
      Dave Chinner 提交于
      Conversion from local to extent format does not set the buffer type
      correctly on the new extent buffer when a symlink data is moved out
      of line.
      
      Fix the symlink code and leave a comment in the generic bmap code
      reminding us that the format-specific data copy needs to set the
      destination buffer type appropriately.
      
      cc: <stable@vger.kernel.org> # 3.10 to current
      Tested-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      fe22d552
    • D
      xfs: sanitise sb_bad_features2 handling · 074e427b
      Dave Chinner 提交于
      We currently have to ensure that every time we update sb_features2
      that we update sb_bad_features2. Now that we log and format the
      superblock in it's entirety we actually don't have to care because
      we can simply update the sb_bad_features2 when we format it into the
      buffer. This removes the need for anything but the mount and
      superblock formatting code to care about sb_bad_features2, and
      hence removes the possibility that we forget to update bad_features2
      when necessary in the future.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      074e427b
    • D
      xfs: consolidate superblock logging functions · 61e63ecb
      Dave Chinner 提交于
      We now have several superblock loggin functions that are identical
      except for the transaction reservation and whether it shoul dbe a
      synchronous transaction or not. Consolidate these all into a single
      function, a single reserveration and a sync flag and call it
      xfs_sync_sb().
      
      Also, xfs_mod_sb() is not really a modification function - it's the
      operation of logging the superblock buffer. hence change the name of
      it to reflect this.
      
      Note that we have to change the mp->m_update_flags that are passed
      around at mount time to a boolean simply to indicate a superblock
      update is needed.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      61e63ecb
    • D
      xfs: remove bitfield based superblock updates · 4d11a402
      Dave Chinner 提交于
      When we log changes to the superblock, we first have to write them
      to the on-disk buffer, and then log that. Right now we have a
      complex bitfield based arrangement to only write the modified field
      to the buffer before we log it.
      
      This used to be necessary as a performance optimisation because we
      logged the superblock buffer in every extent or inode allocation or
      freeing, and so performance was extremely important. We haven't done
      this for years, however, ever since the lazy superblock counters
      pulled the superblock logging out of the transaction commit
      fast path.
      
      Hence we have a bunch of complexity that is not necessary that makes
      writing the in-core superblock to disk much more complex than it
      needs to be. We only need to log the superblock now during
      management operations (e.g. during mount, unmount or quota control
      operations) so it is not a performance critical path anymore.
      
      As such, remove the complex field based logging mechanism and
      replace it with a simple conversion function similar to what we use
      for all other on-disk structures.
      
      This means we always log the entirity of the superblock, but again
      because we rarely modify the superblock this is not an issue for log
      bandwidth or CPU time. Indeed, if we do log the superblock
      frequently, delayed logging will minimise the impact of this
      overhead.
      
      [Fixed gquota/pquota inode sharing regression noticed by bfoster.]
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      4d11a402
  4. 09 1月, 2015 4 次提交
  5. 24 12月, 2014 1 次提交
    • J
      xfs: Keep sb_bad_features2 consistent with sb_features2 · 1a43ec03
      Jan Kara 提交于
      Currently when we modify sb_features2, we store the same value also in
      sb_bad_features2. However in most places we forget to mark field
      sb_bad_features2 for logging and thus it can happen that a change to it
      is lost. This results in an inconsistent sb_features2 and
      sb_bad_features2 fields e.g. after xfstests test xfs/187.
      
      Fix the problem by changing XFS_SB_FEATURES2 to actually mean both
      sb_features2 and sb_bad_features2 fields since this is always what we
      want to log. This isn't ideal because the fact that XFS_SB_FEATURES2
      means two fields could cause some problem in future however the code is
      hopefully less error prone that it is now.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      1a43ec03
  6. 04 12月, 2014 6 次提交
    • D
      xfs: fix set-but-unused warnings · 32296f86
      Dave Chinner 提交于
      The kernel compile doesn't turn on these checks by default, so it's
      only when I do a kernel-user sync that I find that there are lots of
      compiler warnings waiting to be fixed. Fix up these set-but-unused
      warnings.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      32296f86
    • D
      xfs: move type conversion functions to xfs_dir.h · 9a2cc41c
      Dave Chinner 提交于
      These are currently considered private to libxfs, but they are
      widely used by the userspace code to decode, walk and check
      directory structures. Hence they really form part of the external
      API and as such need to bemoved to xfs_dir2.h.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      9a2cc41c
    • D
      xfs: move ftype conversion functions to libxfs · 1b767ee3
      Dave Chinner 提交于
      These functions are needed in userspace for repair and mkfs to
      do the right thing. Move them to libxfs so they can be easily
      shared.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      1b767ee3
    • D
      xfs: cleanup xfs_bmse_merge returns · 4db431f5
      Dave Chinner 提交于
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      
      xfs_bmse_merge() has a jump label for return that just returns the
      error value. Convert all the code to just return the error directly
      and use XFS_WANT_CORRUPTED_RETURN. This also allows the final call
      to xfs_bmbt_update() to return directly.
      
      Noticed while reviewing coccinelle return cleanup patches and
      wondering why the same return pattern as in xfs_bmse_shift_one()
      wasn't picked up by the checker pattern...
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      4db431f5
    • D
      xfs: cleanup xfs_bmse_shift_one goto mess · b11bd671
      Dave Chinner 提交于
      xfs_bmse_shift_one() jumps around determining whether to shift or
      merge, making the code flow difficult to follow. Clean it up and
      use direct error returns (including XFS_WANT_CORRUPTED_RETURN) to
      make the code flow better and be easier to read.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      b11bd671
    • D
      xfs: fix premature enospc on inode allocation · 7a1df156
      Dave Chinner 提交于
      After growing a filesystem, XFS can fail to allocate inodes even
      though there is a large amount of space available in the filesystem
      for inodes. The issue is caused by a nearly full allocation group
      having enough free space in it to be considered for inode
      allocation, but not enough contiguous free space to actually
      allocation inodes.  This situation results in successful selection
      of the AG for allocation, then failure of the allocation resulting
      in ENOSPC being reported to the caller.
      
      It is caused by two possible issues. Firstly, we only consider the
      lognest free extent and whether it would fit an inode chunk. If the
      extent is not correctly aligned, then we can't allocate an inode
      chunk in it regardless of the fact that it is large enough. This
      tends to be a permanent error until space in the AG is freed.
      
      The second issue is that we don't actually lock the AGI or AGF when
      we are doing these checks, and so by the time we get to actually
      allocating the inode chunk the space we thought we had in the AG may
      have been allocated. This tends to be a spurious error as it
      requires a race to trigger. Hence this case is ignored in this patch
      as the reported problem is for permanent errors.
      
      The first issue could be addressed by simply taking into account the
      alignment when checking the longest extent. This, however, would
      prevent allocation in AGs that have aligned, exact sized extents
      free. However, this case should be fairly rare compared to the
      number of allocations that occur near ENOSPC that would trigger this
      condition.
      
      Hence, when selecting the inode AG, take into account the inode
      cluster alignment when checking the lognest free extent in the AG.
      If we can't find any AGs with a contiguous free space large
      enough to be aligned, drop the alignment addition and just try for
      an AG that has enough contiguous free space available for an inode
      chunk. This won't prevent issues from occurring, but should avoid
      situations where other AGs have lots of free space but the selected
      AG can't allocate due to alignment constraints.
      Reported-by: NArkadiusz Miskiewicz <arekm@maven.pl>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      7a1df156
  7. 01 12月, 2014 2 次提交
  8. 28 11月, 2014 5 次提交
  9. 02 10月, 2014 2 次提交
  10. 29 9月, 2014 1 次提交
  11. 23 9月, 2014 5 次提交
    • E
      xfs: don't ASSERT on corrupt ftype · fb040131
      Eric Sandeen 提交于
      xfs_dir3_data_get_ftype() gets the file type off disk, but ASSERTs
      if it's invalid:
      
           ASSERT(type < XFS_DIR3_FT_MAX);
      
      We shouldn't ASSERT on bad values read from disk.  V3 dirs are
      CRC-protected, but V2 dirs + ftype are not.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      fb040131
    • B
      xfs: writeback and inval. file range to be shifted by collapse · f71721d0
      Brian Foster 提交于
      The collapse range operation currently writes the entire file before
      starting the collapse to avoid changes in the in-core extent list due to
      writeback causing the extent count to change. Now that collapse range is
      fsb based rather than extent index based it can sustain changes in the
      extent list during the shift sequence without disruption.
      
      Modify xfs_collapse_file_space() to writeback and invalidate pages
      associated with the range of the file to be shifted.
      xfs_free_file_space() currently has similar behavior, but the space free
      need only affect the region of the file that is freed and this could
      change in the future.
      
      Also update the comments to reflect the current implementation. We
      retain the eofblocks trim permanently as a best option for dealing with
      delalloc extents. We don't shift delalloc extents because this scenario
      only occurs with post-eof preallocation (since data must be flushed such
      that the cache can be invalidated and data can be shifted). That means
      said space must also be initialized before being shifted into the
      accessible region of the file only to be immediately truncated off as
      the last part of the collapse. In other words, the eofblocks trim will
      happen anyways, we just run it first to ensure the file remains in a
      consistent state throughout the collapse.
      
      Finally, detect and fail explicitly in the event of a delalloc extent
      during the extent shift. The implementation does not support delalloc
      extents and the caller is expected to prevent this scenario in advance
      as is done by collapse.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      f71721d0
    • B
      xfs: refactor single extent shift into xfs_bmse_shift_one() helper · a979bdfe
      Brian Foster 提交于
      xfs_bmap_shift_extents() has a variety of conditions and error checks
      that make the logic difficult to follow and indent heavy. Refactor the
      loop body of this function into a new xfs_bmse_shift_one() helper. This
      simplifies the error checks, eliminates index decrement on merge hack by
      pushing the index increment down into the helper, and makes the code
      more readable by reducing multiple levels of indentation.
      
      This is a code refactor only. The behavior of extent shift and collapse
      range is not modified.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      a979bdfe
    • B
      xfs: refactor shift-by-merge into xfs_bmse_merge() helper · ddb19e31
      Brian Foster 提交于
      The extent shift mechanism in xfs_bmap_shift_extents() is complicated
      and handles several different, non-deterministic scenarios. These
      include extent shifts, extent merges and potential btree updates in
      either of the former scenarios.
      
      Refactor the code to be more linear and readable. The loop logic in
      xfs_bmap_shift_extents() and some initial error checking is adjusted
      slightly. The associated btree lookup and update/delete operations are
      condensed into single blocks of code. This reduces the number of
      btree-specific blocks and facilitates the separation of the merge
      operation into a new xfs_bmse_merge() and xfs_bmse_can_merge() helpers.
      
      This is a code refactor only. The behavior of extent shift and collapse
      range is not modified.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      ddb19e31
    • B
      xfs: track collapse via file offset rather than extent index · 2c845f5a
      Brian Foster 提交于
      The collapse range implementation uses a transaction per extent shift.
      The progress of the overall operation is tracked via the current extent
      index of the in-core extent list. This is racy because the ilock must be
      dropped and reacquired for each transaction according to locking and log
      reservation rules. Therefore, writeback to prior regions of the file is
      possible and can change the extent count. This changes the extent to
      which the current index refers and causes the collapse to fail mid
      operation. To avoid this problem, the entire file is currently written
      back before the collapse operation starts.
      
      To eliminate the need to flush the entire file, use the file offset
      (fsb) to track the progress of the overall extent shift operation rather
      than the extent index. Modify xfs_bmap_shift_extents() to
      unconditionally convert the start_fsb parameter to an extent index and
      return the file offset of the extent where the shift left off, if
      further extents exist. The bulk of ths function can remain based on
      extent index as ilock is held by the caller. xfs_collapse_file_space()
      now uses the fsb output as the starting point for the subsequent shift.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      2c845f5a
  12. 09 9月, 2014 5 次提交
  13. 02 9月, 2014 1 次提交
    • B
      xfs: don't log inode unless extent shift makes extent modifications · ca446d88
      Brian Foster 提交于
      The file collapse mechanism uses xfs_bmap_shift_extents() to collapse
      all subsequent extents down into the specified, previously punched out,
      region. This function performs some validation, such as whether a
      sufficient hole exists in the target region of the collapse, then shifts
      the remaining exents downward.
      
      The exit path of the function currently logs the inode unconditionally.
      While we must log the inode (and abort) if an error occurs and the
      transaction is dirty, the initial validation paths can generate errors
      before the transaction has been dirtied. This creates an unnecessary
      filesystem shutdown scenario, as the caller will cancel a transaction
      that has been marked dirty.
      
      Modify xfs_bmap_shift_extents() to OR the logflags bits as modifications
      are made to the inode bmap. Only log the inode in the exit path if
      logflags has been set. This ensures we only have to cancel a dirty
      transaction if modifications have been made and prevents an unnecessary
      filesystem shutdown otherwise.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      ca446d88
  14. 04 8月, 2014 1 次提交
    • E
      xfs: avoid false quotacheck after unclean shutdown · 5ef828c4
      Eric Sandeen 提交于
      The commit
      
      83e782e1 xfs: Remove incore use of XFS_OQUOTA_ENFD and XFS_OQUOTA_CHKD
      
      added a new function xfs_sb_quota_from_disk() which swaps
      on-disk XFS_OQUOTA_* flags for in-core XFS_GQUOTA_* and XFS_PQUOTA_*
      flags after the superblock is read.
      
      However, if log recovery is required, the superblock is read again,
      and the modified in-core flags are re-read from disk, so we have
      XFS_OQUOTA_* flags in memory again.  This causes the
      XFS_QM_NEED_QUOTACHECK() test to be true, because the XFS_OQUOTA_CHKD
      is still set, and not XFS_GQUOTA_CHKD or XFS_PQUOTA_CHKD.
      
      Change xfs_sb_from_disk to call xfs_sb_quota_from disk and always
      convert the disk flags to in-memory flags.
      
      Add a lower-level function which can be called with "false" to
      not convert the flags, so that the sb verifier can verify
      exactly what was on disk, per Brian Foster's suggestion.
      Reported-by: NCyril B. <cbay@excellency.fr>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      5ef828c4