1. 29 5月, 2015 3 次提交
    • B
      xfs: allocate sparse inode chunks on full chunk allocation failure · 56d1115c
      Brian Foster 提交于
      xfs_ialloc_ag_alloc() makes several attempts to allocate a full inode
      chunk. If all else fails, reduce the allocation to the sparse length and
      alignment and attempt to allocate a sparse inode chunk.
      
      If sparse chunk allocation succeeds, check whether an inobt record
      already exists that can track the chunk. If so, inherit and update the
      existing record. Otherwise, insert a new record for the sparse chunk.
      
      Create helpers to align sparse chunk inode records and insert or update
      existing records in the inode btrees. The xfs_inobt_insert_sprec()
      helper implements the merge or update semantics required for sparse
      inode records with respect to both the inobt and finobt. To update the
      inobt, either insert a new record or merge with an existing record. To
      update the finobt, use the updated inobt record to either insert or
      replace an existing record.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      56d1115c
    • B
      xfs: helper to convert holemask to inode alloc. bitmap · 4148c347
      Brian Foster 提交于
      The inobt record holemask field is a condensed data type designed to fit
      into the existing on-disk record and is zero based (allocated regions
      are set to 0, sparse regions are set to 1) to provide backwards
      compatibility. This makes the type somewhat complex for use in higher
      level inode manipulations such as individual inode allocation, etc.
      
      Rather than foist the complexity of dealing with this field to every bit
      of logic that requires inode granular information, create a helper to
      convert the holemask to an inode allocation bitmap. The inode allocation
      bitmap is inode granularity similar to the inobt record free mask and
      indicates which inodes of the chunk are physically allocated on disk,
      irrespective of whether the inode is considered allocated or free by the
      filesystem.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      4148c347
    • B
      xfs: introduce inode record hole mask for sparse inode chunks · 5419040f
      Brian Foster 提交于
      The inode btrees track 64 inodes per record regardless of inode size.
      Thus, inode chunks on disk vary in size depending on the size of the
      inodes. This creates a contiguous allocation requirement for new inode
      chunks that can be difficult to satisfy on an aged and fragmented (free
      space) filesystems.
      
      The inode record freecount currently uses 4 bytes on disk to track the
      free inode count. With a maximum freecount value of 64, only one byte is
      required. Convert the freecount field to a single byte and use two of
      the remaining 3 higher order bytes left for the hole mask field. Use the
      final leftover byte for the total count field.
      
      The hole mask field tracks holes in the chunks of physical space that
      the inode record refers to. This facilitates the sparse allocation of
      inode chunks when contiguous chunks are not available and allows the
      inode btrees to identify what portions of the chunk contain valid
      inodes. The total count field contains the total number of valid inodes
      referred to by the record. This can also be deduced from the hole mask.
      The count field provides clarity and redundancy for internal record
      verification.
      
      Note that neither of the new fields can be written to disk on fs'
      without sparse inode support. Doing so writes to the high-order bytes of
      freecount and causes corruption from the perspective of older kernels.
      The on-disk inobt record data structure is updated with a union to
      distinguish between the original, "full" format and the new, "sparse"
      format. The conversion routines to get, insert and update records are
      updated to translate to and from the on-disk record accordingly such
      that freecount remains a 4-byte value on non-supported fs, yet the new
      fields of the in-core record are always valid with respect to the
      record. This means that higher level code can refer to the current
      in-core record format unconditionally and lower level code ensures that
      records are translated to/from disk according to the capabilities of the
      fs.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      5419040f
  2. 28 11月, 2014 2 次提交
  3. 25 6月, 2014 2 次提交
  4. 24 4月, 2014 2 次提交
  5. 14 4月, 2014 1 次提交
  6. 27 2月, 2014 2 次提交
  7. 31 10月, 2013 1 次提交
  8. 24 10月, 2013 2 次提交
    • D
      xfs: decouple inode and bmap btree header files · a4fbe6ab
      Dave Chinner 提交于
      Currently the xfs_inode.h header has a dependency on the definition
      of the BMAP btree records as the inode fork includes an array of
      xfs_bmbt_rec_host_t objects in it's definition.
      
      Move all the btree format definitions from xfs_btree.h,
      xfs_bmap_btree.h, xfs_alloc_btree.h and xfs_ialloc_btree.h to
      xfs_format.h to continue the process of centralising the on-disk
      format definitions. With this done, the xfs inode definitions are no
      longer dependent on btree header files.
      
      The enables a massive culling of unnecessary includes, with close to
      200 #include directives removed from the XFS kernel code base.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      a4fbe6ab
    • D
      xfs: decouple log and transaction headers · 239880ef
      Dave Chinner 提交于
      xfs_trans.h has a dependency on xfs_log.h for a couple of
      structures. Most code that does transactions doesn't need to know
      anything about the log, but this dependency means that they have to
      include xfs_log.h. Decouple the xfs_trans.h and xfs_log.h header
      files and clean up the includes to be in dependency order.
      
      In doing this, remove the direct include of xfs_trans_reserve.h from
      xfs_trans.h so that we remove the dependency between xfs_trans.h and
      xfs_mount.h. Hence the xfs_trans.h include can be moved to the
      indicate the actual dependencies other header files have on it.
      
      Note that these are kernel only header files, so this does not
      translate to any userspace changes at all.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      239880ef
  9. 08 5月, 2013 1 次提交
    • D
      xfs: introduce CONFIG_XFS_WARN · 742ae1e3
      Dave Chinner 提交于
      Running a CONFIG_XFS_DEBUG kernel in production environments is not
      the best idea as it introduces significant overhead, can change
      the behaviour of algorithms (such as allocation) to improve test
      coverage, and (most importantly) panic the machine on non-fatal
      errors.
      
      There are many cases where all we want to do is run a
      kernel with more bounds checking enabled, such as is provided by the
      ASSERT() statements throughout the code, but without all the
      potential overhead and drawbacks.
      
      This patch converts all the ASSERT statements to evaluate as
      WARN_ON(1) statements and hence if they fail dump a warning and a
      stack trace to the log. This has minimal overhead and does not
      change any algorithms, and will allow us to find strange "out of
      bounds" problems more easily on production machines.
      
      There are a few places where assert statements contain debug only
      code. These are converted to be debug-or-warn only code so that we
      still get all the assert checks in the code.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      742ae1e3
  10. 22 4月, 2013 1 次提交
    • C
      xfs: add support for large btree blocks · ee1a47ab
      Christoph Hellwig 提交于
      Add support for larger btree blocks that contains a CRC32C checksum,
      a filesystem uuid and block number for detecting filesystem
      consistency and out of place writes.
      
      [dchinner@redhat.com] Also include an owner field to allow reverse
      mappings to be implemented for improved repairability and a LSN
      field to so that log recovery can easily determine the last
      modification that made it to disk for each buffer.
      
      [dchinner@redhat.com] Add buffer log format flags to indicate the
      type of buffer to recovery so that we don't have to do blind magic
      number tests to determine what the buffer is.
      
      [dchinner@redhat.com] Modified to fit into the verifier structure.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      ee1a47ab
  11. 16 11月, 2012 4 次提交
    • D
      xfs: convert buffer verifiers to an ops structure. · 1813dd64
      Dave Chinner 提交于
      To separate the verifiers from iodone functions and associate read
      and write verifiers at the same time, introduce a buffer verifier
      operations structure to the xfs_buf.
      
      This avoids the need for assigning the write verifier, clearing the
      iodone function and re-running ioend processing in the read
      verifier, and gets rid of the nasty "b_pre_io" name for the write
      verifier function pointer. If we ever need to, it will also be
      easier to add further content specific callbacks to a buffer with an
      ops structure in place.
      
      We also avoid needing to export verifier functions, instead we
      can simply export the ops structures for those that are needed
      outside the function they are defined in.
      
      This patch also fixes a directory block readahead verifier issue
      it exposed.
      
      This patch also adds ops callbacks to the inode/alloc btree blocks
      initialised by growfs. These will need more work before they will
      work with CRCs.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NPhil White <pwhite@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      1813dd64
    • D
      xfs: connect up write verifiers to new buffers · b0f539de
      Dave Chinner 提交于
      Metadata buffers that are read from disk have write verifiers
      already attached to them, but newly allocated buffers do not. Add
      appropriate write verifiers to all new metadata buffers.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      b0f539de
    • D
      xfs: add pre-write metadata buffer verifier callbacks · 612cfbfe
      Dave Chinner 提交于
      These verifiers are essentially the same code as the read verifiers,
      but do not require ioend processing. Hence factor the read verifier
      functions and add a new write verifier wrapper that is used as the
      callback.
      
      This is done as one large patch for all verifiers rather than one
      patch per verifier as the change is largely mechanical. This
      includes hooking up the write verifier via the read verifier
      function.
      
      Hooking up the write verifier for buffers obtained via
      xfs_trans_get_buf() will be done in a separate patch as that touches
      code in many different places rather than just the verifier
      functions.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      612cfbfe
    • D
      xfs: verify btree blocks as they are read from disk · 3d3e6f64
      Dave Chinner 提交于
      Add an btree block verify callback function and pass it into the
      buffer read functions. Because each different btree block type
      requires different verification, add a function to the ops structure
      that is called from the generic code.
      
      Also, propagate the verification callback functions through the
      readahead functions, and into the external bmap and bulkstat inode
      readahead code that uses the generic btree buffer read functions.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NPhil White <pwhite@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      3d3e6f64
  12. 15 5月, 2012 1 次提交
  13. 13 7月, 2011 1 次提交
  14. 19 10月, 2010 1 次提交
    • C
      xfs: remove the ->kill_root btree operation · c0e59e1a
      Christoph Hellwig 提交于
      The implementation os ->kill_root only differ by either simply
      zeroing out the now unused buffer in the btree cursor in the inode
      allocation btree or using xfs_btree_setbuf in the allocation btree.
      
      Initially both of them used xfs_btree_setbuf, but the use in the
      ialloc btree was removed early on because it interacted badly with
      xfs_trans_binval.
      
      In addition to zeroing out the buffer in the cursor xfs_btree_setbuf
      updates the bc_ra array in the btree cursor, and calls
      xfs_trans_brelse on the buffer previous occupying the slot.
      
      The bc_ra update should be done for the alloc btree updated too,
      although the lack of it does not cause serious problems.  The
      xfs_trans_brelse call on the other hand is effectively a no-op in
      the end - it keeps decrementing the bli_recur refcount until it hits
      zero, and then just skips out because the buffer will always be
      dirty at this point.  So removing it for the allocation btree is
      just fine.
      
      So unify the code and move it to xfs_btree.c.  While we're at it
      also replace the call to xfs_btree_setbuf with a NULL bp argument in
      xfs_btree_del_cursor with a direct call to xfs_trans_brelse given
      that the cursor is beeing freed just after this and the state
      updates are superflous.  After this xfs_btree_setbuf is only used
      with a non-NULL bp argument and can thus be simplified.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      c0e59e1a
  15. 27 7月, 2010 2 次提交
  16. 29 3月, 2009 1 次提交
  17. 30 10月, 2008 13 次提交