1. 24 9月, 2012 15 次提交
    • B
      GFS2: Stop block extents at the end of bitmaps · 0688a5ec
      Bob Peterson 提交于
      This patch stops multiple block allocations if a nonzero
      return code is received from gfs2_rbm_from_block. Without
      this patch, if enough pressure is put on the file system,
      you get a kernel warning quickly followed by:
      BUG: unable to handle kernel NULL pointer dereference at (null)
      IP: [<ffffffffa04f47e8>] gfs2_alloc_blocks+0x2c8/0x880 [gfs2]
      With this patch, things run normally.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      0688a5ec
    • S
      GFS2: Fix unclaimed_blocks() wrapping bug and clean up · c743ffd0
      Steven Whitehouse 提交于
      When rgd->rd_free_clone is less than rgd->rd_reserved, the
      unclaimed_blocks() calculation would wrap and produce
      incorrect results. This patch checks for this condition
      when this function is called from gfs2_mblk_search()
      
      In addition, the use of this particular function in other
      places in the code has been dropped by means of a general
      clean up of gfs2_inplace_reserve(). This function is now
      much easier to follow.
      
      Also the setting of the rgd->rd_last_alloc field is corrected.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      c743ffd0
    • S
      GFS2: Improve block reservation tracing · 9e733d39
      Steven Whitehouse 提交于
      This patch improves the tracing of block reservations by
      removing some corner cases and also providing more useful
      detail in the traces.
      
      A new field is added to the reservation structure to contain
      the inode number. This is used since in certain contexts it is
      not possible to access the inode itself to obtain this information.
      As a result we can then display the inode number for all tracepoints
      and also in case we dump the resource group.
      
      The "del" tracepoint operation has been removed. This could be called
      with the reservation rgrp set to NULL. That resulted in not printing
      the device number, and thus making the information largely useless
      anyway. Also, the conditional on the rgrp being NULL can then be
      removed from the tracepoint. After this change, all the block
      reservation tracepoint calls will be called with the rgrp information.
      
      The existing ins,clm and tdel calls to the block reservation tracepoint
      are sufficient to track the entire life of the block reservation.
      
      In gfs2_block_alloc() the error detection is updated to print out
      the inode number of the problematic inode. This can then be compared
      against the information in the glock dump,tracepoints, etc.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      9e733d39
    • S
      GFS2: Fall back to ignoring reservations, if there are no other blocks left · 137834a6
      Steven Whitehouse 提交于
      When we get to the stage of allocating blocks, we know that the
      resource group in question must contain enough free blocks, otherwise
      gfs2_inplace_reserve() would have failed. So if we are left with only
      free blocks which are reserved, then we must use those. This can happen
      if another node has sneeked in and use some blocks reserved on this
      node, for example. Generally this will happen very rarely and only
      when the resouce group is nearly full.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      137834a6
    • S
      GFS2: Use rbm for gfs2_setbit() · 3e6339dd
      Steven Whitehouse 提交于
      Use the rbm structure for gfs2_setbit() in order to simplify the
      arguments to the function. We have to add a bool to control whether
      the clone bitmap should be updated (if it exists) but otherwise it
      is a more or less direct substitution.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      3e6339dd
    • S
      GFS2: Use rbm for gfs2_testbit() · c04a2ef3
      Steven Whitehouse 提交于
      Change the arguments to gfs2_testbit() so that it now just takes an
      rbm specifying the position of the two bit entry to return.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      c04a2ef3
    • B
      GFS2: Eliminate unnecessary check for state > 3 in bitfit · 29c05b20
      Bob Peterson 提交于
      Function gfs2_bitfit was checking for state > 3, but that's
      impossible since it is only called from rgblk_search, which receives
      only GFS2_BLKST_ constants.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      29c05b20
    • B
      GFS2: rbm code cleanup · 8d8b752a
      Bob Peterson 提交于
      This patch fixes a few small rbm related things. First, it fixes
      a corner case where the rbm needs to switch bitmaps and wasn't
      adjusting its buffer pointer. Second, there's a white space issue
      fixed. Third, the logic in function gfs2_rbm_from_block was optimized
      a bit. Lastly, a check for goal block overflows was added to function
      gfs2_alloc_blocks.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      8d8b752a
    • S
      GFS2: Fix case where reservation finished at end of rgrp · 5d50d532
      Steven Whitehouse 提交于
      One corner case which the original patch failed to take into
      account was when there is a reservation which ended such that
      the following block was one beyond the end of the rgrp in
      question. This extra test fixes that case.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Reported-by: NBob Peterson <rpeterso@redhat.com>
      Tested-by: NBob Peterson <rpeterso@redhat.com>
      5d50d532
    • M
      GFS2: Use RB_CLEAR_NODE() rather than rb_init_node() · 24d634e8
      Michel Lespinasse 提交于
      gfs2 calls RB_EMPTY_NODE() to check if nodes are not on an rbtree.
      The corresponding initialization function is RB_CLEAR_NODE().
      rb_init_node() was never clearly defined and is going away.
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      24d634e8
    • S
      GFS2: Update rgblk_free() to use rbm · 3b1d0b9d
      Steven Whitehouse 提交于
      Replace open coded version with a call to gfs2_rbm_from_block()
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      3b1d0b9d
    • S
      GFS2: Update gfs2_get_block_type() to use rbm · 3983903a
      Steven Whitehouse 提交于
      Use the new gfs2_rbm_from_block() function to replace an open
      coded version of the same code.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      3983903a
    • S
      GFS2: Replace rgblk_search with gfs2_rbm_find · 5b924ae2
      Steven Whitehouse 提交于
      This is part of a series of patches which are introducing the
      gfs2_rbm structure throughout the block allocation code. The
      main aim of this part is to create a search function which can
      deal directly with struct gfs2_rbm. In this case it specifies
      the initial position at which to start the search and also the
      point at which the search terminates.
      
      The net result of this is to clean up the search code and make
      it rather more readable, and the various possible exceptions which
      may occur during the search are partitioned into their own functions.
      
      There are some bug fixes too. We should not be checking the reservations
      while allocating extents - the time for that is when we are searching
      for where to put the extent, not when we've already made that decision.
      
      Also, rgblk_search had two uses, and in only one of those cases did
      it make sense to check for reservations. This is fixed in the new
      gfs2_rbm_find function, which has a cleaner interface.
      
      The reservation checking has been improved by always checking for
      contiguous reservations, and returning the first free block after
      all contiguous reservations. This is done under the spin lock to
      ensure consistancy of the tree.
      
      The allocation of extents is now in all cases done by the existing
      allocation code, and if there is an active reservation, that is updated
      after the fact. Again this is done under the spin lock, since it entails
      changing the lookup key for the reservation in question.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5b924ae2
    • S
      GFS2: Add structure to contain rgrp, bitmap, offset tuple · 4a993fb1
      Steven Whitehouse 提交于
      This patch introduces a new structure, gfs2_rbm, which is a
      tuple of a resource group, a bitmap within the resource group
      and an offset within that bitmap. This is designed to make
      manipulating these sets of variables easier. There is also a
      new helper function which converts this representation back
      to a disk block address.
      
      In addition, the rbtree nodes which are used for the reservations
      were not being correctly initialised, which is now fixed. Also,
      the tracing was not passing through the inode where it should
      have been. That is mostly fixed aside from one corner case. This
      needs to be revisited since there can also be a NULL rgrp in
      some cases which results in the device being incorrect in the
      trace.
      
      This is intended to be the first step towards cleaning up some
      of the allocation code, and some further bug fixes.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      4a993fb1
    • S
      GFS2: Remove rs_requested field from reservations · 71f890f7
      Steven Whitehouse 提交于
      The rs_requested field is left over from the original allocation
      code, however this should have been a parameter passed to the
      various functions from gfs2_inplace_reserve() and not a member of the
      reservation structure as the value is not required after the
      initial allocation.
      
      This also helps simplify the code since we no longer need to set
      the rs_requested to zero. Also the gfs2_inplace_release()
      function can also be simplified since the reservation structure
      will always be defined when it is called, and the only remaining
      task is to unlock the rgrp if required. It can also now be
      called unconditionally too, resulting in a further simplification.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      71f890f7
  2. 13 9月, 2012 1 次提交
    • S
      GFS2: Take account of blockages when using reserved blocks · 62e252ee
      Steven Whitehouse 提交于
      The claim_reserved_blks() function was not taking account of
      the possibility of "blockages" while performing allocation.
      This can be caused by another node allocating something in
      the same extent which has been reserved locally.
      
      This patch tests for this condition and then skips the remainder
      of the reservation in this case. This is a relatively rare event,
      so that it should not affect the general performance improvement
      which the block reservations provide.
      
      The claim_reserved_blks() function also appears not to be able
      to deal with reservations which cross bitmap boundaries, but
      that can be dealt with in a future patch since we don't generate
      boundary crossing reservations currently.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Reported-by: NDavid Teigland <teigland@redhat.com>
      Cc: Bob Peterson <rpeterso@redhat.com>
      62e252ee
  3. 19 7月, 2012 1 次提交
    • B
      GFS2: Reduce file fragmentation · 8e2e0047
      Bob Peterson 提交于
      This patch reduces GFS2 file fragmentation by pre-reserving blocks. The
      resulting improved on disk layout greatly speeds up operations in cases
      which would have resulted in interlaced allocation of blocks previously.
      A typical example of this is 10 parallel dd processes, each writing to a
      file in a common dirctory.
      
      The implementation uses an rbtree of reservations attached to each
      resource group (and each inode).
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      8e2e0047
  4. 18 7月, 2012 1 次提交
  5. 14 6月, 2012 1 次提交
  6. 08 6月, 2012 1 次提交
    • B
      GFS2: Use lvbs for storing rgrp information with mount option · 90306c41
      Benjamin Marzinski 提交于
      Instead of reading in the resource groups when gfs2 is checking
      for free space to allocate from, gfs2 can store the necessary infromation
      in the resource group's lvb.  Also, instead of searching for unlinked
      inodes in every resource group that's checked for free space, gfs2 can
      store the number of unlinked but inodes in the lvb, and only check for
      unlinked inodes if it will find some.
      
      The first time a resource group is locked, the lvb must initialized.
      Since this involves counting the unlinked inodes in the resource group,
      this takes a little extra time.  But after that, if the resource group
      is locked with GL_SKIP, the buffer head won't be read in unless it's
      actually needed.
      
      Enabling the resource groups lvbs is done via the rgrplvb mount option.  If
      this option isn't set, the lvbs will still be set and updated, but they won't
      be verfied or used by the filesystem.  To safely turn on this option, all of
      the nodes mounting the filesystem must be running code with this patch, and
      the filesystem must have been completely unmounted since they were updated.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      90306c41
  7. 06 6月, 2012 2 次提交
  8. 11 5月, 2012 1 次提交
    • B
      GFS2: Add rgrp information to block_alloc trace point · 41db1ab9
      Bob Peterson 提交于
      This is a second attempt at a patch that adds rgrp information to the
      block allocation trace point for GFS2. As suggested, the patch was
      modified to list the rgrp information _after_ the fields that exist today.
      
      Again, the reason for this patch is to allow us to trace and debug
      problems with the block reservations patch, which is still in the works.
      We can debug problems with reservations if we can see what block allocations
      result from the block reservations. It may also be handy in figuring out
      if there are problems in rgrp free space accounting. In other words,
      we can use it to track the rgrp and its free space along side the allocations
      that are taking place.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      41db1ab9
  9. 27 4月, 2012 1 次提交
  10. 24 4月, 2012 5 次提交
  11. 05 4月, 2012 1 次提交
  12. 26 3月, 2012 1 次提交
  13. 05 3月, 2012 2 次提交
    • B
      GFS2: make sure rgrps are up to date in func gfs2_blk2rgrpd · 58884c4d
      Bob Peterson 提交于
      This patch adds a call to gfs2_rindex_update from function gfs2_blk2rgrpd
      and removes calls to it that are made redundant by it. The problem is
      that a gfs2_grow can add rgrps to the rindex, then put those rgrps into
      use, thus rendering the rindex we read in at mount time incomplete.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      58884c4d
    • B
      GFS2: Eliminate sd_rindex_mutex · 6aad1c3d
      Bob Peterson 提交于
      Over time, we've slowly eliminated the use of sd_rindex_mutex.
      Up to this point, it was only used in two places: function
      gfs2_ri_total (which totals the file system size by reading
      and parsing the rindex file) and function gfs2_rindex_update
      which updates the rgrps in memory. Both of these functions have
      the rindex glock to protect them, so the rindex is unnecessary.
      Since gfs2_grow writes to the rindex via the meta_fs, the mutex
      is in the wrong order according to the normal rules. This patch
      eliminates the mutex entirely to avoid the problem.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6aad1c3d
  14. 01 3月, 2012 1 次提交
  15. 29 2月, 2012 1 次提交
    • S
      GFS2: FITRIM ioctl support · 66fc061b
      Steven Whitehouse 提交于
      The FITRIM ioctl provides an alternative way to send discard requests to
      the underlying device. Using the discard mount option results in every
      freed block generating a discard request to the block device. This can
      be slow, since many block devices can only process discard requests of
      larger sizes, and also such operations can be time consuming.
      
      Rather than using the discard mount option, FITRIM allows a sweep of the
      filesystem on an occasional basis, and also to optionally avoid sending
      down discard requests for smaller regions.
      
      In GFS2 FITRIM will work at resource group granularity. There is a flag
      for each resource group which keeps track of which resource groups have
      been trimmed. This flag is reset whenever a deallocation occurs in the
      resource group, and set whenever a successful FITRIM of that resource
      group has taken place. This helps to reduce repeated discard requests
      for the same block ranges, again improving performance.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      66fc061b
  16. 28 2月, 2012 1 次提交
    • S
      GFS2: Read resource groups on mount · a365fbf3
      Steven Whitehouse 提交于
      This makes mount take slightly longer, but at the same time, the first
      write to the filesystem will be faster too. It also means that if there
      is a problem in the resource index, then we can refuse to mount rather
      than having to try and report that when the first write occurs.
      
      In addition, to avoid recursive locking, we hvae to take account of
      instances when the rindex glock may already be held when we are
      trying to update the rbtree of resource groups.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      a365fbf3
  17. 11 1月, 2012 1 次提交
  18. 22 11月, 2011 3 次提交