1. 24 9月, 2012 7 次提交
    • S
      GFS2: Fix case where reservation finished at end of rgrp · 5d50d532
      Steven Whitehouse 提交于
      One corner case which the original patch failed to take into
      account was when there is a reservation which ended such that
      the following block was one beyond the end of the rgrp in
      question. This extra test fixes that case.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Reported-by: NBob Peterson <rpeterso@redhat.com>
      Tested-by: NBob Peterson <rpeterso@redhat.com>
      5d50d532
    • M
      GFS2: Use RB_CLEAR_NODE() rather than rb_init_node() · 24d634e8
      Michel Lespinasse 提交于
      gfs2 calls RB_EMPTY_NODE() to check if nodes are not on an rbtree.
      The corresponding initialization function is RB_CLEAR_NODE().
      rb_init_node() was never clearly defined and is going away.
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      24d634e8
    • S
      GFS2: Update rgblk_free() to use rbm · 3b1d0b9d
      Steven Whitehouse 提交于
      Replace open coded version with a call to gfs2_rbm_from_block()
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      3b1d0b9d
    • S
      GFS2: Update gfs2_get_block_type() to use rbm · 3983903a
      Steven Whitehouse 提交于
      Use the new gfs2_rbm_from_block() function to replace an open
      coded version of the same code.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      3983903a
    • S
      GFS2: Replace rgblk_search with gfs2_rbm_find · 5b924ae2
      Steven Whitehouse 提交于
      This is part of a series of patches which are introducing the
      gfs2_rbm structure throughout the block allocation code. The
      main aim of this part is to create a search function which can
      deal directly with struct gfs2_rbm. In this case it specifies
      the initial position at which to start the search and also the
      point at which the search terminates.
      
      The net result of this is to clean up the search code and make
      it rather more readable, and the various possible exceptions which
      may occur during the search are partitioned into their own functions.
      
      There are some bug fixes too. We should not be checking the reservations
      while allocating extents - the time for that is when we are searching
      for where to put the extent, not when we've already made that decision.
      
      Also, rgblk_search had two uses, and in only one of those cases did
      it make sense to check for reservations. This is fixed in the new
      gfs2_rbm_find function, which has a cleaner interface.
      
      The reservation checking has been improved by always checking for
      contiguous reservations, and returning the first free block after
      all contiguous reservations. This is done under the spin lock to
      ensure consistancy of the tree.
      
      The allocation of extents is now in all cases done by the existing
      allocation code, and if there is an active reservation, that is updated
      after the fact. Again this is done under the spin lock, since it entails
      changing the lookup key for the reservation in question.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5b924ae2
    • S
      GFS2: Add structure to contain rgrp, bitmap, offset tuple · 4a993fb1
      Steven Whitehouse 提交于
      This patch introduces a new structure, gfs2_rbm, which is a
      tuple of a resource group, a bitmap within the resource group
      and an offset within that bitmap. This is designed to make
      manipulating these sets of variables easier. There is also a
      new helper function which converts this representation back
      to a disk block address.
      
      In addition, the rbtree nodes which are used for the reservations
      were not being correctly initialised, which is now fixed. Also,
      the tracing was not passing through the inode where it should
      have been. That is mostly fixed aside from one corner case. This
      needs to be revisited since there can also be a NULL rgrp in
      some cases which results in the device being incorrect in the
      trace.
      
      This is intended to be the first step towards cleaning up some
      of the allocation code, and some further bug fixes.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      4a993fb1
    • S
      GFS2: Remove rs_requested field from reservations · 71f890f7
      Steven Whitehouse 提交于
      The rs_requested field is left over from the original allocation
      code, however this should have been a parameter passed to the
      various functions from gfs2_inplace_reserve() and not a member of the
      reservation structure as the value is not required after the
      initial allocation.
      
      This also helps simplify the code since we no longer need to set
      the rs_requested to zero. Also the gfs2_inplace_release()
      function can also be simplified since the reservation structure
      will always be defined when it is called, and the only remaining
      task is to unlock the rgrp if required. It can also now be
      called unconditionally too, resulting in a further simplification.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      71f890f7
  2. 13 9月, 2012 1 次提交
    • S
      GFS2: Take account of blockages when using reserved blocks · 62e252ee
      Steven Whitehouse 提交于
      The claim_reserved_blks() function was not taking account of
      the possibility of "blockages" while performing allocation.
      This can be caused by another node allocating something in
      the same extent which has been reserved locally.
      
      This patch tests for this condition and then skips the remainder
      of the reservation in this case. This is a relatively rare event,
      so that it should not affect the general performance improvement
      which the block reservations provide.
      
      The claim_reserved_blks() function also appears not to be able
      to deal with reservations which cross bitmap boundaries, but
      that can be dealt with in a future patch since we don't generate
      boundary crossing reservations currently.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Reported-by: NDavid Teigland <teigland@redhat.com>
      Cc: Bob Peterson <rpeterso@redhat.com>
      62e252ee
  3. 19 7月, 2012 1 次提交
    • B
      GFS2: Reduce file fragmentation · 8e2e0047
      Bob Peterson 提交于
      This patch reduces GFS2 file fragmentation by pre-reserving blocks. The
      resulting improved on disk layout greatly speeds up operations in cases
      which would have resulted in interlaced allocation of blocks previously.
      A typical example of this is 10 parallel dd processes, each writing to a
      file in a common dirctory.
      
      The implementation uses an rbtree of reservations attached to each
      resource group (and each inode).
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      8e2e0047
  4. 18 7月, 2012 1 次提交
  5. 14 6月, 2012 1 次提交
  6. 08 6月, 2012 1 次提交
    • B
      GFS2: Use lvbs for storing rgrp information with mount option · 90306c41
      Benjamin Marzinski 提交于
      Instead of reading in the resource groups when gfs2 is checking
      for free space to allocate from, gfs2 can store the necessary infromation
      in the resource group's lvb.  Also, instead of searching for unlinked
      inodes in every resource group that's checked for free space, gfs2 can
      store the number of unlinked but inodes in the lvb, and only check for
      unlinked inodes if it will find some.
      
      The first time a resource group is locked, the lvb must initialized.
      Since this involves counting the unlinked inodes in the resource group,
      this takes a little extra time.  But after that, if the resource group
      is locked with GL_SKIP, the buffer head won't be read in unless it's
      actually needed.
      
      Enabling the resource groups lvbs is done via the rgrplvb mount option.  If
      this option isn't set, the lvbs will still be set and updated, but they won't
      be verfied or used by the filesystem.  To safely turn on this option, all of
      the nodes mounting the filesystem must be running code with this patch, and
      the filesystem must have been completely unmounted since they were updated.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      90306c41
  7. 06 6月, 2012 2 次提交
  8. 11 5月, 2012 1 次提交
    • B
      GFS2: Add rgrp information to block_alloc trace point · 41db1ab9
      Bob Peterson 提交于
      This is a second attempt at a patch that adds rgrp information to the
      block allocation trace point for GFS2. As suggested, the patch was
      modified to list the rgrp information _after_ the fields that exist today.
      
      Again, the reason for this patch is to allow us to trace and debug
      problems with the block reservations patch, which is still in the works.
      We can debug problems with reservations if we can see what block allocations
      result from the block reservations. It may also be handy in figuring out
      if there are problems in rgrp free space accounting. In other words,
      we can use it to track the rgrp and its free space along side the allocations
      that are taking place.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      41db1ab9
  9. 27 4月, 2012 1 次提交
  10. 24 4月, 2012 5 次提交
  11. 05 4月, 2012 1 次提交
  12. 26 3月, 2012 1 次提交
  13. 05 3月, 2012 2 次提交
    • B
      GFS2: make sure rgrps are up to date in func gfs2_blk2rgrpd · 58884c4d
      Bob Peterson 提交于
      This patch adds a call to gfs2_rindex_update from function gfs2_blk2rgrpd
      and removes calls to it that are made redundant by it. The problem is
      that a gfs2_grow can add rgrps to the rindex, then put those rgrps into
      use, thus rendering the rindex we read in at mount time incomplete.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      58884c4d
    • B
      GFS2: Eliminate sd_rindex_mutex · 6aad1c3d
      Bob Peterson 提交于
      Over time, we've slowly eliminated the use of sd_rindex_mutex.
      Up to this point, it was only used in two places: function
      gfs2_ri_total (which totals the file system size by reading
      and parsing the rindex file) and function gfs2_rindex_update
      which updates the rgrps in memory. Both of these functions have
      the rindex glock to protect them, so the rindex is unnecessary.
      Since gfs2_grow writes to the rindex via the meta_fs, the mutex
      is in the wrong order according to the normal rules. This patch
      eliminates the mutex entirely to avoid the problem.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6aad1c3d
  14. 01 3月, 2012 1 次提交
  15. 29 2月, 2012 1 次提交
    • S
      GFS2: FITRIM ioctl support · 66fc061b
      Steven Whitehouse 提交于
      The FITRIM ioctl provides an alternative way to send discard requests to
      the underlying device. Using the discard mount option results in every
      freed block generating a discard request to the block device. This can
      be slow, since many block devices can only process discard requests of
      larger sizes, and also such operations can be time consuming.
      
      Rather than using the discard mount option, FITRIM allows a sweep of the
      filesystem on an occasional basis, and also to optionally avoid sending
      down discard requests for smaller regions.
      
      In GFS2 FITRIM will work at resource group granularity. There is a flag
      for each resource group which keeps track of which resource groups have
      been trimmed. This flag is reset whenever a deallocation occurs in the
      resource group, and set whenever a successful FITRIM of that resource
      group has taken place. This helps to reduce repeated discard requests
      for the same block ranges, again improving performance.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      66fc061b
  16. 28 2月, 2012 1 次提交
    • S
      GFS2: Read resource groups on mount · a365fbf3
      Steven Whitehouse 提交于
      This makes mount take slightly longer, but at the same time, the first
      write to the filesystem will be faster too. It also means that if there
      is a problem in the resource index, then we can refuse to mount rather
      than having to try and report that when the first write occurs.
      
      In addition, to avoid recursive locking, we hvae to take account of
      instances when the rindex glock may already be held when we are
      trying to update the rbtree of resource groups.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      a365fbf3
  17. 11 1月, 2012 1 次提交
  18. 22 11月, 2011 3 次提交
  19. 21 11月, 2011 2 次提交
    • S
      GFS2: Fix up "off by one" in the previous patch · 465f0a76
      Steven Whitehouse 提交于
      The trace point should take extlen and not *ndata as the
      extent length.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      465f0a76
    • B
      GFS2: move toward a generic multi-block allocator · 6e87ed0f
      Bob Peterson 提交于
      This patch is a revision of the one I previously posted.
      I tried to integrate all the suggestions Steve gave.
      The purpose of the patch is to change function gfs2_alloc_block
      (allocate either a dinode block or an extent of data blocks)
      to a more generic gfs2_alloc_blocks function that can
      allocate both a dinode _and_ an extent of data blocks in the
      same call. This will ultimately help us create a multi-block
      reservation scheme to reduce file fragmentation.
      
      This patch moves more toward a generic multi-block allocator that
      takes a pointer to the number of data blocks to allocate, plus whether
      or not to allocate a dinode. In theory, it could be called to allocate
      (1) a single dinode block, (2) a group of one or more data blocks, or
      (3) a dinode plus several data blocks.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6e87ed0f
  20. 18 11月, 2011 1 次提交
  21. 15 11月, 2011 2 次提交
  22. 21 10月, 2011 3 次提交
    • S
      GFS2: Remove two unused variables · 9ae32429
      Steven Whitehouse 提交于
      The two variables being initialised in gfs2_inplace_reserve
      to track the file & line number of the caller are never
      used, so we might as well remove them.
      
      If something does go wrong, then a stack trace is probably
      more useful anyway.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      9ae32429
    • S
      GFS2: Fix off-by-one in gfs2_blk2rgrpd · f75bbfb4
      Steven Whitehouse 提交于
      Bob reported:
      
      I found an off-by-one problem with how I coded this section:
      It should be:
      
      + else if (blk >= cur->rd_data0 + cur->rd_data)
      
      In fact, cur->rd_data0 + cur->rd_data is the start of the next
      rgrp (the next ri_addr), so without the "=" check it can land on
      the wrong rgrp.
      
      In all normal cases, this won't be a problem: you're searching
      for a block _within_ the rgrp, which will pass the test properly.
      Where it gets into trouble is if you search the rgrps for the
      block exactly equal to ri_addr.  I don't think anything in the
      kernel does this, but I found a place in gfs2-utils gfs2_edit
      where it does.  So I definitely need to fix it in libgfs2.  I'd
      like to suggest we fix it in the kernel as well for the sake of
      keeping the functions similar.
      
      So this patch fixes the above mentioned off by one error as well
      as removing the unused parent pointer.
      Reported-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f75bbfb4
    • S
      GFS2: Correctly set goal block after allocation · ccad4e14
      Steven Whitehouse 提交于
      The new goal block should be set to the end of the newly
      allocated extent, not the start of it.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      ccad4e14