1. 15 12月, 2015 1 次提交
    • B
      GFS2: Make rgrp reservations part of the gfs2_inode structure · a097dc7e
      Bob Peterson 提交于
      Before this patch, multi-block reservation structures were allocated
      from a special slab. This patch folds the structure into the gfs2_inode
      structure. The disadvantage is that the gfs2_inode needs more memory,
      even when a file is opened read-only. The advantages are: (a) we don't
      need the special slab and the extra time it takes to allocate and
      deallocate from it. (b) we no longer need to worry that the structure
      exists for things like quota management. (c) This also allows us to
      remove the calls to get_write_access and put_write_access since we
      know the structure will exist.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      a097dc7e
  2. 24 11月, 2015 1 次提交
    • B
      GFS2: Extract quota data from reservations structure (revert 5407e242) · b54e9a0b
      Bob Peterson 提交于
      This patch basically reverts the majority of patch 5407e242.
      That patch eliminated the gfs2_qadata structure in favor of just
      using the reservations structure. The problem with doing that is that
      it increases the size of the reservations structure. That is not an
      issue until it comes time to fold the reservations structure into the
      inode in memory so we know it's always there. By separating out the
      quota structure again, we aren't punishing the non-quota users by
      making all the inodes bigger, requiring more slab space. This patch
      creates a new slab area to allocate the quota stuff so it's managed
      a little more sanely.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      b54e9a0b
  3. 17 11月, 2015 1 次提交
  4. 09 11月, 2015 1 次提交
  5. 30 10月, 2015 1 次提交
  6. 04 9月, 2015 2 次提交
  7. 19 6月, 2015 1 次提交
  8. 19 5月, 2015 1 次提交
  9. 06 5月, 2015 1 次提交
    • A
      gfs2: handle NULL rgd in set_rgrp_preferences · 959b6717
      Abhi Das 提交于
      The function set_rgrp_preferences() does not handle the (rarely
      returned) NULL value from gfs2_rgrpd_get_next() and this patch
      fixes that.
      
      The fs image in question is only 150MB in size which allows for
      only 1 rgrp to be created. The in-memory rb tree has only 1 node
      and when gfs2_rgrpd_get_next() is called on this sole rgrp, it
      returns NULL. (Default behavior is to wrap around the rb tree and
      return the first node to give the illusion of a circular linked
      list. In the case of only 1 rgrp, we can't have
      gfs2_rgrpd_get_next() return the same rgrp (first, last, next all
      point to the same rgrp)... that would cause unintended consequences
      and infinite loops.)
      Signed-off-by: NAbhi Das <adas@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      959b6717
  10. 24 4月, 2015 2 次提交
  11. 19 3月, 2015 1 次提交
    • A
      gfs2: allow quota_check and inplace_reserve to return available blocks · 25435e5e
      Abhi Das 提交于
      struct gfs2_alloc_parms is passed to gfs2_quota_check() and
      gfs2_inplace_reserve() with ap->target containing the number of
      blocks being requested for allocation in the current operation.
      
      We add a new field to struct gfs2_alloc_parms called 'allowed'.
      gfs2_quota_check() and gfs2_inplace_reserve() return the max
      blocks allowed by quota and the max blocks allowed by the chosen
      rgrp respectively in 'allowed'.
      
      A new field 'min_target', when non-zero, tells gfs2_quota_check()
      and gfs2_inplace_reserve() to not return -EDQUOT/-ENOSPC when
      there are atleast 'min_target' blocks allowable/available. The
      assumption is that the caller is ok with just 'min_target' blocks
      and will likely proceed with allocating them.
      Signed-off-by: NAbhi Das <adas@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
      25435e5e
  12. 04 11月, 2014 2 次提交
  13. 03 10月, 2014 1 次提交
  14. 19 9月, 2014 1 次提交
    • A
      GFS2: fix bad inode i_goal values during block allocation · 00a158be
      Abhi Das 提交于
      This patch checks if i_goal is either zero or if doesn't exist
      within any rgrp (i.e gfs2_blk2rgrpd() returns NULL). If so, it
      assigns the ip->i_no_addr block as the i_goal.
      
      There are two scenarios where a bad i_goal can result in a
      -EBADSLT error.
      
      1. Attempting to allocate to an existing inode:
      Control reaches gfs2_inplace_reserve() and ip->i_goal is bad.
      We need to fix i_goal here.
      
      2. A new inode is created in a directory whose i_goal is hosed:
      In this case, the parent dir's i_goal is copied onto the new
      inode. Since the new inode is not yet created, the ip->i_no_addr
      field is invalid and so, the fix in gfs2_inplace_reserve() as per
      1) won't work in this scenario. We need to catch and fix it sooner
      in the parent dir itself (gfs2_create_inode()), before it is
      copied to the new inode.
      Signed-off-by: NAbhi Das <adas@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      00a158be
  15. 18 7月, 2014 1 次提交
  16. 14 5月, 2014 1 次提交
    • B
      GFS2: remove transaction glock · 24972557
      Benjamin Marzinski 提交于
      GFS2 has a transaction glock, which must be grabbed for every
      transaction, whose purpose is to deal with freezing the filesystem.
      Aside from this involving a large amount of locking, it is very easy to
      make the current fsfreeze code hang on unfreezing.
      
      This patch rewrites how gfs2 handles freezing the filesystem. The
      transaction glock is removed. In it's place is a freeze glock, which is
      cached (but not held) in a shared state by every node in the cluster
      when the filesystem is mounted. This lock only needs to be grabbed on
      freezing, and actions which need to be safe from freezing, like
      recovery.
      
      When a node wants to freeze the filesystem, it grabs this glock
      exclusively.  When the freeze glock state changes on the nodes (either
      from shared to unlocked, or shared to exclusive), the filesystem does a
      special log flush.  gfs2_log_flush() does all the work for flushing out
      the and shutting down the incore log, and then it tries to grab the
      freeze glock in a shared state again.  Since the filesystem is stuck in
      gfs2_log_flush, no new transaction can start, and nothing can be written
      to disk. Unfreezing the filesytem simply involes dropping the freeze
      glock, allowing gfs2_log_flush() to grab and then release the shared
      lock, so it is cached for next time.
      
      However, in order for the unfreezing ioctl to occur, gfs2 needs to get a
      shared lock on the filesystem root directory inode to check permissions.
      If that glock has already been grabbed exclusively, fsfreeze will be
      unable to get the shared lock and unfreeze the filesystem.
      
      In order to allow the unfreeze, this patch makes gfs2 grab a shared lock
      on the filesystem root directory during the freeze, and hold it until it
      unfreezes the filesystem.  The functions which need to grab a shared
      lock in order to allow the unfreeze ioctl to be issued now use the lock
      grabbed by the freeze code instead.
      
      The freeze and unfreeze code take care to make sure that this shared
      lock will not be dropped while another process is using it.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      24972557
  17. 07 3月, 2014 2 次提交
  18. 10 2月, 2014 1 次提交
  19. 04 2月, 2014 1 次提交
    • S
      GFS2: Allocate block for xattr at inode alloc time, if required · b2c8b3ea
      Steven Whitehouse 提交于
      This is another step towards improving the allocation of xattr
      blocks at inode allocation time. Here we take advantage of
      Christoph's recent work on ACLs to allocate a block for the
      xattrs early if we know that we will be adding ACLs to the
      inode later on. The advantage of that is that it is much
      more likely that we'll get a contiguous run of two blocks
      where the first is the inode and the second is the xattr block.
      
      We still have to fall back to the original system in case we
      don't get the requested two contiguous blocks, or in case the
      ACLs are too large to fit into the block.
      
      Future patches will move more of the ACL setting code further
      up the gfs2_inode_create() function. Also, I'd like to be
      able to do the same thing with the xattrs from LSMs in
      due course, too. That way we should be able to slowly reduce
      the number of independent transactions, at least in the
      most common cases.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b2c8b3ea
  20. 16 1月, 2014 2 次提交
  21. 03 1月, 2014 5 次提交
    • S
      GFS2: Use range based functions for rgrp sync/invalidation · 7005c3e4
      Steven Whitehouse 提交于
      Each rgrp header is represented as a single extent on disk, so we
      can calculate the position within the address space, since we are
      using address spaces mapped 1:1 to the disk. This means that it
      is possible to use the range based versions of filemap_fdatawrite/wait
      and for invalidating the page cache.
      
      Our eventual intent is to then be able to merge the address spaces
      used for rgrps into a single address space, rather than to have
      one for each glock, saving memory and reducing complexity.
      
      Since during umount, the rgrp structures are disposed of before
      the glocks, we need to store the extent information in the glock
      so that is is available for a final invalidation. This patch uses
      a field which is otherwise unused in rgrp glocks to do that, so
      that we do not have to expand the size of a glock.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7005c3e4
    • S
      GFS2: Remove test which is always true · 7de41d36
      Steven Whitehouse 提交于
      Since gfs2_inplace_reserve() is always called with a valid
      alloc parms structure, there is no need to test for this
      within the function itself - and in any case, after we've
      all ready dereferenced it anyway.
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7de41d36
    • B
      GFS2: Implement a "rgrp has no extents longer than X" scheme · 5ea5050c
      Bob Peterson 提交于
      With the preceding patch, we started accepting block reservations
      smaller than the ideal size, which requires a lot more parsing of the
      bitmaps. To reduce the amount of bitmap searching, this patch
      implements a scheme whereby each rgrp keeps track of the point
      at this multi-block reservations will fail.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5ea5050c
    • B
      GFS2: Drop inadequate rgrps from the reservation tree · 1330edbe
      Bob Peterson 提交于
      This is just basically a resend of a patch I posted earlier.
      It didn't change from its original, except in diff offsets, etc:
      
      This patch fixes a bug in the GFS2 block allocation code. The problem
      starts if a process already has a multi-block reservation, but for
      some reason, another process disqualifies it from further allocations.
      For example, the other process might set on the GFS2_RDF_ERROR bit.
      The process holding the reservation jumps to label skip_rgrp, but
      that label comes after the code that removes the reservation from the
      tree. Therefore, the no longer usable reservation is not removed from
      the rgrp's reservations tree; it's lost. Eventually, the lost reservation
      causes the count of reserved blocks to get off, and eventually that
      causes a BUG_ON(rs->rs_rbm.rgd->rd_reserved < rs->rs_free) to trigger.
      This patch moves the call to after label skip_rgrp so that the
      disqualified reservation is properly removed from the tree, thus keeping
      the rgrp rd_reserved count sane.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1330edbe
    • B
      GFS2: If requested is too large, use the largest extent in the rgrp · 5ce13431
      Bob Peterson 提交于
      Here is a second try at a patch I posted earlier, which also implements
      suggestions Steve made:
      
      Before this patch, GFS2 would keep searching through all the rgrps
      until it found one that had a chunk of free blocks big enough to
      satisfy the size hint, which is based on the file write size,
      regardless of whether the chunk was big enough to perform the write.
      However, when doing big writes there may not be a large enough
      chunk of free blocks in any rgrp, due to file system fragmentation.
      The largest chunk may be big enough to satisfy the write request,
      but it may not meet the ideal reservation size from the "size hint".
      The writes would slow to a crawl because every write would search
      every rgrp, then finally give up and default to a single-block write.
      In my case, performance would drop from 425MB/s to 18KB/s, or 24000
      times slower.
      
      This patch basically makes it so that if we can't find a contiguous
      chunk of blocks big enough to satisfy the sizehint, we'll use the
      largest chunk of blocks we found that will still contain the write.
      It does so by keeping track of the largest run of blocks within the
      rgrp.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5ce13431
  22. 16 11月, 2013 1 次提交
  23. 02 10月, 2013 2 次提交
    • S
      GFS2: Speed up starting point selection for block allocation · 9e07f2cb
      Steven Whitehouse 提交于
      When setting the starting point for block allocation, there were calls
      to both gfs2_rbm_to_block() and gfs2_rbm_from_block() in the common case
      of there being an active reservation. The gfs2_rbm_from_block() function
      can be quite slow, and since the two conversions were effectively a
      no-op, it makes sense to avoid them entirely in this case.
      
      There is no functional change here, but the code should be a bit more
      efficient after this patch.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      9e07f2cb
    • S
      GFS2: Add allocation parameters structure · 7b9cff46
      Steven Whitehouse 提交于
      This patch adds a structure to contain allocation parameters with
      the intention of future expansion of this structure. The idea is
      that we should be able to add more information about the allocation
      in the future in order to allow the allocator to make a better job
      of placing the requests on-disk.
      
      There is no functional difference from applying this patch.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7b9cff46
  24. 27 9月, 2013 1 次提交
    • S
      GFS2: Clean up reservation removal · af5c2697
      Steven Whitehouse 提交于
      The reservation for an inode should be cleared when it is truncated so
      that we can start again at a different offset for future allocations.
      We could try and do better than that, by resetting the search based on
      where the truncation started from, but this is only a first step.
      
      In addition, there are three callers of gfs2_rs_delete() but only one
      of those should really be testing the value of i_writecount. While
      we get away with that in the other cases currently, I think it would
      be better if we made that test specific to the one case which
      requires it.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      af5c2697
  25. 18 9月, 2013 2 次提交
    • B
      GFS2: new function gfs2_rbm_incr · 149ed7f5
      Bob Peterson 提交于
      Since the previous patch eliminated bi in favor of bii, this follow-on
      patch needed to be adjusted accordingly. Here is the revised version.
      
      This patch adds a new function, gfs2_rbm_incr, which increments
      an rbm structure. This is more efficient than calling gfs2_rbm_to_block,
      incrementing, then calling gfs2_rbm_from_block.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      149ed7f5
    • B
      GFS2: Introduce rbm field bii · e579ed4f
      Bob Peterson 提交于
      This is a respin of the original patch. As Steve pointed out, the
      introduction of field bii makes it easy to eliminate bi itself.
      This revised patch does just that, replacing bi with bii.
      
      This patch adds a new field to the rbm structure, called bii,
      which is an index into the array of bitmaps for an rgrp.
      This replaces *bi which was a pointer to the bitmap.
      This is being done for further optimizations.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      e579ed4f
  26. 17 9月, 2013 3 次提交
  27. 20 6月, 2013 1 次提交
    • A
      GFS2: Fix fstrim boundary conditions · 6a98c333
      Abhijith Das 提交于
      This patch correctly distinguishes two boundary conditions:
      
      1. When the given range is entire within the unaccounted space between
         two rgrps, and
      2. The range begins beyond the end of the filesystem
      
      Also fix the unit of the returned value r.len (total trimming) to be in bytes 
      instead of the (incorrect) 512 byte blocks
      
      With this patch, GFS2 passes multiple iterations of all the relevant xfstests
      (251, 260, 288) with different fs block sizes.
      Signed-off-by: NAbhi Das <adas@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6a98c333