1. 29 6月, 2013 1 次提交
  2. 03 6月, 2013 1 次提交
    • B
      GFS2: Fall back to vmalloc if kmalloc fails for dir hash tables · e8830d88
      Bob Peterson 提交于
      This version has one more correction: the vmalloc calls are replaced
      by __vmalloc calls to preserve the GFP_NOFS flag.
      
      When GFS2's directory management code allocates buffers for a
      directory hash table, if it can't get the memory it needs, it
      currently gives a bad return code. Rather than giving an error,
      this patch allows it to use virtual memory rather than kernel
      memory for the hash table. This should make it possible for
      directories to function properly, even when kernel memory becomes
      very fragmented.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      e8830d88
  3. 13 2月, 2013 1 次提交
  4. 29 1月, 2013 1 次提交
    • S
      GFS2: Split gfs2_trans_add_bh() into two · 350a9b0a
      Steven Whitehouse 提交于
      There is little common content in gfs2_trans_add_bh() between the data
      and meta classes by the time that the functions which it calls are
      taken into account. The intent here is to split this into two
      separate functions. Stage one is to introduce gfs2_trans_add_data()
      and gfs2_trans_add_meta() and update the callers accordingly.
      
      Later patches will then pull in the content of gfs2_trans_add_bh()
      and its dependent functions in order to clean up the code in this
      area.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      350a9b0a
  5. 13 11月, 2012 1 次提交
  6. 06 6月, 2012 1 次提交
  7. 11 5月, 2012 1 次提交
    • L
      vfs: make it possible to access the dentry hash/len as one 64-bit entry · 26fe5750
      Linus Torvalds 提交于
      This allows comparing hash and len in one operation on 64-bit
      architectures.  Right now only __d_lookup_rcu() takes advantage of this,
      since that is the case we care most about.
      
      The use of anonymous struct/unions hides the alternate 64-bit approach
      from most users, the exception being a few cases where we initialize a
      'struct qstr' with a static initializer.  This makes the problematic
      cases use a new QSTR_INIT() helper function for that (but initializing
      just the name pointer with a "{ .name = xyzzy }" initializer remains
      valid, as does just copying another qstr structure).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26fe5750
  8. 05 4月, 2012 1 次提交
  9. 22 11月, 2011 1 次提交
  10. 21 11月, 2011 1 次提交
    • B
      GFS2: move toward a generic multi-block allocator · 6e87ed0f
      Bob Peterson 提交于
      This patch is a revision of the one I previously posted.
      I tried to integrate all the suggestions Steve gave.
      The purpose of the patch is to change function gfs2_alloc_block
      (allocate either a dinode block or an extent of data blocks)
      to a more generic gfs2_alloc_blocks function that can
      allocate both a dinode _and_ an extent of data blocks in the
      same call. This will ultimately help us create a multi-block
      reservation scheme to reduce file fragmentation.
      
      This patch moves more toward a generic multi-block allocator that
      takes a pointer to the number of data blocks to allocate, plus whether
      or not to allocate a dinode. In theory, it could be called to allocate
      (1) a single dinode block, (2) a group of one or more data blocks, or
      (3) a dinode plus several data blocks.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6e87ed0f
  11. 15 11月, 2011 1 次提交
  12. 09 11月, 2011 1 次提交
  13. 08 11月, 2011 1 次提交
  14. 21 10月, 2011 4 次提交
    • S
      GFS2: Use cached rgrp in gfs2_rlist_add() · 70b0c365
      Steven Whitehouse 提交于
      Each block which is deallocated, requires a call to gfs2_rlist_add()
      and each of those calls was calling gfs2_blk2rgrpd() in order to
      figure out which rgrp the block belonged in. This can be speeded up
      by making use of the rgrp cached in the inode. We also reset this
      cached rgrp in case the block has changed rgrp. This should provide
      a big reduction in gfs2_blk2rgrpd() calls during deallocation.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      70b0c365
    • S
      GFS2: Make resource groups "append only" during life of fs · 8339ee54
      Steven Whitehouse 提交于
      Since we have ruled out supporting online filesystem shrink,
      it is possible to make the resource group list append only
      during the life of a super block. This gives several benefits:
      
      Firstly, we only need to read new rindex elements as they are added
      rather than needing to reread the whole rindex file each time one
      element is added.
      
      Secondly, the rindex glock can be held for much shorter periods of
      time, and is completely removed from the fast path for allocations.
      The lock is taken in shared mode only when updating the resource
      groups when the first allocation occurs, and after a grow has
      taken place.
      
      Thirdly, this results in a reduction in code size, and everything
      gets a lot simpler to understand in this area.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      8339ee54
    • S
      GFS2: Use ->dirty_inode() · ab9bbda0
      Steven Whitehouse 提交于
      The aim of this patch is to use the newly enhanced ->dirty_inode()
      super block operation to deal with atime updates, rather than
      piggy backing that code into ->write_inode() as is currently
      done.
      
      The net result is a simplification of the code in various places
      and a reduction of the number of gfs2_dinode_out() calls since
      this is now implied by ->dirty_inode().
      
      Some of the mark_inode_dirty() calls have been moved under glocks
      in order to take advantage of then being able to avoid locking in
      ->dirty_inode() when we already have suitable locks.
      
      One consequence is that generic_write_end() now correctly deals
      with file size updates, so that we do not need a separate check
      for that afterwards. This also, indirectly, means that fdatasync
      should work correctly on GFS2 - the current code always syncs the
      metadata whether it needs to or not.
      
      Has survived testing with postmark (with and without atime) and
      also fsx.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      ab9bbda0
    • S
      GFS2: Clean up dir hash table reading · 4c28d338
      Steven Whitehouse 提交于
      Since there is now only a single caller to gfs2_dir_read_data()
      and it has a number of constant arguments, we can factor
      those out. Also some tests relating to the inode size were
      being done twice.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      4c28d338
  15. 15 7月, 2011 1 次提交
    • S
      GFS2: Cache dir hash table in a contiguous buffer · 17d539f0
      Steven Whitehouse 提交于
      This patch adds a cache for the hash table to the directory code
      in order to help simplify the way in which the hash table is
      accessed. This is intended to be a first step towards introducing
      some performance improvements in the directory code.
      
      There are two follow ups that I'm hoping to see fairly shortly. One
      is to simplify the hash table reading code now that we always read the
      complete hash table, whether we want one entry or all of them. The
      other is to introduce readahead on the heads of the hash chains
      which are referred to from the table.
      
      The hash table is a maximum of 128k in size, so it is not worth trying
      to read it in small chunks.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      17d539f0
  16. 09 5月, 2011 2 次提交
    • S
      GFS2: When adding a new dir entry, inc link count if it is a subdir · 3d6ecb7d
      Steven Whitehouse 提交于
      This adds an increment of the link count when we add a new directory
      entry, if that entry is itself a directory. This means that we no
      longer need separate code to perform this operation.
      
      Now that both adding and removing directory entries automatically
      update the parent directory's link count if required, that makes
      the code shorter and simpler than before.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      3d6ecb7d
    • S
      GFS2: Make gfs2_dir_del update link count when required · 855d23ce
      Steven Whitehouse 提交于
      When we remove an entry from a directory, we can save ourselves
      some trouble if we know the type of the entry in question, since
      if it is itself a directory, we can update the link count of the
      parent at the same time as removing the directory entry.
      
      In addition this patch also merges the rmdir and unlink code which
      was almost identical anyway. This eliminates the calls to remove
      the . and .. directory entries on each rmdir (not needed since the
      directory will be deallocated, anyway) which was the only thing preventing
      passing the dentry to gfs2_dir_del(). The passing of the dentry
      rather than just the name allows us to figure out the type of the entry
      which is being removed, and thus adjust the link count when required.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      855d23ce
  17. 20 4月, 2011 4 次提交
  18. 18 4月, 2011 1 次提交
    • B
      GFS2: filesystem hang caused by incorrect lock order · 44ad37d6
      Bob Peterson 提交于
      This patch fixes a deadlock in GFS2 where two processes are trying
      to reclaim an unlinked dinode:
      One holds the inode glock and calls gfs2_lookup_by_inum trying to look
      up the inode, which it can't, due to I_FREEING.  The other has set
      I_FREEING from vfs and is at the beginning of gfs2_delete_inode
      waiting for the glock, which is held by the first.  The solution is to
      add a new non_block parameter to the gfs2_iget function that causes it
      to return -ENOENT if the inode is being freed.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      44ad37d6
  19. 20 9月, 2010 2 次提交
  20. 29 7月, 2010 2 次提交
  21. 15 7月, 2010 1 次提交
    • B
      GFS2: rename causes kernel Oops · 728a756b
      Bob Peterson 提交于
      This patch fixes a kernel Oops in the GFS2 rename code.
      
      The problem was in the way the gfs2 directory code was trying
      to re-use sentinel directory entries.
      
      In the failing case, gfs2's rename function was renaming a
      file to another name that had the same non-trivial length.
      The file being renamed happened to be the first directory
      entry on the leaf block.
      
      First, the rename code (gfs2_rename in ops_inode.c) found the
      original directory entry and decided it could do its job by
      simply replacing the directory entry with another.  Therefore
      it determined correctly that no block allocations were needed.
      
      Next, the rename code deleted the old directory entry prior to
      replacing it with the new name.  Therefore, the soon-to-be
      replaced directory entry was temporarily made into a directory
      entry "sentinel" or a place holder at the start of a leaf block.
      
      Lastly, it went to re-add the replacement directory entry in
      that leaf block.  However, when gfs2_dirent_find_space was
      looking for space in the leaf block, it used the wrong value
      for the sentinel.  That threw off its calculations so later
      it decides it can't really re-use the sentinel and therefore
      must allocate a new leaf block.  But because it previously decided
      to re-use the directory entry, it didn't waste the time to
      grab a new block allocation for the inode.  Therefore, the
      inode's i_alloc pointer was still NULL and it crashes trying to
      reference it.
      
      In the case of sentinel directory entries, the entire dirent is
      reused, not just the "free space" portion of it, and therefore
      the function gfs2_dirent_find_space should use the value 0
      rather than GFS2_DIRENT_SIZE(0) for the actual dirent size.
      
      Fixing this calculation enables the reproducer programs to work
      properly.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      728a756b
  22. 14 4月, 2010 1 次提交
    • B
      GFS2: glock livelock · 1a0eae88
      Bob Peterson 提交于
      This patch fixes a couple gfs2 problems with the reclaiming of
      unlinked dinodes.  First, there were a couple of livelocks where
      everything would come to a halt waiting for a glock that was
      seemingly held by a process that no longer existed.  In fact, the
      process did exist, it just had the wrong pid number in the holder
      information.  Second, there was a lock ordering problem between
      inode locking and glock locking.  Third, glock/inode contention
      could sometimes cause inodes to be improperly marked invalid by
      iget_failed.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      1a0eae88
  23. 03 12月, 2009 1 次提交
  24. 20 5月, 2009 1 次提交
    • S
      GFS2: Improve resource group error handling · 09010978
      Steven Whitehouse 提交于
      This patch improves the error handling in the case where we
      discover that the summary information in the resource group
      doesn't match the bitmap information while in the process of
      allocating blocks. Originally this resulted in a kernel bug,
      but this patch changes that so that we return -EIO and print
      some messages explaining what went wrong, and how to fix it.
      
      We also remember locally not to try and allocate from the
      same rgrp again, so that a subsequent allocation in a
      different rgrp should succeed.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      09010978
  25. 24 3月, 2009 1 次提交
    • S
      GFS2: Merge lock_dlm module into GFS2 · f057f6cd
      Steven Whitehouse 提交于
      This is the big patch that I've been working on for some time
      now. There are many reasons for wanting to make this change
      such as:
       o Reducing overhead by eliminating duplicated fields between structures
       o Simplifcation of the code (reduces the code size by a fair bit)
       o The locking interface is now the DLM interface itself as proposed
         some time ago.
       o Fewer lookups of glocks when processing replies from the DLM
       o Fewer memory allocations/deallocations for each glock
       o Scope to do further optimisations in the future (but this patch is
         more than big enough for now!)
      
      Please note that (a) this patch relates to the lock_dlm module and
      not the DLM itself, that is still a separate module; and (b) that
      we retain the ability to build GFS2 as a standalone single node
      filesystem with out requiring the DLM.
      
      This patch needs a lot of testing, hence my keeping it I restarted
      my -git tree after the last merge window. That way, this has the maximum
      exposure before its merged. This is (modulo a few minor bug fixes) the
      same patch that I've been posting on and off the the last three months
      and its passed a number of different tests so far.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f057f6cd
  26. 05 1月, 2009 3 次提交
  27. 10 4月, 2008 1 次提交
  28. 31 3月, 2008 2 次提交
    • C
      [GFS2] possible null pointer dereference fixup · 182fe5ab
      Cyrill Gorcunov 提交于
      gfs2_alloc_get may fail so we have to check it to prevent
      NULL pointer dereference.
      Signed-off-by: NCyrill Gorcunov <gorcunov@gamil.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      182fe5ab
    • S
      [GFS2] Allow bmap to allocate extents · 9b8c81d1
      Steven Whitehouse 提交于
      We've supported mapping of extents when no block allocation is required
      for some time. This patch extends that to mapping of extents when an
      allocation has been requested. In that case we try to allocate as many
      blocks as are requested, but we might return fewer in case there is
      something preventing us from returning the complete amount (e.g. an
      already allocated block is in the way).
      
      Currently the only code path which can actually request multiple data
      blocks in a single bmap call is the page_mkwrite path and even then it
      only happens if there are multiple blocks per page. What this patch does
      do however, is merge the allocation requests for metadata (growing the
      metadata tree in either height or depth) with the allocation of the data
      blocks in the case that both are needed. This results in lower overheads
      even in the single block allocation case.
      
      The one thing which we can't handle here at the moment is unstuffing. I
      would like to be able to do that, but the problem which arises is that
      in order to unstuff one has to get a locked page from the page cache
      which results in locking problems in the (usual) case that the caller is
      holding the page lock on the page it wishes to map. So that case will
      have to be addressed in future patches.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      9b8c81d1