1. 12 5月, 2010 1 次提交
  2. 14 4月, 2010 1 次提交
    • B
      GFS2: glock livelock · 1a0eae88
      Bob Peterson 提交于
      This patch fixes a couple gfs2 problems with the reclaiming of
      unlinked dinodes.  First, there were a couple of livelocks where
      everything would come to a halt waiting for a glock that was
      seemingly held by a process that no longer existed.  In fact, the
      process did exist, it just had the wrong pid number in the holder
      information.  Second, there was a lock ordering problem between
      inode locking and glock locking.  Third, glock/inode contention
      could sometimes cause inodes to be improperly marked invalid by
      iget_failed.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      1a0eae88
  3. 01 2月, 2010 3 次提交
  4. 03 12月, 2009 1 次提交
  5. 21 9月, 2009 1 次提交
  6. 14 9月, 2009 2 次提交
  7. 09 9月, 2009 1 次提交
    • S
      GFS2: Be extra careful about deallocating inodes · acf7e244
      Steven Whitehouse 提交于
      There is a potential race in the inode deallocation code if two
      nodes try to deallocate the same inode at the same time. Most of
      the issue is solved by the iopen locking. There is still a small
      window which is not covered by the iopen lock. This patches fixes
      that and also makes the deallocation code more robust in the face of
      any errors in the rgrp bitmaps, or erroneous iopen callbacks from
      other nodes.
      
      This does introduce one extra disk read, but that is generally not
      an issue since its the same block that must be written to later
      in the deallocation process. The total disk accesses therefore stay
      the same,
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      acf7e244
  8. 27 8月, 2009 1 次提交
    • S
      GFS2: Remove no_formal_ino generating code · 8d8291ae
      Steven Whitehouse 提交于
      The inum structure used throughout GFS2 has two fields. One
      no_addr is the disk block number of the inode in question and
      is used everywhere as the inode number. The other, no_formal_ino,
      is used only as the generation number for NFS.
      
      Historically the no_formal_ino field was set using a complicated
      system of one global and one per-node file containing inode numbers
      in order to ensure that each no_formal_ino was unique. Also this
      code made no provision for what would happen when eventually the
      (64 bit) numbers ran out. Now I know that is pretty unlikely to
      happen given the large space of numbers, but it is possible
      nevertheless.
      
      The only guarantee required for no_formal_ino is that, for any
      single inode, the same number doesn't get reused too quickly.
      
      We already have a generation number which is kept in the inode
      and initialised from a counter in the resource group (almost
      no overhead, since we have to touch the resource group anyway
      in order to allocate an inode in the first place). Aside from
      ensuring that we never use the value 0 in the no_formal_ino
      field, we can use that counter directly.
      
      As a result of that change, we lose about 200 lines of code and
      also gain about 10 creates/sec on the postmark benchmark (on
      my test machine).
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      8d8291ae
  9. 17 8月, 2009 2 次提交
  10. 30 7月, 2009 2 次提交
  11. 12 6月, 2009 1 次提交
    • S
      GFS2: Add tracepoints · 63997775
      Steven Whitehouse 提交于
      This patch adds the ability to trace various aspects of the GFS2
      filesystem. The trace points are divided into three groups,
      glocks, logging and bmap. These points have been chosen because
      they allow inspection of the major internal functions of GFS2
      and they are also generic enough that they are unlikely to need
      any major changes as the filesystem evolves.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      63997775
  12. 23 5月, 2009 1 次提交
  13. 22 5月, 2009 1 次提交
    • S
      GFS2: Clean up some file names · b1e71b06
      Steven Whitehouse 提交于
      This patch renames the ops_*.c files which have no counterpart
      without the ops_ prefix in order to shorten the name and make
      it more readable. In addition, ops_address.h (which was very
      small) is moved into inode.h and inode.h is cleaned up by
      adding extern where required.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b1e71b06
  14. 21 5月, 2009 2 次提交
    • S
      GFS2: Be more aggressive in reclaiming unlinked inodes · 1ce97e56
      Steven Whitehouse 提交于
      This patch increases the frequency with which gfs2 looks
      for unlinked, but still allocated inodes. Its the equivalent
      operation to ext3's orphan list, but done with bitmaps in
      the resource groups.
      
      This also fixes a bug where a field in the rgrp was too small.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1ce97e56
    • S
      GFS2: Add a rgrp bitmap full flag · 60a0b8f9
      Steven Whitehouse 提交于
      During block allocation, it is useful to know if sections of disk
      are full on a finer grained basis than a single resource group.
      This can make a performance difference when resource groups have
      larger numbers of bitmap blocks, since we no longer have to search
      them all block by block in each individual bitmap.
      
      The full flag is set on a per-bitmap basis when it has been
      searched and found to have no free space. It is then skipped in
      subsequent searches until the flag is reset. The resetting
      occurs if we have to drop the glock on the resource group for any
      reason, or if we deallocate some blocks within that resource
      group and thus free up some space.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      60a0b8f9
  15. 20 5月, 2009 1 次提交
    • S
      GFS2: Improve resource group error handling · 09010978
      Steven Whitehouse 提交于
      This patch improves the error handling in the case where we
      discover that the summary information in the resource group
      doesn't match the bitmap information while in the process of
      allocating blocks. Originally this resulted in a kernel bug,
      but this patch changes that so that we return -EIO and print
      some messages explaining what went wrong, and how to fix it.
      
      We also remember locally not to try and allocate from the
      same rgrp again, so that a subsequent allocation in a
      different rgrp should succeed.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      09010978
  16. 23 4月, 2009 2 次提交
    • S
      GFS2: Ensure that the inode goal block settings are updated · d9ba7615
      Steven Whitehouse 提交于
      GFS2 has a goal block associated with each inode indicating the
      search start position for future block allocations (in fact there
      are two, but thats for backward compatibility with GFS1 as they
      are set to identical locations in GFS2).
      
      In some circumstances, depending on the ordering of updates to
      the inode it was possible for the goal block settings to not
      be updated on disk. This patch ensures that the goal block will
      always get updated, thus reducing the potential for searching
      the same (already allocated) blocks again when looking for free
      space during block allocation.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      d9ba7615
    • S
      GFS2: Fix bug in block allocation · d8bd504a
      Steven Whitehouse 提交于
      The new bitfit algorithm was counting from the wrong end of
      64 bit words in the bitfield. This fixes it by using __ffs64
      instead of fls64
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      d8bd504a
  17. 24 3月, 2009 6 次提交
    • H
      GFS2: fix sparse warning: Should it be static? · 02ab1721
      Hannes Eder 提交于
      Impact: Make symbol static.
      
      Fix this sparse warning:
        fs/gfs2/rgrp.c:188:5: warning: symbol 'gfs2_bitfit' was not declared. Should it be static?
      Signed-off-by: NHannes Eder <hannes@hanneseder.net>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      02ab1721
    • H
      GFS2: fix sparse warnings: constant is so big it is ... · 075ac448
      Hannes Eder 提交于
      Fix this sparse warnings:
        fs/gfs2/rgrp.c:156:23: warning: constant 0xffffffffffffffff is so big it is unsigned long long
        fs/gfs2/rgrp.c:157:23: warning: constant 0xaaaaaaaaaaaaaaaa is so big it is unsigned long long
        fs/gfs2/rgrp.c:158:23: warning: constant 0x5555555555555555 is so big it is long long
        fs/gfs2/rgrp.c:194:20: warning: constant 0x5555555555555555 is so big it is long long
        fs/gfs2/rgrp.c:204:44: warning: constant 0x5555555555555555 is so big it is long long
      Signed-off-by: NHannes Eder <hannes@hanneseder.net>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      075ac448
    • S
      GFS2: Fix alignment issue and tidy gfs2_bitfit · 223b2b88
      Steven Whitehouse 提交于
      An alignment issue with the existing bitfit algorithm was reported
      on IA64. This patch attempts to fix that, and also to tidy up the
      code a bit. There is now more documentation about how this works
      and it has survived a number of different tests.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      223b2b88
    • S
      GFS2: Add a "demote a glock" interface to sysfs · 64d576ba
      Steven Whitehouse 提交于
      This adds a sysfs file called demote_rq to GFS2's
      per filesystem directory. Its possible to use this
      file to demote arbitrary glocks in exactly the same
      way as if a request had come in from a remote node.
      
      This is intended for testing issues relating to caching
      of data under glocks. Despite that, the interface is
      generic enough to send requests to any type of glock,
      but be careful as its not always safe to send an
      arbitrary message to an arbitrary glock. For that reason
      and to prevent DoS, this interface is restricted to root
      only.
      
      The messages look like this:
      
      <type>:<glocknumber> <mode>
      
      Example:
      
      echo -n "2:13324 EX" >/sys/fs/gfs2/unity:myfs/demote_rq
      
      Which means "please demote inode glock (type 2) number 13324 so that
      I can get an EX (exclusive) lock". The lock modes are those which
      would normally be sent by a remote node in its callback so if you
      want to unlock a glock, you use EX, to demote to shared, use SH or PR
      (depending on whether you like GFS2 or DLM lock modes better!).
      
      If the glock doesn't exist, you'll get -ENOENT returned. If the
      arguments don't make sense, you'll get -EINVAL returned.
      
      The plan is that this interface will be used in combination with
      the blktrace patch which I recently posted for comments although
      it is, of course, still useful in its own right.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      64d576ba
    • S
      GFS2: Support generation of discard requests · f15ab561
      Steven Whitehouse 提交于
      This patch allows GFS2 to generate discard requests for blocks which are
      no longer useful to the filesystem (i.e. those which have been freed as
      the result of an unlink operation). The requests are generated at the
      time which those blocks become available for reuse in the filesystem.
      
      In order to use this new feature, you have to specify the "discard"
      mount option. The code coalesces adjacent blocks into a single extent
      when generating the discard requests, thus generating the minimum
      number.
      
      If an error occurs when the request has been sent to the block device,
      then it will print a message and turn off the requests for that
      filesystem. If the problem is temporary, then you can use remount to
      turn the option back on again. There is also a nodiscard mount option
      so that you can use remount to turn discard requests off, if required.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f15ab561
    • S
      GFS2: Merge lock_dlm module into GFS2 · f057f6cd
      Steven Whitehouse 提交于
      This is the big patch that I've been working on for some time
      now. There are many reasons for wanting to make this change
      such as:
       o Reducing overhead by eliminating duplicated fields between structures
       o Simplifcation of the code (reduces the code size by a fair bit)
       o The locking interface is now the DLM interface itself as proposed
         some time ago.
       o Fewer lookups of glocks when processing replies from the DLM
       o Fewer memory allocations/deallocations for each glock
       o Scope to do further optimisations in the future (but this patch is
         more than big enough for now!)
      
      Please note that (a) this patch relates to the lock_dlm module and
      not the DLM itself, that is still a separate module; and (b) that
      we retain the ability to build GFS2 as a standalone single node
      filesystem with out requiring the DLM.
      
      This patch needs a lot of testing, hence my keeping it I restarted
      my -git tree after the last merge window. That way, this has the maximum
      exposure before its merged. This is (modulo a few minor bug fixes) the
      same patch that I've been posting on and off the the last three months
      and its passed a number of different tests so far.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f057f6cd
  18. 05 1月, 2009 4 次提交
  19. 10 7月, 2008 1 次提交
    • S
      [GFS2] Replace rgrp "recent list" with mru list · 9cabcdbd
      Steven Whitehouse 提交于
      This patch removes the "recent list" which is used during allocation
      and replaces it with the (already existing) mru list used during
      deletion. The "recent list" was not a true mru list leading to a number
      of inefficiencies including a "next" function which made scanning the
      list an order N^2 operation wrt to the number of list elements.
      
      This should increase allocation performance with large numbers of rgrps.
      Its also a useful preparation and cleanup before some further changes
      which are planned in this area.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      9cabcdbd
  20. 24 6月, 2008 1 次提交
  21. 12 5月, 2008 1 次提交
  22. 31 3月, 2008 4 次提交
    • B
      [GFS2] Faster gfs2_bitfit algorithm · 1f466a47
      Bob Peterson 提交于
      This version of the gfs2_bitfit algorithm includes the latest
      suggestions from Steve Whitehouse.  It is typically eight to
      ten times faster than the version we're using today.  If there
      is a lot of metadata mixed in (lots of small files) the
      algorithm is often 15 times faster, and given the right
      conditions, I've seen peaks of 20 times faster.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1f466a47
    • S
      [GFS2] Allow bmap to allocate extents · 9b8c81d1
      Steven Whitehouse 提交于
      We've supported mapping of extents when no block allocation is required
      for some time. This patch extends that to mapping of extents when an
      allocation has been requested. In that case we try to allocate as many
      blocks as are requested, but we might return fewer in case there is
      something preventing us from returning the complete amount (e.g. an
      already allocated block is in the way).
      
      Currently the only code path which can actually request multiple data
      blocks in a single bmap call is the page_mkwrite path and even then it
      only happens if there are multiple blocks per page. What this patch does
      do however, is merge the allocation requests for metadata (growing the
      metadata tree in either height or depth) with the allocation of the data
      blocks in the case that both are needed. This results in lower overheads
      even in the single block allocation case.
      
      The one thing which we can't handle here at the moment is unstuffing. I
      would like to be able to do that, but the problem which arises is that
      in order to unstuff one has to get a locked page from the page cache
      which results in locking problems in the (usual) case that the caller is
      holding the page lock on the page it wishes to map. So that case will
      have to be addressed in future patches.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      9b8c81d1
    • S
      [GFS2] Add extent allocation to block allocator · b45e41d7
      Steven Whitehouse 提交于
      Rather than having to allocate a single block at a time, this patch
      allows the block allocator to allocate an extent. Since there is
      no difference (so far as the block allocator is concerned) between
      data blocks and indirect blocks, it is posible to allocate a single
      extent and for the caller to unrevoke just the blocks required
      for indirect blocks.
      
      Currently the only bit of GFS2 to make use of this feature is the
      build height function. The intention is that gfs2_block_map will
      be changed to make use of this feature in future patches.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b45e41d7
    • S
      [GFS2] Merge gfs2_alloc_meta and gfs2_alloc_data · 1639431a
      Steven Whitehouse 提交于
      Thanks to the preceeding patches, the only difference between
      these two functions is their name. We can thus merge them
      and call the new function gfs2_alloc_block to reflect the
      fact that it can allocate either kind of block.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1639431a