1. 31 3月, 2008 8 次提交
    • S
      [GFS2] Allow bmap to allocate extents · 9b8c81d1
      Steven Whitehouse 提交于
      We've supported mapping of extents when no block allocation is required
      for some time. This patch extends that to mapping of extents when an
      allocation has been requested. In that case we try to allocate as many
      blocks as are requested, but we might return fewer in case there is
      something preventing us from returning the complete amount (e.g. an
      already allocated block is in the way).
      
      Currently the only code path which can actually request multiple data
      blocks in a single bmap call is the page_mkwrite path and even then it
      only happens if there are multiple blocks per page. What this patch does
      do however, is merge the allocation requests for metadata (growing the
      metadata tree in either height or depth) with the allocation of the data
      blocks in the case that both are needed. This results in lower overheads
      even in the single block allocation case.
      
      The one thing which we can't handle here at the moment is unstuffing. I
      would like to be able to do that, but the problem which arises is that
      in order to unstuff one has to get a locked page from the page cache
      which results in locking problems in the (usual) case that the caller is
      holding the page lock on the page it wishes to map. So that case will
      have to be addressed in future patches.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      9b8c81d1
    • M
      [GFS2] be*_add_cpu conversion · bb16b342
      Marcin Slusarz 提交于
      replace all:
      big_endian_variable = cpu_to_beX(beX_to_cpu(big_endian_variable) +
      					expression_in_cpu_byteorder);
      with:
      	beX_add_cpu(&big_endian_variable, expression_in_cpu_byteorder);
      generated with semantic patch
      Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      bb16b342
    • S
      [GFS2] Eliminate (almost) duplicate field from gfs2_inode · 77658aad
      Steven Whitehouse 提交于
      The blocks counter is almost a duplicate of the i_blocks
      field in the VFS inode. The only difference is that i_blocks
      can be only 32bits long for 32bit arch without large single file
      support. Since GFS2 doesn't handle the non-large single file
      case (for 32 bit anyway) this adds a new config dependency on
      64BIT || LSF. This has always been the case, however we've never
      explicitly said so before.
      
      Even if we do add support for the non-LSF case, we will still
      not require this field to be duplicated since we will not be
      able to access oversized files anyway.
      
      So the net result of all this is that we shave 8 bytes from a gfs2_inode
      and get our config deps correct.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      77658aad
    • S
      [GFS2] Add extent allocation to block allocator · b45e41d7
      Steven Whitehouse 提交于
      Rather than having to allocate a single block at a time, this patch
      allows the block allocator to allocate an extent. Since there is
      no difference (so far as the block allocator is concerned) between
      data blocks and indirect blocks, it is posible to allocate a single
      extent and for the caller to unrevoke just the blocks required
      for indirect blocks.
      
      Currently the only bit of GFS2 to make use of this feature is the
      build height function. The intention is that gfs2_block_map will
      be changed to make use of this feature in future patches.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b45e41d7
    • S
      [GFS2] Merge gfs2_alloc_meta and gfs2_alloc_data · 1639431a
      Steven Whitehouse 提交于
      Thanks to the preceeding patches, the only difference between
      these two functions is their name. We can thus merge them
      and call the new function gfs2_alloc_block to reflect the
      fact that it can allocate either kind of block.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1639431a
    • S
      [GFS2] Update gfs2_trans_add_unrevoke to accept extents · 5731be53
      Steven Whitehouse 提交于
      By adding an extra argument to gfs2_trans_add_unrevoke we can now
      specify an extent length of blocks to unrevoke. This means that
      we only need to make one pass through the list for each extent
      rather than each block. Currently the only extent length which
      is used is 1, but that will change in the future.
      
      Also gfs2_trans_add_unrevoke is removed from gfs2_alloc_meta
      since its the only difference between this and gfs2_alloc_data
      which is left. This will allow a future patch to merge these
      two functions into one (i.e. one call to allocate both data
      and metadata in a single extent in the future).
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5731be53
    • S
      [GFS2] Shrink & rename di_depth · 9a004508
      Steven Whitehouse 提交于
      This patch forms a pair with the previous patch which shrunk
      di_height. Like that patch di_depth is renamed i_depth and moved
      into struct gfs2_inode directly. Also the field goes from 16 bits
      to 8 bits since it is also limited to a max value which is rather
      small (17 in this case). In addition we also now validate the field
      against this maximum value when its read in.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      9a004508
    • B
      [GFS2] Get rid of unneeded parameter in gfs2_rlist_alloc · fe6c991c
      Bob Peterson 提交于
      This patch removed the unnecessary parameter from function
      gfs2_rlist_alloc.  The parameter was always passed in as 0.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      fe6c991c
  2. 08 2月, 2008 1 次提交
  3. 25 1月, 2008 1 次提交
    • S
      [GFS2] Reduce inode size by moving i_alloc out of line · 6dbd8224
      Steven Whitehouse 提交于
      It is possible to reduce the size of GFS2 inodes by taking the i_alloc
      structure out of the gfs2_inode. This patch allocates the i_alloc
      structure whenever its needed, and frees it afterward. This decreases
      the amount of low memory we use at the expense of requiring a memory
      allocation for each page or partial page that we write. A quick test
      with postmark shows that the overhead is not measurable and I also note
      that OCFS2 use the same approach.
      
      In the future I'd like to solve the problem by shrinking down the size
      of the members of the i_alloc structure, but for now, this reduces the
      immediate problem of using too much low-memory on x86 and doesn't add
      too much overhead.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6dbd8224
  4. 10 10月, 2007 2 次提交
    • B
      [GFS2] Alternate gfs2_iget to avoid looking up inodes being freed · 7a9f53b3
      Benjamin Marzinski 提交于
      There is a possible deadlock between two processes on the same node, where one
      process is deleting an inode, and another process is looking for allocated but
      unused inodes to delete in order to create more space.
      
      process A does an iput() on inode X, and it's i_count drops to 0. This causes
      iput_final() to be called, which puts an inode into state I_FREEING at
      generic_delete_inode(). There no point between when iput_final() is called, and
      when I_FREEING is set where GFS2 could acquire any glocks. Once I_FREEING is
      set, no other process on that node can successfully look up that inode until
      the delete finishes.
      
      process B locks the the resource group for the same inode in get_local_rgrp(),
      which is called by gfs2_inplace_reserve_i()
      
      process A tries to lock the resource group for the inode in
      gfs2_dinode_dealloc(), but it's already locked by process B
      
      process B waits in find_inode for the inode to have the I_FREEING state cleared.
      
      Deadlock.
      
      This patch solves the problem by adding an alternative to gfs2_iget(),
      gfs2_iget_skip(), that simply skips any inodes that are in the I_FREEING
      state.o The alternate test function is just like the original one, except that
      it fails if the inode is being freed, and sets a skipped flag. The alternate
      set function is just like the original, except that it fails if the skipped
      flag is set. Only try_rgrp_unlink() calls gfs2_iget_skip() instead of
      gfs2_iget().
      Signed-off-by: NBenjamin E. Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7a9f53b3
    • S
      [GFS2] Add a missing gfs2_trans_add_bh() · 382e6e25
      Steven Whitehouse 提交于
      This was missing from the dir_split_leaf() function although in
      most cases its not a problem due to other functions having
      already previously called gfs2_trans_add_bh. This makes certain
      that it is correct.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Wendy Cheng <wcheng@redhat.com>
      382e6e25
  5. 09 7月, 2007 4 次提交
    • W
      [GFS2] Obtaining no_formal_ino from directory entry · bb9bcf06
      Wendy Cheng 提交于
      GFS2 lookup code doesn't ask for inode shared glock. This implies during
      in-memory inode creation for existing file, GFS2 will not disk-read in
      the inode contents. This leaves no_formal_ino un-initialized during
      lookup time. The un-initialized no_formal_ino is subsequently encoded
      into file handle. Clients will get ESTALE error whenever it tries to
      access these files.
      Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      bb9bcf06
    • S
      [GFS2] Add nanosecond timestamp feature · 4bd91ba1
      Steven Whitehouse 提交于
      This adds a nanosecond timestamp feature to the GFS2 filesystem. Due
      to the way that the on-disk format works, older filesystems will just
      appear to have this field set to zero. When mounted by an older version
      of GFS2, the filesystem will simply ignore the extra fields so that
      it will again appear to have whole second resolution, so that its
      trivially backward compatible.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      4bd91ba1
    • S
      [GFS2] Fix sign problem in quota/statfs and cleanup _host structures · bb8d8a6f
      Steven Whitehouse 提交于
      This patch fixes some sign issues which were accidentally introduced
      into the quota & statfs code during the endianess annotation process.
      Also included is a general clean up which moves all of the _host
      structures out of gfs2_ondisk.h (where they should not have been to
      start with) and into the places where they are actually used (often only
      one place). Also those _host structures which are not required any more
      are removed entirely (which is the eventual plan for all of them).
      
      The conversion routines from ondisk.c are also moved into the places
      where they are actually used, which for almost every one, was just one
      single place, so all those are now static functions. This also cleans up
      the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__.
      
      The net result is a reduction of about 100 lines of code, many functions
      now marked static plus the bug fixes as mentioned above. For good
      measure I ran the code through sparse after making these changes to
      check that there are no warnings generated.
      
      This fixes Red Hat bz #239686
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      bb8d8a6f
    • S
      [GFS2] Clean up inode number handling · dbb7cae2
      Steven Whitehouse 提交于
      This patch cleans up the inode number handling code. The main difference
      is that instead of looking up the inodes using a struct gfs2_inum_host
      we now use just the no_addr member of this structure. The tests relating
      to no_formal_ino can then be done by the calling code. This has
      advantages in that we want to do different things in different code
      paths if the no_formal_ino doesn't match. In the NFS patch we want to
      return -ESTALE, but in the ->lookup() path, its a bug in the fs if the
      no_formal_ino doesn't match and thus we can withdraw in this case.
      
      In order to later fix bz #201012, we need to be able to look up an inode
      without knowing no_formal_ino, as the only information that is known to
      us is the on-disk location of the inode in question.
      
      This patch will also help us to fix bz #236099 at a later date by
      cleaning up a lot of the code in that area.
      
      There are no user visible changes as a result of this patch and there
      are no changes to the on-disk format either.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      dbb7cae2
  6. 01 5月, 2007 2 次提交
  7. 15 2月, 2007 1 次提交
    • T
      [PATCH] remove many unneeded #includes of sched.h · cd354f1a
      Tim Schmielau 提交于
      After Al Viro (finally) succeeded in removing the sched.h #include in module.h
      recently, it makes sense again to remove other superfluous sched.h includes.
      There are quite a lot of files which include it but don't actually need
      anything defined in there.  Presumably these includes were once needed for
      macros that used to live in sched.h, but moved to other header files in the
      course of cleaning it up.
      
      To ease the pain, this time I did not fiddle with any header files and only
      removed #includes from .c-files, which tend to cause less trouble.
      
      Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha,
      arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig,
      allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all
      configs in arch/arm/configs on arm.  I also checked that no new warnings were
      introduced by the patch (actually, some warnings are removed that were emitted
      by unnecessarily included header files).
      Signed-off-by: NTim Schmielau <tim@physik3.uni-rostock.de>
      Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cd354f1a
  8. 06 2月, 2007 2 次提交
    • E
      [GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2 · ddfe0627
      Eric Sandeen 提交于
      I was looking something else up and came across this...
      
      I don't honestly have a good reason to change it other than to make it
      like every other Linux filesystem in this regard.  ;-)  It doesn't
      functionally change anything, but makes some lines shorter. :)
      
      I'm also curious; why does gfs2 have 64-bits of on-disk timestamps, but
      not in timespec_t format, and only stores second resolutions?  Seems like
      you're halfway to sub-second resolutions already.
      
      I suppose if that gets implemented then all of the below should
      instead be CURRENT_TIME not CURRENT_TIME_SEC.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      ddfe0627
    • S
      [GFS2] Clean up/speed up readdir · 3699e3a4
      Steven Whitehouse 提交于
      This removes the extra filldir callback which gfs2 was using to
      enclose an attempt at readahead for inodes during readdir. The
      code was too complicated and also hurts performance badly in the
      case that the getdents64/readdir call isn't being followed by
      stat() and it wasn't even getting it right all the time when it
      was.
      
      As a result, on my test box an "ls" of a directory containing 250000
      files fell from about 7mins (freshly mounted, so nothing cached) to
      between about 15 to 25 seconds. When the directory content was cached,
      the time taken fell from about 3mins to about 4 or 5 seconds.
      
      Interestingly in the cached case, running "ls -l" once reduced the time
      taken for subsequent runs of "ls" to about 6 secs even without this
      patch. Now it turns out that there was a special case of glocks being
      used for prefetching the metadata, but because of the timeouts for these
      locks (set to 10 secs) the metadata was being timed out before it was
      being used and this the prefetch code was constantly trying to prefetch
      the same data over and over.
      
      Calling "ls -l" meant that the inodes were brought into memory and once
      the inodes are cached, the glocks are not disposed of until the inodes
      are pushed out of the cache, thus extending the lifetime of the glocks,
      and thus bringing down the time for subsequent runs of "ls"
      considerably.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      3699e3a4
  9. 30 11月, 2006 7 次提交
  10. 20 10月, 2006 4 次提交
  11. 25 9月, 2006 1 次提交
  12. 22 9月, 2006 1 次提交
    • S
      [GFS2] Tidy up meta_io code · 7276b3b0
      Steven Whitehouse 提交于
      Fix a bug in the directory reading code, where we might have dereferenced
      a NULL pointer in case of OOM. Updated the directory code to use the new
      & improved version of gfs2_meta_ra() which now returns the first block
      that was being read. Previously it was releasing it requiring following
      code to grab the block again at each point it was called.
      
      Also turned off readahead on directory lookups since we are reading a
      hash table, and therefore reading the entries in order is very
      unlikely. Readahead is still used for all other calls to the
      directory reading function (e.g. when growing the hash table).
      
      Removed the DIO_START constant. Everywhere this was used, it was
      used to unconditionally start i/o aside from a couple of places, so
      I've removed it and made the couple of exceptions to this rule into
      separate functions.
      
      Also hunted through the other DIO flags and removed them as arguments
      from functions which were always called with the same combination of
      arguments.
      
      Updated gfs2_meta_indirect_buffer to be a bit more efficient and
      hopefully also be a bit easier to read.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7276b3b0
  13. 19 9月, 2006 1 次提交
  14. 07 9月, 2006 1 次提交
  15. 05 9月, 2006 3 次提交
  16. 01 9月, 2006 1 次提交
    • S
      [GFS2] Update copyright, tidy up incore.h · e9fc2aa0
      Steven Whitehouse 提交于
      As per comments from Jan Engelhardt <jengelh@linux01.gwdg.de> this
      updates the copyright message to say "version" in full rather than
      "v.2". Also incore.h has been updated to remove forward structure
      declarations which are not required.
      
      The gfs2_quota_lvb structure has now had endianess annotations added
      to it. Also quota.c has been updated so that we now store the
      lvb data locally in endian independant format to avoid needing
      a structure in host endianess too. As a result the endianess
      conversions are done as required at various points and thus the
      conversion routines in lvb.[ch] are no longer required. I've
      moved the one remaining constant in lvb.h thats used into lm.h
      and removed the unused lvb.[ch].
      
      I have not changed the HIF_ constants. That is left to a later patch
      which I hope will unify the gh_flags and gh_iflags fields of the
      struct gfs2_holder.
      
      Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      e9fc2aa0