1. 31 3月, 2008 3 次提交
  2. 25 1月, 2008 5 次提交
    • S
      [GFS2] Reduce inode size by moving i_alloc out of line · 6dbd8224
      Steven Whitehouse 提交于
      It is possible to reduce the size of GFS2 inodes by taking the i_alloc
      structure out of the gfs2_inode. This patch allocates the i_alloc
      structure whenever its needed, and frees it afterward. This decreases
      the amount of low memory we use at the expense of requiring a memory
      allocation for each page or partial page that we write. A quick test
      with postmark shows that the overhead is not measurable and I also note
      that OCFS2 use the same approach.
      
      In the future I'd like to solve the problem by shrinking down the size
      of the members of the i_alloc structure, but for now, this reduces the
      immediate problem of using too much low-memory on x86 and doesn't add
      too much overhead.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6dbd8224
    • B
      b3513fca
    • B
      [GFS2] Run through full bitmaps quicker in gfs2_bitfit · 5fdc2eeb
      Bob Peterson 提交于
      I eliminated the passing of an unused parameter into gfs2_bitfit called rgd.
      
      This also changes the gfs2_bitfit code that searches for free (or used) blocks.
      Before, the code was trying to check for bytes that indicated 4 blocks in
      the undesired state.  The problem is, it was spending more time trying to
      do this than it actually was saving.  This version only optimizes the case
      where we're looking for free blocks, and it checks a machine word at a time.
      So on 32-bit machines, it will check 32-bits (16 blocks) and on 64-bit
      machines, it will check 64-bits (32 blocks) at a time.  The compiler
      optimizes that quite well and we save some time, especially when running
      through full bitmaps (like the bitmaps allocated for the journals).
      
      There's probably a more elegant or optimized way to do this, but I haven't
      thought of it yet.  I'm open to suggestions.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5fdc2eeb
    • A
      [GFS2] patch to check for recursive lock requests in gfs2_rename code path · 292c8c14
      Abhijith Das 提交于
      A certain scenario in the rename code path triggers a kernel BUG()
      because it accidentally does recursive locking The first lock is
      requested to unlink an already existing inode (replacing a file) and the
      second lock is requested when the destination directory needs to alloc
      some space. It is rare that these two
      events happen during the same rename call, and even more rare that these
      two instances try to lock the same rgrp. It is, however, possible.
      https://bugzilla.redhat.com/show_bug.cgi?id=404711Signed-off-by: NAbhijith Das <adas@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      292c8c14
    • S
      [GFS2] Clean up internal read function · 51ff87bd
      Steven Whitehouse 提交于
      As requested by Christoph, this patch cleans up GFS2's internal
      read function so that it no longer uses the do_generic_mapping_read
      function. This function is obsolete and GFS2 is the last user of it.
      
      As a side effect the internal read code gets smaller and easier
      to read and gfs2_readpage is split into two. One function has the locking
      and the other function has the rest of the logic.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      51ff87bd
  3. 10 10月, 2007 3 次提交
    • B
      [GFS2] Alternate gfs2_iget to avoid looking up inodes being freed · 7a9f53b3
      Benjamin Marzinski 提交于
      There is a possible deadlock between two processes on the same node, where one
      process is deleting an inode, and another process is looking for allocated but
      unused inodes to delete in order to create more space.
      
      process A does an iput() on inode X, and it's i_count drops to 0. This causes
      iput_final() to be called, which puts an inode into state I_FREEING at
      generic_delete_inode(). There no point between when iput_final() is called, and
      when I_FREEING is set where GFS2 could acquire any glocks. Once I_FREEING is
      set, no other process on that node can successfully look up that inode until
      the delete finishes.
      
      process B locks the the resource group for the same inode in get_local_rgrp(),
      which is called by gfs2_inplace_reserve_i()
      
      process A tries to lock the resource group for the inode in
      gfs2_dinode_dealloc(), but it's already locked by process B
      
      process B waits in find_inode for the inode to have the I_FREEING state cleared.
      
      Deadlock.
      
      This patch solves the problem by adding an alternative to gfs2_iget(),
      gfs2_iget_skip(), that simply skips any inodes that are in the I_FREEING
      state.o The alternate test function is just like the original one, except that
      it fails if the inode is being freed, and sets a skipped flag. The alternate
      set function is just like the original, except that it fails if the skipped
      flag is set. Only try_rgrp_unlink() calls gfs2_iget_skip() instead of
      gfs2_iget().
      Signed-off-by: NBenjamin E. Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7a9f53b3
    • B
      [GFS2] invalid metadata block - REVISED · 5f3eae75
      Bob Peterson 提交于
      This is for bugzilla bug #248176: GFS2: invalid metadata block
      
      Patches 1 thru 3 were accepted upstream, but there were problems
      with 4 and 5.  Those issues have been resolved and now the recovery
      tests are passing without errors.  This code has gone through
      41 * 3 successful gfs2 recovery tests before it hit an
      unrelated (openais) problem.
      
      This is a complete rewrite of patch 4 for bug #248176.
      
      Part of the problem was that inodes were being recycled
      before their buffers were flushed to the journal logs.
      Another problem was that the clone bitmaps were being
      searched for deleted inodes to recycle, but only the
      "real" bitmaps should be searched for that purpose.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5f3eae75
    • B
      [GFS2] Prevent infinite loop in try_rgrp_unlink() · 6760bdcd
      Bob Peterson 提交于
      This is patch three of five for bug #248176.
      
      The try_rgrp_unlink code in rgrp.c had an infinite loop.  This was
      caused because the bitmap function rgblk_search can return a block
      less than the "goal" block, in which case it was looping.  The fix is
      to make it always march forward as needed.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6760bdcd
  4. 14 8月, 2007 2 次提交
  5. 09 7月, 2007 9 次提交
    • W
      [GFS2] Remove i_mode passing from NFS File Handle · 35dcc52e
      Wendy Cheng 提交于
      GFS2 has been passing i_mode within NFS File Handle. Other than the
      wrong assumption that there is always room for this extra 16 bit value,
      the current gfs2_get_dentry doesn't really need the i_mode to work
      correctly. Note that GFS2 NFS code does go thru the same lookup code
      path as direct file access route (where the mode is obtained from name
      lookup) but gfs2_get_dentry() is coded for different purpose. It is not
      used during lookup time. It is part of the file access procedure call.
      When the call is invoked, if on-disk inode is not in-memory, it has to
      be read-in. This makes i_mode passing a useless overhead.
      Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      35dcc52e
    • W
      [GFS2] Obtaining no_formal_ino from directory entry · bb9bcf06
      Wendy Cheng 提交于
      GFS2 lookup code doesn't ask for inode shared glock. This implies during
      in-memory inode creation for existing file, GFS2 will not disk-read in
      the inode contents. This leaves no_formal_ino un-initialized during
      lookup time. The un-initialized no_formal_ino is subsequently encoded
      into file handle. Clients will get ESTALE error whenever it tries to
      access these files.
      Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      bb9bcf06
    • S
      [GFS2] Remove bogus '\0' in rgrp.c · c4201214
      Steven Whitehouse 提交于
      Not sure how it slipped in, but we don't want it anyway.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      c4201214
    • S
      [GFS2] Recovery for lost unlinked inodes · c8cdf479
      Steven Whitehouse 提交于
      Under certain circumstances its possible (though rather unlikely) that
      inodes which were unlinked by one node while still open on another might
      get "lost" in the sense that they don't get deallocated if the node
      which held the inode open crashed before it was unlinked.
      
      This patch adds the recovery code which allows automatic deallocation of
      the inode if its found during block allocation (the sensible time to
      look for such inodes since we are scanning the rgrp's bitmaps anyway at
      this time, so it adds no overhead to do this).
      
      Since the inode will have had its i_nlink set to zero, all we need to
      trigger recovery is a lookup and an iput(), and the normal deallocation
      code takes care of the rest.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      c8cdf479
    • S
      [GFS2] Fix sign problem in quota/statfs and cleanup _host structures · bb8d8a6f
      Steven Whitehouse 提交于
      This patch fixes some sign issues which were accidentally introduced
      into the quota & statfs code during the endianess annotation process.
      Also included is a general clean up which moves all of the _host
      structures out of gfs2_ondisk.h (where they should not have been to
      start with) and into the places where they are actually used (often only
      one place). Also those _host structures which are not required any more
      are removed entirely (which is the eventual plan for all of them).
      
      The conversion routines from ondisk.c are also moved into the places
      where they are actually used, which for almost every one, was just one
      single place, so all those are now static functions. This also cleans up
      the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__.
      
      The net result is a reduction of about 100 lines of code, many functions
      now marked static plus the bug fixes as mentioned above. For good
      measure I ran the code through sparse after making these changes to
      check that there are no warnings generated.
      
      This fixes Red Hat bz #239686
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      bb8d8a6f
    • S
      [GFS2] Clean up inode number handling · dbb7cae2
      Steven Whitehouse 提交于
      This patch cleans up the inode number handling code. The main difference
      is that instead of looking up the inodes using a struct gfs2_inum_host
      we now use just the no_addr member of this structure. The tests relating
      to no_formal_ino can then be done by the calling code. This has
      advantages in that we want to do different things in different code
      paths if the no_formal_ino doesn't match. In the NFS patch we want to
      return -ESTALE, but in the ->lookup() path, its a bug in the fs if the
      no_formal_ino doesn't match and thus we can withdraw in this case.
      
      In order to later fix bz #201012, we need to be able to look up an inode
      without knowing no_formal_ino, as the only information that is known to
      us is the on-disk location of the inode in question.
      
      This patch will also help us to fix bz #236099 at a later date by
      cleaning up a lot of the code in that area.
      
      There are no user visible changes as a result of this patch and there
      are no changes to the on-disk format either.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      dbb7cae2
    • R
      [GFS2] Addendum patch 2 for gfs2_grow · cd81a4ba
      Robert Peterson 提交于
      This addendum patch 2 corrects three things:
      
      1. It fixes a stupid mistake in the previous addendum that broke gfs2.
         Ref: https://www.redhat.com/archives/cluster-devel/2007-May/msg00162.html
      2. It fixes a problem that Dave Teigland pointed out regarding the
         external declarations in ops_address.h being in the wrong place.
      3. It recasts a couple more %llu printks to (unsigned long long)
         as requested by Steve Whitehouse.
      
      I would have loved to put this all in one revised patch, but there was
      a rush to get some patches for RHEL5.	Therefore, the previous patches
      were applied to the git tree "as is" and therefore, I'm posting another
      addendum.  Sorry.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      cd81a4ba
    • R
      [GFS2] Kernel changes to support new gfs2_grow command (part 2) · 6c53267f
      Robert Peterson 提交于
      To avoid code redundancy, I separated out the operational "guts" into
      a new function called read_rindex_entry.  Then I made two functions:
      the closer-to-original gfs2_ri_update (without the special condition
      checks) and gfs2_ri_update_special that's designed with that condition
      in mind.  (I don't like the name, but if you have a suggestion, I'm
      all ears).
      
      Oh, and there's an added benefit:  we don't need all the ugly gotos
      anymore.  ;)
      
      This patch has been tested with gfs2_fsck_hellfire (which runs for
      three and a half hours, btw).
      Signed-off-By: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6c53267f
    • R
      [GFS2] kernel changes to support new gfs2_grow command · 7ae8fa84
      Robert Peterson 提交于
      This is another revision of my gfs2 kernel patch that allows
      gfs2_grow to function properly.
      
      Steve Whitehouse expressed some concerns about the previous
      patch and I restructured it based on his comments.
      The previous patch was doing the statfs_change at file close time,
      under its own transaction.  The current patch does the statfs_change
      inside the gfs2_commit_write function, which keeps it under the
      umbrella of the inode transaction.
      
      I can't call ri_update to re-read the rindex file during the
      transaction because the transaction may have outstanding unwritten
      buffers attached to the rgrps that would be otherwise blown away.
      So instead, I created a new function, gfs2_ri_total, that will
      re-read the rindex file just to total the file system space
      for the sake of the statfs_change.  The ri_update will happen
      later, when gfs2 realizes the version number has changed, as it
      happened before my patch.
      
      Since the statfs_change is happening at write_commit time and there
      may be multiple writes to the rindex file for one grow operation.
      So one consequence of this restructuring is that instead of getting
      one kernel message to indicate the change, you may see several.
      For example, before when you did a gfs2_grow, you'd get a single
      message like:
      
      GFS2: File system extended by 247876 blocks (968MB)
      
      Now you get something like:
      
      GFS2: File system extended by 207896 blocks (812MB)
      GFS2: File system extended by 39980 blocks (156MB)
      
      This version has also been successfully run against the hours-long
      "gfs2_fsck_hellfire" test that does several gfs2_grow and gfs2_fsck
      while interjecting file system damage.  It does this repeatedly
      under a variety Resource Group conditions.
      Signed-off-By: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7ae8fa84
  6. 01 5月, 2007 2 次提交
    • S
      [GFS2] Fix bz 234168 (ignoring rgrp flags) · a43a4906
      Steven Whitehouse 提交于
      Ths following patch makes GFS2 use the rgrp flags properly. Although
      there are also separate flags for both data and metadata as well, I've
      not implemented these as there seems little use for them. On the
      otherhand, the "noalloc" flag is generally useful for future changes we
      might which to make, so this ensures that we interpret it correctly.
      
      In addition I fixed the comment above the function which was incorrect.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      a43a4906
    • B
      [GFS2] flush the log if a transaction can't allocate space · 172e045a
      Benjamin Marzinski 提交于
      This is a fix for bz #208514. When GFS2 frees up space, the freed blocks
      aren't available for reuse until the resource group is successfully written
      to the ondisk journal. So in rare cases, GFS2 operations will fail, saying
      that the filesystem is out of space, when in reality, you are just waiting for
      a log flush. For instance, on a 1Gig filesystem, if I continually write 10 Mb
      to a file, and then truncate it, after a hundred interations, the write will
      fail with -ENOSPC, even though the filesystem is just 1% full.
      
      The attached patch calls a log flush in these cases.  I tested this patch
      fairly heavily to check if there were any locking issues that I missed, and
      it seems to work just fine. Also, this patch only does the log flush if
      get_local_rgrp makes a complete loop of resource groups without skipping
      any do to locking issues. The code would be slightly simpler if it just always
      did the log flush after the first failed pass, and you could only ever have
      to go through the loop twice, instead of up to three times. However, I guessed
      that failing to find a rg simply do to locking issues would be common enough
      to skip the log flush in that case, but I'm not certain that this is the right
      way to go. Either way, I don't suppose this code will be hit all that often.
      Signed-off-by: NBenjamin E. Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      172e045a
  7. 15 2月, 2007 1 次提交
    • T
      [PATCH] remove many unneeded #includes of sched.h · cd354f1a
      Tim Schmielau 提交于
      After Al Viro (finally) succeeded in removing the sched.h #include in module.h
      recently, it makes sense again to remove other superfluous sched.h includes.
      There are quite a lot of files which include it but don't actually need
      anything defined in there.  Presumably these includes were once needed for
      macros that used to live in sched.h, but moved to other header files in the
      course of cleaning it up.
      
      To ease the pain, this time I did not fiddle with any header files and only
      removed #includes from .c-files, which tend to cause less trouble.
      
      Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha,
      arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig,
      allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all
      configs in arch/arm/configs on arm.  I also checked that no new warnings were
      introduced by the patch (actually, some warnings are removed that were emitted
      by unnecessarily included header files).
      Signed-off-by: NTim Schmielau <tim@physik3.uni-rostock.de>
      Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cd354f1a
  8. 30 11月, 2006 2 次提交
  9. 25 9月, 2006 1 次提交
  10. 22 9月, 2006 1 次提交
    • S
      [GFS2] Tidy up meta_io code · 7276b3b0
      Steven Whitehouse 提交于
      Fix a bug in the directory reading code, where we might have dereferenced
      a NULL pointer in case of OOM. Updated the directory code to use the new
      & improved version of gfs2_meta_ra() which now returns the first block
      that was being read. Previously it was releasing it requiring following
      code to grab the block again at each point it was called.
      
      Also turned off readahead on directory lookups since we are reading a
      hash table, and therefore reading the entries in order is very
      unlikely. Readahead is still used for all other calls to the
      directory reading function (e.g. when growing the hash table).
      
      Removed the DIO_START constant. Everywhere this was used, it was
      used to unconditionally start i/o aside from a couple of places, so
      I've removed it and made the couple of exceptions to this rule into
      separate functions.
      
      Also hunted through the other DIO flags and removed them as arguments
      from functions which were always called with the same combination of
      arguments.
      
      Updated gfs2_meta_indirect_buffer to be a bit more efficient and
      hopefully also be a bit easier to read.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7276b3b0
  11. 19 9月, 2006 1 次提交
  12. 06 9月, 2006 1 次提交
  13. 05 9月, 2006 2 次提交
  14. 01 9月, 2006 1 次提交
    • S
      [GFS2] Update copyright, tidy up incore.h · e9fc2aa0
      Steven Whitehouse 提交于
      As per comments from Jan Engelhardt <jengelh@linux01.gwdg.de> this
      updates the copyright message to say "version" in full rather than
      "v.2". Also incore.h has been updated to remove forward structure
      declarations which are not required.
      
      The gfs2_quota_lvb structure has now had endianess annotations added
      to it. Also quota.c has been updated so that we now store the
      lvb data locally in endian independant format to avoid needing
      a structure in host endianess too. As a result the endianess
      conversions are done as required at various points and thus the
      conversion routines in lvb.[ch] are no longer required. I've
      moved the one remaining constant in lvb.h thats used into lm.h
      and removed the unused lvb.[ch].
      
      I have not changed the HIF_ constants. That is left to a later patch
      which I hope will unify the gh_flags and gh_iflags fields of the
      struct gfs2_holder.
      
      Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      e9fc2aa0
  15. 23 8月, 2006 1 次提交
  16. 28 7月, 2006 1 次提交
  17. 11 7月, 2006 1 次提交
    • S
      [GFS2] Add generation number · 4340fe62
      Steven Whitehouse 提交于
      This adds a generation number for the eventual use of NFS to the
      ondisk inode. Its backward compatible with the current code since
      it doesn't really matter what the generation number is to start with,
      and indeed since its set to zero, due to it being taken from padding
      in both the inode and rgrp header, it should be fine.
      
      The eventual plan is to use this rather than no_formal_ino in the
      NFS filehandles. At that point no_formal_ino will be unused.
      
      At the same time we also add a releasepages call back to the
      "normal" address space for gfs2 inodes. Also I've removed a
      one-linrer function thats not required any more.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      4340fe62
  18. 22 6月, 2006 1 次提交
  19. 19 6月, 2006 1 次提交
  20. 15 6月, 2006 1 次提交
    • S
      [GFS2] Fix unlinked file handling · feaa7bba
      Steven Whitehouse 提交于
      This patch fixes the way we have been dealing with unlinked,
      but still open files. It removes all limits (other than memory
      for inodes, as per every other filesystem) on numbers of these
      which we can support on GFS2. It also means that (like other
      fs) its the responsibility of the last process to close the file
      to deallocate the storage, rather than the person who did the
      unlinking. Note that with GFS2, those two events might take place
      on different nodes.
      
      Also there are a number of other changes:
      
       o We use the Linux inode subsystem as it was intended to be
      used, wrt allocating GFS2 inodes
       o The Linux inode cache is now the point which we use for
      local enforcement of only holding one copy of the inode in
      core at once (previous to this we used the glock layer).
       o We no longer use the unlinked "special" file. We just ignore it
      completely. This makes unlinking more efficient.
       o We now use the 4th block allocation state. The previously unused
      state is used to track unlinked but still open inodes.
       o gfs2_inoded is no longer needed
       o Several fields are now no longer needed (and removed) from the in
      core struct gfs2_inode
       o Several fields are no longer needed (and removed) from the in core
      superblock
      
      There are a number of future possible optimisations and clean ups
      which have been made possible by this patch.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      feaa7bba