1. 24 3月, 2009 2 次提交
    • S
      GFS2: Merge lock_dlm module into GFS2 · f057f6cd
      Steven Whitehouse 提交于
      This is the big patch that I've been working on for some time
      now. There are many reasons for wanting to make this change
      such as:
       o Reducing overhead by eliminating duplicated fields between structures
       o Simplifcation of the code (reduces the code size by a fair bit)
       o The locking interface is now the DLM interface itself as proposed
         some time ago.
       o Fewer lookups of glocks when processing replies from the DLM
       o Fewer memory allocations/deallocations for each glock
       o Scope to do further optimisations in the future (but this patch is
         more than big enough for now!)
      
      Please note that (a) this patch relates to the lock_dlm module and
      not the DLM itself, that is still a separate module; and (b) that
      we retain the ability to build GFS2 as a standalone single node
      filesystem with out requiring the DLM.
      
      This patch needs a lot of testing, hence my keeping it I restarted
      my -git tree after the last merge window. That way, this has the maximum
      exposure before its merged. This is (modulo a few minor bug fixes) the
      same patch that I've been posting on and off the the last three months
      and its passed a number of different tests so far.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f057f6cd
    • S
      GFS2: Fix remount argument parsing · 6f04c1c7
      Steven Whitehouse 提交于
      The following patch fixes an issue relating to remount and argument
      parsing. After this fix is applied, remount becomes atomic in that
      it either succeeds changing the mount to the new state, or it fails
      and leaves it in the old state. Previously it was possible for the
      parsing of options to fail part way though and for the fs to be left
      in a state where some of the new arguments had been applied, but some
      had not.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6f04c1c7
  2. 10 1月, 2009 1 次提交
    • T
      filesystem freeze: add error handling of write_super_lockfs/unlockfs · c4be0c1d
      Takashi Sato 提交于
      Currently, ext3 in mainline Linux doesn't have the freeze feature which
      suspends write requests.  So, we cannot take a backup which keeps the
      filesystem's consistency with the storage device's features (snapshot and
      replication) while it is mounted.
      
      In many case, a commercial filesystem (e.g.  VxFS) has the freeze feature
      and it would be used to get the consistent backup.
      
      If Linux's standard filesystem ext3 has the freeze feature, we can do it
      without a commercial filesystem.
      
      So I have implemented the ioctls of the freeze feature.
      I think we can take the consistent backup with the following steps.
      1. Freeze the filesystem with the freeze ioctl.
      2. Separate the replication volume or create the snapshot
         with the storage device's feature.
      3. Unfreeze the filesystem with the unfreeze ioctl.
      4. Take the backup from the separated replication volume
         or the snapshot.
      
      This patch:
      
      VFS:
      Changed the type of write_super_lockfs and unlockfs from "void"
      to "int" so that they can return an error.
      Rename write_super_lockfs and unlockfs of the super block operation
      freeze_fs and unfreeze_fs to avoid a confusion.
      
      ext3, ext4, xfs, gfs2, jfs:
      Changed the type of write_super_lockfs and unlockfs from "void"
      to "int" so that write_super_lockfs returns an error if needed,
      and unlockfs always returns 0.
      
      reiserfs:
      Changed the type of write_super_lockfs and unlockfs from "void"
      to "int" so that they always return 0 (success) to keep a current behavior.
      Signed-off-by: NTakashi Sato <t-sato@yk.jp.nec.com>
      Signed-off-by: NMasayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com>
      Cc: <xfs-masters@oss.sgi.com>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Alasdair G Kergon <agk@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c4be0c1d
  3. 05 1月, 2009 9 次提交
    • S
      GFS2: Fix use-after-free bug on umount (try #2) · 88a19ad0
      Steven Whitehouse 提交于
      This should solve the issue with the previous attempt at fixing this.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      88a19ad0
    • S
      Revert "GFS2: Fix use-after-free bug on umount" · fefc03bf
      Steven Whitehouse 提交于
      This reverts commit 78802499912f1ba31ce83a94c55b5a980f250a43.
      
      The original patch is causing problems in relation to order of
      operations at umount in relation to jdata files. I need to fix
      this a different way.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      fefc03bf
    • S
      GFS2: Fix use-after-free bug on umount · 3af165ac
      Steven Whitehouse 提交于
      There was a use-after-free with the GFS2 super block during
      umount. This patch moves almost all of the umount code from
      ->put_super into ->kill_sb, the only bit that cannot be moved
      being the glock hash clearing which has to remain as ->put_super
      due to umount ordering requirements. As a result its now obvious
      that the kfree is the final operation, whereas before it was
      hidden in ->put_super.
      
      Also gfs2_jindex_free is then only referenced from a single file
      so thats moved and marked static too.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      3af165ac
    • S
      GFS2: Move four functions from super.c · 2bfb6449
      Steven Whitehouse 提交于
      The functions which are being moved can all be marked
      static in their new locations, since they only have
      a single caller each. Their new locations are more
      logical than before and some of the functions are
      small enough that the compiler might well inline them.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      2bfb6449
    • S
      GFS2: Kill two daemons with one patch · 97cc1025
      Steven Whitehouse 提交于
      This patch removes the two daemons, gfs2_scand and gfs2_glockd
      and replaces them with a shrinker which is called from the VM.
      
      The net result is that GFS2 responds better when there is memory
      pressure, since it shrinks the glock cache at the same rate
      as the VFS shrinks the dcache and icache. There are no longer
      any time based criteria for shrinking glocks, they are kept
      until such time as the VM asks for more memory and then we
      demote just as many glocks as required.
      
      There are potential future changes to this code, including the
      possibility of sorting the glocks which are to be written back
      into inode number order, to get a better I/O ordering. It would
      be very useful to have an elevator based workqueue implementation
      for this, as that would automatically deal with the read I/O cases
      at the same time.
      
      This patch is my answer to Andrew Morton's remark, made during
      the initial review of GFS2, asking why GFS2 needs so many kernel
      threads, the answer being that it doesn't :-) This patch is a
      net loss of about 200 lines of code.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      97cc1025
    • S
      GFS2: Banish struct gfs2_dinode_host · 383f01fb
      Steven Whitehouse 提交于
      The final field in gfs2_dinode_host was the i_flags field. Thats
      renamed to i_diskflags in order to avoid confusion with the existing
      inode flags, and moved into the inode proper at a suitable location
      to avoid creating a "hole".
      
      At that point struct gfs2_dinode_host is no longer needed and as
      promised (quite some time ago!) it can now be removed completely.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      383f01fb
    • S
      GFS2: Move di_eattr into "proper" inode · 3767ac21
      Steven Whitehouse 提交于
      This moves the di_eattr field out of gfs2_inode_host and
      into the inode proper.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      3767ac21
    • S
      GFS2: Fix up jdata writepage/delete_inode · 1bb7322f
      Steven Whitehouse 提交于
      There is a bug in writepage and delete_inode which allows jdata files to
      invalidate pages from the address space without being in a transaction at
      the time. This causes problems in case the pages are in the journal. This
      patch fixes that case and prevents the resulting oops.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1bb7322f
    • S
      GFS2: Rationalise header files · b2760583
      Steven Whitehouse 提交于
      Move the contents of some headers which contained very
      little into more sensible places, and remove the original
      header files. This should make it easier to find things.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b2760583
  4. 18 9月, 2008 1 次提交
    • S
      GFS2: high time to take some time over atime · 719ee344
      Steven Whitehouse 提交于
      Until now, we've used the same scheme as GFS1 for atime. This has failed
      since atime is a per vfsmnt flag, not a per fs flag and as such the
      "noatime" flag was not getting passed down to the filesystems. This
      patch removes all the "special casing" around atime updates and we
      simply use the VFS's atime code.
      
      The net result is that GFS2 will now support all the same atime related
      mount options of any other filesystem on a per-vfsmnt basis. We do lose
      the "lazy atime" updates, but we gain "relatime". We could add lazy
      atime to the VFS at a later date, if there is a requirement for that
      variant still - I suspect relatime will be enough.
      
      Also we lose about 100 lines of code after this patch has been applied,
      and I have a suspicion that it will speed things up a bit, even when
      atime is "on". So it seems like a nice clean up as well.
      
      From a user perspective, everything stays the same except the loss of
      the per-fs atime quantum tweekable (ought to be per-vfsmnt at the very
      least, and to be honest I don't think anybody ever used it) and that a
      number of options which were ignored before now work correctly.
      
      Please let me know if you've got any comments. I'm pushing this out
      early so that you can all see what my plans are.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      719ee344
  5. 13 8月, 2008 1 次提交
    • S
      GFS2: Fix metafs mounts · 9b8df98f
      Steven Whitehouse 提交于
      This patch is intended to fix the issues reported in bz #457798. Instead
      of having the metafs as a separate filesystem, it becomes a second root
      of gfs2. As a result it will appear as type gfs2 in /proc/mounts, but it
      is still possible (for backwards compatibility purposes) to mount it as
      type gfs2meta. A new mount flag "meta" is introduced so that its possible
      to tell the two cases apart in /proc/mounts.
      
      As a result it becomes possible to mount type gfs2 with -o meta and
      get the same result as mounting type gfs2meta. So it is possible to
      mount just the metafs on its own. Currently if you do this, its then
      impossible to mount the "normal" root of the gfs2 filesystem without
      first unmounting the metafs root. I'm not sure if thats a feature or
      a bug :-)
      
      Either way, this is a great improvement on the previous scheme and I've
      verified that it works ok with bind mounts on both the "normal" root
      and the metafs root in various combinations.
      
      There were also a bunch of functions in super.c which didn't belong there,
      so this moves them into ops_fstype.c where they can be static. Hopefully
      the mount/umount sequence is now more obvious as a result.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Alexander Viro <aviro@redhat.com>
      9b8df98f
  6. 27 6月, 2008 2 次提交
    • S
      [GFS2] Remove remote lock dropping code · 1bdad606
      Steven Whitehouse 提交于
      There are several reasons why this is undesirable:
      
       1. It never happens during normal operation anyway
       2. If it does happen it causes performance to be very, very poor
       3. It isn't likely to solve the original problem (memory shortage
          on remote DLM node) it was supposed to solve
       4. It uses a bunch of arbitrary constants which are unlikely to be
          correct for any particular situation and for which the tuning seems
          to be a black art.
       5. In an N node cluster, only 1/N of the dropped locked will actually
          contribute to solving the problem on average.
      
      So all in all we are better off without it. This also makes merging
      the lock_dlm module into GFS2 a bit easier.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1bdad606
    • B
      [GFS2] kernel panic mounting volume · 9171f5a9
      Bob Peterson 提交于
      This patch fixes Red Hat bugzilla bug 450156.
      
      This started with a not-too-improbable mount failure because the
      locking protocol was never set back to its proper "lock_dlm" after the
      system was rebooted in the middle of a gfs2_fsck.  That left a
      (purposely) invalid locking protocol in the superblock, which caused an
      error when the file system was mounted the next time.
      
      When there's an error mounting, vfs calls DQUOT_OFF, which calls
      vfs_quota_off which calls gfs2_sync_fs.  Next, gfs2_sync_fs calls
      gfs2_log_flush passing s_fs_info.  But due to the error, s_fs_info
      had been previously set to NULL, and so we have the kernel oops.
      
      My solution in this patch is to test for the NULL value before passing
      it.  I tested this patch and it fixes the problem.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      9171f5a9
  7. 12 5月, 2008 1 次提交
  8. 31 3月, 2008 1 次提交
    • S
      [GFS2] Remove lm.[ch] and distribute content · da755fdb
      Steven Whitehouse 提交于
      The functions in lm.c were just wrappers which were mostly
      only used in one other file. By moving the functions to
      the files where they are being used, they can be marked
      static and also this will usually result in them being inlined
      since they are often only used from one point in the code.
      
      A couple of really trivial functions have been inlined by hand
      into the function which called them as it makes the code clearer
      to do that.
      
      We also gain from one fewer function call in the glock lock and
      unlock paths.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      da755fdb
  9. 25 1月, 2008 1 次提交
  10. 10 10月, 2007 2 次提交
    • S
      [GFS2] Clean up journaled data writing · 16615be1
      Steven Whitehouse 提交于
      This patch cleans up the code for writing journaled data into the log.
      It also removes the need to allocate a small "tag" structure for each
      block written into the log. Instead we just keep count of the outstanding
      I/O so that we can be sure that its all been written at the correct time.
      Another result of this patch is that a number of ll_rw_block() calls
      have become submit_bh() calls, closing some races at the same time.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      16615be1
    • S
      [GFS2] Reduce number of gfs2_scand processes to one · 8fbbfd21
      Steven Whitehouse 提交于
      We only need a single gfs2_scand process rather than the one
      per filesystem which we had previously. As a result the parameter
      determining the frequency of gfs2_scand runs becomes a module
      parameter rather than a mount parameter as it was before.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      8fbbfd21
  11. 09 7月, 2007 1 次提交
    • A
      [GFS2] Fix deallocation issues · d93cfa98
      Abhijith Das 提交于
      There were two issues during deallocation of unlinked inodes. The
      first was relating to the use of a "try" lock which in the case of
      the inode lock wasn't trying hard enough to deallocate in all
      circumstances (now changed to a normal glock) and in the case of
      the iopen lock didn't wait for the demotion of the shared lock before
      attempting to get the exclusive lock, and thereby sometimes (timing dependent)
      not completing the deallocation when it should have done.
      
      The second issue related to the lack of a way to invalidate dcache entries
      on remote nodes (now fixed by this patch) which meant that unlinks were
      taking a long time to return disk space to the fs. By adding some code to
      invalidate the dcache entries across the cluster for unlinked inodes, that
      is now fixed.
      
      This patch was written jointly by Abhijith Das and Steven Whitehouse.
      Signed-off-by: NAbhijith Das <adas@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      d93cfa98
  12. 01 5月, 2007 1 次提交
    • S
      [GFS2] Fix bz 224480 and cleanup glock demotion code · 3b8249f6
      Steven Whitehouse 提交于
      This patch prevents the printing of a warning message in cases where
      the fs is functioning normally by handing off responsibility for
      unlinked, but still open inodes, to another node for eventual deallocation.
      Also, there is now an improved system for ensuring that such requests
      to other nodes do not get lost. The callback on the iopen lock is
      only ever called when i_nlink == 0 and when a node is unable to deallocate
      it due to it still being in use on another node. When a node receives
      the callback therefore, it knows that i_nlink must be zero, so we mark
      it as such (in gfs2_drop_inode) in order that it will then attempt
      deallocation of the inode itself.
      
      As an additional benefit, queuing a demote request no longer requires
      a memory allocation. This simplifies the code for dealing with gfs2_holders
      as it removes one special case.
      
      There are two new fields in struct gfs2_glock. gl_demote_state is the
      state which the remote node has requested and gl_demote_time is the
      time when the request came in. Both fields are only valid when the
      GLF_DEMOTE flag is set in gl_flags.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      3b8249f6
  13. 13 2月, 2007 1 次提交
  14. 06 2月, 2007 3 次提交
    • S
      [GFS2] Remove the "greedy" function from glock.[ch] · e5dab552
      Steven Whitehouse 提交于
      The "greedy" code was an attempt to retain glocks for a minimum length
      of time when they relate to mmap()ed files. The current implementation
      of this feature is not, however, ideal in that it required allocating
      memory in order to do this and its overly complicated.
      
      It also misses the mark by ignoring the other I/O operations which are
      just as likely to suffer from the same problem. So the plan is to remove
      this now and then add the functionality back as part of the glock state
      machine at a later date (and thus take into account all the possible
      users of this feature)
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      e5dab552
    • S
      [GFS2] Fix ordering of page disposal vs. glock_dq · 49686f71
      Steven Whitehouse 提交于
      In case of unlinked files with dirty pages GFS2 wasn't clearing
      the pages in quite the right order. This patch clears the pages
      earlier (before the qlock_dq) to avoid the situation that the
      release of the glock results in attempting to write back data that
      has already been deallocated.
      
      This fixes Red Hat bugzilla: #220117
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      49686f71
    • D
      [GFS2] don't try to lockfs after shutdown · c3780511
      David Teigland 提交于
      If an fs has already been shut down, a lockfs callback should do nothing.
      An fs that's been shut down can't acquire locks or do anything with
      respect to the cluster.
      
      Also, remove FIXME comment in withdraw function.  The missing bits of the
      withdraw procedure are now all done by user space.
      Signed-off-by: NDavid Teigland <teigland@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      c3780511
  15. 30 11月, 2006 3 次提交
    • S
      [GFS2] Fix journal flush problem · b004157a
      Steven Whitehouse 提交于
      This fixes a bug which resulted in poor performance due to flushing
      the journal too often. The code path in question was via the inode_go_sync()
      function in glops.c. The solution is not to flush the journal immediately
      when inodes are ejected from memory, but batch up the work for glockd to
      deal with later on. This means that glocks may now live on beyond the end of
      the lifetime of their inodes (but not very much longer in the normal case).
      
      Also fixed in this patch is a bug (which was hidden by the bug mentioned above) in
      calculation of the number of free journal blocks.
      
      The gfs2_logd process has been altered to be more responsive to the journal
      filling up. We now wake it up when the number of uncommitted journal blocks
      has reached the threshold level rather than trying to flush directly at the
      end of each transaction. This again means doing fewer, but larger, log
      flushes in general.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b004157a
    • S
      [GFS2] Shrink gfs2_inode (3) - di_mode · b60623c2
      Steven Whitehouse 提交于
      This removes the duplicate di_mode field in favour of using the
      inode->i_mode field. This saves 4 bytes.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b60623c2
    • A
      [GFS2] split and annotate gfs2_statfs_change · bd209cc0
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      bd209cc0
  16. 06 11月, 2006 1 次提交
    • S
      [GFS2] Fix incorrect fs sync behaviour. · 4a221953
      Steven Whitehouse 提交于
      This adds a sync_fs superblock operation for GFS2 and removes
      the journal flush from write_super in favour of sync_fs where it
      ought to be. This is more or less identical to the way in which ext3
      does this.
      
      This bug was pointed out by Russell Cattelan <cattelan@redhat.com>
      
      Cc: Russell Cattelan <cattelan@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      4a221953
  17. 28 9月, 2006 1 次提交
    • T
      [GFS2] inode_diet: Replace inode.u.generic_ip with inode.i_private (gfs) · bba9dfd8
      Theodore Ts'o 提交于
      The following patches reduce the size of the VFS inode structure by 28 bytes
      on a UP x86.  (It would be more on an x86_64 system).  This is a 10% reduction
      in the inode size on a UP kernel that is configured in a production mode
      (i.e., with no spinlock or other debugging functions enabled; if you want to
      save memory taken up by in-core inodes, the first thing you should do is
      disable the debugging options; they are responsible for a huge amount of bloat
      in the VFS inode structure).
      
      This patch:
      
      The filesystem or device-specific pointer in the inode is inside a union,
      which is pretty pointless given that all 30+ users of this field have been
      using the void pointer.  Get rid of the union and rename it to i_private, with
      a comment to explain who is allowed to use the void pointer.  This is just a
      cleanup, but it allows us to reuse the union 'u' for something something where
      the union will actually be used.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      bba9dfd8
  18. 25 9月, 2006 1 次提交
  19. 19 9月, 2006 1 次提交
  20. 10 9月, 2006 1 次提交
  21. 05 9月, 2006 2 次提交
  22. 01 9月, 2006 1 次提交
    • S
      [GFS2] Update copyright, tidy up incore.h · e9fc2aa0
      Steven Whitehouse 提交于
      As per comments from Jan Engelhardt <jengelh@linux01.gwdg.de> this
      updates the copyright message to say "version" in full rather than
      "v.2". Also incore.h has been updated to remove forward structure
      declarations which are not required.
      
      The gfs2_quota_lvb structure has now had endianess annotations added
      to it. Also quota.c has been updated so that we now store the
      lvb data locally in endian independant format to avoid needing
      a structure in host endianess too. As a result the endianess
      conversions are done as required at various points and thus the
      conversion routines in lvb.[ch] are no longer required. I've
      moved the one remaining constant in lvb.h thats used into lm.h
      and removed the unused lvb.[ch].
      
      I have not changed the HIF_ constants. That is left to a later patch
      which I hope will unify the gh_flags and gh_iflags fields of the
      struct gfs2_holder.
      
      Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      e9fc2aa0
  23. 26 8月, 2006 1 次提交
  24. 19 8月, 2006 1 次提交
    • S
      [GFS2] Fix leak of gfs2_bufdata · 15d00c0b
      Steven Whitehouse 提交于
      This fixes a memory leak of struct gfs2_bufdata and also some
      problems in the ordered write handling code. It needs a bit
      more testing, but I believe that the reference counting of
      ordered write buffers should now be correct.
      
      This is aimed at fixing Red Hat bugzilla: #201028 and #201082
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      15d00c0b