1. 01 3月, 2010 2 次提交
    • S
      GFS2: Remove loopy umount code · c1184f8a
      Steven Whitehouse 提交于
      As a consequence of the previous patch, we can now remove the
      loop which used to be required due to the circular dependency
      between the inodes and glocks. Instead we can just invalidate
      the inodes, and then clear up any glocks which are left.
      
      Also we no longer need the rwsem since there is no longer any
      danger of the inode invalidation calling back into the glock
      code (and from there back into the inode code).
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      c1184f8a
    • S
      GFS2: Metadata address space clean up · 009d8518
      Steven Whitehouse 提交于
      Since the start of GFS2, an "extra" inode has been used to store
      the metadata belonging to each inode. The only reason for using
      this inode was to have an extra address space, the other fields
      were unused. This means that the memory usage was rather inefficient.
      
      The reason for keeping each inode's metadata in a separate address
      space is that when glocks are requested on remote nodes, we need to
      be able to efficiently locate the data and metadata which relating
      to that glock (inode) in order to sync or sync and invalidate it
      (depending on the remotely requested lock mode).
      
      This patch adds a new type of glock, which has in addition to
      its normal fields, has an address space. This applies to all
      inode and rgrp glocks (but to no other glock types which remain
      as before). As a result, we no longer need to have the second
      inode.
      
      This results in three major improvements:
       1. A saving of approx 25% of memory used in caching inodes
       2. A removal of the circular dependency between inodes and glocks
       3. No confusion between "normal" and "metadata" inodes in super.c
      
      Although the first of these is the more immediately apparent, the
      second is just as important as it now enables a number of clean
      ups at umount time. Those will be the subject of future patches.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      009d8518
  2. 03 2月, 2010 2 次提交
  3. 03 12月, 2009 6 次提交
  4. 09 9月, 2009 1 次提交
    • S
      GFS2: Be extra careful about deallocating inodes · acf7e244
      Steven Whitehouse 提交于
      There is a potential race in the inode deallocation code if two
      nodes try to deallocate the same inode at the same time. Most of
      the issue is solved by the iopen locking. There is still a small
      window which is not covered by the iopen lock. This patches fixes
      that and also makes the deallocation code more robust in the face of
      any errors in the rgrp bitmaps, or erroneous iopen callbacks from
      other nodes.
      
      This does introduce one extra disk read, but that is generally not
      an issue since its the same block that must be written to later
      in the deallocation process. The total disk accesses therefore stay
      the same,
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      acf7e244
  5. 27 8月, 2009 2 次提交
    • S
      GFS2: Remove no_formal_ino generating code · 8d8291ae
      Steven Whitehouse 提交于
      The inum structure used throughout GFS2 has two fields. One
      no_addr is the disk block number of the inode in question and
      is used everywhere as the inode number. The other, no_formal_ino,
      is used only as the generation number for NFS.
      
      Historically the no_formal_ino field was set using a complicated
      system of one global and one per-node file containing inode numbers
      in order to ensure that each no_formal_ino was unique. Also this
      code made no provision for what would happen when eventually the
      (64 bit) numbers ran out. Now I know that is pretty unlikely to
      happen given the large space of numbers, but it is possible
      nevertheless.
      
      The only guarantee required for no_formal_ino is that, for any
      single inode, the same number doesn't get reused too quickly.
      
      We already have a generation number which is kept in the inode
      and initialised from a counter in the resource group (almost
      no overhead, since we have to touch the resource group anyway
      in order to allocate an inode in the first place). Aside from
      ensuring that we never use the value 0 in the no_formal_ino
      field, we can use that counter directly.
      
      As a result of that change, we lose about 200 lines of code and
      also gain about 10 creates/sec on the postmark benchmark (on
      my test machine).
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      8d8291ae
    • S
      GFS2: Rename eattr.[ch] as xattr.[ch] · 307cf6e6
      Steven Whitehouse 提交于
      Use the more conventional name for the extended attribute
      support code. Update all the places which care.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      307cf6e6
  6. 24 8月, 2009 1 次提交
    • B
      GFS2: Add "-o errors=panic|withdraw" mount options · d34843d0
      Bob Peterson 提交于
      This patch adds "-o errors=panic" and "-o errors=withdraw" to the
      gfs2 mount options.  The "errors=withdraw" option is today's
      current behaviour, meaning to withdraw from the file system if a
      non-serious gfs2 error occurs.  The new "errors=panic" option
      tells gfs2 to force a kernel panic if a non-serious gfs2 file
      system error occurs.  This may be useful, for example, where
      fabric-level fencing is used that has no way to reboot (such as
      fence_scsi).
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      d34843d0
  7. 17 8月, 2009 1 次提交
    • S
      GFS2: Add online uevent to GFS2 · 8633ecfa
      Steven Whitehouse 提交于
      We already have an offline uevent (used when a withdraw occurs)
      but no online uevent. This adds an online uevent so that userspace
      will be able to detect a successful mount by means other than
      not receiving a remove event after the add & recovery (change)
      uevents.
      
      It has also been added to the remount path as well - we can't use
      a change uevent there as older GFS2 userspace acts on change uevents
      according to the state that it thinks the fs is in, so we can't
      easily add any new ones.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      8633ecfa
  8. 30 7月, 2009 2 次提交
    • B
      GFS2: remove dcache entries for remote deleted inodes · b94a170e
      Benjamin Marzinski 提交于
      When a file is deleted from a gfs2 filesystem on one node, a dcache
      entry for it may still exist on other nodes in the cluster. If this
      happens, gfs2 will be unable to free this file on disk. Because of this,
      it's possible to have a gfs2 filesystem with no files on it and no free
      space. With this patch, when a node receives a callback notifying it
      that the file is being deleted on another node, it schedules a new
      workqueue thread to remove the file's dcache entry.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b94a170e
    • B
      GFS2: keep statfs info in sync on grows · 1946f70a
      Benjamin Marzinski 提交于
      GFS2 wasn't syncing its statfs info on grows.  This causes a problem
      when you grow the filesystem on multiple nodes.  GFS2 would calculate
      the new space based on the resource groups (which are always current),
      and then assume that the filesystem had grown the from the existing
      statfs size.  If you grew the filesystem on two different nodes in a
      short time, the second node wouldn't see the statfs size change from the
      first node, and would assume that it was grown by a larger amount than
      it was.  When all these changes were synced out, the total fileystem
      size would be incorrect (the first grow would be counted twice).
      
      This patch syncs makes GFS2 read in the statfs changes from disk before
      a grow, and write them out after the grow, while the master statfs inode
      is locked.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1946f70a
  9. 12 6月, 2009 3 次提交
  10. 22 5月, 2009 1 次提交
  11. 24 3月, 2009 2 次提交
    • S
      GFS2: Fix freeze issue · df3647b2
      Steven Whitehouse 提交于
      This removes some old code that was causing issues during
      filesystem freeze.
      Reported-by: NAndrew Price <andy@andrewprice.me.uk>
      Tested-by: NAndrew Price <andy@andrewprice.me.uk>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      df3647b2
    • S
      GFS2: Merge lock_dlm module into GFS2 · f057f6cd
      Steven Whitehouse 提交于
      This is the big patch that I've been working on for some time
      now. There are many reasons for wanting to make this change
      such as:
       o Reducing overhead by eliminating duplicated fields between structures
       o Simplifcation of the code (reduces the code size by a fair bit)
       o The locking interface is now the DLM interface itself as proposed
         some time ago.
       o Fewer lookups of glocks when processing replies from the DLM
       o Fewer memory allocations/deallocations for each glock
       o Scope to do further optimisations in the future (but this patch is
         more than big enough for now!)
      
      Please note that (a) this patch relates to the lock_dlm module and
      not the DLM itself, that is still a separate module; and (b) that
      we retain the ability to build GFS2 as a standalone single node
      filesystem with out requiring the DLM.
      
      This patch needs a lot of testing, hence my keeping it I restarted
      my -git tree after the last merge window. That way, this has the maximum
      exposure before its merged. This is (modulo a few minor bug fixes) the
      same patch that I've been posting on and off the the last three months
      and its passed a number of different tests so far.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f057f6cd
  12. 05 1月, 2009 7 次提交
  13. 13 8月, 2008 1 次提交
    • S
      GFS2: Fix metafs mounts · 9b8df98f
      Steven Whitehouse 提交于
      This patch is intended to fix the issues reported in bz #457798. Instead
      of having the metafs as a separate filesystem, it becomes a second root
      of gfs2. As a result it will appear as type gfs2 in /proc/mounts, but it
      is still possible (for backwards compatibility purposes) to mount it as
      type gfs2meta. A new mount flag "meta" is introduced so that its possible
      to tell the two cases apart in /proc/mounts.
      
      As a result it becomes possible to mount type gfs2 with -o meta and
      get the same result as mounting type gfs2meta. So it is possible to
      mount just the metafs on its own. Currently if you do this, its then
      impossible to mount the "normal" root of the gfs2 filesystem without
      first unmounting the metafs root. I'm not sure if thats a feature or
      a bug :-)
      
      Either way, this is a great improvement on the previous scheme and I've
      verified that it works ok with bind mounts on both the "normal" root
      and the metafs root in various combinations.
      
      There were also a bunch of functions in super.c which didn't belong there,
      so this moves them into ops_fstype.c where they can be static. Hopefully
      the mount/umount sequence is now more obvious as a result.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Alexander Viro <aviro@redhat.com>
      9b8df98f
  14. 27 7月, 2008 1 次提交
  15. 10 7月, 2008 1 次提交
  16. 27 6月, 2008 1 次提交
    • S
      [GFS2] Clean up the glock core · 6802e340
      Steven Whitehouse 提交于
      This patch implements a number of cleanups to the core of the
      GFS2 glock code. As a result a lot of code is removed. It looks
      like a really big change, but actually a large part of this patch
      is either removing or moving existing code.
      
      There are some new bits too though, such as the new run_queue()
      function which is considerably streamlined. Highlights of this
      patch include:
      
       o Fixes a cluster coherency bug during SH -> EX lock conversions
       o Removes the "glmutex" code in favour of a single bit lock
       o Removes the ->go_xmote_bh() for inodes since it was duplicating
         ->go_lock()
       o We now only use the ->lm_lock() function for both locks and
         unlocks (i.e. unlock is a lock with target mode LM_ST_UNLOCKED)
       o The fast path is considerably shortly, giving performance gains
         especially with lock_nolock
       o The glock_workqueue is now used for all the callbacks from the DLM
         which allows us to simplify the lock_dlm module (see following patch)
       o The way is now open to make further changes such as eliminating the two
         threads (gfs2_glockd and gfs2_scand) in favour of a more efficient
         scheme.
      
      This patch has undergone extensive testing with various test suites
      so it should be pretty stable by now.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Bob Peterson <rpeterso@redhat.com>
      6802e340
  17. 10 4月, 2008 1 次提交
  18. 31 3月, 2008 1 次提交
    • S
      [GFS2] Streamline indirect pointer tree height calculation · ecc30c79
      Steven Whitehouse 提交于
      This patch improves the calculation of the tree height in order to reduce
      the number of operations which are carried out on each call to gfs2_block_map.
      In the common case, we now make a single comparison, rather than calculating
      the required tree height from scratch each time. Also in the case that the
      tree does need some extra height, we start from the current height rather from
      zero when we work out what the new height ought to be.
      
      In addition the di_height field is moved into the inode proper and reduced
      in size to a u8 since the value must be between 0 and GFS2_MAX_META_HEIGHT (10).
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      ecc30c79
  19. 25 1月, 2008 4 次提交