1. 05 5月, 2010 1 次提交
    • B
      GFS2: Various gfs2_logd improvements · 5e687eac
      Benjamin Marzinski 提交于
      This patch contains various tweaks to how log flushes and active item writeback
      work. gfs2_logd is now managed by a waitqueue, and gfs2_log_reseve now waits
      for gfs2_logd to do the log flushing.  Multiple functions were rewritten to
      remove the need to call gfs2_log_lock(). Instead of using one test to see if
      gfs2_logd had work to do, there are now seperate tests to check if there
      are two many buffers in the incore log or if there are two many items on the
      active items list.
      
      This patch is a port of a patch Steve Whitehouse wrote about a year ago, with
      some minor changes.  Since gfs2_ail1_start always submits all the active items,
      it no longer needs to keep track of the first ai submitted, so this has been
      removed. In gfs2_log_reserve(), the order of the calls to
      prepare_to_wait_exclusive() and wake_up() when firing off the logd thread has
      been switched.  If it called wake_up first there was a small window for a race,
      where logd could run and return before gfs2_log_reserve was ready to get woken
      up. If gfs2_logd ran, but did not free up enough blocks, gfs2_log_reserve()
      would be left waiting for gfs2_logd to eventualy run because it timed out.
      Finally, gt_logd_secs, which controls how long to wait before gfs2_logd times
      out, and flushes the log, can now be set on mount with ar_commit.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5e687eac
  2. 14 4月, 2010 1 次提交
    • B
      GFS2: glock livelock · 1a0eae88
      Bob Peterson 提交于
      This patch fixes a couple gfs2 problems with the reclaiming of
      unlinked dinodes.  First, there were a couple of livelocks where
      everything would come to a halt waiting for a glock that was
      seemingly held by a process that no longer existed.  In fact, the
      process did exist, it just had the wrong pid number in the holder
      information.  Second, there was a lock ordering problem between
      inode locking and glock locking.  Third, glock/inode contention
      could sometimes cause inodes to be improperly marked invalid by
      iget_failed.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      1a0eae88
  3. 29 3月, 2010 2 次提交
  4. 12 3月, 2010 1 次提交
  5. 11 3月, 2010 1 次提交
    • B
      GFS2: Allow the number of committed revokes to temporarily be negative · 2e95e3f6
      Benjamin Marzinski 提交于
      GFS2 tracks the number of revokes and unrevokes that are part of committed
      transactions via sd_log_commited_revoke. It is possible for one process to add
      revokes during its transaction, while another process unrevokes them during its
      transaction. If the second process finishes its transaction first,
      sd_log_commited_revoke will be decremented by the number of unrevokes that the
      second process did, without first being incremented by the number of revokes
      the first process did. This is fine, since all started transactions must be
      completed before the journal can be flushed.  However, sd_log_commited_revoke
      is an unsigned integer, and log_refund() causes an assertion failure if it
      would go negative at the end of a transaction.  This patch makes
      sd_log_commited_revoke a signed integer and allows it to go negative.
      __gfs2_log_flush() still checks that it mataches the actual number of revokes.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      2e95e3f6
  6. 09 3月, 2010 1 次提交
  7. 08 3月, 2010 2 次提交
  8. 06 3月, 2010 1 次提交
  9. 05 3月, 2010 1 次提交
    • C
      quota: move code from sync_quota_sb into vfs_quota_sync · 5fb324ad
      Christoph Hellwig 提交于
      Currenly sync_quota_sb does a lot of sync and truncate action that only
      applies to "VFS" style quotas and is actively harmful for the sync
      performance in XFS.  Move it into vfs_quota_sync and add a wait parameter
      to ->quota_sync to tell if we need it or not.
      
      My audit of the GFS2 code says it's also not needed given the way GFS2
      implements quotas, but I'd be happy if this can get a detailed review.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      5fb324ad
  10. 04 3月, 2010 1 次提交
  11. 01 3月, 2010 4 次提交
    • B
      GFS2: print glock numbers in hex · 4818972e
      Bob Peterson 提交于
      This patch changes glock numbers from printing in decimal to hex.
      Since DLM prints corresponding resource IDs in hex, it makes debugging
      easier.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      4818972e
    • D
      GFS2: ordered writes are backwards · e5884636
      Dave Chinner 提交于
      When we queue data buffers for ordered write, the buffers are added
      to the head of the ordered write list. When the log needs to push
      these buffers to disk, it also walks the list from the head. The
      result is that the the ordered buffers are submitted to disk in
      reverse order.
      
      For large writes, this means that whenever the log flushes large
      streams of reverse sequential order buffers are pushed down into the
      block layers. The elevators don't handle this particularly well, so
      IO rates tend to be significantly lower than if the IO was issued in
      ascending block order.
      
      Queue new ordered buffers to the tail of the ordered buffer list to
      ensure that IO is dispatched in the order it was submitted. This
      should significantly improve large sequential write speeds. On a
      disk capable of 85MB/s, speeds increase from 50MB/s to 65MB/s for
      noop and from 38MB/s to 50MB/s for cfq.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      e5884636
    • S
      GFS2: Remove loopy umount code · c1184f8a
      Steven Whitehouse 提交于
      As a consequence of the previous patch, we can now remove the
      loop which used to be required due to the circular dependency
      between the inodes and glocks. Instead we can just invalidate
      the inodes, and then clear up any glocks which are left.
      
      Also we no longer need the rwsem since there is no longer any
      danger of the inode invalidation calling back into the glock
      code (and from there back into the inode code).
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      c1184f8a
    • S
      GFS2: Metadata address space clean up · 009d8518
      Steven Whitehouse 提交于
      Since the start of GFS2, an "extra" inode has been used to store
      the metadata belonging to each inode. The only reason for using
      this inode was to have an extra address space, the other fields
      were unused. This means that the memory usage was rather inefficient.
      
      The reason for keeping each inode's metadata in a separate address
      space is that when glocks are requested on remote nodes, we need to
      be able to efficiently locate the data and metadata which relating
      to that glock (inode) in order to sync or sync and invalidate it
      (depending on the remotely requested lock mode).
      
      This patch adds a new type of glock, which has in addition to
      its normal fields, has an address space. This applies to all
      inode and rgrp glocks (but to no other glock types which remain
      as before). As a result, we no longer need to have the second
      inode.
      
      This results in three major improvements:
       1. A saving of approx 25% of memory used in caching inodes
       2. A removal of the circular dependency between inodes and glocks
       3. No confusion between "normal" and "metadata" inodes in super.c
      
      Although the first of these is the more immediately apparent, the
      second is just as important as it now enables a number of clean
      ups at umount time. Those will be the subject of future patches.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      009d8518
  12. 12 2月, 2010 2 次提交
  13. 09 2月, 2010 1 次提交
  14. 03 2月, 2010 2 次提交
  15. 01 2月, 2010 3 次提交
  16. 12 1月, 2010 1 次提交
  17. 11 1月, 2010 1 次提交
  18. 08 1月, 2010 3 次提交
  19. 18 12月, 2009 2 次提交
  20. 17 12月, 2009 1 次提交
    • C
      sanitize xattr handler prototypes · 431547b3
      Christoph Hellwig 提交于
      Add a flags argument to struct xattr_handler and pass it to all xattr
      handler methods.  This allows using the same methods for multiple
      handlers, e.g. for the ACL methods which perform exactly the same action
      for the access and default ACLs, just using a different underlying
      attribute.  With a little more groundwork it'll also allow sharing the
      methods for the regular user/trusted/secure handlers in extN, ocfs2 and
      jffs2 like it's already done for xfs in this patch.
      
      Also change the inode argument to the handlers to a dentry to allow
      using the handlers mechnism for filesystems that require it later,
      e.g. cifs.
      
      [with GFS2 bits updated by Steven Whitehouse <swhiteho@redhat.com>]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJames Morris <jmorris@namei.org>
      Acked-by: NJoel Becker <joel.becker@oracle.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      431547b3
  21. 16 12月, 2009 2 次提交
  22. 03 12月, 2009 6 次提交
    • S
      GFS2: Fix glock refcount issues · 26bb7505
      Steven Whitehouse 提交于
      This patch fixes some ref counting issues. Firstly by moving
      the point at which we drop the ref count after a dlm lock
      operation has completed we ensure that we never call
      gfs2_glock_hold() on a lock with a zero ref count.
      
      Secondly, by using atomic_dec_and_lock() in gfs2_glock_put()
      we ensure that at no time will a glock with zero ref count
      appear on the lru_list. That means that we can remove the
      check for this in our shrinker (which was racy).
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      26bb7505
    • W
      writeback: remove unused nonblocking and congestion checks (gfs2) · c29cd900
      Wu Fengguang 提交于
      No one is calling wb_writeback and write_cache_pages with
      wbc.nonblocking=1 any more. And lumpy pageout will want to do
      nonblocking writeback without the congestion wait.
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      c29cd900
    • B
      GFS2: drop rindex glock to refresh rindex list · 9ae3c6de
      Benjamin Marzinski 提交于
      When a gfs2 filesystem is grown, it needs to rebuild the rindex list to be able
      to use the new space.  gfs2 does this when the rindex is marked not uptodate,
      which happens when the rindex glock is dropped.  However, on a single node
      setup, there is never any reason to drop the rindex glock, so gfs2 never
      invalidates the the rindex. This patch makes gfs2 automatically drop the
      rindex glock after filesystem grows, so it can refresh the rindex list.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      9ae3c6de
    • S
      GFS2: Tag all metadata with jid · 0ab7d13f
      Steven Whitehouse 提交于
      There are two spare field in the header common to all GFS2
      metadata. One is just the right size to fit a journal id
      in it, and this patch updates the journal code so that each
      time a metadata block is modified, we tag it with the journal
      id of the node which is performing the modification.
      
      The reason for this is that it should make it much easier to
      debug issues which arise if we can tell which node was the
      last to modify a particular metadata block.
      
      Since the field is updated before the block is written into
      the journal, each journal should only contain metadata which
      is tagged with its own journal id. The one exception to this
      is the journal header block, which might have a different node's
      id in it, if that journal was recovered by another node in the
      cluster.
      
      Thus each journal will contain a record of which nodes recovered
      it, via the journal header.
      
      The other field in the metadata header could potentially be
      used to hold information about what kind of operation was
      performed, but for the time being we just zero it on each
      transaction so that if we use it for that in future, we'll
      know that the information (where it exists) is reliable.
      
      I did consider using the other field to hold the journal
      sequence number, however since in GFS2's journaling we write
      the modified data into the journal and not the original
      data, this gives no information as to what action caused the
      modification, so I think we can probably come up with a better
      use for those 64 bits in the future.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      0ab7d13f
    • S
      GFS2: Locking order fix in gfs2_check_blk_state · 2c776349
      Steven Whitehouse 提交于
      In some cases we already have the rindex lock when
      we enter this function.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      2c776349
    • S
      GFS2: Remove dirent_first() function · 1579343a
      Steven Whitehouse 提交于
      This function only had one caller left, and that caller only
      called it for leaf blocks, hence one branch of the "if" was
      never taken. In addition the call to get_left had already
      verified the metadata type, so the function can be reduced
      to a single line of code in its caller.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1579343a