1. 25 2月, 2014 1 次提交
    • S
      GFS2: Move log buffer lists into transaction · d69a3c65
      Steven Whitehouse 提交于
      Over time, we hope to be able to improve the concurrency available
      in the log code. This is one small step towards that, by moving
      the buffer lists from the super block, and into the transaction
      structure, so that each transaction builds its own buffer lists.
      
      At transaction commit time, the buffer lists are merged into
      the currently accumulating transaction. That transaction then
      is passed into the before and after commit functions at journal
      flush time. Thus there should be no change in overall behaviour
      yet.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      d69a3c65
  2. 21 2月, 2014 1 次提交
  3. 20 6月, 2013 1 次提交
  4. 19 6月, 2013 1 次提交
    • B
      GFS2: aggressively issue revokes in gfs2_log_flush · 5d054964
      Benjamin Marzinski 提交于
      This patch looks at all the outstanding blocks in all the transactions
      on the log, and moves the completed ones to the ail2 list.  Then it
      issues revokes for these blocks.  This will hopefully speed things up
      in situations where there is a lot of contention for glocks, especially
      if they are acquired serially.
      
      revoke_lo_before_commit will issue at most one log block's full of these
      preemptive revokes. The amount of reserved log space that
      gfs2_log_reserve() ignores has been incremented to allow for this extra
      block.
      
      This patch also consolidates the common revoke instructions into one
      function, gfs2_add_revoke().
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5d054964
  5. 29 4月, 2013 1 次提交
  6. 08 4月, 2013 1 次提交
    • B
      GFS2: replace gfs2_ail structure with gfs2_trans · 16ca9412
      Benjamin Marzinski 提交于
      In order to allow transactions and log flushes to happen at the same
      time, gfs2 needs to move the transaction accounting and active items
      list code into the gfs2_trans structure.  As a first step toward this,
      this patch removes the gfs2_ail structure, and handles the active items
      list in the gfs_trans structure.  This keeps gfs2 from allocating an ail
      structure on log flushes, and gives us a struture that can later be used
      to store the transaction accounting outside of the gfs2 superblock
      structure.
      
      With this patch, at the end of a transaction, gfs2 will add the
      gfs2_trans structure to the superblock if there is not one already.
      This structure now has the active items fields that were previously in
      gfs2_ail.  This is not necessary in the case where the transaction was
      simply used to add revokes, since these are never written outside of the
      journal, and thus, don't need an active items list.
      
      Also, in order to make sure that the transaction structure is not
      removed while it's still in use by gfs2_trans_end, unlocking the
      sd_log_flush_lock has to happen slightly later in ending the
      transaction.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      16ca9412
  7. 29 1月, 2013 5 次提交
    • S
      GFS2: Use ->writepages for ordered writes · 45138990
      Steven Whitehouse 提交于
      Instead of using a list of buffers to write ahead of the journal
      flush, this now uses a list of inodes and calls ->writepages
      via filemap_fdatawrite() in order to achieve the same thing. For
      most use cases this results in a shorter ordered write list,
      as well as much larger i/os being issued.
      
      The ordered write list is sorted by inode number before writing
      in order to retain the disk block ordering between inodes as
      per the previous code.
      
      The previous ordered write code used to conflict in its assumptions
      about how to write out the disk blocks with mpage_writepages()
      so that with this updated version we can also use mpage_writepages()
      for GFS2's ordered write, writepages implementation. So we will
      also send larger i/os from writeback too.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      45138990
    • S
      GFS2: Merge gfs2_attach_bufdata() into trans.c · c76c4d96
      Steven Whitehouse 提交于
      The locking in gfs2_attach_bufdata() was type specific (data/meta)
      which made the function rather confusing. This patch moves the core
      of gfs2_attach_bufdata() into trans.c renaming it gfs2_alloc_bufdata()
      and moving the locking into gfs2_trans_add_data()/gfs2_trans_add_meta()
      
      As a result all of the locking related to adding data and metadata to
      the journal is now in these two functions. This should help to clarify
      what is going on, and give us some opportunities to simplify in
      some cases.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      c76c4d96
    • S
      GFS2: Copy gfs2_trans_add_bh into new data/meta functions · 767f433f
      Steven Whitehouse 提交于
      This patch copies the body of gfs2_trans_add_bh into the two newly
      added gfs2_trans_add_data and gfs2_trans_add_meta functions. We can
      then move the .lo_add functions from lops.c into trans.c and call
      them directly.
      
      As a result of this, we no longer need to use the .lo_add functions
      at all, so that is removed from the log operations structure.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      767f433f
    • S
      GFS2: Split gfs2_trans_add_bh() into two · 350a9b0a
      Steven Whitehouse 提交于
      There is little common content in gfs2_trans_add_bh() between the data
      and meta classes by the time that the functions which it calls are
      taken into account. The intent here is to split this into two
      separate functions. Stage one is to introduce gfs2_trans_add_data()
      and gfs2_trans_add_meta() and update the callers accordingly.
      
      Later patches will then pull in the content of gfs2_trans_add_bh()
      and its dependent functions in order to clean up the code in this
      area.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      350a9b0a
    • S
      GFS2: Merge revoke adding functions · 75f2b879
      Steven Whitehouse 提交于
      This moves the lo_add function for revokes into trans.c, removing
      a function call and making the code easier to read.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      75f2b879
  8. 07 11月, 2012 1 次提交
    • B
      GFS2: Test bufdata with buffer locked and gfs2_log_lock held · 96e5d1d3
      Benjamin Marzinski 提交于
      In gfs2_trans_add_bh(), gfs2 was testing if a there was a bd attached to the
      buffer without having the gfs2_log_lock held. It was then assuming it would
      stay attached for the rest of the function. However, without either the log
      lock being held of the buffer locked, __gfs2_ail_flush() could detach bd at any
      time.  This patch moves the locking before the test.  If there isn't a bd
      already attached, gfs2 can safely allocate one and attach it before locking.
      There is no way that the newly allocated bd could be on the ail list,
      and thus no way for __gfs2_ail_flush() to detach it.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      96e5d1d3
  9. 31 7月, 2012 1 次提交
  10. 02 5月, 2012 1 次提交
  11. 24 4月, 2012 1 次提交
    • S
      GFS2: Remove bd_list_tr · c50b91c4
      Steven Whitehouse 提交于
      This is another clean up in the logging code. This per-transaction
      list was largely unused. Its main function was to ensure that the
      number of buffers in a transaction was correct, however that counter
      was only used to check the number of buffers in the bd_list_tr, plus
      an assert at the end of each transaction. With the assert now changed
      to use the calculated buffer counts, we can remove both bd_list_tr and
      its associated counter.
      
      This should make the code easier to understand as well as shrinking
      a couple of structures.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      c50b91c4
  12. 21 10月, 2011 1 次提交
    • B
      GFS2: Use rbtree for resource groups and clean up bitmap buffer ref count scheme · 7c9ca621
      Bob Peterson 提交于
      Here is an update of Bob's original rbtree patch which, in addition, also
      resolves the rather strange ref counting that was being done relating to
      the bitmap blocks.
      
      Originally we had a dual system for journaling resource groups. The metadata
      blocks were journaled and also the rgrp itself was added to a list. The reason
      for adding the rgrp to the list in the journal was so that the "repolish
      clones" code could be run to update the free space, and potentially send any
      discard requests when the log was flushed. This was done by comparing the
      "cloned" bitmap with what had been written back on disk during the transaction
      commit.
      
      Due to this, there was a requirement to hang on to the rgrps' bitmap buffers
      until the journal had been flushed. For that reason, there was a rather
      complicated set up in the ->go_lock ->go_unlock functions for rgrps involving
      both a mutex and a spinlock (the ->sd_rindex_spin) to maintain a reference
      count on the buffers.
      
      However, the journal maintains a reference count on the buffers anyway, since
      they are being journaled as metadata buffers. So by moving the code which deals
      with the post-journal accounting for bitmap blocks to the metadata journaling
      code, we can entirely dispense with the rather strange buffer ref counting
      scheme and also the requirement to journal the rgrps.
      
      The net result of all this is that the ->sd_rindex_spin is left to do exactly
      one job, and that is to look after the rbtree or rgrps.
      
      This patch is designed to be a stepping stone towards using RCU for the rbtree
      of resource groups, however the reduction in the number of uses of the
      ->sd_rindex_spin is likely to have benefits for multi-threaded workloads,
      anyway.
      
      The patch retains ->go_lock and ->go_unlock for rgrps, however these maybe also
      be removed in future in favour of calling the functions directly where required
      in the code. That will allow locking of resource groups without needing to
      actually read them in - something that could be useful in speeding up statfs.
      
      In the mean time though it is valid to dereference ->bi_bh only when the rgrp
      is locked. This is basically the same rule as before, modulo the references not
      being valid until the following journal flush.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Cc: Benjamin Marzinski <bmarzins@redhat.com>
      7c9ca621
  13. 05 5月, 2010 1 次提交
    • B
      GFS2: Various gfs2_logd improvements · 5e687eac
      Benjamin Marzinski 提交于
      This patch contains various tweaks to how log flushes and active item writeback
      work. gfs2_logd is now managed by a waitqueue, and gfs2_log_reseve now waits
      for gfs2_logd to do the log flushing.  Multiple functions were rewritten to
      remove the need to call gfs2_log_lock(). Instead of using one test to see if
      gfs2_logd had work to do, there are now seperate tests to check if there
      are two many buffers in the incore log or if there are two many items on the
      active items list.
      
      This patch is a port of a patch Steve Whitehouse wrote about a year ago, with
      some minor changes.  Since gfs2_ail1_start always submits all the active items,
      it no longer needs to keep track of the first ai submitted, so this has been
      removed. In gfs2_log_reserve(), the order of the calls to
      prepare_to_wait_exclusive() and wake_up() when firing off the logd thread has
      been switched.  If it called wake_up first there was a small window for a race,
      where logd could run and return before gfs2_log_reserve was ready to get woken
      up. If gfs2_logd ran, but did not free up enough blocks, gfs2_log_reserve()
      would be left waiting for gfs2_logd to eventualy run because it timed out.
      Finally, gt_logd_secs, which controls how long to wait before gfs2_logd times
      out, and flushes the log, can now be set on mount with ar_commit.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5e687eac
  14. 13 5月, 2009 1 次提交
  15. 24 3月, 2009 2 次提交
    • S
      GFS2: Fix deadlock on journal flush · d8348de0
      Steven Whitehouse 提交于
      This patch fixes a deadlock when the journal is flushed and there
      are dirty inodes other than the one which caused the journal flush.
      Originally the journal flushing code was trying to obtain the
      transaction glock while running the flush code for an inode glock.
      We no longer require the transaction glock at this point in time
      since we know that any attempt to get the transaction glock from
      another node will result in a journal flush. So if we are flushing
      the journal, we can be sure that the transaction lock is still
      cached from when the transaction was started.
      
      By inlining a version of gfs2_trans_begin() (minus the bit which
      gets the transaction glock) we can avoid the deadlock problems
      caused if there is a demote request queued up on the transaction
      glock.
      
      In addition I've also moved the umount rwsem so that it covers
      the glock workqueue, since it all demotions are done by this
      workqueue now. That fixes a bug on umount which I came across
      while fixing the original problem.
      Reported-by: NDavid Teigland <teigland@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      d8348de0
    • S
      GFS2: Merge lock_dlm module into GFS2 · f057f6cd
      Steven Whitehouse 提交于
      This is the big patch that I've been working on for some time
      now. There are many reasons for wanting to make this change
      such as:
       o Reducing overhead by eliminating duplicated fields between structures
       o Simplifcation of the code (reduces the code size by a fair bit)
       o The locking interface is now the DLM interface itself as proposed
         some time ago.
       o Fewer lookups of glocks when processing replies from the DLM
       o Fewer memory allocations/deallocations for each glock
       o Scope to do further optimisations in the future (but this patch is
         more than big enough for now!)
      
      Please note that (a) this patch relates to the lock_dlm module and
      not the DLM itself, that is still a separate module; and (b) that
      we retain the ability to build GFS2 as a standalone single node
      filesystem with out requiring the DLM.
      
      This patch needs a lot of testing, hence my keeping it I restarted
      my -git tree after the last merge window. That way, this has the maximum
      exposure before its merged. This is (modulo a few minor bug fixes) the
      same patch that I've been posting on and off the the last three months
      and its passed a number of different tests so far.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f057f6cd
  16. 31 3月, 2008 1 次提交
    • S
      [GFS2] Update gfs2_trans_add_unrevoke to accept extents · 5731be53
      Steven Whitehouse 提交于
      By adding an extra argument to gfs2_trans_add_unrevoke we can now
      specify an extent length of blocks to unrevoke. This means that
      we only need to make one pass through the list for each extent
      rather than each block. Currently the only extent length which
      is used is 1, but that will change in the future.
      
      Also gfs2_trans_add_unrevoke is removed from gfs2_alloc_meta
      since its the only difference between this and gfs2_alloc_data
      which is left. This will allow a future patch to merge these
      two functions into one (i.e. one call to allocate both data
      and metadata in a single extent in the future).
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5731be53
  17. 25 1月, 2008 1 次提交
    • S
      [GFS2] Don't add glocks to the journal · 2bcd610d
      Steven Whitehouse 提交于
      The only reason for adding glocks to the journal was to keep track
      of which locks required a log flush prior to release. We add a
      flag to the glock to allow this check to be made in a simpler way.
      
      This reduces the size of a glock (by 12 bytes on i386, 24 on x86_64)
      and means that we can avoid extra work during the journal flush.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      2bcd610d
  18. 10 10月, 2007 3 次提交
    • S
      [GFS2] Clean up gfs2_trans_add_revoke() · 1ad38c43
      Steven Whitehouse 提交于
      The following alters gfs2_trans_add_revoke() to take a struct
      gfs2_bufdata as an argument. This eliminates the memory allocation which
      was previously required by making use of the already existing struct
      gfs2_bufdata. It makes some sanity checks to ensure that the
      gfs2_bufdata has been removed from all the lists before its recycled as
      a revoke structure. This saves one memory allocation and one free per
      revoke structure.
      
      Also as a result, and to simplify the locking, since there is no longer
      any blocking code in gfs2_trans_add_revoke() we must hold the log lock
      whenever this function is called. This reduces the amount of times we
      take and unlock the log lock.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1ad38c43
    • S
      [GFS2] Use slab operations for all gfs2_bufdata allocations · 0820ab51
      Steven Whitehouse 提交于
      The old revoke structure was allocated using kalloc/kfree but
      there is a slab cache for gfs2_bufdata, so we should use that
      now that the structures have been converted.
      
      This is part two of the patch series to merge the revoke
      and gfs2_bufdata structures.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      0820ab51
    • S
      [GFS2] Replace revoke structure with bufdata structure · 82e86087
      Steven Whitehouse 提交于
      Both the revoke structure and the bufdata structure are quite similar.
      They are basically small tags which are put on lists. In addition to
      which the revoke structure is always allocated when there is a bufdata
      structure which is (or can be) freed. As such it should be possible to
      reduce the number of frees and allocations by using the same structure
      for both purposes.
      
      This patch is the first step along that path. It replaces existing uses
      of the revoke structure with the bufdata structure.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      82e86087
  19. 19 9月, 2006 1 次提交
  20. 05 9月, 2006 1 次提交
  21. 01 9月, 2006 1 次提交
    • S
      [GFS2] Update copyright, tidy up incore.h · e9fc2aa0
      Steven Whitehouse 提交于
      As per comments from Jan Engelhardt <jengelh@linux01.gwdg.de> this
      updates the copyright message to say "version" in full rather than
      "v.2". Also incore.h has been updated to remove forward structure
      declarations which are not required.
      
      The gfs2_quota_lvb structure has now had endianess annotations added
      to it. Also quota.c has been updated so that we now store the
      lvb data locally in endian independant format to avoid needing
      a structure in host endianess too. As a result the endianess
      conversions are done as required at various points and thus the
      conversion routines in lvb.[ch] are no longer required. I've
      moved the one remaining constant in lvb.h thats used into lm.h
      and removed the unused lvb.[ch].
      
      I have not changed the HIF_ constants. That is left to a later patch
      which I hope will unify the gh_flags and gh_iflags fields of the
      struct gfs2_holder.
      
      Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      e9fc2aa0
  22. 19 5月, 2006 2 次提交
  23. 27 4月, 2006 1 次提交
  24. 12 4月, 2006 1 次提交
    • S
      [GFS2] Update journal accounting code. · f4154ea0
      Steven Whitehouse 提交于
      A small update to the journaling code to change the way that
      the "extra" blocks are accounted for in the journal. These are
      used at a rate of one per 503 metadata blocks or one per 251
      journaled data blocks (or just one if the total number of journaled
      blocks in the transaction is smaller). Since we are using them at
      two different rates the old method of accounting for them no longer
      works and we count them up as required.
      
      Since the "per transaction" accounting can't handle this (there is no
      fixed number of header blocks per transaction) we have to account for
      it in the general journal code. We now require that each transaction
      reserves more blocks than it actually needs to take account of the
      possible extra blocks.
      
      Also a final fix to dir.c to ensure that all ref counts are handled
      correctly.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f4154ea0
  25. 07 4月, 2006 1 次提交
    • S
      [GFS2] Fix a ref count bug and other clean ups · b09e593d
      Steven Whitehouse 提交于
      This fixes a ref count bug that sometimes showed up a umount time
      (causing it to hang) but it otherwise mostly harmless. At the same
      time there are some clean ups including making the log operations
      structures const, moving a memory allocation so that its not done
      in the fast path of checking to see if there is an outstanding
      transaction related to a particular glock.
      
      Removes the sd_log_wrap varaible which was updated, but never actually
      used anywhere. Updates the gfs2 ioctl() to run without the kernel lock
      (which it never needed anyway). Removes the "invalidate inodes" loop
      from GFS2's put_super routine. This is done in kill super anyway so
      we don't need to do it here. The loop was also bogus in that if there
      are any inodes "stuck" at this point its a bug and we need to know
      about it rather than hide it by hanging forever.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b09e593d
  26. 31 3月, 2006 1 次提交
  27. 30 3月, 2006 1 次提交
    • S
      [GFS2] Update debugging code · d0dc80db
      Steven Whitehouse 提交于
      Update the debugging code in trans.c and at the same time improve
      the debugging code for gfs2_holders. The new code should be pretty
      fast during the normal case and provide just as much information
      in case of errors (or more).
      
      One small function from glock.c has moved to glock.h as a static inline so
      that its return address won't get in the way of the debugging.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      d0dc80db
  28. 29 3月, 2006 1 次提交
    • S
      [GFS2] Update locking in log.c · 484adff8
      Steven Whitehouse 提交于
      Replace the lock_for_trans()/lock_for_flush() functions with an rwsem.
      In fact the sd_log_flush_lock becomes an rwsem (the write part of it)
      and is extended slightly to cover everything that the lock_for_flush()
      used to cover. The read part of the lock is instead of lock_for_trans().
      
      This corrects the races in the original code and reduces the code size.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      484adff8
  29. 02 3月, 2006 2 次提交
  30. 28 2月, 2006 1 次提交
    • S
      [GFS2] Macros removal in gfs2.h · 5c676f6d
      Steven Whitehouse 提交于
      As suggested by Pekka Enberg <penberg@cs.helsinki.fi>.
      
      The DIV_RU macro is renamed DIV_ROUND_UP and and moved to kernel.h
      The other macros are gone from gfs2.h as (although not requested
      by Pekka Enberg) are a number of included header file which are now
      included individually. The inode number comparison function is
      now an inline function.
      
      The DT2IF and IF2DT may be addressed in a future patch.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5c676f6d
  31. 21 2月, 2006 1 次提交