1. 08 4月, 2013 1 次提交
    • B
      GFS2: replace gfs2_ail structure with gfs2_trans · 16ca9412
      Benjamin Marzinski 提交于
      In order to allow transactions and log flushes to happen at the same
      time, gfs2 needs to move the transaction accounting and active items
      list code into the gfs2_trans structure.  As a first step toward this,
      this patch removes the gfs2_ail structure, and handles the active items
      list in the gfs_trans structure.  This keeps gfs2 from allocating an ail
      structure on log flushes, and gives us a struture that can later be used
      to store the transaction accounting outside of the gfs2 superblock
      structure.
      
      With this patch, at the end of a transaction, gfs2 will add the
      gfs2_trans structure to the superblock if there is not one already.
      This structure now has the active items fields that were previously in
      gfs2_ail.  This is not necessary in the case where the transaction was
      simply used to add revokes, since these are never written outside of the
      journal, and thus, don't need an active items list.
      
      Also, in order to make sure that the transaction structure is not
      removed while it's still in use by gfs2_trans_end, unlocking the
      sd_log_flush_lock has to happen slightly later in ending the
      transaction.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      16ca9412
  2. 29 1月, 2013 1 次提交
    • S
      GFS2: Merge gfs2_attach_bufdata() into trans.c · c76c4d96
      Steven Whitehouse 提交于
      The locking in gfs2_attach_bufdata() was type specific (data/meta)
      which made the function rather confusing. This patch moves the core
      of gfs2_attach_bufdata() into trans.c renaming it gfs2_alloc_bufdata()
      and moving the locking into gfs2_trans_add_data()/gfs2_trans_add_meta()
      
      As a result all of the locking related to adding data and metadata to
      the journal is now in these two functions. This should help to clarify
      what is going on, and give us some opportunities to simplify in
      some cases.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      c76c4d96
  3. 04 8月, 2012 1 次提交
  4. 28 6月, 2012 1 次提交
    • M
      GFS2: Fixing double brelse'ing bh allocated in gfs2_meta_read when EIO occurs · 44b8db13
      Masatake YAMATO 提交于
      This patch fixes buffer_head double free in following code path:
      
      gfs2_block_map
      => gfs2_meta_inode_buffer
       => gfs2_meta_indirect_buffer
        => gfs2_meta_read
      => release_metapath
      
      gfs2_block_map calls gfs2_meta_inode_buffer with &mp.mp_bh[0]
      as an argument. mp.mp_bh are filled with zero at the beginning
      of gfs2_block_map.
      
      If gfs2_meta_inode_buffer returns non-zero value, gfs2_block_map
      calls release_metapath to free buffers chained to mp.mp_bh.
      release_metapath checks each slot of mp.mp_bh[i] and
      free(with brelse) unless the slot is filled with NULL.
      
      &mp.mp_bh[0] passed to gfs2_meta_inode_buffer is filled at
      gfs2_meta_read. gfs2_meta_read is filled a buffer allocated with
      gfs2_getbuf even if EIO occurs. When EIO occurs, the allocated buffer
      is brelse'ed though the pointer(wrong poiner) points the brelse'ed is
      passed back to caller via an argument bhp.
      
      gfs2_meta_indirect_buffer, the caller also pass the wrong pointer
      to its caller with EIO. Finally gfs2_block_map gets both EIO and
      &mp.mp_bh[0] filled with the wrong pointer. release_metapath
      calls brelse again on the wrong pointer.
      Signed-off-by: NMasatake YAMATO <yamato@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      44b8db13
  5. 11 5月, 2012 1 次提交
  6. 02 5月, 2012 1 次提交
  7. 24 4月, 2012 1 次提交
    • S
      GFS2: Remove bd_list_tr · c50b91c4
      Steven Whitehouse 提交于
      This is another clean up in the logging code. This per-transaction
      list was largely unused. Its main function was to ensure that the
      number of buffers in a transaction was correct, however that counter
      was only used to check the number of buffers in the bd_list_tr, plus
      an assert at the end of each transaction. With the assert now changed
      to use the calculated buffer counts, we can remove both bd_list_tr and
      its associated counter.
      
      This should make the code easier to understand as well as shrinking
      a couple of structures.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      c50b91c4
  8. 08 11月, 2011 1 次提交
    • S
      GFS2: Fix up REQ flags · 20ed0535
      Steven Whitehouse 提交于
      Christoph has split up REQ_PRIO from REQ_META. That means that
      we can drop REQ_PRIO from places where is it not needed. I'm
      not at all sure that the combination WRITE_FLUSH_FUA | REQ_PRIO
      makes any kind of sense, anyway.
      
      In addition, I've added REQ_META to one place in the code where
      it was missing. REQ_PRIO has been left for read/writes triggered
      by glock acquisition and writeback only. We can adjust it again
      if required, but these are the most important points from a
      performance perspective.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      20ed0535
  9. 23 8月, 2011 1 次提交
  10. 20 4月, 2011 1 次提交
    • S
      GFS2: Improve tracing support (adds two flags) · 627c10b7
      Steven Whitehouse 提交于
      This adds support for two new flags. One keeps track of whether
      the glock is on the LRU list or not. The other isn't really a
      flag as such, but an indication of whether the glock has an
      attached object or not. This indication is reported without
      any locking, which is ok since we do not dereference the object
      pointer but merely report whether it is NULL or not.
      
      Also, this fixes one place where a tracepoint was missing, which
      was at the point we remove deallocated blocks from the journal.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      627c10b7
  11. 14 3月, 2011 1 次提交
  12. 10 3月, 2011 2 次提交
    • J
      block: kill off REQ_UNPLUG · 721a9602
      Jens Axboe 提交于
      With the plugging now being explicitly controlled by the
      submitter, callers need not pass down unplugging hints
      to the block layer. If they want to unplug, it's because they
      manually plugged on their own - in which case, they should just
      unplug at will.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      721a9602
    • J
      block: remove per-queue plugging · 7eaceacc
      Jens Axboe 提交于
      Code has been converted over to the new explicit on-stack plugging,
      and delay users have been converted to use the new API for that.
      So lets kill off the old plugging along with aops->sync_page().
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      7eaceacc
  13. 27 10月, 2010 1 次提交
    • W
      writeback: remove nonblocking/encountered_congestion references · 1b430bee
      Wu Fengguang 提交于
      This removes more dead code that was somehow missed by commit 0d99519e
      (writeback: remove unused nonblocking and congestion checks).  There are
      no behavior change except for the removal of two entries from one of the
      ext4 tracing interface.
      
      The nonblocking checks in ->writepages are no longer used because the
      flusher now prefer to block on get_request_wait() than to skip inodes on
      IO congestion.  The latter will lead to more seeky IO.
      
      The nonblocking checks in ->writepage are no longer used because it's
      redundant with the WB_SYNC_NONE check.
      
      We no long set ->nonblocking in VM page out and page migration, because
      a) it's effectively redundant with WB_SYNC_NONE in current code
      b) it's old semantic of "Don't get stuck on request queues" is mis-behavior:
         that would skip some dirty inodes on congestion and page out others, which
         is unfair in terms of LRU age.
      
      Inspired by Christoph Hellwig. Thanks!
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Sage Weil <sage@newdream.net>
      Cc: Steve French <sfrench@samba.org>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1b430bee
  14. 08 8月, 2010 1 次提交
    • C
      block: unify flags for struct bio and struct request · 7b6d91da
      Christoph Hellwig 提交于
      Remove the current bio flags and reuse the request flags for the bio, too.
      This allows to more easily trace the type of I/O from the filesystem
      down to the block driver.  There were two flags in the bio that were
      missing in the requests:  BIO_RW_UNPLUG and BIO_RW_AHEAD.  Also I've
      renamed two request flags that had a superflous RW in them.
      
      Note that the flags are in bio.h despite having the REQ_ name - as
      blkdev.h includes bio.h that is the only way to go for now.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      7b6d91da
  15. 12 5月, 2010 1 次提交
  16. 05 5月, 2010 1 次提交
    • B
      GFS2: Various gfs2_logd improvements · 5e687eac
      Benjamin Marzinski 提交于
      This patch contains various tweaks to how log flushes and active item writeback
      work. gfs2_logd is now managed by a waitqueue, and gfs2_log_reseve now waits
      for gfs2_logd to do the log flushing.  Multiple functions were rewritten to
      remove the need to call gfs2_log_lock(). Instead of using one test to see if
      gfs2_logd had work to do, there are now seperate tests to check if there
      are two many buffers in the incore log or if there are two many items on the
      active items list.
      
      This patch is a port of a patch Steve Whitehouse wrote about a year ago, with
      some minor changes.  Since gfs2_ail1_start always submits all the active items,
      it no longer needs to keep track of the first ai submitted, so this has been
      removed. In gfs2_log_reserve(), the order of the calls to
      prepare_to_wait_exclusive() and wake_up() when firing off the logd thread has
      been switched.  If it called wake_up first there was a small window for a race,
      where logd could run and return before gfs2_log_reserve was ready to get woken
      up. If gfs2_logd ran, but did not free up enough blocks, gfs2_log_reserve()
      would be left waiting for gfs2_logd to eventualy run because it timed out.
      Finally, gt_logd_secs, which controls how long to wait before gfs2_logd times
      out, and flushes the log, can now be set on mount with ar_commit.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5e687eac
  17. 01 3月, 2010 1 次提交
    • S
      GFS2: Metadata address space clean up · 009d8518
      Steven Whitehouse 提交于
      Since the start of GFS2, an "extra" inode has been used to store
      the metadata belonging to each inode. The only reason for using
      this inode was to have an extra address space, the other fields
      were unused. This means that the memory usage was rather inefficient.
      
      The reason for keeping each inode's metadata in a separate address
      space is that when glocks are requested on remote nodes, we need to
      be able to efficiently locate the data and metadata which relating
      to that glock (inode) in order to sync or sync and invalidate it
      (depending on the remotely requested lock mode).
      
      This patch adds a new type of glock, which has in addition to
      its normal fields, has an address space. This applies to all
      inode and rgrp glocks (but to no other glock types which remain
      as before). As a result, we no longer need to have the second
      inode.
      
      This results in three major improvements:
       1. A saving of approx 25% of memory used in caching inodes
       2. A removal of the circular dependency between inodes and glocks
       3. No confusion between "normal" and "metadata" inodes in super.c
      
      Although the first of these is the more immediately apparent, the
      second is just as important as it now enables a number of clean
      ups at umount time. Those will be the subject of future patches.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      009d8518
  18. 11 1月, 2010 1 次提交
  19. 22 5月, 2009 1 次提交
    • S
      GFS2: Clean up some file names · b1e71b06
      Steven Whitehouse 提交于
      This patch renames the ops_*.c files which have no counterpart
      without the ops_ prefix in order to shorten the name and make
      it more readable. In addition, ops_address.h (which was very
      small) is moved into inode.h and inode.h is cleaned up by
      adding extern where required.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b1e71b06
  20. 11 5月, 2009 2 次提交
  21. 24 3月, 2009 2 次提交
    • S
      GFS2: Clean up of glops.c · 6bac243f
      Steven Whitehouse 提交于
      This cleans up a number of bits of code mostly based in glops.c.
      A couple of simple functions have been merged into the callers
      to make it more obvious what is going on, the mysterious raising
      of i_writecount around the truncate_inode_pages() call has been
      removed. The meta_go_* operations have been renamed rgrp_go_*
      since that is the only lock type that they are used with.
      
      The unused argument of gfs2_read_sb has been removed. Also
      a bug has been fixed where a check for the rindex inode was
      in the wrong callback. More comments are added, and the
      debugging code is improved too.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6bac243f
    • S
      GFS2: Merge lock_dlm module into GFS2 · f057f6cd
      Steven Whitehouse 提交于
      This is the big patch that I've been working on for some time
      now. There are many reasons for wanting to make this change
      such as:
       o Reducing overhead by eliminating duplicated fields between structures
       o Simplifcation of the code (reduces the code size by a fair bit)
       o The locking interface is now the DLM interface itself as proposed
         some time ago.
       o Fewer lookups of glocks when processing replies from the DLM
       o Fewer memory allocations/deallocations for each glock
       o Scope to do further optimisations in the future (but this patch is
         more than big enough for now!)
      
      Please note that (a) this patch relates to the lock_dlm module and
      not the DLM itself, that is still a separate module; and (b) that
      we retain the ability to build GFS2 as a standalone single node
      filesystem with out requiring the DLM.
      
      This patch needs a lot of testing, hence my keeping it I restarted
      my -git tree after the last merge window. That way, this has the maximum
      exposure before its merged. This is (modulo a few minor bug fixes) the
      same patch that I've been posting on and off the the last three months
      and its passed a number of different tests so far.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f057f6cd
  22. 27 6月, 2008 1 次提交
    • S
      [GFS2] Clean up the glock core · 6802e340
      Steven Whitehouse 提交于
      This patch implements a number of cleanups to the core of the
      GFS2 glock code. As a result a lot of code is removed. It looks
      like a really big change, but actually a large part of this patch
      is either removing or moving existing code.
      
      There are some new bits too though, such as the new run_queue()
      function which is considerably streamlined. Highlights of this
      patch include:
      
       o Fixes a cluster coherency bug during SH -> EX lock conversions
       o Removes the "glmutex" code in favour of a single bit lock
       o Removes the ->go_xmote_bh() for inodes since it was duplicating
         ->go_lock()
       o We now only use the ->lm_lock() function for both locks and
         unlocks (i.e. unlock is a lock with target mode LM_ST_UNLOCKED)
       o The fast path is considerably shortly, giving performance gains
         especially with lock_nolock
       o The glock_workqueue is now used for all the callbacks from the DLM
         which allows us to simplify the lock_dlm module (see following patch)
       o The way is now open to make further changes such as eliminating the two
         threads (gfs2_glockd and gfs2_scand) in favour of a more efficient
         scheme.
      
      This patch has undergone extensive testing with various test suites
      so it should be pretty stable by now.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Bob Peterson <rpeterso@redhat.com>
      6802e340
  23. 12 5月, 2008 1 次提交
  24. 25 1月, 2008 4 次提交
  25. 10 10月, 2007 4 次提交
    • W
      [GFS2] Data corruption fix · de986e85
      Wendy Cheng 提交于
      * GFS2 has been using i_cache array to store its indirect meta blocks.
      Its flush routine doesn't correctly clean up all the entries. The
      problem would show while multiple nodes do simultaneous writes to the
      same file. Upon glock exclusive lock transfer, if the file is a sparse
      file with large file size where the indirect meta blocks span multiple
      array entries with "zero" entries in between. The flush routine
      prematurely stops the flushing that leaves old (stale) entries around.
      This leads to several nasty issues, including data corruption.
      * Fix gfs2_get_block_noalloc checking to correctly return EIO upon
      unmapped buffer.
      Signed-off-by: NWendy Cheng <wcheng@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      de986e85
    • S
      [GFS2] Clean up journaled data writing · 16615be1
      Steven Whitehouse 提交于
      This patch cleans up the code for writing journaled data into the log.
      It also removes the need to allocate a small "tag" structure for each
      block written into the log. Instead we just keep count of the outstanding
      I/O so that we can be sure that its all been written at the correct time.
      Another result of this patch is that a number of ll_rw_block() calls
      have become submit_bh() calls, closing some races at the same time.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      16615be1
    • S
      [GFS2] Clean up gfs2_trans_add_revoke() · 1ad38c43
      Steven Whitehouse 提交于
      The following alters gfs2_trans_add_revoke() to take a struct
      gfs2_bufdata as an argument. This eliminates the memory allocation which
      was previously required by making use of the already existing struct
      gfs2_bufdata. It makes some sanity checks to ensure that the
      gfs2_bufdata has been removed from all the lists before its recycled as
      a revoke structure. This saves one memory allocation and one free per
      revoke structure.
      
      Also as a result, and to simplify the locking, since there is no longer
      any blocking code in gfs2_trans_add_revoke() we must hold the log lock
      whenever this function is called. This reduces the amount of times we
      take and unlock the log lock.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1ad38c43
    • S
      [GFS2] Move pin/unpin into lops.c, clean up locking · 9b9107a5
      Steven Whitehouse 提交于
      gfs2_pin and gfs2_unpin are only used in lops.c, despite being
      defined in meta_io.c, so this patch moves them into lops.c and
      makes them static. At the same time, its possible to clean up
      the locking in the buf and databuf _lo_add() functions so that
      we only need to grab the spinlock once. Also we have to move
      lock_buffer() around the _lo_add() functions since we can't
      do that in gfs2_pin() any more since we hold the spinlock
      for the duration of that function.
      
      As a result, the code shrinks by 12 lines and we do far fewer
      operations when adding buffers to the log. It also makes the
      code somewhat easier to read & understand.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      9b9107a5
  26. 09 7月, 2007 1 次提交
    • R
      [GFS2] assertion failure after writing to journaled file, umount · 2332c443
      Robert Peterson 提交于
      This patch passes all my nasty tests that were causing the code to
      fail under one circumstance or another.  Here is a complete summary
      of all changes from today's git tree, in order of appearance:
      
      1. There are now separate variables for metadata buffer accounting.
      2. Variable sd_log_num_hdrs is no longer needed, since the header
         accounting is taken care of by the reserve/refund sequence.
      3. Fixed a tiny grammatical problem in a comment.
      4. Added a new function "calc_reserved" to calculate the reserved
         log space.  This isn't entirely necessary, but it has two benefits:
         First, it simplifies the gfs2_log_refund function greatly.
         Second, it allows for easier debugging because I could sprinkle the
         code with calls to this function to make sure the accounting is
         proper (by adding asserts and printks) at strategic point of the code.
      5. In log_pull_tail there apparently was a kludge to fix up the
         accounting based on a "pull" parameter.  The buffer accounting is
         now done properly, so the kludge was removed.
      6. File sync operations were making a call to gfs2_log_flush that
         writes another journal header.  Since that header was unplanned
         for (reserved) by the reserve/refund sequence, the free space had
         to be decremented so that when log_pull_tail gets called, the free
         space is be adjusted properly.  (Did I hear you call that a kludge?
         well, maybe, but a lot more justifiable than the one I removed).
      7. In the gfs2_log_shutdown code, it optionally syncs the log by
         specifying the PULL parameter to log_write_header.  I'm not sure
         this is necessary anymore.  It just seems to me there could be
         cases where shutdown is called while there are outstanding log
         buffers.
      8. In the (data)buf_lo_before_commit functions, I changed some offset
         values from being calculated on the fly to being constants.	That
         simplified some code and we might as well let the compiler do the
         calculation once rather than redoing those cycles at run time.
      9. This version has my rewritten databuf_lo_add function.
         This version is much more like its predecessor, buf_lo_add, which
         makes it easier to understand.  Again, this might not be necessary,
         but it seems as if this one works as well as the previous one,
         maybe even better, so I decided to leave it in.
      10. In databuf_lo_before_commit, a previous data corruption problem
         was caused by going off the end of the buffer.  The proper solution
         is to have the proper limit in place, rather than stopping earlier.
         (Thus my previous attempt to fix it is wrong).
         If you don't wrap the buffer, you're stopping too early and that
         causes more log buffer accounting problems.
      11. In lops.h there are two new (previously mentioned) constants for
         figuring out the data offset for the journal buffers.
      12. There are also two new functions, buf_limit and databuf_limit to
         calculate how many entries will fit in the buffer.
      13. In function gfs2_meta_wipe, it needs to distinguish between pinned
         metadata buffers and journaled data buffers for proper journal buffer
         accounting.	It can't use the JDATA gfs2_inode flag because it's
         sometimes passed the "real" inode and sometimes the "metadata
         inode" and the inode flags will be random bits in a metadata
         gfs2_inode.	It needs to base its decision on which was passed in.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      2332c443
  27. 12 2月, 2007 1 次提交
  28. 30 11月, 2006 3 次提交
    • S
      [GFS2] Reduce number of arguments to meta_io.c:getbuf() · cb4c0313
      Steven Whitehouse 提交于
      Since the superblock and the address_space are determined by the
      glock, we might as well just pass that as the argument since all
      the callers already have that available.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      cb4c0313
    • S
      [GFS2] Move gfs2_meta_syncfs() into log.c · a25311c8
      Steven Whitehouse 提交于
      By moving gfs2_meta_syncfs() into log.c, gfs2_ail1_start()
      can be made static.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      a25311c8
    • S
      [GFS2] Fix journal flush problem · b004157a
      Steven Whitehouse 提交于
      This fixes a bug which resulted in poor performance due to flushing
      the journal too often. The code path in question was via the inode_go_sync()
      function in glops.c. The solution is not to flush the journal immediately
      when inodes are ejected from memory, but batch up the work for glockd to
      deal with later on. This means that glocks may now live on beyond the end of
      the lifetime of their inodes (but not very much longer in the normal case).
      
      Also fixed in this patch is a bug (which was hidden by the bug mentioned above) in
      calculation of the number of free journal blocks.
      
      The gfs2_logd process has been altered to be more responsive to the journal
      filling up. We now wake it up when the number of uncommitted journal blocks
      has reached the threshold level rather than trying to flush directly at the
      end of each transaction. This again means doing fewer, but larger, log
      flushes in general.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b004157a
  29. 03 10月, 2006 1 次提交