1. 10 10月, 2007 3 次提交
  2. 09 7月, 2007 5 次提交
    • S
      [GFS2] Small fixes to logging code · a0a24741
      Steven Whitehouse 提交于
      This reverts part of an earlier patch which tried to reclaim
      gfs2_bufdata structures too early and resulted in a "use after free"
      case (this bit from me). Also a change to not write out log headers
      unless we really need to (in the case of flushing nothing we don't need
      a header) from Bob.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      a0a24741
    • R
      [GFS2] assertion failure after writing to journaled file, umount · 2332c443
      Robert Peterson 提交于
      This patch passes all my nasty tests that were causing the code to
      fail under one circumstance or another.  Here is a complete summary
      of all changes from today's git tree, in order of appearance:
      
      1. There are now separate variables for metadata buffer accounting.
      2. Variable sd_log_num_hdrs is no longer needed, since the header
         accounting is taken care of by the reserve/refund sequence.
      3. Fixed a tiny grammatical problem in a comment.
      4. Added a new function "calc_reserved" to calculate the reserved
         log space.  This isn't entirely necessary, but it has two benefits:
         First, it simplifies the gfs2_log_refund function greatly.
         Second, it allows for easier debugging because I could sprinkle the
         code with calls to this function to make sure the accounting is
         proper (by adding asserts and printks) at strategic point of the code.
      5. In log_pull_tail there apparently was a kludge to fix up the
         accounting based on a "pull" parameter.  The buffer accounting is
         now done properly, so the kludge was removed.
      6. File sync operations were making a call to gfs2_log_flush that
         writes another journal header.  Since that header was unplanned
         for (reserved) by the reserve/refund sequence, the free space had
         to be decremented so that when log_pull_tail gets called, the free
         space is be adjusted properly.  (Did I hear you call that a kludge?
         well, maybe, but a lot more justifiable than the one I removed).
      7. In the gfs2_log_shutdown code, it optionally syncs the log by
         specifying the PULL parameter to log_write_header.  I'm not sure
         this is necessary anymore.  It just seems to me there could be
         cases where shutdown is called while there are outstanding log
         buffers.
      8. In the (data)buf_lo_before_commit functions, I changed some offset
         values from being calculated on the fly to being constants.	That
         simplified some code and we might as well let the compiler do the
         calculation once rather than redoing those cycles at run time.
      9. This version has my rewritten databuf_lo_add function.
         This version is much more like its predecessor, buf_lo_add, which
         makes it easier to understand.  Again, this might not be necessary,
         but it seems as if this one works as well as the previous one,
         maybe even better, so I decided to leave it in.
      10. In databuf_lo_before_commit, a previous data corruption problem
         was caused by going off the end of the buffer.  The proper solution
         is to have the proper limit in place, rather than stopping earlier.
         (Thus my previous attempt to fix it is wrong).
         If you don't wrap the buffer, you're stopping too early and that
         causes more log buffer accounting problems.
      11. In lops.h there are two new (previously mentioned) constants for
         figuring out the data offset for the journal buffers.
      12. There are also two new functions, buf_limit and databuf_limit to
         calculate how many entries will fit in the buffer.
      13. In function gfs2_meta_wipe, it needs to distinguish between pinned
         metadata buffers and journaled data buffers for proper journal buffer
         accounting.	It can't use the JDATA gfs2_inode flag because it's
         sometimes passed the "real" inode and sometimes the "metadata
         inode" and the inode flags will be random bits in a metadata
         gfs2_inode.	It needs to base its decision on which was passed in.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      2332c443
    • R
      [GFS2] Journaled file write/unstuff bug · 8fb68595
      Robert Peterson 提交于
      This patch is for bugzilla bug 283162, which uncovered a number of
      bugs pertaining to writing to files that have the journaled bit on.
      These bugs happen most often when writing to the meta_fs because
      the files are always journaled.  So operations like gfs2_grow were
      particularly vulnerable, although many of the problems could be
      recreated with normal files after setting the journaled bit on.
      The problems fixed are:
      
      -GFS2 wasn't ever writing unstuffed journaled data blocks to their
       in-place location on disk. Now it does.
      
      -If you unmounted too quickly after doing IO to a journaled file,
       GFS2 was crashing because you would discard a buffer whose bufdata
       was still on the active items list.  GFS2 now deals with this
       gracefully.
      
      -GFS2 was losing track of the bufdata for journaled data blocks,
       and it wasn't getting freed, causing an error when you tried to
       unmount the module.  GFS2 now frees all the bufdata structures.
      
      -There was a memory corruption occurring because GFS2 wrote
       twice as many log entries for journaled buffers.
      
      -It was occasionally trying to write journal headers in buffers
       that weren't currently mapped.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      8fb68595
    • B
      [GFS2] fix jdata issues · ddf4b426
      Benjamin Marzinski 提交于
      This is a patch for the first three issues of RHBZ #238162
      
      The first issue is that when you allocate a new page for a file, it will not
      start off uptodate. This makes sense, since you haven't written anything to that
      part of the file yet.  Unfortunately, gfs2_pin() checks to make sure that the
      buffers are uptodate.  The solution to this is to mark the buffers uptodate in
      gfs2_commit_write(), after they have been zeroed out and have the data written
      into them.  I'm pretty confident with this fix, although it's not completely
      obvious that there is no problem with marking the buffers uptodate here.
      
      The second issue is simply that you can try to pin a data buffer that is already
      on the incore log, and thus, already pinned. This patch checks to see if this
      buffer is already on the log, and exits databuf_lo_add() if it is, just like
      buf_lo_add() does.
      
      The third issue is that gfs2_log_flush() doesn't do it's block accounting
      correctly.  Both metadata and journaled data are logged, but gfs2_log_flush()
      only compares the number of metadata blocks with the number of blocks to commit
      to the ondisk journal.  This patch also counts the journaled data blocks.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      ddf4b426
    • S
      [GFS2] Make the log reserved blocks depend on block size · 89918647
      Steven Whitehouse 提交于
      The number of blocks which we reserve in the log at the start of each
      transaction needs to depends upon the block size since the overhead is
      related to the number of "pointers" which can be fitted into a single
      block.
      
      This relates to Red Hat bz #240435
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      89918647
  3. 30 11月, 2006 3 次提交
    • R
      [GFS2] fs/gfs2/log.c:log_bmap() fix printk format warning · aed3255f
      Ryusuke Konishi 提交于
      Fix a printk format warning in fs/gfs2/log.c:
      fs/gfs2/log.c:322: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'sector_t'
      Signed-off-by: NRyusuke Konishi <ryusuke@osrg.net>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      aed3255f
    • S
      [GFS2] Move gfs2_meta_syncfs() into log.c · a25311c8
      Steven Whitehouse 提交于
      By moving gfs2_meta_syncfs() into log.c, gfs2_ail1_start()
      can be made static.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      a25311c8
    • S
      [GFS2] Fix journal flush problem · b004157a
      Steven Whitehouse 提交于
      This fixes a bug which resulted in poor performance due to flushing
      the journal too often. The code path in question was via the inode_go_sync()
      function in glops.c. The solution is not to flush the journal immediately
      when inodes are ejected from memory, but batch up the work for glockd to
      deal with later on. This means that glocks may now live on beyond the end of
      the lifetime of their inodes (but not very much longer in the normal case).
      
      Also fixed in this patch is a bug (which was hidden by the bug mentioned above) in
      calculation of the number of free journal blocks.
      
      The gfs2_logd process has been altered to be more responsive to the journal
      filling up. We now wake it up when the number of uncommitted journal blocks
      has reached the threshold level rather than trying to flush directly at the
      end of each transaction. This again means doing fewer, but larger, log
      flushes in general.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b004157a
  4. 20 10月, 2006 1 次提交
  5. 13 10月, 2006 1 次提交
  6. 03 10月, 2006 1 次提交
  7. 19 9月, 2006 3 次提交
    • S
      [GFS2] Use list_for_each_entry_safe_reverse in gfs2_ail1_start() · 74669416
      Steven Whitehouse 提交于
      This is an attempt to fix Red Hat bz 204364. I don't hit it all
      the time, but with these changes, running postmark which used to
      trigger it on a regular basis no longer appears to. So I'm not
      saying that its 100% certain that its fixed, but it does look
      promising at the moment.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      74669416
    • F
      [GFS2] Export lm_interface to kernel headers · 7d308590
      Fabio Massimo Di Nitto 提交于
      
      lm_interface.h has a few out of the tree clients such as GFS1
      and userland tools.
      
      Right now, these clients keeps a copy of the file in their build tree
      that can go out of sync.
      
      Move lm_interface.h to include/linux, export it to userland and
      clean up fs/gfs2 to use the new location.
      Signed-off-by: NFabio M. Di Nitto <fabbione@ubuntu.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7d308590
    • S
      [GFS2] Map multiple blocks at once where possible · 7a6bbacb
      Steven Whitehouse 提交于
      This is a tidy up of the GFS2 bmap code. The main change is that the
      bh is passed to gfs2_block_map allowing the flags to be set directly
      rather than having to repeat that code several times in ops_address.c.
      
      At the same time, the extent mapping code from gfs2_extent_map has
      been moved into gfs2_block_map. This allows all calls to gfs2_block_map
      to map extents in the case that no allocation is taking place. As a
      result reads and non-allocating writes should be faster. A quick test
      with postmark appears to support this.
      
      There is a limit on the number of blocks mapped in a single bmap
      call in that it will only ever map blocks which are pointed to
      from a single pointer block. So in other words, it will never try
      to do additional i/o in order to satisfy read-ahead. The maximum
      number of blocks is thus somewhat less than 512 (the GFS2 4k block
      size minus the header divided by sizeof(u64)). I've further limited
      the mapping of "normal" blocks to 32 blocks (to avoid extra work)
      since readpages() will currently read a maximum of 32 blocks ahead (128k).
      
      Some further work will probably be needed to set a suitable value
      for DIO as well, but for now thats left at the maximum 512 (see
      ops_address.c:gfs2_get_block_direct).
      
      There is probably a lot more that can be done to improve bmap for GFS2,
      but this is a good first step.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7a6bbacb
  8. 13 9月, 2006 1 次提交
  9. 07 9月, 2006 1 次提交
  10. 06 9月, 2006 1 次提交
  11. 05 9月, 2006 1 次提交
  12. 01 9月, 2006 1 次提交
    • S
      [GFS2] Update copyright, tidy up incore.h · e9fc2aa0
      Steven Whitehouse 提交于
      As per comments from Jan Engelhardt <jengelh@linux01.gwdg.de> this
      updates the copyright message to say "version" in full rather than
      "v.2". Also incore.h has been updated to remove forward structure
      declarations which are not required.
      
      The gfs2_quota_lvb structure has now had endianess annotations added
      to it. Also quota.c has been updated so that we now store the
      lvb data locally in endian independant format to avoid needing
      a structure in host endianess too. As a result the endianess
      conversions are done as required at various points and thus the
      conversion routines in lvb.[ch] are no longer required. I've
      moved the one remaining constant in lvb.h thats used into lm.h
      and removed the unused lvb.[ch].
      
      I have not changed the HIF_ constants. That is left to a later patch
      which I hope will unify the gh_flags and gh_iflags fields of the
      struct gfs2_holder.
      
      Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      e9fc2aa0
  13. 25 8月, 2006 1 次提交
    • B
      [GFS2] Fix journal off-by-one error · 5dc39fe6
      Benjamin Marzinski 提交于
      log_refund() incorrectly assumed that if a transaction had been touched, it
      always committed buffers to the incore log. Thus, when you got around to
      flushing the log, you would need one more block than you committed, to account
      for the header. So it automatically set reserved to 1, which had the effect of
      making sdp->sd_log_blks_reserved one greater when you got to gfs2_log_flush().
      However, if you don't actually commit anything to the incore log between
      flushes, you don't need the header, because you aren't writing anything out.
      With this patch, log_refund() only increments reservered to account for the
      header if something has been committed since the last flush.
      Signed-off-by: NBenjamin E. Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5dc39fe6
  14. 05 8月, 2006 1 次提交
    • S
      [GFS2] Fix lock ordering bug in page fault path · 59a1cc6b
      Steven Whitehouse 提交于
      Mmapped files were able to trigger a lock ordering bug. Private
      maps do not need to take the glock so early on. Shared maps do
      unfortunately, however we can get around that by adding a flag
      into the flags for the struct gfs2_file. This only works because
      we are taking an exclusive lock at this point, so we know that
      nobody else can be racing with us.
      
      Fixes Red Hat bugzilla: #201196
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      59a1cc6b
  15. 17 7月, 2006 1 次提交
  16. 06 7月, 2006 1 次提交
  17. 15 6月, 2006 1 次提交
    • S
      [GFS2] Fix unlinked file handling · feaa7bba
      Steven Whitehouse 提交于
      This patch fixes the way we have been dealing with unlinked,
      but still open files. It removes all limits (other than memory
      for inodes, as per every other filesystem) on numbers of these
      which we can support on GFS2. It also means that (like other
      fs) its the responsibility of the last process to close the file
      to deallocate the storage, rather than the person who did the
      unlinking. Note that with GFS2, those two events might take place
      on different nodes.
      
      Also there are a number of other changes:
      
       o We use the Linux inode subsystem as it was intended to be
      used, wrt allocating GFS2 inodes
       o The Linux inode cache is now the point which we use for
      local enforcement of only holding one copy of the inode in
      core at once (previous to this we used the glock layer).
       o We no longer use the unlinked "special" file. We just ignore it
      completely. This makes unlinking more efficient.
       o We now use the 4th block allocation state. The previously unused
      state is used to track unlinked but still open inodes.
       o gfs2_inoded is no longer needed
       o Several fields are now no longer needed (and removed) from the in
      core struct gfs2_inode
       o Several fields are no longer needed (and removed) from the in core
      superblock
      
      There are a number of future possible optimisations and clean ups
      which have been made possible by this patch.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      feaa7bba
  18. 19 5月, 2006 2 次提交
  19. 06 5月, 2006 1 次提交
    • S
      [GFS2] Readpages support · fd88de56
      Steven Whitehouse 提交于
      This adds readpages support (and also corrects a small bug in
      the readpage error path at the same time). Hopefully this will
      improve performance by allowing GFS to submit larger lumps of
      I/O at a time.
      
      In order to simplify the setting of BH_Boundary, it currently gets
      set when we hit the end of a indirect pointer block. There is
      always a boundary at this point with the current allocation code.
      It doesn't get all the boundaries right though, so there is still
      room for improvement in this.
      
      See comments in fs/gfs2/ops_address.c for further information about
      readpages with GFS2.
      
      Signed-off-by: Steven Whitehouse
      fd88de56
  20. 22 4月, 2006 1 次提交
  21. 21 4月, 2006 1 次提交
  22. 12 4月, 2006 1 次提交
    • S
      [GFS2] Update journal accounting code. · f4154ea0
      Steven Whitehouse 提交于
      A small update to the journaling code to change the way that
      the "extra" blocks are accounted for in the journal. These are
      used at a rate of one per 503 metadata blocks or one per 251
      journaled data blocks (or just one if the total number of journaled
      blocks in the transaction is smaller). Since we are using them at
      two different rates the old method of accounting for them no longer
      works and we count them up as required.
      
      Since the "per transaction" accounting can't handle this (there is no
      fixed number of header blocks per transaction) we have to account for
      it in the general journal code. We now require that each transaction
      reserves more blocks than it actually needs to take account of the
      possible extra blocks.
      
      Also a final fix to dir.c to ensure that all ref counts are handled
      correctly.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f4154ea0
  23. 08 4月, 2006 1 次提交
  24. 07 4月, 2006 1 次提交
    • S
      [GFS2] Fix a ref count bug and other clean ups · b09e593d
      Steven Whitehouse 提交于
      This fixes a ref count bug that sometimes showed up a umount time
      (causing it to hang) but it otherwise mostly harmless. At the same
      time there are some clean ups including making the log operations
      structures const, moving a memory allocation so that its not done
      in the fast path of checking to see if there is an outstanding
      transaction related to a particular glock.
      
      Removes the sd_log_wrap varaible which was updated, but never actually
      used anywhere. Updates the gfs2 ioctl() to run without the kernel lock
      (which it never needed anyway). Removes the "invalidate inodes" loop
      from GFS2's put_super routine. This is done in kill super anyway so
      we don't need to do it here. The loop was also bogus in that if there
      are any inodes "stuck" at this point its a bug and we need to know
      about it rather than hide it by hanging forever.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b09e593d
  25. 31 3月, 2006 1 次提交
  26. 29 3月, 2006 2 次提交
    • S
      [GFS2] Update locking in log.c · 484adff8
      Steven Whitehouse 提交于
      Replace the lock_for_trans()/lock_for_flush() functions with an rwsem.
      In fact the sd_log_flush_lock becomes an rwsem (the write part of it)
      and is extended slightly to cover everything that the lock_for_flush()
      used to cover. The read part of the lock is instead of lock_for_trans().
      
      This corrects the races in the original code and reduces the code size.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      484adff8
    • S
      [GFS2] Further updates to dir and logging code · 71b86f56
      Steven Whitehouse 提交于
      This reduces the size of the directory code by about 3k and gets
      readdir() to use the functions which were introduced in the previous
      directory code update.
      
      Two memory allocations are merged into one. Eliminates zeroing of some
      buffers which were never used before they were initialised by
      other data.
      
      There is still scope for further improvement in the directory code.
      
      On the logging side, a hand created mutex has been replaced by a
      standard Linux mutex in the log allocation code.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      71b86f56
  27. 02 3月, 2006 1 次提交
  28. 28 2月, 2006 1 次提交
    • S
      [GFS2] Macros removal in gfs2.h · 5c676f6d
      Steven Whitehouse 提交于
      As suggested by Pekka Enberg <penberg@cs.helsinki.fi>.
      
      The DIV_RU macro is renamed DIV_ROUND_UP and and moved to kernel.h
      The other macros are gone from gfs2.h as (although not requested
      by Pekka Enberg) are a number of included header file which are now
      included individually. The inode number comparison function is
      now an inline function.
      
      The DT2IF and IF2DT may be addressed in a future patch.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5c676f6d