1. 09 7月, 2007 22 次提交
    • A
      [GFS2] Fix deallocation issues · d93cfa98
      Abhijith Das 提交于
      There were two issues during deallocation of unlinked inodes. The
      first was relating to the use of a "try" lock which in the case of
      the inode lock wasn't trying hard enough to deallocate in all
      circumstances (now changed to a normal glock) and in the case of
      the iopen lock didn't wait for the demotion of the shared lock before
      attempting to get the exclusive lock, and thereby sometimes (timing dependent)
      not completing the deallocation when it should have done.
      
      The second issue related to the lack of a way to invalidate dcache entries
      on remote nodes (now fixed by this patch) which meant that unlinks were
      taking a long time to return disk space to the fs. By adding some code to
      invalidate the dcache entries across the cluster for unlinked inodes, that
      is now fixed.
      
      This patch was written jointly by Abhijith Das and Steven Whitehouse.
      Signed-off-by: NAbhijith Das <adas@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      d93cfa98
    • D
      [GFS2] return conflicts for GETLK · a7a2ff8a
      David Teigland 提交于
      We weren't returning the correct result when GETLK found a conflict,
      which is indicated by userspace passing back a 1.
      
      Signed-off-by: Abhijith Das <adas redhat com>
      Signed-off-by: David Teigland <teigland redhat com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      a7a2ff8a
    • D
      [GFS2] set plock owner in GETLK info · d88101d4
      David Teigland 提交于
      Set the owner field in the plock info sent to userspace for GETLK.
      Without this, gfs_controld won't correctly see when the GETLK from a
      process matches one of the process's existing locks.
      Signed-off-by: NDavid Teigland <teigland@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      d88101d4
    • A
      [GFS2] gfs2_lookupi() uninitialised var fix · 037bcbb7
      akpm@linux-foundation.org 提交于
      fs/gfs2/inode.c: In function 'gfs2_lookupi':
      fs/gfs2/inode.c:392: warning: 'error' may be used uninitialized in this function
      
      Looks like a real bug to me.
      
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      037bcbb7
    • S
      [GFS2] Recovery for lost unlinked inodes · c8cdf479
      Steven Whitehouse 提交于
      Under certain circumstances its possible (though rather unlikely) that
      inodes which were unlinked by one node while still open on another might
      get "lost" in the sense that they don't get deallocated if the node
      which held the inode open crashed before it was unlinked.
      
      This patch adds the recovery code which allows automatic deallocation of
      the inode if its found during block allocation (the sensible time to
      look for such inodes since we are scanning the rgrp's bitmaps anyway at
      this time, so it adds no overhead to do this).
      
      Since the inode will have had its i_nlink set to zero, all we need to
      trigger recovery is a lookup and an iput(), and the normal deallocation
      code takes care of the rest.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      c8cdf479
    • R
      [GFS2] Can't mount GFS2 file system on AoE device · b35997d4
      Robert Peterson 提交于
      This patch fixes bug 243131: Can't mount GFS2 file system on AoE device.
      When using AoE devices with lock_nolock, there is no locking table, so
      gfs2 (and gfs1) uses the superblock s_id.  This turns out to be the device
      name in some cases.  In the case of AoE, the device contains a slash,
      (e.g. "etherd/e1.1p2") which is an invalid character when we try to
      register the table in sysfs.  This patch replaces the "/" with underscore.
      Rather than add a new variable to the stack, I'm just reusing a (char *)
      variable that's no longer used: table.
      
      This code has been tested on the failing system using a RHEL5 patch.
      The upstream code was tested by using gfs2_tool sb to interject a "/"
      into the table name of a clustered gfs2 file system.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b35997d4
    • S
      [GFS2] Fix bug in error path of inode · e1cc8603
      Steven Whitehouse 提交于
      This fixes a bug in the ordering of operations in the error path of
      createi. Its not valid to do an iput() when holding the inode's glock
      since the iput() will (in this case) result in delete_inode() being
      called which needs to grab the lock itself. This was causing the
      recursive lock checking code to trigger.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      e1cc8603
    • S
      [GFS2] Fix typo in rename of directories · ffed8ab3
      Steven Whitehouse 提交于
      A typo caused us to pass a NULL pointer when renaming directories. It
      was accidentally introduced in: [GFS2] Clean up inode number handling
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      ffed8ab3
    • P
      [DLM] variable allocation · 44f487a5
      Patrick Caulfield 提交于
      Add a new flag, DLM_LSFL_FS, to be used when a file system creates a lockspace.
      This flag causes the dlm to use GFP_NOFS for allocations instead of GFP_KERNEL.
      (This updated version of the patch uses gfp_t for ls_allocation.)
      Signed-Off-By: NPatrick Caulfield <pcaulfie@redhat.com>
      Signed-Off-By: NDavid Teigland <teigland@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      44f487a5
    • S
      [GFS2] Add nanosecond timestamp feature · 4bd91ba1
      Steven Whitehouse 提交于
      This adds a nanosecond timestamp feature to the GFS2 filesystem. Due
      to the way that the on-disk format works, older filesystems will just
      appear to have this field set to zero. When mounted by an older version
      of GFS2, the filesystem will simply ignore the extra fields so that
      it will again appear to have whole second resolution, so that its
      trivially backward compatible.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      4bd91ba1
    • S
      [GFS2] Fix sign problem in quota/statfs and cleanup _host structures · bb8d8a6f
      Steven Whitehouse 提交于
      This patch fixes some sign issues which were accidentally introduced
      into the quota & statfs code during the endianess annotation process.
      Also included is a general clean up which moves all of the _host
      structures out of gfs2_ondisk.h (where they should not have been to
      start with) and into the places where they are actually used (often only
      one place). Also those _host structures which are not required any more
      are removed entirely (which is the eventual plan for all of them).
      
      The conversion routines from ondisk.c are also moved into the places
      where they are actually used, which for almost every one, was just one
      single place, so all those are now static functions. This also cleans up
      the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__.
      
      The net result is a reduction of about 100 lines of code, many functions
      now marked static plus the bug fixes as mentioned above. For good
      measure I ran the code through sparse after making these changes to
      check that there are no warnings generated.
      
      This fixes Red Hat bz #239686
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      bb8d8a6f
    • B
      [GFS2] fix jdata issues · ddf4b426
      Benjamin Marzinski 提交于
      This is a patch for the first three issues of RHBZ #238162
      
      The first issue is that when you allocate a new page for a file, it will not
      start off uptodate. This makes sense, since you haven't written anything to that
      part of the file yet.  Unfortunately, gfs2_pin() checks to make sure that the
      buffers are uptodate.  The solution to this is to mark the buffers uptodate in
      gfs2_commit_write(), after they have been zeroed out and have the data written
      into them.  I'm pretty confident with this fix, although it's not completely
      obvious that there is no problem with marking the buffers uptodate here.
      
      The second issue is simply that you can try to pin a data buffer that is already
      on the incore log, and thus, already pinned. This patch checks to see if this
      buffer is already on the log, and exits databuf_lo_add() if it is, just like
      buf_lo_add() does.
      
      The third issue is that gfs2_log_flush() doesn't do it's block accounting
      correctly.  Both metadata and journaled data are logged, but gfs2_log_flush()
      only compares the number of metadata blocks with the number of blocks to commit
      to the ondisk journal.  This patch also counts the journaled data blocks.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      ddf4b426
    • S
      [GFS2] Make the log reserved blocks depend on block size · 89918647
      Steven Whitehouse 提交于
      The number of blocks which we reserve in the log at the start of each
      transaction needs to depends upon the block size since the overhead is
      related to the number of "pointers" which can be fitted into a single
      block.
      
      This relates to Red Hat bz #240435
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      89918647
    • A
      [GFS2] Quotas non-functional - fix another bug · 1990e917
      Abhijith Das 提交于
      This patch fixes a bug where gfs2 was writing update quota usage
      information to the wrong location in the quota file.
      Signed-off-by: NAbhijith Das <adas@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1990e917
    • A
      [GFS2] Quotas non-functional - fix bug · 2a87ab08
      Abhijith Das 提交于
      This patch fixes an error in the quota code where a 'struct
      gfs2_quota_lvb*' was being passed to gfs2_adjust_quota() instead of a
      'struct gfs2_quota_data*'. Also moved 'struct gfs2_quota_lvb' from
      fs/gfs2/incore.h to include/linux/gfs2_ondisk.h as per Steve's suggestion.
      Signed-off-by: NAbhijith Das <adas@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      2a87ab08
    • S
      [GFS2] Clean up inode number handling · dbb7cae2
      Steven Whitehouse 提交于
      This patch cleans up the inode number handling code. The main difference
      is that instead of looking up the inodes using a struct gfs2_inum_host
      we now use just the no_addr member of this structure. The tests relating
      to no_formal_ino can then be done by the calling code. This has
      advantages in that we want to do different things in different code
      paths if the no_formal_ino doesn't match. In the NFS patch we want to
      return -ESTALE, but in the ->lookup() path, its a bug in the fs if the
      no_formal_ino doesn't match and thus we can withdraw in this case.
      
      In order to later fix bz #201012, we need to be able to look up an inode
      without knowing no_formal_ino, as the only information that is known to
      us is the on-disk location of the inode in question.
      
      This patch will also help us to fix bz #236099 at a later date by
      cleaning up a lot of the code in that area.
      
      There are no user visible changes as a result of this patch and there
      are no changes to the on-disk format either.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      dbb7cae2
    • S
      [GFS2] Reduce size of struct gdlm_lock · 41d7db0a
      Steven Whitehouse 提交于
      This patch removes the completion (which is rather large) from struct
      gdlm_lock in favour of using the wait_on_bit() functions. We don't need
      to add any extra fields to the structure to do this, so we save 32 bytes
      (on x86_64) per structure. This adds up to quite a lot when we may
      potentially have millions of these lock structures,
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Acked-by: NDavid Teigland <teigland@redhat.com>
      41d7db0a
    • R
      [GFS2] Addendum patch 2 for gfs2_grow · cd81a4ba
      Robert Peterson 提交于
      This addendum patch 2 corrects three things:
      
      1. It fixes a stupid mistake in the previous addendum that broke gfs2.
         Ref: https://www.redhat.com/archives/cluster-devel/2007-May/msg00162.html
      2. It fixes a problem that Dave Teigland pointed out regarding the
         external declarations in ops_address.h being in the wrong place.
      3. It recasts a couple more %llu printks to (unsigned long long)
         as requested by Steve Whitehouse.
      
      I would have loved to put this all in one revised patch, but there was
      a rush to get some patches for RHEL5.	Therefore, the previous patches
      were applied to the git tree "as is" and therefore, I'm posting another
      addendum.  Sorry.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      cd81a4ba
    • N
      [GFS2] use zero_user_page · 0507ecf5
      Nate Diller 提交于
      Use zero_user_page() instead of open-coding it.
      Signed-off-by: NNate Diller <nate.diller@gmail.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      0507ecf5
    • R
      [GFS2] Kernel changes to support new gfs2_grow command (part 2) · 6c53267f
      Robert Peterson 提交于
      To avoid code redundancy, I separated out the operational "guts" into
      a new function called read_rindex_entry.  Then I made two functions:
      the closer-to-original gfs2_ri_update (without the special condition
      checks) and gfs2_ri_update_special that's designed with that condition
      in mind.  (I don't like the name, but if you have a suggestion, I'm
      all ears).
      
      Oh, and there's an added benefit:  we don't need all the ugly gotos
      anymore.  ;)
      
      This patch has been tested with gfs2_fsck_hellfire (which runs for
      three and a half hours, btw).
      Signed-off-By: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6c53267f
    • R
      [GFS2] kernel changes to support new gfs2_grow command · 7ae8fa84
      Robert Peterson 提交于
      This is another revision of my gfs2 kernel patch that allows
      gfs2_grow to function properly.
      
      Steve Whitehouse expressed some concerns about the previous
      patch and I restructured it based on his comments.
      The previous patch was doing the statfs_change at file close time,
      under its own transaction.  The current patch does the statfs_change
      inside the gfs2_commit_write function, which keeps it under the
      umbrella of the inode transaction.
      
      I can't call ri_update to re-read the rindex file during the
      transaction because the transaction may have outstanding unwritten
      buffers attached to the rgrps that would be otherwise blown away.
      So instead, I created a new function, gfs2_ri_total, that will
      re-read the rindex file just to total the file system space
      for the sake of the statfs_change.  The ri_update will happen
      later, when gfs2 realizes the version number has changed, as it
      happened before my patch.
      
      Since the statfs_change is happening at write_commit time and there
      may be multiple writes to the rindex file for one grow operation.
      So one consequence of this restructuring is that instead of getting
      one kernel message to indicate the change, you may see several.
      For example, before when you did a gfs2_grow, you'd get a single
      message like:
      
      GFS2: File system extended by 247876 blocks (968MB)
      
      Now you get something like:
      
      GFS2: File system extended by 207896 blocks (812MB)
      GFS2: File system extended by 39980 blocks (156MB)
      
      This version has also been successfully run against the hours-long
      "gfs2_fsck_hellfire" test that does several gfs2_grow and gfs2_fsck
      while interjecting file system damage.  It does this repeatedly
      under a variety Resource Group conditions.
      Signed-off-By: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7ae8fa84
    • B
      [GFS2] flush the glock completely in inode_go_sync · b524fe64
      Benjamin Marzinski 提交于
      Fix for bz #231910
      When filemap_fdatawrite() is called on the inode mapping in data=ordered mode,
      it will add the glock to the log. In inode_go_sync(), if you do the
      gfs2_log_flush() before this, after the filemap_fdatawrite() call, the glock
      and its associated data buffers will be on the log again. This means you can
      demote a lock from exclusive, without having it flushed from the log. The
      attached patch simply moves the gfs2_log_flush up to after the
      filemap_fdatawrite() call.
      
      Originally, I tried moving the gfs2_log_flush to after gfs2_meta_sync(), but
      that caused me to trip the following assert.
      
      GFS2: fsid=cypher-36:test.0: fatal: assertion "!buffer_busy(bh)" failed
      GFS2: fsid=cypher-36:test.0:   function = gfs2_ail_empty_gl, file = fs/gfs2/glops.c, line = 61
      
      It appears that gfs2_log_flush() puts some of the glocks buffers in the busy
      state and the filemap_fdatawrite() call is necessary to flush them. This makes
      me worry slightly that a related problem could happen because of moving the
      gfs2_log_flush() after the initial filemap_fdatawrite(), but I assume that
      gfs2_ail_empty_gl() would catch that case as well.
      Signed-off-by: NBenjamin E. Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b524fe64
  2. 22 5月, 2007 1 次提交
    • A
      Detach sched.h from mm.h · e8edc6e0
      Alexey Dobriyan 提交于
      First thing mm.h does is including sched.h solely for can_do_mlock() inline
      function which has "current" dereference inside. By dealing with can_do_mlock()
      mm.h can be detached from sched.h which is good. See below, why.
      
      This patch
      a) removes unconditional inclusion of sched.h from mm.h
      b) makes can_do_mlock() normal function in mm/mlock.c
      c) exports can_do_mlock() to not break compilation
      d) adds sched.h inclusions back to files that were getting it indirectly.
      e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
         getting them indirectly
      
      Net result is:
      a) mm.h users would get less code to open, read, preprocess, parse, ... if
         they don't need sched.h
      b) sched.h stops being dependency for significant number of files:
         on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
         after patch it's only 3744 (-8.3%).
      
      Cross-compile tested on
      
      	all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
      	alpha alpha-up
      	arm
      	i386 i386-up i386-defconfig i386-allnoconfig
      	ia64 ia64-up
      	m68k
      	mips
      	parisc parisc-up
      	powerpc powerpc-up
      	s390 s390-up
      	sparc sparc-up
      	sparc64 sparc64-up
      	um-x86_64
      	x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig
      
      as well as my two usual configs.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8edc6e0
  3. 17 5月, 2007 1 次提交
    • C
      Remove SLAB_CTOR_CONSTRUCTOR · a35afb83
      Christoph Lameter 提交于
      SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Steven French <sfrench@us.ibm.com>
      Cc: Michael Halcrow <mhalcrow@us.ibm.com>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Anton Altaparmakov <aia21@cantab.net>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jan Kara <jack@ucw.cz>
      Cc: David Chinner <dgc@sgi.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a35afb83
  4. 09 5月, 2007 2 次提交
  5. 08 5月, 2007 1 次提交
    • C
      slab allocators: Remove SLAB_DEBUG_INITIAL flag · 50953fe9
      Christoph Lameter 提交于
      I have never seen a use of SLAB_DEBUG_INITIAL.  It is only supported by
      SLAB.
      
      I think its purpose was to have a callback after an object has been freed
      to verify that the state is the constructor state again?  The callback is
      performed before each freeing of an object.
      
      I would think that it is much easier to check the object state manually
      before the free.  That also places the check near the code object
      manipulation of the object.
      
      Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
      compiled with SLAB debugging on.  If there would be code in a constructor
      handling SLAB_DEBUG_INITIAL then it would have to be conditional on
      SLAB_DEBUG otherwise it would just be dead code.  But there is no such code
      in the kernel.  I think SLUB_DEBUG_INITIAL is too problematic to make real
      use of, difficult to understand and there are easier ways to accomplish the
      same effect (i.e.  add debug code before kfree).
      
      There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
      clear in fs inode caches.  Remove the pointless checks (they would even be
      pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.
      
      This is the last slab flag that SLUB did not support.  Remove the check for
      unimplemented flags from SLUB.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      50953fe9
  6. 07 5月, 2007 2 次提交
  7. 03 5月, 2007 1 次提交
  8. 01 5月, 2007 10 次提交