1. 26 1月, 2008 9 次提交
    • J
      ocfs2: Silence false lockdep warnings · 5fa0613e
      Jan Kara 提交于
      Create separate lockdep lock classes for system file's i_mutexes. They are
      used to guard allocations and similar things and thus rank differently
      than i_mutex of a regular file or directory.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      5fa0613e
    • M
      [PATCH 2/2] ocfs2: cluster aware flock() · 53fc622b
      Mark Fasheh 提交于
      Hook up ocfs2_flock(), using the new flock lock type in dlmglue.c. A new
      mount option, "localflocks" is added so that users can revert to old
      functionality as need be.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      53fc622b
    • S
      ocfs2: Local alloc window size changeable via mount option · 2fbe8d1e
      Sunil Mushran 提交于
      Local alloc is a performance optimization in ocfs2 in which a node
      takes a window of bits from the global bitmap and then uses that for
      all small local allocations. This window size is fixed to 8MB currently.
      This patch allows users to specify the window size in MB including
      disabling it by passing in 0. If the number specified is too large,
      the fs will use the default value of 8MB.
      
      mount -o localalloc=X /dev/sdX /mntpoint
      Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      2fbe8d1e
    • M
      ocfs2: Support commit= mount option · d147b3d6
      Mark Fasheh 提交于
      Mostly taken from ext3. This allows the user to set the jbd commit interval,
      in seconds. The default of 5 seconds stays the same, but now users can
      easily increase the commit interval. Typically, this would be increased in
      order to benefit performance at the expense of data-safety.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      d147b3d6
    • T
      ocfs2: Initalize bitmap_cpg of ocfs2_super to be the maximum. · e9d578a8
      Tao Ma 提交于
      This value is initialized from global_bitmap->id2.i_chain.cl_cpg. If there
      is only 1 group, it will be equal to the total clusters in the volume. So
      as for online resize, it should change for all the nodes in the cluster.
      It isn't easy and there is no corresponding lock for it.
      
      bitmap_cpg is only used in 2 areas:
      1. Check whether the suballoc is too large for us to allocate from the global
         bitmap, so it is little used. And now the suballoc size is 2048, it rarely
         meet this situation and the check is almost useless.
      2. Calculate which group a cluster belongs to. We use it during truncate to
         figure out which cluster group an extent belongs too. But we should be OK
         if we increase it though as the cluster group calculated shouldn't change
         and we only ever have a small bitmap_cpg on file systems with a single
         cluster group.
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      e9d578a8
    • M
      ocfs2: Rename ocfs2_meta_[un]lock · e63aecb6
      Mark Fasheh 提交于
      Call this the "inode_lock" now, since it covers both data and meta data.
      This patch makes no functional changes.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      e63aecb6
    • M
      ocfs2: Remove data locks · c934a92d
      Mark Fasheh 提交于
      The meta lock now covers both meta data and data, so this just removes the
      now-redundant data lock.
      
      Combining locks saves us a round of lock mastery per inode and one less lock
      to ping between nodes during read/write.
      
      We don't lose much - since meta locks were always held before a data lock
      (and at the same level) ordered writeout mode (the default) ensured that
      flushing for the meta data lock also pushed out data anyways.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      c934a92d
    • M
      ocfs2: Remove mount/unmount votes · 34d024f8
      Mark Fasheh 提交于
      The node maps that are set/unset by these votes are no longer relevant, thus
      we can remove the mount and umount votes. Since those are the last two
      remaining votes, we can also remove the entire vote infrastructure.
      
      The vote thread has been renamed to the downconvert thread, and the small
      amount of functionality related to managing it has been moved into
      fs/ocfs2/dlmglue.c. All references to votes have been removed or updated.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      34d024f8
    • M
      ocfs2: Remove fs dependency on ocfs2_heartbeat module · 6f7b056e
      Mark Fasheh 提交于
      Now that the dlm exposes domain information to us, we don't need generic
      node up / node down callbacks. And since the DLM is only telling us when a
      node goes down unexpectedly, we no longer need to optimize away node down
      callbacks via the umount map.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      6f7b056e
  2. 28 11月, 2007 1 次提交
  3. 17 10月, 2007 1 次提交
  4. 13 10月, 2007 3 次提交
  5. 12 9月, 2007 1 次提交
    • T
      [PATCH] ocfs2: fix mount option parsing · c0123ade
      Tiger Yang 提交于
      For some mount option types, ocfs2_parse_options() will try to access
      sb->s_fs_info to get at the ocfs2 private superblock. Unfortunately, that
      hasn't been allocated yet and will cause a kernel crash.
      
      Fix this by storing options in a struct which can then get pushed into the
      ocfs2_super once it's been allocated later. If we need more options which
      store to the ocfs2_super in the future, we can just fields to this struct.
      Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      c0123ade
  6. 10 8月, 2007 3 次提交
  7. 20 7月, 2007 1 次提交
    • P
      mm: Remove slab destructors from kmem_cache_create(). · 20c2df83
      Paul Mundt 提交于
      Slab destructors were no longer supported after Christoph's
      c59def9f change. They've been
      BUGs for both slab and slub, and slob never supported them
      either.
      
      This rips out support for the dtor pointer from kmem_cache_create()
      completely and fixes up every single callsite in the kernel (there were
      about 224, not including the slab allocator definitions themselves,
      or the documentation references).
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      20c2df83
  8. 11 7月, 2007 2 次提交
  9. 17 5月, 2007 1 次提交
    • C
      Remove SLAB_CTOR_CONSTRUCTOR · a35afb83
      Christoph Lameter 提交于
      SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Steven French <sfrench@us.ibm.com>
      Cc: Michael Halcrow <mhalcrow@us.ibm.com>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Anton Altaparmakov <aia21@cantab.net>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jan Kara <jack@ucw.cz>
      Cc: David Chinner <dgc@sgi.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a35afb83
  10. 08 5月, 2007 1 次提交
    • C
      slab allocators: Remove SLAB_DEBUG_INITIAL flag · 50953fe9
      Christoph Lameter 提交于
      I have never seen a use of SLAB_DEBUG_INITIAL.  It is only supported by
      SLAB.
      
      I think its purpose was to have a callback after an object has been freed
      to verify that the state is the constructor state again?  The callback is
      performed before each freeing of an object.
      
      I would think that it is much easier to check the object state manually
      before the free.  That also places the check near the code object
      manipulation of the object.
      
      Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
      compiled with SLAB debugging on.  If there would be code in a constructor
      handling SLAB_DEBUG_INITIAL then it would have to be conditional on
      SLAB_DEBUG otherwise it would just be dead code.  But there is no such code
      in the kernel.  I think SLUB_DEBUG_INITIAL is too problematic to make real
      use of, difficult to understand and there are easier ways to accomplish the
      same effect (i.e.  add debug code before kfree).
      
      There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
      clear in fs inode caches.  Remove the pointless checks (they would even be
      pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.
      
      This is the last slab flag that SLUB did not support.  Remove the check for
      unimplemented flags from SLUB.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      50953fe9
  11. 03 5月, 2007 1 次提交
  12. 27 4月, 2007 3 次提交
    • M
      ocfs2: Cache extent records · 83418978
      Mark Fasheh 提交于
      The extent map code was ripped out earlier because of an inability to deal
      with holes. This patch adds back a simpler caching scheme requiring far less
      code.
      
      Our old extent map caching was designed back when meta data block caching in
      Ocfs2 didn't work very well, resulting in many disk reads. These days our
      metadata caching is much better, resulting in no un-necessary disk reads. As
      a result, extent caching doesn't have to be as fancy, nor does it have to
      cache as many extents. Keeping the last 3 extents seen should be sufficient
      to give us a small performance boost on some streaming workloads.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      83418978
    • M
      ocfs2: temporarily remove extent map caching · 363041a5
      Mark Fasheh 提交于
      The code in extent_map.c is not prepared to deal with a subtree being
      rotated between lookups. This can happen when filling holes in sparse files.
      Instead of a lengthy patch to update the code (which would likely lose the
      benefit of caching subtree roots), we remove most of the algorithms and
      implement a simple path based lookup. A less ambitious extent caching scheme
      will be added in a later patch.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      363041a5
    • T
      ocfs2: Remove delete inode vote · 50008630
      Tiger Yang 提交于
      Ocfs2 currently does cluster-wide node messaging to check the open state of
      an inode during delete. This patch removes that mechanism in favor of an
      inode cluster lock which is taken at shared read when an inode is first read
      and dropped in clear_inode(). This allows a deleting node to test the
      liveness of an inode by attempting to take an exclusive lock.
      Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      50008630
  13. 13 2月, 2007 1 次提交
  14. 14 12月, 2006 1 次提交
  15. 08 12月, 2006 4 次提交
  16. 02 12月, 2006 4 次提交
  17. 22 11月, 2006 1 次提交
  18. 12 10月, 2006 1 次提交
  19. 25 9月, 2006 1 次提交
    • M
      ocfs2: Remove i_generation from inode lock names · 24c19ef4
      Mark Fasheh 提交于
      OCFS2 puts inode meta data in the "lock value block" provided by the DLM.
      Typically, i_generation is encoded in the lock name so that a deleted inode
      on and a new one in the same block don't share the same lvb.
      
      Unfortunately, that scheme means that the read in ocfs2_read_locked_inode()
      is potentially thrown away as soon as the meta data lock is taken - we
      cannot encode the lock name without first knowing i_generation, which
      requires a disk read.
      
      This patch encodes i_generation in the inode meta data lvb, and removes the
      value from the inode meta data lock name. This way, the read can be covered
      by a lock, and at the same time we can distinguish between an up to date and
      a stale LVB.
      
      This will help cold-cache stat(2) performance in particular.
      
      Since this patch changes the protocol version, we take the opportunity to do
      a minor re-organization of two of the LVB fields.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      24c19ef4