1. 14 10月, 2008 6 次提交
    • J
      ocfs2: Make ocfs2_extent_tree the first-class representation of a tree. · f99b9b7c
      Joel Becker 提交于
      We now have three different kinds of extent trees in ocfs2: inode data
      (dinode), extended attributes (xattr_tree), and extended attribute
      values (xattr_value).  There is a nice abstraction for them,
      ocfs2_extent_tree, but it is hidden in alloc.c.  All the calling
      functions have to pick amongst a varied API and pass in type bits and
      often extraneous pointers.
      
      A better way is to make ocfs2_extent_tree a first-class object.
      Everyone converts their object to an ocfs2_extent_tree() via the
      ocfs2_get_*_extent_tree() calls, then uses the ocfs2_extent_tree for all
      tree calls to alloc.c.
      
      This simplifies a lot of callers, making for readability.  It also
      provides an easy way to add additional extent tree types, as they only
      need to be defined in alloc.c with a ocfs2_get_<new>_extent_tree()
      function.
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      Signed-off-by: NMark Fasheh <mfasheh@suse.com>
      f99b9b7c
    • T
      ocfs2: Add extended attribute support · cf1d6c76
      Tiger Yang 提交于
      This patch implements storing extended attributes both in inode or a single
      external block. We only store EA's in-inode when blocksize > 512 or that
      inode block has free space for it. When an EA's value is larger than 80
      bytes, we will store the value via b-tree outside inode or block.
      Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
      Signed-off-by: NMark Fasheh <mfasheh@suse.com>
      cf1d6c76
    • T
      ocfs2: Add extent tree operation for xattr value btrees · f56654c4
      Tao Ma 提交于
      Add some thin wrappers around ocfs2_insert_extent() for each of the 3
      different btree types, ocfs2_inode_insert_extent(),
      ocfs2_xattr_value_insert_extent() and ocfs2_xattr_tree_insert_extent(). The
      last is for the xattr index btree, which will be used in a followup patch.
      
      All the old callers in file.c etc will call ocfs2_dinode_insert_extent(),
      while the other two handle the xattr issue. And the init of extent tree are
      handled by these functions.
      
      When storing xattr value which is too large, we will allocate some clusters
      for it and here ocfs2_extent_list and ocfs2_extent_rec will also be used. In
      order to re-use the b-tree operation code, a new parameter named "private"
      is added into ocfs2_extent_tree and it is used to indicate the root of
      ocfs2_exent_list. The reason is that we can't deduce the root from the
      buffer_head now. It may be in an inode, an ocfs2_xattr_block or even worse,
      in any place in an ocfs2_xattr_bucket.
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NMark Fasheh <mfasheh@suse.com>
      f56654c4
    • T
      ocfs2: Abstract ocfs2_extent_tree in b-tree operations. · e7d4cb6b
      Tao Ma 提交于
      In the old extent tree operation, we take the hypothesis that we
      are using the ocfs2_extent_list in ocfs2_dinode as the tree root.
      As xattr will also use ocfs2_extent_list to store large value
      for a xattr entry, we refactor the tree operation so that xattr
      can use it directly.
      
      The refactoring includes 4 steps:
      1. Abstract set/get of last_eb_blk and update_clusters since they may
         be stored in different location for dinode and xattr.
      2. Add a new structure named ocfs2_extent_tree to indicate the
         extent tree the operation will work on.
      3. Remove all the use of fe_bh and di, use root_bh and root_el in
         extent tree instead. So now all the fe_bh is replaced with
         et->root_bh, el with root_el accordingly.
      4. Make ocfs2_lock_allocators generic. Now it is limited to be only used
         in file extend allocation. But the whole function is useful when we want
         to store large EAs.
      
      Note: This patch doesn't touch ocfs2_commit_truncate() since it is not used
      for anything other than truncate inode data btrees.
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NMark Fasheh <mfasheh@suse.com>
      e7d4cb6b
    • T
      ocfs2: Use ocfs2_extent_list instead of ocfs2_dinode. · 811f933d
      Tao Ma 提交于
      ocfs2_extend_meta_needed(), ocfs2_calc_extend_credits() and
      ocfs2_reserve_new_metadata() are all useful for extent tree operations. But
      they are all limited to an inode btree because they use a struct
      ocfs2_dinode parameter. Change their parameter to struct ocfs2_extent_list
      (the part of an ocfs2_dinode they actually use) so that the xattr btree code
      can use these functions.
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NMark Fasheh <mfasheh@suse.com>
      811f933d
    • M
      ocfs2: throttle back local alloc when low on disk space · 9c7af40b
      Mark Fasheh 提交于
      Ocfs2's local allocator disables itself for the duration of a mount point
      when it has trouble allocating a large enough area from the primary bitmap.
      That can cause performance problems, especially for disks which were only
      temporarily full or fragmented. This patch allows for the allocator to
      shrink it's window first, before being disabled. Later, it can also be
      re-enabled so that any performance drop is minimized.
      
      To do this, we allow the value of osb->local_alloc_bits to be shrunk when
      needed. The default value is recorded in a mostly read-only variable so that
      we can re-initialize when required.
      
      Locking had to be updated so that we could protect changes to
      local_alloc_bits. Mostly this involves protecting various local alloc values
      with the osb spinlock. A new state is also added, OCFS2_LA_THROTTLED, which
      is used when the local allocator is has shrunk, but is not disabled. If the
      available space dips below 1 megabyte, the local alloc file is disabled. In
      either case, local alloc is re-enabled 30 seconds after the event, or when
      an appropriate amount of bits is seen in the primary bitmap.
      Signed-off-by: NMark Fasheh <mfasheh@suse.com>
      9c7af40b
  2. 18 4月, 2008 1 次提交
  3. 26 1月, 2008 1 次提交
  4. 21 9月, 2007 1 次提交
    • M
      ocfs2: Allow smaller allocations during large writes · 415cb800
      Mark Fasheh 提交于
      The ocfs2 write code loops through a page much like the block code, except
      that ocfs2 allocation units can be any size, including larger than page
      size. Typically it's equal to or larger than page size - most kernels run 4k
      pages, the minimum ocfs2 allocation (cluster) size.
      
      Some changes introduced during 2.6.23 changed the way writes to pages are
      handled, and inadvertantly broke support for > 4k page size. Instead of just
      writing one cluster at a time, we now handle the whole page in one pass.
      
      This means that multiple (small) seperate allocations might happen in the
      same pass. The allocation code howver typically optimizes by getting the
      maximum which was reserved. This triggered a BUG_ON in the extend code where
      it'd ask for a single bit (for one part of a > 4k page) and get back more
      than it asked for.
      
      Fix this by providing a variant of the high level allocation function which
      allows the caller to specify a maximum. The traditional function remains and
      just calls the new one with a maximum determined from the initial
      reservation.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      415cb800
  5. 11 7月, 2007 2 次提交
  6. 02 12月, 2006 2 次提交
  7. 08 8月, 2006 1 次提交
    • M
      ocfs2: allocation hints · 883d4cae
      Mark Fasheh 提交于
      Record the most recently used allocation group on the allocation context, so
      that subsequent allocations can attempt to optimize for contiguousness.
      Local alloc especially should benefit from this as the current chain search
      tends to let it spew across the disk.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      883d4cae
  8. 04 1月, 2006 1 次提交