1. 27 7月, 2010 1 次提交
    • C
      direct-io: move aio_complete into ->end_io · 552ef802
      Christoph Hellwig 提交于
      Filesystems with unwritten extent support must not complete an AIO request
      until the transaction to convert the extent has been commited.  That means
      the aio_complete calls needs to be moved into the ->end_io callback so
      that the filesystem can control when to call it exactly.
      
      This makes a bit of a mess out of dio_complete and the ->end_io callback
      prototype even more complicated. 
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: Jan Kara <jack@suse.cz> 
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      552ef802
  2. 28 5月, 2010 2 次提交
  3. 25 5月, 2010 1 次提交
  4. 24 5月, 2010 4 次提交
  5. 22 5月, 2010 10 次提交
  6. 19 5月, 2010 10 次提交
    • J
      ocfs2: Silence a gcc warning. · 18d3a98f
      Joel Becker 提交于
      ocfs2_block_group_claim_bits() is never called with min_bits=0, but we
      shouldn't leave status undefined if it ever is.
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      18d3a98f
    • T
      ocfs2: Don't retry xattr set in case value extension fails. · 5f5261ac
      Tao Ma 提交于
      In normal xattr set, the set sequence is inode, xattr block
      and finally xattr bucket if we meet with a ENOSPC. But there
      is a corner case.
      So consider we will set a xattr whose value will be stored in
      a cluster, and there is no xattr block by now. So we will
      reserve 1 xattr block and 1 cluster for setting it. Now if we
      fail in value extension(in case the volume is almost full and
      we can't allocate the cluster because the check in
      ocfs2_test_bg_bit_allocatable), ENOSPC will be returned. So
      we will try to create a bucket(this time there is a chance that
      the reserved cluster will be used), and when we try value extension
      again, kernel bug happens. We did meet with it. Check the bug below.
      http://oss.oracle.com/bugzilla/show_bug.cgi?id=1251
      
      This patch just try to avoid this by adding a set_abort in
      ocfs2_xattr_set_ctxt, so in case ENOSPC happens in value extension,
      we will check whether it is caused by the real ENOSPC or just the
      full of inode or xattr block. If it is the first case, we set set_abort
      so that we don't try any further. we are safe to exit directly here
      ince it is really ENOSPC.
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      5f5261ac
    • W
      ocfs2:dlm: avoid dlm->ast_lock lockres->spinlock dependency break · d9ef7522
      Wengang Wang 提交于
      Currently we process a dirty lockres with the lockres->spinlock taken. While
      during the process, we may need to lock on dlm->ast_lock. This breaks the
      dependency of dlm->ast_lock(lock first) and lockres->spinlock(lock second).
      
      This patch fixes the problem.
      Since we can't release lockres->spinlock, we have to take dlm->ast_lock
      just before taking the lockres->spinlock and release it after lockres->spinlock
      is released. And use __dlm_queue_bast()/__dlm_queue_ast(), the nolock version,
      in dlm_shuffle_lists(). There are no too many locks on a lockres, so there is no
      performance harm.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      d9ef7522
    • T
      ocfs2: Reset xattr value size after xa_cleanup_value_truncate(). · d5a7df06
      Tao Ma 提交于
      In ocfs2_prepare_xattr_entry, if we fail to grow an existing value,
      xa_cleanup_value_truncate() will leave the old entry in place.  Thus, we
      reset its value size.  However, if we were allocating a new value, we
      must not reset the value size or we will BUG().  This resolves
      oss.oracle.com bug 1247.
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      d5a7df06
    • J
      fs/ocfs2/dlm: Use kstrdup · 316ce2ba
      Julia Lawall 提交于
      Use kstrdup when the goal of an allocation is copy a string into the
      allocated region.
      
      The semantic patch that makes this change is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      expression from,to;
      expression flag,E1,E2;
      statement S;
      @@
      
      -  to = kmalloc(strlen(from) + 1,flag);
      +  to = kstrdup(from, flag);
         ... when != \(from = E1 \| to = E1 \)
         if (to==NULL || ...) S
         ... when != \(from = E2 \| to = E2 \)
      -  strcpy(to, from);
      // </smpl>
      Signed-off-by: NJulia Lawall <julia@diku.dk>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      316ce2ba
    • J
      fs/ocfs2/dlm: Drop memory allocation cast · 3914ed0c
      Julia Lawall 提交于
      Drop cast on the result of kmalloc and similar functions.
      
      The semantic patch that makes this change is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      type T;
      @@
      
      - (T *)
        (\(kmalloc\|kzalloc\|kcalloc\|kmem_cache_alloc\|kmem_cache_zalloc\|
         kmem_cache_alloc_node\|kmalloc_node\|kzalloc_node\)(...))
      // </smpl>
      Signed-off-by: NJulia Lawall <julia@diku.dk>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      3914ed0c
    • T
      Ocfs2: Optimize punching-hole code. · c1631d4a
      Tristan Ye 提交于
      This patch simplifies the logic of handling existing holes and
      skipping extent blocks and removes some confusing comments.
      
      The patch survived the fill_verify_holes testcase in ocfs2-test.
      It also passed my manual sanity check and stress tests with enormous
      extent records.
      
      Currently punching a hole on a file with 3+ extent tree depth was
      really a performance disaster.  It can even take several hours,
      though we may not hit this in real life with such a huge extent
      number.
      
      One simple way to improve the performance is quite straightforward.
      From the logic of truncate, we can punch the hole from hole_end to
      hole_start, which reduces the overhead of btree operations in a
      significant way, such as tree rotation and moving.
      
      Following is the testing result when punching hole from 0 to file end
      in bytes, on a 1G file, 1G file consists of 256k extent records, each record
      cover 4k data(just one cluster, clustersize is 4k):
      
      ===========================================================================
       * Original punching-hole mechanism:
      ===========================================================================
      
         I waited 1 hour for its completion, unfortunately it's still ongoing.
      
      ===========================================================================
       * Patched punching-hode mechanism:
      ===========================================================================
      
         real 0m2.518s
         user 0m0.000s
         sys  0m2.445s
      
      That means we've gained up to 1000 times improvement on performance in this
      case, whee! It's fairly cool. and it looks like that performance gain will
      be raising when extent records grow.
      
      The patch was based on my former 2 patches, which were about truncating
      codes optimization and fixup to handle CoW on punching hole.
      Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
      Acked-by: NMark Fasheh <mfasheh@suse.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      c1631d4a
    • T
      Ocfs2: Make ocfs2_find_cpos_for_left_leaf() public. · ee149a7c
      Tristan Ye 提交于
      The original idea to pull ocfs2_find_cpos_for_left_leaf() out of
      alloc.c is to benefit punching-holes optimization patch, it however,
      can also be referred by other funcs in the future who want to do the
      same job.
      Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
      Acked-by: NMark Fasheh <mfasheh@suse.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      ee149a7c
    • T
      Ocfs2: Fix hole punching to correctly do CoW during cluster zeroing. · e8aec068
      Tristan Ye 提交于
      Based on the previous patch of optimizing truncate, the bugfix for
      refcount trees when punching holes can be fairly easy
      and straightforward since most of work we should take into account for
      refcounting have been completed already in ocfs2_remove_btree_range().
      
      This patch performs CoW for refcounted extents when a hole being punched
      whose start or end offset were in the middle of a cluster, which means
      partial zeroing of the cluster will be performed soon.
      
      The patch has been tested fixing the following bug:
      
      http://oss.oracle.com/bugzilla/show_bug.cgi?id=1216Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
      Acked-by: NMark Fasheh <mfasheh@suse.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      e8aec068
    • T
      Ocfs2: Optimize ocfs2 truncate to use ocfs2_remove_btree_range() instead. · 78f94673
      Tristan Ye 提交于
      Truncate is just a special case of punching holes(from new i_size to
      end), we therefore could take advantage of the existing
      ocfs2_remove_btree_range() to reduce the comlexity and redundancy in
      alloc.c.  The goal here is to make truncate more generic and
      straightforward.
      
      Several functions only used by ocfs2_commit_truncate() will smiply be
      removed.
      
      ocfs2_remove_btree_range() was originally used by the hole punching
      code, which didn't take refcount trees into account (definitely a bug).
      We therefore need to change that func a bit to handle refcount trees.
      It must take the refcount lock, calculate and reserve blocks for
      refcount tree changes, and decrease refcounts at the end.  We replace 
      ocfs2_lock_allocators() here by adding a new func
      ocfs2_reserve_blocks_for_rec_trunc() which accepts some extra blocks to
      reserve.  This will not hurt any other code using
      ocfs2_remove_btree_range() (such as dir truncate and hole punching).
      
      I merged the following steps into one patch since they may be
      logically doing one thing, though I know it looks a little bit fat
      to review.
      
      1). Remove redundant code used by ocfs2_commit_truncate(), since we're
          moving to ocfs2_remove_btree_range anyway.
      
      2). Add a new func ocfs2_reserve_blocks_for_rec_trunc() for purpose of
          accepting some extra blocks to reserve.
      
      3). Change ocfs2_prepare_refcount_change_for_del() a bit to fit our
          needs.  It's safe to do this since it's only being called by
          truncate.
      
      4). Change ocfs2_remove_btree_range() a bit to take refcount case into
          account.
      
      5). Finally, we change ocfs2_commit_truncate() to call
          ocfs2_remove_btree_range() in a proper way.
      
      The patch has been tested normally for sanity check, stress tests
      with heavier workload will be expected.
      
      Based on this patch, fixing the punching holes bug will be fairly easy.
      Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
      Acked-by: NMark Fasheh <mfasheh@suse.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      78f94673
  7. 11 5月, 2010 2 次提交
    • J
      ocfs2: Block signals for mkdir/link/symlink/O_CREAT. · 547ba7c8
      Joel Becker 提交于
      Once file or link creation gets going, it can't be interrupted by a
      signal.  They're not idempotent.
      
      This blocks signals in ocfs2_mknod(), ocfs2_link(), and ocfs2_symlink()
      once we start actually changing things.  ocfs2_mknod() covers mknod(),
      creat(), mkdir(), and open(O_CREAT).
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      547ba7c8
    • J
      ocfs2: Wrap signal blocking in void functions. · e4b963f1
      Joel Becker 提交于
      ocfs2 sometimes needs to block signals around dlm operations, but it
      currently does it with sigprocmask().  Even worse, it's checking the
      error code of sigprocmask().  The in-kernel sigprocmask() can only error
      if you get the SIG_* argument wrong.  We don't.
      
      Wrap the sigprocmask() calls with ocfs2_[un]block_signals().  These
      functions are void, but they will BUG() if somehow sigprocmask() returns
      an error.
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      e4b963f1
  8. 06 5月, 2010 10 次提交