1. 20 11月, 2014 1 次提交
  2. 04 11月, 2014 1 次提交
  3. 04 4月, 2014 1 次提交
    • G
      ocfs2: revert iput deferring code in ocfs2_drop_dentry_lock · 8ed6b237
      Goldwyn Rodrigues 提交于
      The following patches are reverted in this patch because these patches
      caused performance regression in the remote unlink() calls.
      
        ea455f8a - ocfs2: Push out dropping of dentry lock to ocfs2_wq
        f7b1aa69 - ocfs2: Fix deadlock on umount
        5fd13189 - ocfs2: Don't oops in ocfs2_kill_sb on a failed mount
      
      Previous patches in this series removed the possible deadlocks from
      downconvert thread so the above patches shouldn't be needed anymore.
      
      The regression is caused because these patches delay the iput() in case
      of dentry unlocks.  This also delays the unlocking of the open lockres.
      The open lockresource is required to test if the inode can be wiped from
      disk or not.  When the deleting node does not get the open lock, it
      marks it as orphan (even though it is not in use by another
      node/process) and causes a journal checkpoint.  This delays operations
      following the inode eviction.  This also moves the inode to the orphaned
      inode which further causes more I/O and a lot of unneccessary orphans.
      
      The following script can be used to generate the load causing issues:
      
        declare -a create
        declare -a remove
        declare -a iterations=(1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384)
        unique="`mktemp -u XXXXX`"
        script="/tmp/idontknow-${unique}.sh"
        cat <<EOF > "${script}"
        for n in {1..8}; do mkdir -p test/dir\${n}
          eval touch test/dir\${n}/foo{1.."\$1"}
        done
        EOF
        chmod 700 "${script}"
      
        function fcreate ()
        {
          exec 2>&1 /usr/bin/time --format=%E "${script}" "$1"
        }
      
        function fremove ()
        {
          exec 2>&1 /usr/bin/time --format=%E ssh node2 "cd `pwd`; rm -Rf test*"
        }
      
        function fcp ()
        {
          exec 2>&1 /usr/bin/time --format=%E ssh node3 "cd `pwd`; cp -R test test.new"
        }
      
        echo -------------------------------------------------
        echo "| # files | create #s | copy #s | remove #s |"
        echo -------------------------------------------------
        for ((x=0; x < ${#iterations[*]} ; x++)) do
          create[$x]="`fcreate ${iterations[$x]}`"
          copy[$x]="`fcp ${iterations[$x]}`"
          remove[$x]="`fremove`"
          printf "| %8d | %9s | %9s | %9s |\n" ${iterations[$x]} ${create[$x]} ${copy[$x]} ${remove[$x]}
        done
        rm "${script}"
        echo "------------------------"
      Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8ed6b237
  4. 30 9月, 2013 1 次提交
  5. 28 2月, 2013 1 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
  6. 14 7月, 2012 3 次提交
  7. 10 3月, 2011 1 次提交
  8. 23 2月, 2011 1 次提交
  9. 07 3月, 2011 1 次提交
    • T
      ocfs2: Remove EXIT from masklog. · c1e8d35e
      Tao Ma 提交于
      mlog_exit is used to record the exit status of a function.
      But because it is added in so many functions, if we enable it,
      the system logs get filled up quickly and cause too much I/O.
      So actually no one can open it for a production system or even
      for a test.
      
      This patch just try to remove it or change it. So:
      1. if all the error paths already use mlog_errno, it is just removed.
         Otherwise, it will be replaced by mlog_errno.
      2. if it is used to print some return value, it is replaced with
         mlog(0,...).
      mlog_exit_ptr is changed to mlog(0.
      All those mlog(0,...) will be replaced with trace events later.
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      c1e8d35e
  10. 21 2月, 2011 1 次提交
    • T
      ocfs2: Remove ENTRY from masklog. · ef6b689b
      Tao Ma 提交于
      ENTRY is used to record the entry of a function.
      But because it is added in so many functions, if we enable it,
      the system logs get filled up quickly and cause too much I/O.
      So actually no one can open it for a production system or even
      for a test.
      
      So for mlog_entry_void, we just remove it.
      for mlog_entry(...), we replace it with mlog(0,...), and they
      will be replace by trace event later.
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      ef6b689b
  11. 07 1月, 2011 6 次提交
  12. 19 11月, 2010 1 次提交
    • T
      Ocfs2: Stop tracking a negative dentry after dentry_iput(). · 1989a80a
      Tristan Ye 提交于
      I suddenly hit the problem during 2.6.37-rc1 regression test, which was
      introduced by commit '5e98d492'(Track
      negative entries v3), following scenario reproduces the issue easily:
      
      Node A			Node B
      ================	============
      $touch 	testfile
      			$ls testfile
      $rm -rf testfile
      $touch 	testfile
      			$ls testfile
      			ls: cannot access testfile: No such file or directory
      
      This patch stops tracking the dentry which was negativated by a inode deletion,
      so as to force the revaliation in next lookup, in case we'll touch the inode
      again in the same node.
      
      It didn't hurt the performance of multiple lookup for none-existed files anyway,
      while regresses a bit in the first try after a file deletion.
      Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      1989a80a
  13. 11 9月, 2010 1 次提交
    • G
      Track negative entries v3 · 5e98d492
      Goldwyn Rodrigues 提交于
      Track negative dentries by recording the generation number of the parent
      directory in d_fsdata. The generation number for the parent directory is
      recorded in the inode_info, which increments every time the lock on the
      directory is dropped.
      
      If the generation number of the parent directory and the negative dentry
      matches, there is no need to perform the revalidate, else a revalidate
      is forced. This improves performance in situations where nodes look for
      the same non-existent file multiple times.
      
      Thanks Mark for explaining the DLM sequence.
      Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.de>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      5e98d492
  14. 28 8月, 2009 1 次提交
  15. 22 7月, 2009 1 次提交
    • J
      ocfs2: Fix deadlock on umount · f7b1aa69
      Jan Kara 提交于
      In commit ea455f8a, we moved the dentry lock
      put process into ocfs2_wq. This causes problems during umount because ocfs2_wq
      can drop references to inodes while they are being invalidated by
      invalidate_inodes() causing all sorts of nasty things (invalidate_inodes()
      ending in an infinite loop, "Busy inodes after umount" messages etc.).
      
      We fix the problem by stopping ocfs2_wq from doing any further releasing of
      inode references on the superblock being unmounted, wait until it finishes
      the current round of releasing and finally cleaning up all the references in
      dentry_lock_list from ocfs2_put_super().
      
      The issue was tracked down by Tao Ma <tao.ma@oracle.com>.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      f7b1aa69
  16. 24 4月, 2009 1 次提交
  17. 28 3月, 2009 1 次提交
  18. 03 2月, 2009 1 次提交
  19. 26 1月, 2008 1 次提交
    • M
      ocfs2: Remove mount/unmount votes · 34d024f8
      Mark Fasheh 提交于
      The node maps that are set/unset by these votes are no longer relevant, thus
      we can remove the mount and umount votes. Since those are the last two
      remaining votes, we can also remove the entire vote infrastructure.
      
      The vote thread has been renamed to the downconvert thread, and the small
      amount of functionality related to managing it has been moved into
      fs/ocfs2/dlmglue.c. All references to votes have been removed or updated.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      34d024f8
  20. 28 11月, 2007 1 次提交
    • M
      ocfs2: Remove bug statement in ocfs2_dentry_iput() · bccb9dad
      Mark Fasheh 提交于
      The existing bug statement didn't take into account unhashed dentries which
      might not have a cluster lock on them. This could happen if a node exporting
      the file system via NFS is rebooted, re-exported to nfs clients and then
      unmounted. It's fine in this case to not have a dentry cluster lock.
      
      Just remove the bug statement and replace it with an error print, which
      does the proper checks. Though we want to know if something has happened
      which might have prevented a cluster lock from being created, it's
      definitely not necessary to panic the machine for this.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      bccb9dad
  21. 07 11月, 2007 1 次提交
    • M
      ocfs2: Re-order iput in ocfs2_drop_dentry_lock · 9f70968a
      Mark Fasheh 提交于
      Do this to avoid a theoretical (I haven't seen this in practice) race where
      the downconvert thread might drop the dentry lock, allowing a remote unlink
      to proceed before dropping the inode locks. This could bounce access to the
      orphan dir between nodes.
      
      There doesn't seem to be a need to do the same in ocfs2_dentry_iput() as
      that's never called for the last ref drop from the downconvert thread.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      9f70968a
  22. 20 10月, 2007 1 次提交
  23. 25 9月, 2006 4 次提交
    • M
      ocfs2: Remove special casing for inode creation in ocfs2_dentry_attach_lock() · 0027dd5b
      Mark Fasheh 提交于
      We can't use LKM_LOCAL for new dentry locks because an unlink and subsequent
      re-create of a name/inode pair may result in the lock still being mastered
      somewhere in the cluster.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      0027dd5b
    • M
      ocfs2: manually d_move() during ocfs2_rename() · 1ba9da2f
      Mark Fasheh 提交于
      Make use of FS_RENAME_DOES_D_MOVE to avoid a race condition that can occur
      during ->rename() if we d_move() outside of the parent directory cluster
      locks, and another node discovers the new name (created during the rename)
      and unlinks it. d_move() will unconditionally rehash a dentry - which will
      leave stale data in the system.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      1ba9da2f
    • M
      ocfs2: Add dentry tracking API · 80c05846
      Mark Fasheh 提交于
      Replace the dentry vote mechanism with a cluster lock which covers a set
      of dentries. This allows us to force d_delete() only on nodes which actually
      care about an unlink.
      
      Every node that does a ->lookup() gets a read only lock on the dentry, until
      an unlink during which the unlinking node, will request an exclusive lock,
      forcing the other nodes who care about that dentry to d_delete() it. The
      effect is that we retain a very lightweight ->d_revalidate(), and at the
      same time get to make large improvements to the average case performance of
      the ocfs2 unlink and rename operations.
      
      This patch adds the higher level API and the dentry manipulation code.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      80c05846
    • M
      ocfs2: Add new cluster lock type · d680efe9
      Mark Fasheh 提交于
      Replace the dentry vote mechanism with a cluster lock which covers a set
      of dentries. This allows us to force d_delete() only on nodes which actually
      care about an unlink.
      
      Every node that does a ->lookup() gets a read only lock on the dentry, until
      an unlink during which the unlinking node, will request an exclusive lock,
      forcing the other nodes who care about that dentry to d_delete() it. The
      effect is that we retain a very lightweight ->d_revalidate(), and at the
      same time get to make large improvements to the average case performance of
      the ocfs2 unlink and rename operations.
      
      This patch adds the cluster lock type which OCFS2 can attach to
      dentries.  A small number of fs/ocfs2/dcache.c functions are stubbed
      out so that this change can compile.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      d680efe9
  24. 25 3月, 2006 1 次提交
  25. 04 1月, 2006 1 次提交