1. 21 1月, 2015 1 次提交
  2. 19 12月, 2014 1 次提交
    • J
      ocfs2: reflink: fix slow unlink for refcounted file · f62f12b3
      Junxiao Bi 提交于
      When running ocfs2 test suite multiple nodes reflink stress test, for a
      4 nodes cluster, every unlink() for refcounted file needs about 700s.
      
      The slow unlink is caused by the contention of refcount tree lock since
      all nodes are unlink files using the same refcount tree.  When the
      unlinking file have many extents(over 1600 in our test), most of the
      extents has refcounted flag set.  In ocfs2_commit_truncate(), it will
      execute the following call trace for every extents.  This means it needs
      get and released refcount tree lock about 1600 times.  And when several
      nodes are do this at the same time, the performance will be very low.
      
        ocfs2_remove_btree_range()
        --  ocfs2_lock_refcount_tree()
        ----  ocfs2_refcount_lock()
        ------  __ocfs2_cluster_lock()
      
      ocfs2_refcount_lock() is costly, move it to ocfs2_commit_truncate() to
      do lock/unlock once can improve a lot performance.
      Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Cc: Wengang <wen.gang.wang@oracle.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f62f12b3
  3. 11 12月, 2014 1 次提交
  4. 10 10月, 2014 1 次提交
    • J
      ocfs2: fix deadlock due to wrong locking order · f775da2f
      Junxiao Bi 提交于
      For commit ocfs2 journal, ocfs2 journal thread will acquire the mutex
      osb->journal->j_trans_barrier and wake up jbd2 commit thread, then it
      will wait until jbd2 commit thread done. In order journal mode, jbd2
      needs flushing dirty data pages first, and this needs get page lock.
      So osb->journal->j_trans_barrier should be got before page lock.
      
      But ocfs2_write_zero_page() and ocfs2_write_begin_inline() obey this
      locking order, and this will cause deadlock and hung the whole cluster.
      
      One deadlock catched is the following:
      
      PID: 13449  TASK: ffff8802e2f08180  CPU: 31  COMMAND: "oracle"
       #0 [ffff8802ee3f79b0] __schedule at ffffffff8150a524
       #1 [ffff8802ee3f7a58] schedule at ffffffff8150acbf
       #2 [ffff8802ee3f7a68] rwsem_down_failed_common at ffffffff8150cb85
       #3 [ffff8802ee3f7ad8] rwsem_down_read_failed at ffffffff8150cc55
       #4 [ffff8802ee3f7ae8] call_rwsem_down_read_failed at ffffffff812617a4
       #5 [ffff8802ee3f7b50] ocfs2_start_trans at ffffffffa0498919 [ocfs2]
       #6 [ffff8802ee3f7ba0] ocfs2_zero_start_ordered_transaction at ffffffffa048b2b8 [ocfs2]
       #7 [ffff8802ee3f7bf0] ocfs2_write_zero_page at ffffffffa048e9bd [ocfs2]
       #8 [ffff8802ee3f7c80] ocfs2_zero_extend_range at ffffffffa048ec83 [ocfs2]
       #9 [ffff8802ee3f7ce0] ocfs2_zero_extend at ffffffffa048edfd [ocfs2]
       #10 [ffff8802ee3f7d50] ocfs2_extend_file at ffffffffa049079e [ocfs2]
       #11 [ffff8802ee3f7da0] ocfs2_setattr at ffffffffa04910ed [ocfs2]
       #12 [ffff8802ee3f7e70] notify_change at ffffffff81187d29
       #13 [ffff8802ee3f7ee0] do_truncate at ffffffff8116bbc1
       #14 [ffff8802ee3f7f50] sys_ftruncate at ffffffff8116bcbd
       #15 [ffff8802ee3f7f80] system_call_fastpath at ffffffff81515142
          RIP: 00007f8de750c6f7  RSP: 00007fffe786e478  RFLAGS: 00000206
          RAX: 000000000000004d  RBX: ffffffff81515142  RCX: 0000000000000000
          RDX: 0000000000000200  RSI: 0000000000028400  RDI: 000000000000000d
          RBP: 00007fffe786e040   R8: 0000000000000000   R9: 000000000000000d
          R10: 0000000000000000  R11: 0000000000000206  R12: 000000000000000d
          R13: 00007fffe786e710  R14: 00007f8de70f8340  R15: 0000000000028400
          ORIG_RAX: 000000000000004d  CS: 0033  SS: 002b
      
      crash64> bt
      PID: 7610   TASK: ffff88100fd56140  CPU: 1   COMMAND: "ocfs2cmt"
       #0 [ffff88100f4d1c50] __schedule at ffffffff8150a524
       #1 [ffff88100f4d1cf8] schedule at ffffffff8150acbf
       #2 [ffff88100f4d1d08] jbd2_log_wait_commit at ffffffffa01274fd [jbd2]
       #3 [ffff88100f4d1d98] jbd2_journal_flush at ffffffffa01280b4 [jbd2]
       #4 [ffff88100f4d1dd8] ocfs2_commit_cache at ffffffffa0499b14 [ocfs2]
       #5 [ffff88100f4d1e38] ocfs2_commit_thread at ffffffffa0499d38 [ocfs2]
       #6 [ffff88100f4d1ee8] kthread at ffffffff81090db6
       #7 [ffff88100f4d1f48] kernel_thread_helper at ffffffff81516284
      
      crash64> bt
      PID: 7609   TASK: ffff88100f2d4480  CPU: 0   COMMAND: "jbd2/dm-20-86"
       #0 [ffff88100def3920] __schedule at ffffffff8150a524
       #1 [ffff88100def39c8] schedule at ffffffff8150acbf
       #2 [ffff88100def39d8] io_schedule at ffffffff8150ad6c
       #3 [ffff88100def39f8] sleep_on_page at ffffffff8111069e
       #4 [ffff88100def3a08] __wait_on_bit_lock at ffffffff8150b30a
       #5 [ffff88100def3a58] __lock_page at ffffffff81110687
       #6 [ffff88100def3ab8] write_cache_pages at ffffffff8111b752
       #7 [ffff88100def3be8] generic_writepages at ffffffff8111b901
       #8 [ffff88100def3c48] journal_submit_data_buffers at ffffffffa0120f67 [jbd2]
       #9 [ffff88100def3cf8] jbd2_journal_commit_transaction at ffffffffa0121372[jbd2]
       #10 [ffff88100def3e68] kjournald2 at ffffffffa0127a86 [jbd2]
       #11 [ffff88100def3ee8] kthread at ffffffff81090db6
       #12 [ffff88100def3f48] kernel_thread_helper at ffffffff81516284
      Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Alex Chen <alex.chen@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f775da2f
  5. 01 10月, 2014 1 次提交
    • J
      ocfs2: Back out change to use OCFS2_MAXQUOTAS in ocfs2_setattr() · c53f755d
      Jan Kara 提交于
      ocfs2_setattr() actually needs to really use MAXQUOTAS and not
      OCFS2_MAXQUOTAS since it will pass the array over to VFS. Currently
      this isn't a problem since MAXQUOTAS == OCFS2_MAXQUOTAS but it would
      be once we introduce project quotas.
      
      CC: Mark Fasheh <mfasheh@suse.com>
      CC: Joel Becker <jlbec@evilplan.org>
      CC: ocfs2-devel@oss.oracle.com
      Signed-off-by: NJan Kara <jack@suse.cz>
      c53f755d
  6. 17 9月, 2014 1 次提交
    • J
      ocfs2: Don't use MAXQUOTAS value · 52362810
      Jan Kara 提交于
      MAXQUOTAS value defines maximum number of quota types VFS supports.
      This isn't necessarily the number of types ocfs2 supports and with
      addition of project quotas these two numbers stop matching. So make
      ocfs2 use its private definition.
      
      CC: Mark Fasheh <mfasheh@suse.com>
      CC: Joel Becker <jlbec@evilplan.org>
      CC: ocfs2-devel@oss.oracle.com
      Signed-off-by: NJan Kara <jack@suse.cz>
      52362810
  7. 12 6月, 2014 1 次提交
  8. 05 6月, 2014 1 次提交
  9. 07 5月, 2014 6 次提交
  10. 04 4月, 2014 4 次提交
  11. 02 4月, 2014 3 次提交
  12. 10 3月, 2014 1 次提交
  13. 11 2月, 2014 3 次提交
  14. 26 1月, 2014 1 次提交
  15. 22 1月, 2014 1 次提交
  16. 13 11月, 2013 2 次提交
  17. 12 9月, 2013 2 次提交
    • Y
      ocfs2: free path in ocfs2_remove_inode_range() · 7aebff18
      Younger Liu 提交于
      In ocfs2_remove_inode_range(), there is a memory leak.  The variable path
      has allocated memory with ocfs2_new_path_from_et(), but it is not free.
      Signed-off-by: NYounger Liu <younger.liu@huawei.com>
      Reviewed-by: NJie Liu <jeff.liu@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7aebff18
    • Y
      ocfs2: lighten up allocate transaction · 2b1e55c3
      Younger Liu 提交于
      The issue scenario is as following:
      
      When fallocating a very large disk space for a small file,
      __ocfs2_extend_allocation attempts to get a very large transaction.  For
      some journal sizes, there may be not enough room for this transaction,
      and the fallocate will fail.
      
      The patch below extends & restarts the transaction as necessary while
      allocating space, and should work with even the smallest journal.  This
      patch refers ext4 resize.
      
      Test:
      # mkfs.ocfs2 -b 4K -C 32K -T datafiles /dev/sdc
      ...(jounral size is 32M)
      # mount.ocfs2 /dev/sdc /mnt/ocfs2/
      # touch /mnt/ocfs2/1.log
      # fallocate -o 0 -l 400G /mnt/ocfs2/1.log
      fallocate: /mnt/ocfs2/1.log: fallocate failed: Cannot allocate memory
      # tail -f /var/log/messages
      [ 7372.278591] JBD: fallocate wants too many credits (2051 > 2048)
      [ 7372.278597] (fallocate,6438,0):__ocfs2_extend_allocation:709 ERROR: status = -12
      [ 7372.278603] (fallocate,6438,0):ocfs2_allocate_unwritten_extents:1504 ERROR: status = -12
      [ 7372.278607] (fallocate,6438,0):__ocfs2_change_file_space:1955 ERROR: status = -12
      ^C
      With this patch, the test works well.
      Signed-off-by: NYounger Liu <younger.liu@huawei.com>
      Cc: Jie Liu <jeff.liu@oracle.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2b1e55c3
  18. 14 8月, 2013 1 次提交
  19. 30 7月, 2013 1 次提交
    • K
      aio: Kill aio_rw_vect_retry() · 73a7075e
      Kent Overstreet 提交于
      This code doesn't serve any purpose anymore, since the aio retry
      infrastructure has been removed.
      
      This change should be safe because aio_read/write are also used for
      synchronous IO, and called from do_sync_read()/do_sync_write() - and
      there's no looping done in the sync case (the read and write syscalls).
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>
      73a7075e
  20. 03 7月, 2013 1 次提交
    • J
      vfs: export lseek_execute() to modules · 46a1c2c7
      Jie Liu 提交于
      For those file systems(btrfs/ext4/ocfs2/tmpfs) that support
      SEEK_DATA/SEEK_HOLE functions, we end up handling the similar
      matter in lseek_execute() to update the current file offset
      to the desired offset if it is valid, ceph also does the
      simliar things at ceph_llseek().
      
      To reduce the duplications, this patch make lseek_execute()
      public accessible so that we can call it directly from the
      underlying file systems.
      
      Thanks Dave Chinner for this suggestion.
      
      [AV: call it vfs_setpos(), don't bring the removed 'inode' argument back]
      
      v2->v1:
      - Add kernel-doc comments for lseek_execute()
      - Call lseek_execute() in ceph->llseek()
      Signed-off-by: NJie Liu <jeff.liu@oracle.com>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chris Mason <chris.mason@fusionio.com>
      Cc: Josef Bacik <jbacik@fusionio.com>
      Cc: Ben Myers <bpm@sgi.com>
      Cc: Ted Tso <tytso@mit.edu>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Sage Weil <sage@inktank.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      46a1c2c7
  21. 29 6月, 2013 1 次提交
  22. 25 5月, 2013 1 次提交
  23. 10 4月, 2013 2 次提交
  24. 26 2月, 2013 1 次提交
  25. 23 2月, 2013 1 次提交