1. 05 9月, 2015 2 次提交
  2. 24 7月, 2015 1 次提交
  3. 25 6月, 2015 1 次提交
  4. 16 4月, 2015 1 次提交
  5. 15 4月, 2015 1 次提交
  6. 11 2月, 2015 1 次提交
  7. 07 8月, 2014 1 次提交
  8. 24 6月, 2014 1 次提交
    • W
      ocfs2: refcount: take rw_lock in ocfs2_reflink · 8a8ad1c2
      Wengang Wang 提交于
      This patch tries to fix this crash:
      
       #5 [ffff88003c1cd690] do_invalid_op at ffffffff810166d5
       #6 [ffff88003c1cd730] invalid_op at ffffffff8159b2de
          [exception RIP: ocfs2_direct_IO_get_blocks+359]
          RIP: ffffffffa05dfa27  RSP: ffff88003c1cd7e8  RFLAGS: 00010202
          RAX: 0000000000000000  RBX: ffff88003c1cdaa8  RCX: 0000000000000000
          RDX: 000000000000000c  RSI: ffff880027a95000  RDI: ffff88003c79b540
          RBP: ffff88003c1cd858   R8: 0000000000000000   R9: ffffffff815f6ba0
          R10: 00000000000001c9  R11: 00000000000001c9  R12: ffff88002d271500
          R13: 0000000000000001  R14: 0000000000000000  R15: 0000000000001000
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
       #7 [ffff88003c1cd860] do_direct_IO at ffffffff811cd31b
       #8 [ffff88003c1cd950] direct_IO_iovec at ffffffff811cde9c
       #9 [ffff88003c1cd9b0] do_blockdev_direct_IO at ffffffff811ce764
      #10 [ffff88003c1cdb80] __blockdev_direct_IO at ffffffff811ce7cc
      #11 [ffff88003c1cdbb0] ocfs2_direct_IO at ffffffffa05df756 [ocfs2]
      #12 [ffff88003c1cdbe0] generic_file_direct_write_iter at ffffffff8112f935
      #13 [ffff88003c1cdc40] ocfs2_file_write_iter at ffffffffa0600ccc [ocfs2]
      #14 [ffff88003c1cdd50] do_aio_write at ffffffff8119126c
      #15 [ffff88003c1cddc0] aio_rw_vect_retry at ffffffff811d9bb4
      #16 [ffff88003c1cddf0] aio_run_iocb at ffffffff811db880
      #17 [ffff88003c1cde30] io_submit_one at ffffffff811dc238
      #18 [ffff88003c1cde80] do_io_submit at ffffffff811dc437
      #19 [ffff88003c1cdf70] sys_io_submit at ffffffff811dc530
      #20 [ffff88003c1cdf80] system_call_fastpath at ffffffff8159a159
      
      It crashes at
              BUG_ON(create && (ext_flags & OCFS2_EXT_REFCOUNTED));
      in ocfs2_direct_IO_get_blocks.
      
      ocfs2_direct_IO_get_blocks is expecting the OCFS2_EXT_REFCOUNTED be removed in
      ocfs2_prepare_inode_for_write() if it was there. But no cluster lock is taken
      during the time before (or inside) ocfs2_prepare_inode_for_write() and after
      ocfs2_direct_IO_get_blocks().
      
      It can happen in this case:
      
      Node A(which crashes)				Node B
      ------------------------                 ---------------------------
      ocfs2_file_aio_write
        ocfs2_prepare_inode_for_write
          ocfs2_inode_lock
          ...
          ocfs2_inode_unlock
        #no refcount found
      ....					ocfs2_reflink
                                                ocfs2_inode_lock
                                                ...
                                                ocfs2_inode_unlock
                                                #now, refcount flag set on extent
      
                                              ...
                                              flush change to disk
      
      ocfs2_direct_IO_get_blocks
        ocfs2_get_clusters
          #extent map miss
          #buffer_head miss
          read extents from disk
        found refcount flag on extent
        crash..
      
      Fix:
      Take rw_lock in ocfs2_reflink path
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a8ad1c2
  9. 05 6月, 2014 1 次提交
  10. 26 1月, 2014 1 次提交
  11. 13 11月, 2013 3 次提交
  12. 12 9月, 2013 1 次提交
  13. 14 8月, 2013 1 次提交
  14. 01 8月, 2013 1 次提交
  15. 23 2月, 2013 1 次提交
  16. 13 2月, 2013 1 次提交
  17. 30 7月, 2012 2 次提交
  18. 14 4月, 2012 3 次提交
  19. 20 7月, 2011 1 次提交
  20. 25 5月, 2011 1 次提交
  21. 10 4月, 2011 1 次提交
  22. 14 3月, 2011 1 次提交
  23. 22 2月, 2011 1 次提交
  24. 20 2月, 2011 1 次提交
  25. 02 2月, 2011 1 次提交
    • E
      fs/vfs/security: pass last path component to LSM on inode creation · 2a7dba39
      Eric Paris 提交于
      SELinux would like to implement a new labeling behavior of newly created
      inodes.  We currently label new inodes based on the parent and the creating
      process.  This new behavior would also take into account the name of the
      new object when deciding the new label.  This is not the (supposed) full path,
      just the last component of the path.
      
      This is very useful because creating /etc/shadow is different than creating
      /etc/passwd but the kernel hooks are unable to differentiate these
      operations.  We currently require that userspace realize it is doing some
      difficult operation like that and than userspace jumps through SELinux hoops
      to get things set up correctly.  This patch does not implement new
      behavior, that is obviously contained in a seperate SELinux patch, but it
      does pass the needed name down to the correct LSM hook.  If no such name
      exists it is fine to pass NULL.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      2a7dba39
  26. 11 9月, 2010 1 次提交
    • T
      ocfs2: Fix lockdep warning in reflink. · 07eaac94
      Tao Ma 提交于
      This patch change mutex_lock to a new subclass and
      add a new inode lock subclass for the target inode
      which caused this lockdep warning.
      
      =============================================
      [ INFO: possible recursive locking detected ]
      2.6.35+ #5
      ---------------------------------------------
      reflink/11086 is trying to acquire lock:
       (Meta){+++++.}, at: [<ffffffffa06f9d65>] ocfs2_reflink_ioctl+0x898/0x1229 [ocfs2]
      
      but task is already holding lock:
       (Meta){+++++.}, at: [<ffffffffa06f9aa0>] ocfs2_reflink_ioctl+0x5d3/0x1229 [ocfs2]
      
      other info that might help us debug this:
      6 locks held by reflink/11086:
       #0:  (&sb->s_type->i_mutex_key#15/1){+.+.+.}, at: [<ffffffff820e09ec>] lookup_create+0x26/0x97
       #1:  (&sb->s_type->i_mutex_key#15){+.+.+.}, at: [<ffffffffa06f99a0>] ocfs2_reflink_ioctl+0x4d3/0x1229 [ocfs2]
       #2:  (Meta){+++++.}, at: [<ffffffffa06f9aa0>] ocfs2_reflink_ioctl+0x5d3/0x1229 [ocfs2]
       #3:  (&oi->ip_xattr_sem){+.+.+.}, at: [<ffffffffa06f9b58>] ocfs2_reflink_ioctl+0x68b/0x1229 [ocfs2]
       #4:  (&oi->ip_alloc_sem){+.+.+.}, at: [<ffffffffa06f9b67>] ocfs2_reflink_ioctl+0x69a/0x1229 [ocfs2]
       #5:  (&sb->s_type->i_mutex_key#15/2){+.+...}, at: [<ffffffffa06f9d4f>] ocfs2_reflink_ioctl+0x882/0x1229 [ocfs2]
      
      stack backtrace:
      Pid: 11086, comm: reflink Not tainted 2.6.35+ #5
      Call Trace:
       [<ffffffff82063dd9>] validate_chain+0x56e/0xd68
       [<ffffffff82062275>] ? mark_held_locks+0x49/0x69
       [<ffffffff82064d6d>] __lock_acquire+0x79a/0x7f1
       [<ffffffff82065a81>] lock_acquire+0xc6/0xed
       [<ffffffffa06f9d65>] ? ocfs2_reflink_ioctl+0x898/0x1229 [ocfs2]
       [<ffffffffa06c9ade>] __ocfs2_cluster_lock+0x975/0xa0d [ocfs2]
       [<ffffffffa06f9d65>] ? ocfs2_reflink_ioctl+0x898/0x1229 [ocfs2]
       [<ffffffffa06e107b>] ? ocfs2_wait_for_recovery+0x15/0x8a [ocfs2]
       [<ffffffffa06cb6ea>] ocfs2_inode_lock_full_nested+0x1ac/0xdc5 [ocfs2]
       [<ffffffffa06f9d65>] ? ocfs2_reflink_ioctl+0x898/0x1229 [ocfs2]
       [<ffffffff820623a0>] ? trace_hardirqs_on_caller+0x10b/0x12f
       [<ffffffff82060193>] ? debug_mutex_free_waiter+0x4f/0x53
       [<ffffffffa06f9d65>] ocfs2_reflink_ioctl+0x898/0x1229 [ocfs2]
       [<ffffffffa06ce24a>] ? ocfs2_file_lock_res_init+0x66/0x78 [ocfs2]
       [<ffffffff820bb2d2>] ? might_fault+0x40/0x8d
       [<ffffffffa06df9f6>] ocfs2_ioctl+0x61a/0x656 [ocfs2]
       [<ffffffff820ee5d3>] ? mntput_no_expire+0x1d/0xb0
       [<ffffffff820e07b3>] ? path_put+0x2c/0x31
       [<ffffffff820e53ac>] vfs_ioctl+0x2a/0x9d
       [<ffffffff820e5903>] do_vfs_ioctl+0x45d/0x4ae
       [<ffffffff8233a7f6>] ? _raw_spin_unlock+0x26/0x2a
       [<ffffffff8200299c>] ? sysret_check+0x27/0x62
       [<ffffffff820e59ab>] sys_ioctl+0x57/0x7a
       [<ffffffff8200296b>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      07eaac94
  27. 08 9月, 2010 1 次提交
  28. 12 8月, 2010 3 次提交
  29. 08 8月, 2010 1 次提交
  30. 16 7月, 2010 1 次提交
  31. 09 7月, 2010 1 次提交
    • J
      ocfs2: Zero the tail cluster when extending past i_size. · 5693486b
      Joel Becker 提交于
      ocfs2's allocation unit is the cluster.  This can be larger than a block
      or even a memory page.  This means that a file may have many blocks in
      its last extent that are beyond the block containing i_size.  There also
      may be more unwritten extents after that.
      
      When ocfs2 grows a file, it zeros the entire cluster in order to ensure
      future i_size growth will see cleared blocks.  Unfortunately,
      block_write_full_page() drops the pages past i_size.  This means that
      ocfs2 is actually leaking garbage data into the tail end of that last
      cluster.  This is a bug.
      
      We adjust ocfs2_write_begin_nolock() and ocfs2_extend_file() to detect
      when a write or truncate is past i_size.  They will use
      ocfs2_zero_extend() to ensure the data is properly zeroed.
      
      Older versions of ocfs2_zero_extend() simply zeroed every block between
      i_size and the zeroing position.  This presumes three things:
      
      1) There is allocation for all of these blocks.
      2) The extents are not unwritten.
      3) The extents are not refcounted.
      
      (1) and (2) hold true for non-sparse filesystems, which used to be the
      only users of ocfs2_zero_extend().  (3) is another bug.
      
      Since we're now using ocfs2_zero_extend() for sparse filesystems as
      well, we teach ocfs2_zero_extend() to check every extent between
      i_size and the zeroing position.  If the extent is unwritten, it is
      ignored.  If it is refcounted, it is CoWed.  Then it is zeroed.
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      Cc: stable@kernel.org
      5693486b
  32. 19 5月, 2010 1 次提交
    • T
      Ocfs2: Optimize ocfs2 truncate to use ocfs2_remove_btree_range() instead. · 78f94673
      Tristan Ye 提交于
      Truncate is just a special case of punching holes(from new i_size to
      end), we therefore could take advantage of the existing
      ocfs2_remove_btree_range() to reduce the comlexity and redundancy in
      alloc.c.  The goal here is to make truncate more generic and
      straightforward.
      
      Several functions only used by ocfs2_commit_truncate() will smiply be
      removed.
      
      ocfs2_remove_btree_range() was originally used by the hole punching
      code, which didn't take refcount trees into account (definitely a bug).
      We therefore need to change that func a bit to handle refcount trees.
      It must take the refcount lock, calculate and reserve blocks for
      refcount tree changes, and decrease refcounts at the end.  We replace 
      ocfs2_lock_allocators() here by adding a new func
      ocfs2_reserve_blocks_for_rec_trunc() which accepts some extra blocks to
      reserve.  This will not hurt any other code using
      ocfs2_remove_btree_range() (such as dir truncate and hole punching).
      
      I merged the following steps into one patch since they may be
      logically doing one thing, though I know it looks a little bit fat
      to review.
      
      1). Remove redundant code used by ocfs2_commit_truncate(), since we're
          moving to ocfs2_remove_btree_range anyway.
      
      2). Add a new func ocfs2_reserve_blocks_for_rec_trunc() for purpose of
          accepting some extra blocks to reserve.
      
      3). Change ocfs2_prepare_refcount_change_for_del() a bit to fit our
          needs.  It's safe to do this since it's only being called by
          truncate.
      
      4). Change ocfs2_remove_btree_range() a bit to take refcount case into
          account.
      
      5). Finally, we change ocfs2_commit_truncate() to call
          ocfs2_remove_btree_range() in a proper way.
      
      The patch has been tested normally for sanity check, stress tests
      with heavier workload will be expected.
      
      Based on this patch, fixing the punching holes bug will be fairly easy.
      Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
      Acked-by: NMark Fasheh <mfasheh@suse.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      78f94673