1. 14 8月, 2011 3 次提交
    • T
      ext4: fix nomblk_io_submit option so it correctly converts uninit blocks · 9dd75f1f
      Theodore Ts'o 提交于
      Bug discovered by Jan Kara:
      
      Finally, commit 1449032b returned back
      the old IO submission code but apparently it forgot to return the old
      handling of uninitialized buffers so we unconditionnaly call
      block_write_full_page() without specifying end_io function. So AFAICS
      we never convert unwritten extents to written in some cases. For
      example when I mount the fs as: mount -t ext4 -o
      nomblk_io_submit,dioread_nolock /dev/ubdb /mnt and do
              int fd = open(argv[1], O_RDWR | O_CREAT | O_TRUNC, 0600);
              char buf[1024];
              memset(buf, 'a', sizeof(buf));
              fallocate(fd, 0, 0, 16384);
              write(fd, buf, sizeof(buf));
      
      I get a file full of zeros (after remounting the filesystem so that
      pagecache is dropped) instead of seeing the first KB contain 'a's.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      9dd75f1f
    • T
      ext4: Resolve the hang of direct i/o read in handling EXT4_IO_END_UNWRITTEN. · 32c80b32
      Tao Ma 提交于
      EXT4_IO_END_UNWRITTEN flag set and the increase of i_aiodio_unwritten
      should be done simultaneously since ext4_end_io_nolock always clear
      the flag and decrease the counter in the same time.
      
      We don't increase i_aiodio_unwritten when setting
      EXT4_IO_END_UNWRITTEN so it will go nagative and causes some process
      to wait forever.
      
      Part of the patch came from Eric in his e-mail, but it doesn't fix the
      problem met by Michael actually.
      
      http://marc.info/?l=linux-ext4&m=131316851417460&w=2
      
      Reported-and-Tested-by: Michael Tokarev<mjt@tls.msk.ru>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      32c80b32
    • J
      ext4: call ext4_ioend_wait and ext4_flush_completed_IO in ext4_evict_inode · 2581fdc8
      Jiaying Zhang 提交于
      Flush inode's i_completed_io_list before calling ext4_io_wait to
      prevent the following deadlock scenario: A page fault happens while
      some process is writing inode A. During page fault,
      shrink_icache_memory is called that in turn evicts another inode
      B. Inode B has some pending io_end work so it calls ext4_ioend_wait()
      that waits for inode B's i_ioend_count to become zero. However, inode
      B's ioend work was queued behind some of inode A's ioend work on the
      same cpu's ext4-dio-unwritten workqueue. As the ext4-dio-unwritten
      thread on that cpu is processing inode A's ioend work, it tries to
      grab inode A's i_mutex lock. Since the i_mutex lock of inode A is
      still hold before the page fault happened, we enter a deadlock.
      
      Also moves ext4_flush_completed_IO and ext4_ioend_wait from
      ext4_destroy_inode() to ext4_evict_inode(). During inode deleteion,
      ext4_evict_inode() is called before ext4_destroy_inode() and in
      ext4_evict_inode(), we may call ext4_truncate() without holding
      i_mutex lock. As a result, there is a race between flush_completed_IO
      that is called from ext4_ext_truncate() and ext4_end_io_work, which
      may cause corruption on an io_end structure. This change moves
      ext4_flush_completed_IO and ext4_ioend_wait from ext4_destroy_inode()
      to ext4_evict_inode() to resolve the race between ext4_truncate() and
      ext4_end_io_work during inode deletion.
      Signed-off-by: NJiaying Zhang <jiayingz@google.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      2581fdc8
  2. 13 8月, 2011 1 次提交
    • C
      ext4: Fix ext4_should_writeback_data() for no-journal mode · 441c8508
      Curt Wohlgemuth 提交于
      ext4_should_writeback_data() had an incorrect sequence of
      tests to determine if it should return 0 or 1: in
      particular, even in no-journal mode, 0 was being returned
      for a non-regular-file inode.
      
      This meant that, in non-journal mode, we would use
      ext4_journalled_aops for directories, symlinks, and other
      non-regular files.  However, calling journalled aop
      callbacks when there is no valid handle, can cause problems.
      
      This would cause a kernel crash with Jan Kara's commit
      2d859db3 ("ext4: fix data corruption in inodes with
      journalled data"), because we now dereference 'handle' in
      ext4_journalled_write_end().
      
      I also added BUG_ONs to check for a valid handle in the
      obviously journal-only aops callbacks.
      
      I tested this running xfstests with a scratch device in
      these modes:
      
         - no-journal
         - data=ordered
         - data=writeback
         - data=journal
      
      All work fine; the data=journal run has many failures and a
      crash in xfstests 074, but this is no different from a
      vanilla kernel.
      Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      441c8508
  3. 04 8月, 2011 1 次提交
  4. 02 8月, 2011 3 次提交
  5. 01 8月, 2011 5 次提交
  6. 31 7月, 2011 2 次提交
    • D
      ext4: add missing kfree() on error return path in add_new_gdb() · c49bafa3
      Dan Carpenter 提交于
      We added some more error handling in b4097142 "ext4: add error
      checking to calls to ext4_handle_dirty_metadata()".  But we need to
      call kfree() as well to avoid a memory leak.
      Signed-off-by: NDan Carpenter <error27@gmail.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c49bafa3
    • T
      ext4: fix races in ext4_sync_parent() · d59729f4
      Theodore Ts'o 提交于
      Fix problems if fsync() races against a rename of a parent directory
      as pointed out by Al Viro in his own inimitable way:
      
      >While we are at it, could somebody please explain what the hell is ext4
      >doing in
      >static int ext4_sync_parent(struct inode *inode)
      >{
      >        struct writeback_control wbc;
      >        struct dentry *dentry = NULL;
      >        int ret = 0;
      >
      >        while (inode && ext4_test_inode_state(inode, EXT4_STATE_NEWENTRY)) {
      >                ext4_clear_inode_state(inode, EXT4_STATE_NEWENTRY);
      >                dentry = list_entry(inode->i_dentry.next,
      >                                    struct dentry, d_alias);
      >                if (!dentry || !dentry->d_parent || !dentry->d_parent->d_inode)
      >                        break;
      >                inode = dentry->d_parent->d_inode;
      >                ret = sync_mapping_buffers(inode->i_mapping);
      >                ...
      >Note that dentry obviously can't be NULL there.  dentry->d_parent is never
      >NULL.  And dentry->d_parent would better not be negative, for crying out
      >loud!  What's worse, there's no guarantees that dentry->d_parent will
      >remain our parent over that sync_mapping_buffers() *and* that inode won't
      >just be freed under us (after rename() and memory pressure leading to
      >eviction of what used to be our dentry->d_parent)......
      Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      d59729f4
  7. 28 7月, 2011 5 次提交
  8. 27 7月, 2011 8 次提交
  9. 26 7月, 2011 5 次提交
  10. 24 7月, 2011 6 次提交
  11. 21 7月, 2011 1 次提交
    • J
      fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers · 02c24a82
      Josef Bacik 提交于
      Btrfs needs to be able to control how filemap_write_and_wait_range() is called
      in fsync to make it less of a painful operation, so push down taking i_mutex and
      the calling of filemap_write_and_wait() down into the ->fsync() handlers.  Some
      file systems can drop taking the i_mutex altogether it seems, like ext3 and
      ocfs2.  For correctness sake I just pushed everything down in all cases to make
      sure that we keep the current behavior the same for everybody, and then each
      individual fs maintainer can make up their mind about what to do from there.
      Thanks,
      Acked-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      02c24a82