1. 06 10月, 2012 29 次提交
  2. 05 10月, 2012 1 次提交
    • D
      ext4: fix ext4_flush_completed_IO wait semantics · c278531d
      Dmitry Monakhov 提交于
      BUG #1) All places where we call ext4_flush_completed_IO are broken
          because buffered io and DIO/AIO goes through three stages
          1) submitted io,
          2) completed io (in i_completed_io_list) conversion pended
          3) finished  io (conversion done)
          And by calling ext4_flush_completed_IO we will flush only
          requests which were in (2) stage, which is wrong because:
           1) punch_hole and truncate _must_ wait for all outstanding unwritten io
            regardless to it's state.
           2) fsync and nolock_dio_read should also wait because there is
              a time window between end_page_writeback() and ext4_add_complete_io()
              As result integrity fsync is broken in case of buffered write
              to fallocated region:
              fsync                                      blkdev_completion
      	 ->filemap_write_and_wait_range
                                                         ->ext4_end_bio
                                                           ->end_page_writeback
                <-- filemap_write_and_wait_range return
      	 ->ext4_flush_completed_IO
         	 sees empty i_completed_io_list but pended
         	 conversion still exist
                                                           ->ext4_add_complete_io
      
      BUG #2) Race window becomes wider due to the 'ext4: completed_io
      locking cleanup V4' patch series
      
      This patch make following changes:
      1) ext4_flush_completed_io() now first try to flush completed io and when
         wait for any outstanding unwritten io via ext4_unwritten_wait()
      2) Rename function to more appropriate name.
      3) Assert that all callers of ext4_flush_unwritten_io should hold i_mutex to
         prevent endless wait
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      c278531d
  3. 03 10月, 2012 4 次提交
  4. 02 10月, 2012 2 次提交
  5. 01 10月, 2012 3 次提交
    • T
      ext4: fix mtime update in nodelalloc mode · 041bbb6d
      Theodore Ts'o 提交于
      Commits 5e8830dc and 41c4d25f introduced a regression into
      v3.6-rc1 for ext4 in nodealloc mode, such that mtime updates would not
      take place for files modified via mmap if the page was already in the
      page cache.  This would also affect ext3 file systems mounted using
      the ext4 file system driver.
      
      The problem was that ext4_page_mkwrite() had a shortcut which would
      avoid calling __block_page_mkwrite() under some circumstances, and the
      above two commit transferred the responsibility of calling
      file_update_time() to __block_page_mkwrite --- which woudln't get
      called in some circumstances.
      
      Since __block_page_mkwrite() only has three callers,
      block_page_mkwrite(), ext4_page_mkwrite, and nilfs_page_mkwrite(), the
      best way to solve this is to move the responsibility for calling
      file_update_time() to its caller.
      
      This problem was found via xfstests #215 with a file system mounted
      with -o nodelalloc.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
      Cc: stable@vger.kernel.org
      041bbb6d
    • D
      ext4: fix ext_remove_space for punch_hole case · 6f2080e6
      Dmitry Monakhov 提交于
      Inode is allowed to have empty leaf only if it this is blockless inode.
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      6f2080e6
    • D
      ext4: punch_hole should wait for DIO writers · 02d262df
      Dmitry Monakhov 提交于
      punch_hole is the place where we have to wait for all existing writers
      (writeback, aio, dio), but currently we simply flush pended end_io request
      which is not sufficient. Other issue is that punch_hole performed w/o i_mutex
      held which obviously result in dangerous data corruption due to
      write-after-free.
      
      This patch performs following changes:
      - Guard punch_hole with i_mutex
      - Recheck inode flags under i_mutex
      - Block all new dio readers in order to prevent information leak caused by
        read-after-free pattern.
      - punch_hole now wait for all writers in flight
        NOTE: XXX write-after-free race is still possible because new dirty pages
        may appear due to mmap(), and currently there is no easy way to stop
        writeback while punch_hole is in progress.
      
      [ Fixed error return from ext4_ext_punch_hole() to make sure that we
        release i_mutex before returning EPERM or ETXTBUSY -- Ted ]
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      02d262df
  6. 30 9月, 2012 1 次提交
    • M
      vfs: dcache: fix deadlock in tree traversal · 8110e16d
      Miklos Szeredi 提交于
      IBM reported a deadlock in select_parent().  This was found to be caused
      by taking rename_lock when already locked when restarting the tree
      traversal.
      
      There are two cases when the traversal needs to be restarted:
      
       1) concurrent d_move(); this can only happen when not already locked,
          since taking rename_lock protects against concurrent d_move().
      
       2) racing with final d_put() on child just at the moment of ascending
          to parent; rename_lock doesn't protect against this rare race, so it
          can happen when already locked.
      
      Because of case 2, we need to be able to handle restarting the traversal
      when rename_lock is already held.  This patch fixes all three callers of
      try_to_ascend().
      
      IBM reported that the deadlock is gone with this patch.
      
      [ I rewrote the patch to be smaller and just do the "goto again" if the
        lock was already held, but credit goes to Miklos for the real work.
         - Linus ]
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8110e16d