1. 10 2月, 2008 3 次提交
    • D
      JBD2: Clear buffer_ordered flag for barried IO request on success · c4e35e07
      Dave Kleikamp 提交于
      In JBD2 jbd2_journal_write_commit_record(), clear the buffer_ordered
      flag for the bh after barried IO has succeed. This prevents later, if
      the same buffer head were submitted to the underlying device, which has
      been reconfigured to not support barrier request, the JBD2 commit code
      could treat it as a normal IO (without barrier).
      
      This is a port from JBD/ext3 fix from Neil Brown.
      
      More details from Neil:
      
      Some devices - notably dm and md - can change their behaviour in
      response to BIO_RW_BARRIER requests.  They might start out accepting
      such requests but on reconfiguration, they find out that they cannot
      any more. JBD2 deal with this by always testing if BIO_RW_BARRIER
      requests fail with EOPNOTSUPP, and retrying the write
      requests without the barrier (probably after waiting for any pending
      writes to complete).
      
      However there is a bug in the handling this in JBD2 for ext4 .
      
      When ext4/JBD2 to submit a BIO_RW_BARRIER request,
      it sets the buffer_ordered flag on the buffer head.
      If the request completes successfully, the flag STAYS SET.
      
      Other code might then write the same buffer_head after the device has
      been reconfigured to not accept barriers.  This write will then fail,
      but the "other code" is not ready to handle EOPNOTSUPP errors and the
      error will be treated as fatal.
      
      Cc:  Neil Brown <neilb@suse.de>
      Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c4e35e07
    • J
      ext4: Fix Direct I/O locking · 7fb5409d
      Jan Kara 提交于
      We cannot start transaction in ext4_direct_IO() and just let it last
      during the whole write because dio_get_page() acquires mmap_sem which
      ranks above transaction start (e.g. because we have dependency chain
      mmap_sem->PageLock->journal_start, or because we update atime while
      holding mmap_sem) and thus deadlocks could happen. We solve the problem
      by starting a transaction separately for each ext4_get_block() call.
      
      We *could* have a problem that we allocate a block and before its data
      are written out the machine crashes and thus we expose stale data. But
      that does not happen because for hole-filling generic code falls back to
      buffered writes and for file extension, we add inode to orphan list and
      thus in case of crash, journal replay will truncate inode back to the
      original size.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      7fb5409d
    • A
      ext4: Fix circular locking dependency with migrate and rm. · 8009f9fb
      Aneesh Kumar K.V 提交于
      In order to prevent a circular locking dependency when an unlink
      operation is racing with an ext4 migration, we delay taking i_data_sem
      until just before switch the inode format, and use i_mutex to prevent
      writes and truncates during the first part of the migration operation.
      Acked-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      8009f9fb
  2. 06 2月, 2008 1 次提交
    • E
      allow in-inode EAs on ext4 root inode · 0040d987
      Eric Sandeen 提交于
      The ext3 root inode was treated specially with respect
      to in-inode extended attributes, for reasons detailed
      in the removed comment below.  The first mkfs-created
      inodes would not get extra_i_size or the EXT3_STATE_XATTR
      flag set in ext3_read_inode, which disallowed reading or
      setting in-inode EAs on the root.
      
      However, in ext4, ext4_mark_inode_dirty calls
      ext4_expand_extra_isize for all inodes; once this is done
      EAs may be placed in the root ext4 inode body.
      
      But for reasons above, it won't be found after a reboot.
      
      testcase:
      
      setfattr -n user.name -v value mntpt/
      setfattr -n user.name2 -v value2 mntpt/
      umount mntpt/; remount mntpt/
      getfattr -d mntpt/
      
      name2/value2 has gone missing; debugfs shows it in the
      inode body, but it is not found there by getattr.
      
      The following fixes it up; newer mkfs appears to properly
      zero the inodes, so this workaround isn't needed for ext4.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      0040d987
  3. 10 2月, 2008 1 次提交
  4. 05 2月, 2008 3 次提交
  5. 01 2月, 2008 1 次提交
  6. 05 2月, 2008 1 次提交
    • M
      jbd2: Add error check to journal_wait_on_commit_record to avoid oops · b048d846
      Mingming Cao 提交于
      The buffer head pointer passed to journal_wait_on_commit_record() could
      be NULL if the previous journal_submit_commit_record() failed or journal
      has already aborted.
      
      Looking at the jbd2 debug messages, before the oops happened, the jbd2
      is aborted due to trying to access the next log block beyond the end
      of device. This might be caused by using a corrupted image.
      
      We need to check the error returns from journal_submit_commit_record()
      and avoid calling journal_wait_on_commit_record() in the failure case.
      
      This addresses Kernel Bugzilla #9849
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      b048d846
  7. 10 2月, 2008 24 次提交
  8. 09 2月, 2008 6 次提交