1. 25 5月, 2011 5 次提交
    • K
      ext4: ensure f_bfree returned by ext4_statfs() is non-negative · d02a9391
      Kazuya Mio 提交于
      I found the issue that the number of free blocks went negative.
      # stat -f /mnt/mp1/
        File: "/mnt/mp1/"
          ID: e175ccb83a872efe Namelen: 255     Type: ext2/ext3
      Block size: 4096       Fundamental block size: 4096
      Blocks: Total: 258022     Free: -15        Available: -13122
      Inodes: Total: 65536      Free: 63029
      
      f_bfree in struct statfs will go negative when the filesystem has
      few free blocks. Because the number of dirty blocks is bigger than
      the number of free blocks in the following two cases.
      
      CASE 1:
      ext4_da_writepages
        mpage_da_map_and_submit
          ext4_map_blocks
            ext4_ext_map_blocks
              ext4_mb_new_blocks
                ext4_mb_diskspace_used
                  percpu_counter_sub(&sbi->s_freeblocks_counter, ac->ac_b_ex.fe_len);
              <--- interrupt statfs systemcall --->
              ext4_da_update_reserve_space
                  percpu_counter_sub(&sbi->s_dirtyblocks_counter,
                                  used + ei->i_allocated_meta_blocks);
      
      CASE 2:
      ext4_write_begin
        __block_write_begin
          ext4_map_blocks
            ext4_ext_map_blocks
              ext4_mb_new_blocks
                ext4_mb_diskspace_used
                  percpu_counter_sub(&sbi->s_freeblocks_counter, ac->ac_b_ex.fe_len);
                  <--- interrupt statfs systemcall --->
                  percpu_counter_sub(&sbi->s_dirtyblocks_counter, reserv_blks);
      
      To avoid the issue, this patch ensures that f_bfree is non-negative.
      Signed-off-by: NKazuya Mio <k-mio@sx.jp.nec.com>
      d02a9391
    • L
      ext4: protect bb_first_free in ext4_trim_all_free() with group lock · 28739eea
      Lukas Czerner 提交于
      We should protect reading bd_info->bb_first_free with the group lock
      because otherwise we might miss some free blocks. This is not a big deal
      at all, but the change to do right thing is really simple, so lets do
      that.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      28739eea
    • L
      ext4: only load buddy bitmap in ext4_trim_fs() when it is needed · 78944086
      Lukas Czerner 提交于
      Currently we are loading buddy ext4_mb_load_buddy() for every block
      group we are going through in ext4_trim_fs() in many cases just to find
      out that there is not enough space to be bothered with. As Amir Goldstein
      suggested we can use bb_free information directly from ext4_group_info.
      
      This commit removes ext4_mb_load_buddy() from ext4_trim_fs() and rather
      get the ext4_group_info via ext4_get_group_info() and use the bb_free
      information directly from that. This avoids unnecessary call to load
      buddy in the case the group does not have enough free space to trim.
      Loading buddy is now moved to ext4_trim_all_free().
      
      Tested by me with xfstests 251.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      78944086
    • E
      jbd2: Fix comment to match the code in jbd2__journal_start() · c867516d
      Eryu Guan 提交于
      jbd2__journal_start() returns an ERR_PTR() value rather than NULL on
      failure.
      Signed-off-by: NEryu Guan <guaneryu@gmail.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c867516d
    • J
      ext4: fix waiting and sending of a barrier in ext4_sync_file() · 93628ffb
      Jan Kara 提交于
      jbd2_log_start_commit() returns 1 only when we really start a
      transaction.  But we also need to wait for a transaction when the
      commit is already running.  Fix this problem by waiting for
      transaction commit unconditionally (which is just a quick check if the
      transaction is already committed).
      
      Also we have to be more careful with sending of a barrier because when
      transaction is being committed in parallel to ext4_sync_file()
      running, we cannot be sure that the barrier the journalling code sends
      happens after we wrote all the data for fsync (note that not every
      data writeout needs to trigger metadata changes thus commit of some
      metadata changes can be running while other data is still written
      out). So use jbd2_will_send_data_barrier() helper to detect the common
      cases when we can be sure barrier will be issued by the commit code
      and issue the barrier ourselves in the remaining cases.
      Reported-by: NEdward Goggin <egoggin@vmware.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      93628ffb
  2. 24 5月, 2011 4 次提交
    • J
      jbd2: Add function jbd2_trans_will_send_data_barrier() · bbd2be36
      Jan Kara 提交于
      Provide a function which returns whether a transaction with given tid
      will send a flush to the filesystem device.  The function will be used
      by ext4 to detect whether fsync needs to send a separate flush or not.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      bbd2be36
    • J
      jbd2: fix sending of data flush on journal commit · 81be12c8
      Jan Kara 提交于
      
      In data=ordered mode, it's theoretically possible (however rare) that
      an inode is filed to transaction's t_inode_list and a flusher thread
      writes all the data and inode is reclaimed before the transaction
      starts to commit.  In such a case, we could erroneously omit sending a
      flush to file system device when it is different from the journal
      device (because data can still be in disk cache only).
      
      Fix the problem by setting a flag in a transaction when some inode is added
      to it and then send disk flush in the commit code when the flag is set.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      81be12c8
    • Y
      ext4: fix ext4_ext_fiemap_cb() to handle blocks before request range correctly · b221349f
      Yongqiang Yang 提交于
      To get delayed-extent information, ext4_ext_fiemap_cb() looks up
      pagecache, it thus collects information starting from a page's
      head block.
      
      If blocksize < pagesize, the beginning blocks of a page may lies
      before the request range. So ext4_ext_fiemap_cb() should proceed
      ignoring them, because they has been handled before. If no mapped
      buffer in the range is found in the 1st page, we need to look up
      the 2nd page, otherwise delayed-extents after a hole will be ignored.
      
      Without this patch, xfstests 225 will hung on ext4 with 1K block.
      Reported-by: NAmir Goldstein <amir73il@users.sourceforge.net>
      Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      b221349f
    • T
      ext4: use truncate_setsize() unconditionally · 072bd7ea
      Theodore Ts'o 提交于
      In commit c8d46e41 (ext4: Add flag to files with blocks intentionally
      past EOF), if the EOFBLOCKS_FL flag is set, we call ext4_truncate()
      before calling vmtruncate().  This caused any allocated but unwritten
      blocks created by calling fallocate() with the FALLOC_FL_KEEP_SIZE
      flag to be dropped.  This was done to make to make sure that
      EOFBLOCKS_FL would not be cleared while still leaving blocks past
      i_size allocated.  This was not necessary, since ext4_truncate()
      guarantees that blocks past i_size will be dropped, even in the case
      where truncate() has increased i_size before calling ext4_truncate().
      
      So fix this by removing the EOFBLOCKS_FL special case treatment in
      ext4_setattr().  In addition, use truncate_setsize() followed by a
      call to ext4_truncate() instead of using vmtruncate().  This is more
      efficient since it skips the call to inode_newsize_ok(), which has
      been checked already by inode_change_ok().  This is also in a win in
      the case where EOFBLOCKS_FL is set since it avoids calling
      ext4_truncate() twice.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      072bd7ea
  3. 23 5月, 2011 5 次提交
  4. 21 5月, 2011 4 次提交
  5. 19 5月, 2011 3 次提交
  6. 16 5月, 2011 2 次提交
  7. 15 5月, 2011 1 次提交
  8. 10 5月, 2011 4 次提交
  9. 09 5月, 2011 8 次提交
  10. 04 5月, 2011 2 次提交
  11. 03 5月, 2011 2 次提交
    • Y
      ext4: add a function merging extents right and left · 197217a5
      Yongqiang Yang 提交于
      1) Rename ext4_ext_try_to_merge() to ext4_ext_try_to_merge_right().
      
      2) Add a new function ext4_ext_try_to_merge() which tries to merge
         an extent both left and right.
      
      3) Use the new function in ext4_ext_convert_unwritten_endio() and
         ext4_ext_insert_extent().
      Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
      Tested-by: NAllison Henderson <achender@linux.vnet.ibm.com>
      197217a5
    • J
      ext4: fix deadlock in ext4_symlink() in ENOSPC conditions · df5e6223
      Jan Kara 提交于
      ext4_symlink() cannot call __page_symlink() with transaction open.
      __page_symlink() calls ext4_write_begin() which can wait for
      transaction commit if we are running out of space thus causing a
      deadlock. Also error recovery in ext4_truncate_failed_write() does not
      count with the transaction being already started (although I'm not
      aware of any particular deadlock here).
      
      Fix the problem by stopping a transaction before calling
      __page_symlink() (we have to be careful and put inode to orphan list
      so that it gets deleted in case of crash) and starting another one
      after __page_symlink() returns for addition of symlink into a
      directory.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      df5e6223