1. 27 7月, 2013 1 次提交
    • T
      ext4: make sure group number is bumped after a inode allocation race · a34eb503
      Theodore Ts'o 提交于
      When we try to allocate an inode, and there is a race between two
      CPU's trying to grab the same inode, _and_ this inode is the last free
      inode in the block group, make sure the group number is bumped before
      we continue searching the rest of the block groups.  Otherwise, we end
      up searching the current block group twice, and we end up skipping
      searching the last block group.  So in the unlikely situation where
      almost all of the inodes are allocated, it's possible that we will
      return ENOSPC even though there might be free inodes in that last
      block group.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      a34eb503
  2. 05 6月, 2013 1 次提交
  3. 21 4月, 2013 1 次提交
  4. 20 4月, 2013 1 次提交
    • J
      ext4: move quota initialization out of inode allocation transaction · eb9cc7e1
      Jan Kara 提交于
      Inode allocation transaction is pretty heavy (246 credits with quotas
      and extents before previous patch, still around 200 after it).  This is
      mostly due to credits required for allocation of quota structures
      (credits there are heavily overestimated but it's difficult to make
      better estimates if we don't want to wire non-trivial assumptions about
      quota format into filesystem).
      
      So move quota initialization out of allocation transaction. That way
      transaction for quota structure allocation will be started only if we
      need to look up quota structure on disk (rare) and furthermore it will
      be started for each quota type separately, not for all of them at once.
      This reduces maximum transaction size to 34 is most cases and to 73 in
      the worst case.
      
      [ Modified by tytso to clean up the cleanup paths for error handling.
        Also use a separate call to ext4_std_error() for each failure so it
        is easier for someone who is debugging a problem in this function to
        determine which function call failed. ]
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      eb9cc7e1
  5. 10 4月, 2013 1 次提交
  6. 12 3月, 2013 1 次提交
  7. 15 2月, 2013 1 次提交
  8. 10 2月, 2013 1 次提交
    • T
      ext4: start handle at the last possible moment when creating inodes · 1139575a
      Theodore Ts'o 提交于
      In ext4_{create,mknod,mkdir,symlink}(), don't start the journal handle
      until the inode has been succesfully allocated.  In order to do this,
      we need to start the handle in the ext4_new_inode().  So create a new
      variant of this function, ext4_new_inode_start_handle(), so the handle
      can be created at the last possible minute, before we need to modify
      the inode allocation bitmap block.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      1139575a
  9. 09 2月, 2013 1 次提交
    • T
      ext4: pass context information to jbd2__journal_start() · 9924a92a
      Theodore Ts'o 提交于
      So we can better understand what bits of ext4 are responsible for
      long-running jbd2 handles, use jbd2__journal_start() so we can pass
      context information for logging purposes.
      
      The recommended way for finding the longer-running handles is:
      
         T=/sys/kernel/debug/tracing
         EVENT=$T/events/jbd2/jbd2_handle_stats
         echo "interval > 5" > $EVENT/filter
         echo 1 > $EVENT/enable
      
         ./run-my-fs-benchmark
      
         cat $T/trace > /tmp/problem-handles
      
      This will list handles that were active for longer than 20ms.  Having
      longer-running handles is bad, because a commit started at the wrong
      time could stall for those 20+ milliseconds, which could delay an
      fsync() or an O_SYNC operation.  Here is an example line from the
      trace file describing a handle which lived on for 311 jiffies, or over
      1.2 seconds:
      
      postmark-2917  [000] ....   196.435786: jbd2_handle_stats: dev 254,32 
         tid 570 type 2 line_no 2541 interval 311 sync 0 requested_blocks 1
         dirtied_blocks 0
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      9924a92a
  10. 11 12月, 2012 1 次提交
  11. 30 11月, 2012 1 次提交
  12. 29 10月, 2012 1 次提交
    • E
      ext4: fix unjournaled inode bitmap modification · ffb5387e
      Eric Sandeen 提交于
      commit 119c0d44 changed
      ext4_new_inode() such that the inode bitmap was being modified
      outside a transaction, which could lead to corruption, and was
      discovered when journal_checksum found a bad checksum in the
      journal during log replay.
      
      Nix ran into this when using the journal_async_commit mount
      option, which enables journal checksumming.  The ensuing
      journal replay failures due to the bad checksums led to
      filesystem corruption reported as the now infamous
      "Apparent serious progressive ext4 data corruption bug"
      
      [ Changed by tytso to only call ext4_journal_get_write_access() only
        when we're fairly certain that we're going to allocate the inode. ]
      
      I've tested this by mounting with journal_checksum and
      running fsstress then dropping power; I've also tested by
      hacking DM to create snapshots w/o first quiescing, which
      allows me to test journal replay repeatedly w/o actually
      power-cycling the box.  Without the patch I hit a journal
      checksum error every time.  With this fix it survives
      many iterations.
      Reported-by: NNix <nix@esperi.org.uk>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      ffb5387e
  13. 22 10月, 2012 1 次提交
  14. 24 9月, 2012 1 次提交
  15. 23 7月, 2012 1 次提交
  16. 01 7月, 2012 1 次提交
  17. 29 5月, 2012 2 次提交
    • T
      ext4: protect group inode free counting with group lock · 6f2e9f0e
      Tao Ma 提交于
      Now when we set the group inode free count, we don't have a proper
      group lock so that multiple threads may decrease the inode free
      count at the same time. And e2fsck will complain something like:
      
      Free inodes count wrong for group #1 (1, counted=0).
      Fix? no
      
      Free inodes count wrong for group #2 (3, counted=0).
      Fix? no
      
      Directories count wrong for group #2 (780, counted=779).
      Fix? no
      
      Free inodes count wrong for group #3 (2272, counted=2273).
      Fix? no
      
      So this patch try to protect it with the ext4_lock_group.
      
      btw, it is found by xfstests test case 269 and the volume is
      mkfsed with the parameter
      "-O ^resize_inode,^uninit_bg,extent,meta_bg,flex_bg,ext_attr"
      and I have run it 100 times and the error in e2fsck doesn't
      show up again.
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      6f2e9f0e
    • D
      ext4: fix potential NULL dereference in ext4_free_inodes_counts() · bb3d132a
      Dan Carpenter 提交于
      The ext4_get_group_desc() function returns NULL on error, and
      ext4_free_inodes_count() function dereferences it without checking.
      There is a check on the next line, but it's too late.
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      bb3d132a
  18. 16 5月, 2012 1 次提交
  19. 30 4月, 2012 4 次提交
  20. 20 3月, 2012 2 次提交
  21. 21 2月, 2012 1 次提交
    • T
      ext4: fix race when setting bitmap_uptodate flag · 813e5727
      Theodore Ts'o 提交于
      In ext4_read_{inode,block}_bitmap() we were setting bitmap_uptodate()
      before submitting the buffer for read.  The is bad, since we check
      bitmap_uptodate() without locking the buffer, and so if another
      process is racing with us, it's possible that they will think the
      bitmap is uptodate even though the read has not completed yet,
      resulting in inodes and blocks potentially getting allocated more than
      once if we get really unlucky.
      
      Addresses-Google-Bug: 2828254
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      813e5727
  22. 07 2月, 2012 1 次提交
  23. 04 1月, 2012 1 次提交
  24. 29 12月, 2011 2 次提交
  25. 19 12月, 2011 1 次提交
    • J
      ext4: fix error handling on inode bitmap corruption · acd6ad83
      Jan Kara 提交于
      When insert_inode_locked() fails in ext4_new_inode() it most likely means inode
      bitmap got corrupted and we allocated again inode which is already in use. Also
      doing unlock_new_inode() during error recovery is wrong since the inode does
      not have I_NEW set. Fix the problem by jumping to fail: (instead of fail_drop:)
      which declares filesystem error and does not call unlock_new_inode().
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      acd6ad83
  26. 02 11月, 2011 1 次提交
  27. 01 11月, 2011 1 次提交
  28. 29 10月, 2011 1 次提交
  29. 18 10月, 2011 1 次提交
  30. 09 10月, 2011 1 次提交
  31. 10 9月, 2011 4 次提交