1. 21 2月, 2013 9 次提交
    • M
      Btrfs: serialize unlocked dio reads with truncate · 2e60a51e
      Miao Xie 提交于
      Currently, we can do unlocked dio reads, but the following race
      is possible:
      
      dio_read_task			truncate_task
      				->btrfs_setattr()
      ->btrfs_direct_IO
          ->__blockdev_direct_IO
            ->btrfs_get_block
      				  ->btrfs_truncate()
      				 #alloc truncated blocks
      				 #to other inode
            ->submit_io()
           #INFORMATION LEAK
      
      In order to avoid this problem, we must serialize unlocked dio reads with
      truncate. There are two approaches:
      - use extent lock to protect the extent that we truncate
      - use inode_dio_wait() to make sure the truncating task will wait for
        the read DIO.
      
      If we use the 1st one, we will meet the endless truncation problem due to
      the nonlocked read DIO after we implement the nonlocked write DIO. It is
      because we still need invoke inode_dio_wait() avoid the race between write
      DIO and truncation. By that time, we have to introduce
      
        btrfs_inode_{block, resume}_nolock_dio()
      
      again. That is we have to implement this patch again, so I choose the 2nd
      way to fix the problem.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      2e60a51e
    • M
      Btrfs: fix deadlock due to unsubmitted · 0934856d
      Miao Xie 提交于
      The deadlock problem happened when running fsstress(a test program in LTP).
      
      Steps to reproduce:
       # mkfs.btrfs -b 100M <partition>
       # mount <partition> <mnt>
       # <Path>/fsstress -p 3 -n 10000000 -d <mnt>
      
      The reason is:
      btrfs_direct_IO()
       |->do_direct_IO()
           |->get_page()
           |->get_blocks()
           |	 |->btrfs_delalloc_resereve_space()
           |	 |->btrfs_add_ordered_extent() -------	Add a new ordered extent
           |->dio_send_cur_page(page0) --------------	We didn't submit bio here
           |->get_page()
           |->get_blocks()
      	 |->btrfs_delalloc_resereve_space()
      	     |->flush_space()
      		 |->btrfs_start_ordered_extent()
      		     |->wait_event() ----------	Wait the completion of
      						the ordered extent that is
      						mentioned above
      
      But because we didn't submit the bio that is mentioned above, the ordered
      extent can not complete, we would wait for its completion forever.
      
      There are two methods which can fix this deadlock problem:
      1. submit the bio before we invoke get_blocks()
      2. reserve the space before we do dio
      
      Though the 1st is the simplest way, we need modify the code of VFS, and it
      is likely to break contiguous requests, and introduce performance regression
      for the other filesystems.
      
      So we have to choose the 2nd way.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Cc: Josef Bacik <jbacik@fusionio.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      0934856d
    • J
      Btrfs: cleanup orphan reservation if truncate fails · 4a7d0f68
      Josef Bacik 提交于
      I noticed we were getting lots of warnings with xfstest 83 because we have
      reservations outstanding.  This is because we moved the orphan add outside
      of the truncate, but we don't actually cleanup our reservation if something
      fails.  This fixes the problem and I no longer see warnings.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      4a7d0f68
    • J
      Btrfs: steal from global reserve if we are cleaning up orphans · 5d80366e
      Josef Bacik 提交于
      Sometimes xfstest 83 will fail to remount the scratch device because we've
      gotten ourselves so full that we cannot cleanup the orphan items.  In this
      case check to see if we're doing the orphan cleanup and if we are allow us
      to steal our reservation from the global block rsv.  With this patch I've
      not been able to reproduce the failed mount problem.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      5d80366e
    • J
      Btrfs: handle errors in compression submission path · 3e04e7f1
      Josef Bacik 提交于
      I noticed we would deadlock if we aborted a transaction while doing
      compressed io.  This is because we don't unlock our pages if something goes
      horribly wrong.  To fix this we need to make sure that we call
      extent_clear_unlock_delalloc in order to unlock all the pages.  If we have
      to cow in the async submission thread we need to make sure to unlock our
      locked_page as the cow error path will not unlock the locked page as it
      depends on the caller to unlock that page.  With this patch we no longer
      deadlock on the page lock when we have an aborted transaction.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      3e04e7f1
    • J
      Btrfs: account for orphan inodes properly during cleanup · 925396ec
      Josef Bacik 提交于
      Dave sent me a panic where we were doing the orphan cleanup and panic'ed
      trying to release our reservation from the orphan block rsv.  The reason for
      this is because our orphan block rsv had been free'd out from underneath us
      because the transaction commit found that there were no orphan inodes
      according to its count and decided to free it.  This is incorrect so make
      sure we inc the orphan inodes count so the accounting is all done properly.
      This would also cause the warning in the orphan commit code normally if you
      had any orphans to cleanup as they would only decrement the orphan count so
      you'd get a negative orphan count which could cause problems during runtime.
      Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      925396ec
    • J
      Btrfs: unreserve space if our ordered extent fails to work · 0bec9ef5
      Josef Bacik 提交于
      When a transaction aborts or there's an EIO on an ordered extent or any
      error really we will not free up the space we reserved for this ordered
      extent.  This results in warnings from the block group cache cleanup in the
      case of a transaction abort, or leaking space in the case of EIO on an
      ordered extent.  Fix this up by free'ing the reserved space if we have an
      error at all trying to complete an ordered extent.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      0bec9ef5
    • M
      Btrfs: use the inode own lock to protect its delalloc_bytes · df0af1a5
      Miao Xie 提交于
      We need not use a global lock to protect the delalloc_bytes of the
      inode, just use its own lock. In this way, we can reduce the lock
      contention and ->delalloc_lock will just protect delalloc inode
      list.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      df0af1a5
    • M
      Btrfs: use percpu counter for fs_info->delalloc_bytes · 963d678b
      Miao Xie 提交于
      fs_info->delalloc_bytes is accessed very frequently, so use percpu
      counter instead of the u64 variant for it to reduce the lock
      contention.
      
      This patch also fixed the problem that we access the variant
      without the lock protection.At worst, we would not flush the
      delalloc inodes, and just return ENOSPC error when we still have
      some free space in the fs.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      963d678b
  2. 20 2月, 2013 7 次提交
  3. 25 1月, 2013 1 次提交
  4. 15 1月, 2013 4 次提交
  5. 18 12月, 2012 2 次提交
    • L
      Btrfs: fix a bug of per-file nocow · 213490b3
      Liu Bo 提交于
      Users report a bug, the reproducer is:
      $ mkfs.btrfs /dev/loop0
      $ mount /dev/loop0 /mnt/btrfs/
      $ mkdir /mnt/btrfs/dir
      $ chattr +C /mnt/btrfs/dir/
      $ dd if=/dev/zero of=/mnt/btrfs/dir/foo bs=4K count=10;
      $ lsattr /mnt/btrfs/dir/foo
      ---------------C- /mnt/btrfs/dir/foo
      $ filefrag /mnt/btrfs/dir/foo
      /mnt/btrfs/dir/foo: 1 extent found    ---> an extent
      $ dd if=/dev/zero of=/mnt/btrfs/dir/foo bs=4K count=1 seek=5 conv=notrunc,nocreat; sync
      $ filefrag /mnt/btrfs/dir/foo
      /mnt/btrfs/dir/foo: 3 extents found   ---> with nocow, btrfs breaks the extent into three parts
      
      The new created file should not only inherit the NODATACOW flag, but also
      honor NODATASUM flag, because we must do COW on a file extent with checksum.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      213490b3
    • C
      Btrfs: fix hash overflow handling · 9c52057c
      Chris Mason 提交于
      The handling for directory crc hash overflows was fairly obscure,
      split_leaf returns EOVERFLOW when we try to extend the item and that is
      supposed to bubble up to userland.  For a while it did so, but along the
      way we added better handling of errors and forced the FS readonly if we
      hit IO errors during the directory insertion.
      
      Along the way, we started testing only for EEXIST and the EOVERFLOW case
      was dropped.  The end result is that we may force the FS readonly if we
      catch a directory hash bucket overflow.
      
      This fixes a few problem spots.  First I add tests for EOVERFLOW in the
      places where we can safely just return the error up the chain.
      
      btrfs_rename is harder though, because it tries to insert the new
      directory item only after it has already unlinked anything the rename
      was going to overwrite.  Rather than adding very complex logic, I added
      a helper to test for the hash overflow case early while it is still safe
      to bail out.
      
      Snapshot and subvolume creation had a similar problem, so they are using
      the new helper now too.
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      Reported-by: NPascal Junod <pascal@junod.info>
      9c52057c
  6. 17 12月, 2012 12 次提交
  7. 13 12月, 2012 5 次提交
反馈
建议
客服 返回
顶部