1. 15 12月, 2011 7 次提交
    • C
      BTRFS: Establish i_ops before calling d_instantiate · ad19db71
      Casey Schaufler 提交于
      The Smack LSM hook for security_d_instantiate checks
      the inode's i_op->getxattr value to determine if the
      containing filesystem supports extended attributes.
      The BTRFS filesystem sets the inode's i_op value only
      after it has instantiated the inode. This results in
      Smack incorrectly giving new BTRFS inodes attributes
      from the filesystem defaults on the assumption that
      values can't be stored on the filesystem. This patch
      moves the assignment of inode operation vectors ahead
      of the calls to d_instantiate, letting Smack know that
      the filesystem supports extended attributes. There
      should be no impact on the performance or behavior of
      BTRFS.
      Signed-off-by: NCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      ad19db71
    • C
      Btrfs: add a cond_resched() into the worker loop · 8f3b65a3
      Chris Mason 提交于
      If we have a constant stream of end_io completions or crc work,
      we can hit softlockup messages from the async helper threads.  This
      adds a cond_resched() into the loop to avoid them.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      8f3b65a3
    • L
      Btrfs: fix ctime update of on-disk inode · 306424cc
      Li Zefan 提交于
      To reproduce the bug:
      
          # touch /mnt/tmp
          # stat /mnt/tmp | grep Change
          Change: 2011-12-09 09:32:23.412105981 +0800
          # chattr +i /mnt/tmp
          # stat /mnt/tmp | grep Change
          Change: 2011-12-09 09:32:43.198105295 +0800
          # umount /mnt
          # mount /dev/loop1 /mnt
          # stat /mnt/tmp | grep Change
          Change: 2011-12-09 09:32:23.412105981 +0800
      
      We should update ctime of in-memory inode before calling
      btrfs_update_inode().
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      306424cc
    • A
      btrfs: keep orphans for subvolume deletion · f8e9e0b0
      Arne Jansen 提交于
      Since we have the free space caches, btrfs_orphan_cleanup also runs for
      the tree_root. Unfortunately this also cleans up the orphans used to mark
      subvol deletions in progress.
      
      Currently if a subvol deletion gets interrupted twice by umount/mount, the
      deletion will not be continued and the space permanently lost, though it
      would be possible to write a tool to recover those lost subvol deletions.
      This patch checks if the orphan belongs to a subvol (dead root) and skips
      the deletion.
      Signed-off-by: NArne Jansen <sensille@gmx.net>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      f8e9e0b0
    • M
      Btrfs: fix inaccurate available space on raid0 profile · 39fb26c3
      Miao Xie 提交于
      When we use raid0 as the data profile, df command may show us a very
      inaccurate value of the available space, which may be much less than the
      real one. It may make the users puzzled. Fix it by changing the calculation
      of the available space, and making it be more similar to a fake chunk
      allocation.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      39fb26c3
    • M
      Btrfs: fix wrong disk space information of the files · 3642320e
      Miao Xie 提交于
      Btrfsck report errors after the 83th case of xfstests was run, The error
      number is 400, it means the used disk space of the file is wrong.
      
      The reason of this bug is that:
      The file truncation may fail when the space of the file system is not enough,
      and leave some file extents, whose offset are beyond the end of the files.
      When we want to expand those files, we will drop those file extents, and
      put in dummy file extents, and then we should update the i-node. But btrfs
      forgets to do it.
      
      This patch adds the forgotten i-node update.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      3642320e
    • M
      Btrfs: fix wrong i_size when truncating a file to a larger size · f4a2f4c5
      Miao Xie 提交于
      Btrfsck report error 100 after the 83th case of xfstests was run, it means
      the i_size of the file is wrong.
      
      The reason of this bug is that:
      Btrfs increased i_size of the file at the beginning, but it failed to expand
      the file, and failed to update the i_size to the old size because there is no
      enough space in the file system, so we found a wrong i_size.
      
      This patch fixes this bug by updating the i_size just when we pass the file
      expanding and get enough space to update i-node.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      f4a2f4c5
  2. 10 12月, 2011 1 次提交
    • C
      Btrfs: fix btrfs_end_bio to deal with write errors to a single mirror · 5dbc8fca
      Chris Mason 提交于
      btrfs_end_bio checks the number of errors on a bio against the max
      number of errors allowed before sending any EIOs up to the higher
      levels.
      
      If we got enough copies of the bio done for a given raid level, it is
      supposed to clear the bio error flag and return success.
      
      We have pointers to the original bio sent down by the higher layers and
      pointers to any cloned bios we made for raid purposes.  If the original
      bio happens to be the one that got an io error, but not the last one to
      finish, it might not have the BIO_UPTODATE bit set.
      
      Then, when the last bio does finish, we'll call bio_end_io on the
      original bio.  It won't have the uptodate bit set and we'll end up
      sending EIO to the higher layers.
      
      We already had a check for this, it just was conditional on getting the
      IO error on the very last bio.  Make the check unconditional so we eat
      the EIOs properly.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      5dbc8fca
  3. 08 12月, 2011 4 次提交
  4. 01 12月, 2011 11 次提交
  5. 22 11月, 2011 1 次提交
    • C
      Btrfs: remove free-space-cache.c WARN during log replay · 24a70313
      Chris Mason 提交于
      The log replay code only partially loads block groups, since
      the block group caching code is able to detect and deal with
      extents the logging code has pinned down.
      
      While the logging code is pinning down block groups, there is
      a bogus WARN_ON we're hitting if the code wasn't able to find
      an extent in the cache.  This commit removes the warning because
      it can happen any time there isn't a valid free space cache
      for that block group.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      24a70313
  6. 20 11月, 2011 10 次提交
    • J
      Btrfs: sectorsize align offsets in fiemap · 4d479cf0
      Josef Bacik 提交于
      We've been hitting BUG()'s in btrfs_cont_expand and btrfs_fallocate and anywhere
      else that calls btrfs_get_extent while running xfstests 13 in a loop.  This is
      because fiemap is calling btrfs_get_extent with non-sectorsize aligned offsets,
      which will end up adding mappings that are not sectorsize aligned, which will
      cause problems in some cases for subsequent calls to btrfs_get_extent for
      similar areas that are sectorsize aligned.  With this patch I ran xfstests 13 in
      a loop for a couple of hours and didn't hit the problem that I could previously
      hit in at most 20 minutes.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      4d479cf0
    • J
      Btrfs: clear pages dirty for io and set them extent mapped · f7d61dcd
      Josef Bacik 提交于
      When doing the io_ctl helpers to clean up the free space cache stuff I stopped
      using our normal prepare_pages stuff, which means I of course forgot to do
      things like set the pages extent mapped, which will cause us all sorts of
      wonderful propblems.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      f7d61dcd
    • J
      Btrfs: wait on caching if we're loading the free space cache · 291c7d2f
      Josef Bacik 提交于
      We've been hitting panics when running xfstest 13 in a loop for long periods of
      time.  And actually this problem has always existed so we've been hitting these
      things randomly for a while.  Basically what happens is we get a thread coming
      into the allocator and reading the space cache off of disk and adding the
      entries to the free space cache as we go.  Then we get another thread that comes
      in and tries to allocate from that block group.  Since block_group->cached !=
      BTRFS_CACHE_NO it goes ahead and tries to do the allocation.  We do this because
      if we're doing the old slow way of caching we don't want to hold people up and
      wait for everything to finish.  The problem with this is we could end up
      discarding the space cache at some arbitrary point in the future, which means we
      could very well end up allocating space that is either bad, or when the real
      caching happens it could end up thinking the space isn't in use when it really
      is and cause all sorts of other problems.
      
      The solution is to add a new flag to indicate we are loading the free space
      cache from disk, and always try to cache the block group if cache->cached !=
      BTRFS_CACHE_FINISHED.  That way if we are loading the space cache anybody else
      who tries to allocate from the block group will have to wait until it's finished
      to make sure it completes successfully.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      291c7d2f
    • A
      Btrfs: prefix resize related printks with btrfs: · 5bb14682
      Arnd Hannemann 提交于
      For the user it is confusing to find something like:
      [10197.627710] new size for /dev/mapper/vg0-usr_share is 3221225472
      in kernel log, because it doesn't point directly to btrfs.
      
      This patch prefixes those messages with "btrfs:" like other btrfs
      related printks.
      Signed-off-by: NArnd Hannemann <arnd@arndnet.de>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      5bb14682
    • D
      btrfs: fix stat blocks accounting · fadc0d8b
      David Sterba 提交于
      Round inode bytes and delalloc bytes up to real blocksize before
      converting to sector size. Otherwise eg. files smaller than 512
      are reported with zero blocks due to incorrect rounding.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      fadc0d8b
    • L
      Btrfs: avoid unnecessary bitmap search for cluster setup · 52621cb6
      Li Zefan 提交于
      setup_cluster_no_bitmap() searches all the extents and bitmaps starting
      from offset. Therefore if it returns -ENOSPC, all the bitmaps starting
      from offset are in the bitmaps list, so it's sufficient to search from
      this list in setup_cluser_bitmap().
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      52621cb6
    • L
      Btrfs: fix to search one more bitmap for cluster setup · 0f0fbf1d
      Li Zefan 提交于
      Suppose there are two bitmaps [0, 256], [256, 512] and one extent
      [100, 120] in the free space cache, and we want to setup a cluster
      with offset=100, bytes=50.
      
      In this case, there will be only one bitmap [256, 512] in the temporary
      bitmaps list, and then setup_cluster_bitmap() won't search bitmap [0, 256].
      
      The cause is, the list is constructed in setup_cluster_no_bitmap(),
      and only bitmaps with bitmap_entry->offset >= offset will be added
      into the list, and the very bitmap that convers offset has
      bitmap_entry->offset <= offset.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      0f0fbf1d
    • J
      btrfs: mirror_num should be int, not u64 · 32240a91
      Jan Schmidt 提交于
      My previous patch introduced some u64 for failed_mirror variables, this one
      makes it consistent again.
      Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      32240a91
    • J
      btrfs: Fix up 32/64-bit compatibility for new ioctls · 745c4d8e
      Jeff Mahoney 提交于
       This patch casts to unsigned long before casting to a pointer and fixes
       the following warnings:
      fs/btrfs/extent_io.c:2289:20: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      fs/btrfs/ioctl.c:2933:37: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      fs/btrfs/ioctl.c:2937:21: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      fs/btrfs/ioctl.c:3020:21: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      fs/btrfs/scrub.c:275:4: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      fs/btrfs/backref.c:686:27: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      745c4d8e
    • C
      Btrfs: fix barrier flushes · 387125fc
      Chris Mason 提交于
      When btrfs is writing the super blocks, it send barrier flushes to make
      sure writeback caching drives get all the metadata on disk in the
      right order.
      
      But, we have two bugs in the way these are sent down.  When doing
      full commits (not via the tree log), we are sending the barrier down
      before the last super when it should be going down before the first.
      
      In multi-device setups, we should be waiting for the barriers to
      complete on all devices before writing any of the supers.
      
      Both of these bugs can cause corruptions on power failures.  We fix it
      with some new code to send down empty barriers to all devices before
      writing the first super.
      
      Alexandre Oliva found the multi-device bug.  Arne Jansen did the async
      barrier loop.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      Reported-by: NAlexandre Oliva <oliva@lsd.ic.unicamp.br>
      387125fc
  7. 15 11月, 2011 1 次提交
    • L
      Btrfs: fix tree corruption after multi-thread snapshots and inode_cache flush · f1ebcc74
      Liu Bo 提交于
      The btrfs snapshotting code requires that once a root has been
      snapshotted, we don't change it during a commit.
      
      But there are two cases to lead to tree corruptions:
      
      1) multi-thread snapshots can commit serveral snapshots in a transaction,
         and this may change the src root when processing the following pending
         snapshots, which lead to the former snapshots corruptions;
      
      2) the free inode cache was changing the roots when it root the cache,
         which lead to corruptions.
      
      This fixes things by making sure we force COW the block after we create a
      snapshot during commiting a transaction, then any changes to the roots
      will result in COW, and we get all the fs roots and snapshot roots to be
      consistent.
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      f1ebcc74
  8. 11 11月, 2011 5 次提交