1. 07 5月, 2013 2 次提交
    • J
      Btrfs: fix bad extent logging · 09a2a8f9
      Josef Bacik 提交于
      A user sent me a btrfs-image of a file system that was panicing on mount during
      the log recovery.  I had originally thought these problems were from a bug in
      the free space cache code, but that was just a symptom of the problem.  The
      problem is if your application does something like this
      
      [prealloc][prealloc][prealloc]
      
      the internal extent maps will merge those all together into one extent map, even
      though on disk they are 3 separate extents.  So if you go to write into one of
      these ranges the extent map will be right since we use the physical extent when
      doing the write, but when we log the extents they will use the wrong sizes for
      the remainder prealloc space.  If this doesn't happen to trip up the free space
      cache (which it won't in a lot of cases) then you will get bogus entries in your
      extent tree which will screw stuff up later.  The data and such will still work,
      but everything else is broken.  This patch fixes this by not allowing extents
      that are on the modified list to be merged.  This has the side effect that we
      are no longer adding everything to the modified list all the time, which means
      we now have to call btrfs_drop_extents every time we log an extent into the
      tree.  So this allows me to drop all this speciality code I was using to get
      around calling btrfs_drop_extents.  With this patch the testcase I've created no
      longer creates a bogus file system after replaying the log.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      09a2a8f9
    • S
      Btrfs: Include the device in most error printk()s · c2cf52eb
      Simon Kirby 提交于
      With more than one btrfs volume mounted, it can be very difficult to find
      out which volume is hitting an error. btrfs_error() will print this, but
      it is currently rigged as more of a fatal error handler, while many of
      the printk()s are currently for debugging and yet-unhandled cases.
      
      This patch just changes the functions where the device information is
      already available. Some cases remain where the root or fs_info is not
      passed to the function emitting the error.
      
      This may introduce some confusion with volumes backed by multiple devices
      emitting errors referring to the primary device in the set instead of the
      one on which the error occurred.
      
      Use btrfs_printk(fs_info, format, ...) rather than writing the device
      string every time, and introduce macro wrappers ala XFS for brevity.
      Since the function already cannot be used for continuations, print a
      newline as part of the btrfs_printk() message rather than at each caller.
      Signed-off-by: NSimon Kirby <sim@hostway.ca>
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      c2cf52eb
  2. 22 3月, 2013 1 次提交
    • J
      Btrfs: handle a bogus chunk tree nicely · 835d974f
      Josef Bacik 提交于
      If you restore a btrfs-image file system and try to mount that file system we'll
      panic.  That's because btrfs-image restores and just makes one big chunk to
      envelope the whole disk, since they are really only meant to be messed with by
      our btrfs-progs.  So fix up btrfs_rmap_block and the callers of it for mount so
      that we no longer panic but instead just return an error and fail to mount.
      Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      835d974f
  3. 15 3月, 2013 1 次提交
    • E
      btrfs: use rcu_barrier() to wait for bdev puts at unmount · bc178622
      Eric Sandeen 提交于
      Doing this would reliably fail with -EBUSY for me:
      
      # mount /dev/sdb2 /mnt/scratch; umount /mnt/scratch; mkfs.btrfs -f /dev/sdb2
      ...
      unable to open /dev/sdb2: Device or resource busy
      
      because mkfs.btrfs tries to open the device O_EXCL, and somebody still has it.
      
      Using systemtap to track bdev gets & puts shows a kworker thread doing a
      blkdev put after mkfs attempts a get; this is left over from the unmount
      path:
      
      btrfs_close_devices
      	__btrfs_close_devices
      		call_rcu(&device->rcu, free_device);
      			free_device
      				INIT_WORK(&device->rcu_work, __free_device);
      				schedule_work(&device->rcu_work);
      
      so unmount might complete before __free_device fires & does its blkdev_put.
      
      Adding an rcu_barrier() to btrfs_close_devices() causes unmount to wait
      until all blkdev_put()s are done, and the device is truly free once
      unmount completes.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      bc178622
  4. 07 3月, 2013 1 次提交
  5. 05 3月, 2013 1 次提交
  6. 01 3月, 2013 1 次提交
  7. 27 2月, 2013 1 次提交
    • Q
      btrfs: cleanup for open-coded alignment · fda2832f
      Qu Wenruo 提交于
      Though most of the btrfs codes are using ALIGN macro for page alignment,
      there are still some codes using open-coded alignment like the
      following:
      ------
              u64 mask = ((u64)root->stripesize - 1);
              u64 ret = (val + mask) & ~mask;
      ------
      Or even hidden one:
      ------
              num_bytes = (end - start + blocksize) & ~(blocksize - 1);
      ------
      
      Sometimes these open-coded alignment is not so easy to understand for
      newbie like me.
      
      This commit changes the open-coded alignment to the ALIGN macro for a
      better readability.
      
      Also there is a previous patch from David Sterba with similar changes,
      but the patch is for 3.2 kernel and seems not merged.
      http://www.spinics.net/lists/linux-btrfs/msg12747.html
      
      Cc: David Sterba <dave@jikos.cz>
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      fda2832f
  8. 21 2月, 2013 9 次提交
  9. 20 2月, 2013 1 次提交
  10. 16 2月, 2013 1 次提交
    • D
      btrfs: access superblock via pagecache in scan_one_device · 6f60cbd3
      David Sterba 提交于
      btrfs_scan_one_device is calling set_blocksize() which can race
      with a concurrent process making dirty page cache pages.  It can end up
      dropping dirty page cache pages on the floor, which isn't very nice when
      someone is just running btrfs dev scan to find filesystems on the
      box.
      
      Now that udev is registering btrfs devices as it discovers them, we can
      actually end up racing with our own mkfs program too.  When this
      happens, we drop some of the important blocks written by mkfs.
      
      This commit changes scan_one_device to read the super out of the page
      cache instead of trying to use bread.  This way we don't have to care
      about the blocksize of the device.
      
      This also drops the invalidate_bdev() call.  It wasn't very polite to
      invalidate during the scan either.  mkfs is putting the super into the
      page cache, there's no reason to invalidate at this point.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      6f60cbd3
  11. 05 2月, 2013 1 次提交
  12. 02 2月, 2013 2 次提交
    • D
      Btrfs: RAID5 and RAID6 · 53b381b3
      David Woodhouse 提交于
      This builds on David Woodhouse's original Btrfs raid5/6 implementation.
      The code has changed quite a bit, blame Chris Mason for any bugs.
      
      Read/modify/write is done after the higher levels of the filesystem have
      prepared a given bio.  This means the higher layers are not responsible
      for building full stripes, and they don't need to query for the topology
      of the extents that may get allocated during delayed allocation runs.
      It also means different files can easily share the same stripe.
      
      But, it does expose us to incorrect parity if we crash or lose power
      while doing a read/modify/write cycle.  This will be addressed in a
      later commit.
      
      Scrub is unable to repair crc errors on raid5/6 chunks.
      
      Discard does not work on raid5/6 (yet)
      
      The stripe size is fixed at 64KiB per disk.  This will be tunable
      in a later commit.
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      53b381b3
    • E
      btrfs: don't try to notify udev about missing devices · 3c911608
      Eric Sandeen 提交于
      If we remove a missing device, bdev is null, and if we
      send that off to btrfs_kobject_uevent we'll panic.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      3c911608
  13. 25 1月, 2013 1 次提交
  14. 22 1月, 2013 1 次提交
  15. 20 1月, 2013 1 次提交
    • I
      Btrfs: bring back balance pause/resume logic · ed0fb78f
      Ilya Dryomov 提交于
      Balance pause/resume logic got broken by 5ac00add (went in into 3.8-rc1
      as part of dev-replace merge).  Offending commit took a stab at making
      mutually exclusive volume operations (add_dev, rm_dev, resize, balance,
      replace_dev) not block behind volume_mutex if another such operation is
      in progress and instead return an error right away.  Balancing front-end
      relied on the blocking behaviour, so the fix is ugly, but short of a
      complete rework, it's the best we can do.
      Reported-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      ed0fb78f
  16. 15 1月, 2013 1 次提交
  17. 17 12月, 2012 4 次提交
    • L
      Btrfs: put raid properties into global table · 31e50229
      Liu Bo 提交于
      Raid properties can be shared among raid calculation code, we can put
      them into a global table to keep it simple.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      31e50229
    • J
      Btrfs: log changed inodes based on the extent map tree · 70c8a91c
      Josef Bacik 提交于
      We don't really need to copy extents from the source tree since we have all
      of the information already available to us in the extent_map tree.  So
      instead just write the extents straight to the log tree and don't bother to
      copy the extent items from the source tree.
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      70c8a91c
    • L
      btrfs: Notify udev when removing device · b8b8ff59
      Lukas Czerner 提交于
      Currently udev does not know about the device being removed from the
      file system. This may result in the situation where we're unable to
      mount the file system by UUID or by LABEL because the by-uuid and
      by-label links may still point to the device which is no longer part of
      the btrfs file system and hence does not have any btrfs super block.
      
      It can be easily reproduced by the following:
      
      mkfs.btrfs -L bugfs /dev/loop[0-6]
      mount /dev/loop0 /mnt/test
      btrfs device delete /dev/loop0 /mnt/test
      umount /mnt/test
      
      mount LABEL=bugfs /mnt/test <---- this fails
      
      then see:
      
      ls -l /dev/disk/by-label/bugfs
      
      which will still point to the /dev/loop0
      
      We did not noticed this before because libblkid would send the udev
      event for us when it notice that the link does not fit the reality,
      however it does not do that anymore and completely relies on udev
      information.
      
      Fix this by sending the KOBJ_CHANGE event to the bdev kobject after
      successful device removal.
      
      Note that this does not affect device addition, because we will open the
      device prior the addition from userspace and udev will notice that and
      reread the device afterwards.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      b8b8ff59
    • S
      Btrfs: fix a build warning for an unused label · f9c83748
      Stefan Behrens 提交于
      This issue was detected by the "0-DAY kernel build testing".
      
      fs/btrfs/volumes.c: In function 'btrfs_rm_device':
      fs/btrfs/volumes.c:1505:1: warning: label 'error_close' defined but not used [-Wunused-label]
      Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      f9c83748
  18. 13 12月, 2012 10 次提交