1. 12 11月, 2013 7 次提交
    • D
      btrfs: Use WARN_ON()'s return value in place of WARN_ON(1) · fae7f21c
      Dulshani Gunawardhana 提交于
      Use WARN_ON()'s return value in place of WARN_ON(1) for cleaner source
      code that outputs a more descriptive warnings. Also fix the styling
      warning of redundant braces that came up as a result of this fix.
      Signed-off-by: NDulshani Gunawardhana <dulshani.gunawardhana89@gmail.com>
      Reviewed-by: NZach Brown <zab@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      fae7f21c
    • W
      Btrfs: use 'u64' rather than 'int' to get extent's generation · 7fdf4b60
      Wang Shilong 提交于
      We define a 'int' to get extent's generation by mistake,fix it.
      Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      7fdf4b60
    • J
      Btrfs: stop committing the transaction so much during relocate · 9e6a0c52
      Josef Bacik 提交于
      I noticed with my horrible snapshot excercisor that we were taking forever to
      relocate the larger the file system got.  This appeared to be because we were
      committing the transaction _constantly_.  There were a few places where we do
      braindead things with metadata reservation, like start a transaction and then
      try to refill the block rsv, which not only keeps us from committing a
      transaction during the enospc stuff, but keeps us from doing some of the harder
      flushing work which will make us more likely to need to commit the transaction.
      We also were checking the block rsv and committing the transaction if the block
      rsv was below a certain threshold, but we were doing this in a place where we
      don't actually keep anything in the block rsv so this was always ending up false
      so we always committed the transaction in this case.  I tested this to make sure
      it didn't break anything, but it takes about 10 hours to get the box to this
      state so I don't know how much of an impact it will really make.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      9e6a0c52
    • J
      Btrfs: return an error from btrfs_wait_ordered_range · 0ef8b726
      Josef Bacik 提交于
      I noticed that if the free space cache has an error writing out it's data it
      won't actually error out, it will just carry on.  This is because it doesn't
      check the return value of btrfs_wait_ordered_range, which didn't actually return
      anything.  So fix this in order to keep us from making free space cache look
      valid when it really isnt.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      0ef8b726
    • M
      Btrfs: fix BUG_ON() casued by the reserved space migration · 20dd2cbf
      Miao Xie 提交于
      When we did space balance and snapshot creation at the same time, we might
      meet the following oops:
       kernel BUG at fs/btrfs/inode.c:3038!
       [SNIP]
       Call Trace:
       [<ffffffffa0411ec7>] btrfs_orphan_cleanup+0x293/0x407 [btrfs]
       [<ffffffffa042dc45>] btrfs_mksubvol.isra.28+0x259/0x373 [btrfs]
       [<ffffffffa042de85>] btrfs_ioctl_snap_create_transid+0x126/0x156 [btrfs]
       [<ffffffffa042dff1>] btrfs_ioctl_snap_create_v2+0xd0/0x121 [btrfs]
       [<ffffffffa0430b2c>] btrfs_ioctl+0x414/0x1854 [btrfs]
       [<ffffffff813b60b7>] ? __do_page_fault+0x305/0x379
       [<ffffffff811215a9>] vfs_ioctl+0x1d/0x39
       [<ffffffff81121d7c>] do_vfs_ioctl+0x32d/0x3e2
       [<ffffffff81057fe7>] ? finish_task_switch+0x80/0xb8
       [<ffffffff81121e88>] SyS_ioctl+0x57/0x83
       [<ffffffff813b39ff>] ? do_device_not_available+0x12/0x14
       [<ffffffff813b99c2>] system_call_fastpath+0x16/0x1b
       [SNIP]
       RIP  [<ffffffffa040da40>] btrfs_orphan_add+0xc3/0x126 [btrfs]
      
      The reason of the problem is that the relocation root creation stole
      the reserved space, which was reserved for orphan item deletion.
      
      There are several ways to fix this problem, one is to increasing
      the reserved space size of the space balace, and then we can use
      that space to create the relocation tree for each fs/file trees.
      But it is hard to calculate the suitable size because we doesn't
      know how many fs/file trees we need relocate.
      
      We fixed this problem by reserving the space for relocation root creation
      actively since the space it need is very small (one tree block, used for
      root node copy), then we use that reserved space to create the
      relocation tree. If we don't reserve space for relocation tree creation,
      we will use the reserved space of the balance.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      20dd2cbf
    • J
      Btrfs: relocate csums properly with prealloc extents · 4577b014
      Josef Bacik 提交于
      A user reported a problem where they were getting csum errors when running a
      balance and running systemd's journal.  This is because systemd is awesome and
      fallocate()'s its log space and writes into it.  Unfortunately we assume that
      when we read in all the csums for an extent that they are sequential starting at
      the bytenr we care about.  This obviously isn't the case for prealloc extents,
      where we could have written to the middle of the prealloc extent only, which
      means the csum would be for the bytenr in the middle of our range and not the
      front of our range.  Fix this by offsetting the new bytenr we are logging to
      based on the original bytenr the csum was for.  With this patch I no longer see
      the csum errors I was seeing.  Thanks,
      
      Cc: stable@vger.kernel.org
      Reported-by: NChris Murphy <lists@colorremedies.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      4577b014
    • F
      Btrfs: remove path arg from btrfs_truncate_free_space_cache · 74514323
      Filipe David Borba Manana 提交于
      Not used for anything, and removing it avoids caller's need to
      allocate a path structure.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      74514323
  2. 11 10月, 2013 1 次提交
    • M
      Btrfs: fix oops caused by the space balance and dead roots · c00869f1
      Miao Xie 提交于
      When doing space balance and subvolume destroy at the same time, we met
      the following oops:
      
      kernel BUG at fs/btrfs/relocation.c:2247!
      RIP: 0010: [<ffffffffa04cec16>] prepare_to_merge+0x154/0x1f0 [btrfs]
      Call Trace:
       [<ffffffffa04b5ab7>] relocate_block_group+0x466/0x4e6 [btrfs]
       [<ffffffffa04b5c7a>] btrfs_relocate_block_group+0x143/0x275 [btrfs]
       [<ffffffffa0495c56>] btrfs_relocate_chunk.isra.27+0x5c/0x5a2 [btrfs]
       [<ffffffffa0459871>] ? btrfs_item_key_to_cpu+0x15/0x31 [btrfs]
       [<ffffffffa048b46a>] ? btrfs_get_token_64+0x7e/0xcd [btrfs]
       [<ffffffffa04a3467>] ? btrfs_tree_read_unlock_blocking+0xb2/0xb7 [btrfs]
       [<ffffffffa049907d>] btrfs_balance+0x9c7/0xb6f [btrfs]
       [<ffffffffa049ef84>] btrfs_ioctl_balance+0x234/0x2ac [btrfs]
       [<ffffffffa04a1e8e>] btrfs_ioctl+0xd87/0x1ef9 [btrfs]
       [<ffffffff81122f53>] ? path_openat+0x234/0x4db
       [<ffffffff813c3b78>] ? __do_page_fault+0x31d/0x391
       [<ffffffff810f8ab6>] ? vma_link+0x74/0x94
       [<ffffffff811250f5>] vfs_ioctl+0x1d/0x39
       [<ffffffff811258c8>] do_vfs_ioctl+0x32d/0x3e2
       [<ffffffff811259d4>] SyS_ioctl+0x57/0x83
       [<ffffffff813c3bfa>] ? do_page_fault+0xe/0x10
       [<ffffffff813c73c2>] system_call_fastpath+0x16/0x1b
      
      It is because we returned the error number if the reference of the root was 0
      when doing space relocation. It was not right here, because though the root
      was dead(refs == 0), but the space it held still need be relocated, or we
      could not remove the block group. So in this case, we should return the root
      no matter it is dead or not.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      c00869f1
  3. 21 9月, 2013 2 次提交
  4. 01 9月, 2013 6 次提交
  5. 02 7月, 2013 1 次提交
    • M
      Btrfs: remove btrfs_sector_sum structure · f51a4a18
      Miao Xie 提交于
      Using the structure btrfs_sector_sum to keep the checksum value is
      unnecessary, because the extents that btrfs_sector_sum points to are
      continuous, we can find out the expected checksums by btrfs_ordered_sum's
      bytenr and the offset, so we can remove btrfs_sector_sum's bytenr. After
      removing bytenr, there is only one member in the structure, so it makes
      no sense to keep the structure, just remove it, and use a u32 array to
      store the checksum value.
      
      By this change, we don't use the while loop to get the checksums one by
      one. Now, we can get several checksum value at one time, it improved the
      performance by ~74% on my SSD (31MB/s -> 54MB/s).
      
      test command:
       # dd if=/dev/zero of=/mnt/btrfs/file0 bs=1M count=1024 oflag=sync
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      f51a4a18
  6. 01 7月, 2013 2 次提交
    • J
      Btrfs: fix not being able to find skinny extents during relocate · aee68ee5
      Josef Bacik 提交于
      We unconditionally search for the EXTENT_ITEM_KEY for metadata during balance,
      and then check the key that we found to see if it is actually a
      METADATA_ITEM_KEY, but this doesn't work right because METADATA is a higher key
      value, so if what we are looking for happens to be the first item in the leaf
      the search will dump us out at the previous leaf, and we won't find our item.
      So instead do what we do everywhere else, search for the skinny extent first and
      if we don't find it go back and re-search for the extent item.  This patch fixes
      the panic I was hitting when balancing a large file system with skinny extents.
      Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      aee68ee5
    • M
      Btrfs: fix broken nocow after balance · 5bc7247a
      Miao Xie 提交于
      Balance will create reloc_root for each fs root, and it's going to
      record last_snapshot to filter shared blocks.  The side effect of
      setting last_snapshot is to break nocow attributes of files.
      
      Since the extents are not shared by the relocation tree after the balance,
      we can recover the old last_snapshot safely if no one snapshoted the
      source tree. We fix the above problem by this way.
      Reported-by: NKyle Gates <kylegates@hotmail.com>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      5bc7247a
  7. 14 6月, 2013 3 次提交
  8. 09 6月, 2013 1 次提交
  9. 18 5月, 2013 2 次提交
  10. 07 5月, 2013 6 次提交
    • D
      34c2b290
    • E
      btrfs: make static code static & remove dead code · 48a3b636
      Eric Sandeen 提交于
      Big patch, but all it does is add statics to functions which
      are in fact static, then remove the associated dead-code fallout.
      
      removed functions:
      
      btrfs_iref_to_path()
      __btrfs_lookup_delayed_deletion_item()
      __btrfs_search_delayed_insertion_item()
      __btrfs_search_delayed_deletion_item()
      find_eb_for_page()
      btrfs_find_block_group()
      range_straddles_pages()
      extent_range_uptodate()
      btrfs_file_extent_length()
      btrfs_scrub_cancel_devid()
      btrfs_start_transaction_lflush()
      
      btrfs_print_tree() is left because it is used for debugging.
      btrfs_start_transaction_lflush() and btrfs_reada_detach() are
      left for symmetry.
      
      ulist.c functions are left, another patch will take care of those.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      48a3b636
    • J
      Btrfs: fix all callers of read_tree_block · 416bc658
      Josef Bacik 提交于
      We kept leaking extent buffers when mounting a broken file system and it turns
      out it's because not everybody uses read_tree_block properly.  You need to check
      and make sure the extent_buffer is uptodate before you use it.  This patch fixes
      everybody who calls read_tree_block directly to make sure they check that it is
      uptodate and free it and return an error if it is not.  With this we no longer
      leak EB's when things go horribly wrong.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      416bc658
    • J
      Btrfs: fix bad extent logging · 09a2a8f9
      Josef Bacik 提交于
      A user sent me a btrfs-image of a file system that was panicing on mount during
      the log recovery.  I had originally thought these problems were from a bug in
      the free space cache code, but that was just a symptom of the problem.  The
      problem is if your application does something like this
      
      [prealloc][prealloc][prealloc]
      
      the internal extent maps will merge those all together into one extent map, even
      though on disk they are 3 separate extents.  So if you go to write into one of
      these ranges the extent map will be right since we use the physical extent when
      doing the write, but when we log the extents they will use the wrong sizes for
      the remainder prealloc space.  If this doesn't happen to trip up the free space
      cache (which it won't in a lot of cases) then you will get bogus entries in your
      extent tree which will screw stuff up later.  The data and such will still work,
      but everything else is broken.  This patch fixes this by not allowing extents
      that are on the modified list to be merged.  This has the side effect that we
      are no longer adding everything to the modified list all the time, which means
      we now have to call btrfs_drop_extents every time we log an extent into the
      tree.  So this allows me to drop all this speciality code I was using to get
      around calling btrfs_drop_extents.  With this patch the testcase I've created no
      longer creates a bogus file system after replaying the log.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      09a2a8f9
    • D
      btrfs: clean snapshots one by one · 9d1a2a3a
      David Sterba 提交于
      Each time pick one dead root from the list and let the caller know if
      it's needed to continue. This should improve responsiveness during
      umount and balance which at some point waits for cleaning all currently
      queued dead roots.
      
      A new dead root is added to the end of the list, so the snapshots
      disappear in the order of deletion.
      
      The snapshot cleaning work is now done only from the cleaner thread and the
      others wake it if needed.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      9d1a2a3a
    • J
      Btrfs: add a incompatible format change for smaller metadata extent refs · 3173a18f
      Josef Bacik 提交于
      We currently store the first key of the tree block inside the reference for the
      tree block in the extent tree.  This takes up quite a bit of space.  Make a new
      key type for metadata which holds the level as the offset and completely removes
      storing the btrfs_tree_block_info inside the extent ref.  This reduces the size
      from 51 bytes to 33 bytes per extent reference for each tree block.  In practice
      this results in a 30-35% decrease in the size of our extent tree, which means we
      COW less and can keep more of the extent tree in memory which makes our heavy
      metadata operations go much faster.  This is not an automatic format change, you
      must enable it at mkfs time or with btrfstune.  This patch deals with having
      metadata stored as either the old format or the new format so it is easy to
      convert.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      3173a18f
  11. 05 3月, 2013 5 次提交
  12. 20 2月, 2013 1 次提交
  13. 09 1月, 2013 1 次提交
  14. 13 12月, 2012 1 次提交
  15. 12 12月, 2012 1 次提交