1. 29 1月, 2014 11 次提交
    • L
      Btrfs: fix extent state leak on transaction abortion · 1a4319cc
      Liu Bo 提交于
      When transaction is aborted, we fail to commit transaction, instead we do
      cleanup work.  After that when we umount btrfs, we get to free fs roots' log
      trees respectively, but that happens after we unpin extents, so those extents
      pinned by freeing log trees will remain in memory and lead to the leak.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      1a4319cc
    • Q
      btrfs: Add noinode_cache mount option · 3818aea2
      Qu Wenruo 提交于
      Add noinode_cache mount option for btrfs.
      
      Since inode map cache involves all the btrfs_find_free_ino/return_ino
      things and if just trigger the mount_opt,
      an inode number get from inode map cache will not returned to inode map
      cache.
      
      To keep the find and return inode both in the same behavior,
      a new bit in mount_opt, CHANGE_INODE_CACHE, is introduced for this idea.
      CHANGE_INODE_CACHE is set/cleared in remounting, and the original
      INODE_MAP_CACHE is set/cleared according to CHANGE_INODE_CACHE after a
      success transaction.
      Since find/return inode is all done between btrfs_start_transaction and
      btrfs_commit_transaction, this will keep consistent behavior.
      
      Also noinode_cache mount option will not stop the caching_kthread.
      
      Cc: David Sterba <dsterba@suse.cz>
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      3818aea2
    • J
      Btrfs: throttle delayed refs better · 0a2b2a84
      Josef Bacik 提交于
      On one of our gluster clusters we noticed some pretty big lag spikes.  This
      turned out to be because our transaction commit was taking like 3 minutes to
      complete.  This is because we have like 30 gigs of metadata, so our global
      reserve would end up being the max which is like 512 mb.  So our throttling code
      would allow a ridiculous amount of delayed refs to build up and then they'd all
      get run at transaction commit time, and for a cold mounted file system that
      could take up to 3 minutes to run.  So fix the throttling to be based on both
      the size of the global reserve and how long it takes us to run delayed refs.
      This patch tracks the time it takes to run delayed refs and then only allows 1
      seconds worth of outstanding delayed refs at a time.  This way it will auto-tune
      itself from cold cache up to when everything is in memory and it no longer has
      to go to disk.  This makes our transaction commits take much less time to run.
      Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      0a2b2a84
    • J
      Btrfs: attach delayed ref updates to delayed ref heads · d7df2c79
      Josef Bacik 提交于
      Currently we have two rb-trees, one for delayed ref heads and one for all of the
      delayed refs, including the delayed ref heads.  When we process the delayed refs
      we have to hold onto the delayed ref lock for all of the selecting and merging
      and such, which results in quite a bit of lock contention.  This was solved by
      having a waitqueue and only one flusher at a time, however this hurts if we get
      a lot of delayed refs queued up.
      
      So instead just have an rb tree for the delayed ref heads, and then attach the
      delayed ref updates to an rb tree that is per delayed ref head.  Then we only
      need to take the delayed ref lock when adding new delayed refs and when
      selecting a delayed ref head to process, all the rest of the time we deal with a
      per delayed ref head lock which will be much less contentious.
      
      The locking rules for this get a little more complicated since we have to lock
      up to 3 things to properly process delayed refs, but I will address that problem
      later.  For now this passes all of xfstests and my overnight stress tests.
      Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      d7df2c79
    • W
      Btrfs: only fua the first superblock when writting supers · e8117c26
      Wang Shilong 提交于
      We only intent to fua the first superblock in every device from
      comments, fix it.
      Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      e8117c26
    • F
      Btrfs: convert printk to btrfs_ and fix BTRFS prefix · efe120a0
      Frank Holton 提交于
      Convert all applicable cases of printk and pr_* to the btrfs_* macros.
      
      Fix all uses of the BTRFS prefix.
      Signed-off-by: NFrank Holton <fholton@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      efe120a0
    • J
      Btrfs: move the extent buffer radix tree into the fs_info · f28491e0
      Josef Bacik 提交于
      I need to create a fake tree to test qgroups and I don't want to have to setup a
      fake btree_inode.  The fact is we only use the radix tree for the fs_info, so
      everybody else who allocates an extent_io_tree is just wasting the space anyway.
      This patch moves the radix tree and its lock into btrfs_fs_info so there is less
      stuff I have to fake to do qgroup sanity tests.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      f28491e0
    • K
      btrfs: expand btrfs_find_item() to include find_orphan_item functionality · 3f870c28
      Kelley Nielsen 提交于
      This is the third step in bootstrapping the btrfs_find_item interface.
      The function find_orphan_item(), in orphan.c, is similar to the two
      functions already replaced by the new interface. It uses two parameters,
      which are already present in the interface, and is nearly identical to
      the function brought in in the previous patch.
      
      Replace the two calls to find_orphan_item() with calls to
      btrfs_find_item(), with the defined objectid and type that was used
      internally by find_orphan_item(), a null path, and a null key. Add a
      test for a null path to btrfs_find_item, and if it passes, allocate and
      free the path. Finally, remove find_orphan_item().
      Signed-off-by: NKelley Nielsen <kelleynnn@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      3f870c28
    • V
      btrfs: remove unused variables from disk-io.c · 71db2a77
      Valentina Giusti 提交于
      Remove unused variables:
      * tree from csum_dirty_buffer,
      * tree from btree_readpage_end_io_hook,
      * tree from btree_writepages,
      * bytenr from btrfs_create_tree,
      * fs_info from end_workqueue_fn.
      Signed-off-by: NValentina Giusti <valentina.giusti@microon.de>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      71db2a77
    • J
      btrfs: publish per-super attributes in sysfs · 5ac1d209
      Jeff Mahoney 提交于
      This patch adds per-super attributes to sysfs.
      
      It doesn't publish any attributes yet, but does the proper lifetime
      handling as well as the basic infrastructure to add new attributes.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      5ac1d209
    • L
      Btrfs: introduce a head ref rbtree · c46effa6
      Liu Bo 提交于
      The way how we process delayed refs is
      1) get a bunch of head refs,
      2) pick up one head ref,
      3) go one node back for any delayed ref updates.
      
      The head ref is also linked in the same rbtree as the delayed ref is,
      so in 1) stage, we have to walk one by one including not only head refs, but
      delayed refs.
      
      When we have a great number of delayed refs pending to process,
      this'll cost time a lot.
      
      Here we introduce a head ref specific rbtree, it only has head refs, so troubles
      go away.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      c46effa6
  2. 21 11月, 2013 1 次提交
    • L
      Btrfs: avoid heavy operations in btrfs_commit_super · d52c1bcc
      Liu Bo 提交于
      The 'git blame' history shows that, the old transaction commit code has to do
      twice to ensure roots are updated and we have to flush metadata and super block
      manually, however, right now all of these can be handled well inside
      the transaction commit code without extra efforts.
      
      And the error handling part remains same with the current code, -- 'return to
      caller once we get error'.
      
      This saves us a transaction commit and a flush of super block, which are both
      heavy operations according to ftrace output analysis.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      d52c1bcc
  3. 12 11月, 2013 16 次提交
  4. 11 10月, 2013 1 次提交
    • M
      Btrfs: fix oops caused by the space balance and dead roots · c00869f1
      Miao Xie 提交于
      When doing space balance and subvolume destroy at the same time, we met
      the following oops:
      
      kernel BUG at fs/btrfs/relocation.c:2247!
      RIP: 0010: [<ffffffffa04cec16>] prepare_to_merge+0x154/0x1f0 [btrfs]
      Call Trace:
       [<ffffffffa04b5ab7>] relocate_block_group+0x466/0x4e6 [btrfs]
       [<ffffffffa04b5c7a>] btrfs_relocate_block_group+0x143/0x275 [btrfs]
       [<ffffffffa0495c56>] btrfs_relocate_chunk.isra.27+0x5c/0x5a2 [btrfs]
       [<ffffffffa0459871>] ? btrfs_item_key_to_cpu+0x15/0x31 [btrfs]
       [<ffffffffa048b46a>] ? btrfs_get_token_64+0x7e/0xcd [btrfs]
       [<ffffffffa04a3467>] ? btrfs_tree_read_unlock_blocking+0xb2/0xb7 [btrfs]
       [<ffffffffa049907d>] btrfs_balance+0x9c7/0xb6f [btrfs]
       [<ffffffffa049ef84>] btrfs_ioctl_balance+0x234/0x2ac [btrfs]
       [<ffffffffa04a1e8e>] btrfs_ioctl+0xd87/0x1ef9 [btrfs]
       [<ffffffff81122f53>] ? path_openat+0x234/0x4db
       [<ffffffff813c3b78>] ? __do_page_fault+0x31d/0x391
       [<ffffffff810f8ab6>] ? vma_link+0x74/0x94
       [<ffffffff811250f5>] vfs_ioctl+0x1d/0x39
       [<ffffffff811258c8>] do_vfs_ioctl+0x32d/0x3e2
       [<ffffffff811259d4>] SyS_ioctl+0x57/0x83
       [<ffffffff813c3bfa>] ? do_page_fault+0xe/0x10
       [<ffffffff813c73c2>] system_call_fastpath+0x16/0x1b
      
      It is because we returned the error number if the reference of the root was 0
      when doing space relocation. It was not right here, because though the root
      was dead(refs == 0), but the space it held still need be relocated, or we
      could not remove the block group. So in this case, we should return the root
      no matter it is dead or not.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      c00869f1
  5. 21 9月, 2013 2 次提交
  6. 01 9月, 2013 9 次提交