1. 29 1月, 2014 5 次提交
    • J
      Btrfs: fix extent_from_logical to deal with skinny metadata · 580f0a67
      Josef Bacik 提交于
      I don't think this is an issue and I've not seen it in practice but
      extent_from_logical will fail to find a skinny extent because it uses
      btrfs_previous_item and gives it the normal extent item type.  This is just not
      a place to use btrfs_previous_item since we care about either normal extents or
      skinny extents, so open code btrfs_previous_item to properly check.  This would
      only affect metadata and the only place this is used for metadata is scrub and
      I'm pretty sure it's just for printing stuff out, not actually doing any work so
      hopefully it was never a problem other than a cosmetic one.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      580f0a67
    • J
      Btrfs: attach delayed ref updates to delayed ref heads · d7df2c79
      Josef Bacik 提交于
      Currently we have two rb-trees, one for delayed ref heads and one for all of the
      delayed refs, including the delayed ref heads.  When we process the delayed refs
      we have to hold onto the delayed ref lock for all of the selecting and merging
      and such, which results in quite a bit of lock contention.  This was solved by
      having a waitqueue and only one flusher at a time, however this hurts if we get
      a lot of delayed refs queued up.
      
      So instead just have an rb tree for the delayed ref heads, and then attach the
      delayed ref updates to an rb tree that is per delayed ref head.  Then we only
      need to take the delayed ref lock when adding new delayed refs and when
      selecting a delayed ref head to process, all the rest of the time we deal with a
      per delayed ref head lock which will be much less contentious.
      
      The locking rules for this get a little more complicated since we have to lock
      up to 3 things to properly process delayed refs, but I will address that problem
      later.  For now this passes all of xfstests and my overnight stress tests.
      Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      d7df2c79
    • F
      Btrfs: fix deadlock when iterating inode refs and running delayed inodes · 3fe81ce2
      Filipe David Borba Manana 提交于
      While running btrfs/004 from xfstests, after 503 iterations, dmesg reported
      a deadlock between tasks iterating inode refs and tasks running delayed inodes
      (during a transaction commit).
      
      It turns out that iterating inode refs implies doing one tree search and
      release all nodes in the path except the leaf node, and then passing that
      leaf node to btrfs_ref_to_path(), which in turn does another tree search
      without releasing the lock on the leaf node it received as parameter.
      
      This is a problem when other task wants to write to the btree as well and
      ends up updating the leaf that is read locked - the writer task locks the
      parent of the leaf and then blocks waiting for the leaf's lock to be
      released - at the same time, the task executing btrfs_ref_to_path()
      does a second tree search, without releasing the lock on the first leaf,
      and wants to access a leaf (the same or another one) that is a child of
      the same parent, resulting in a deadlock.
      
      The trace reported by lockdep follows.
      
      [84314.936373] INFO: task fsstress:11930 blocked for more than 120 seconds.
      [84314.936381]       Tainted: G        W  O 3.12.0-fdm-btrfs-next-16+ #70
      [84314.936383] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [84314.936386] fsstress        D ffff8806e1bf8000     0 11930  11926 0x00000000
      [84314.936393]  ffff8804d6d89b78 0000000000000046 ffff8804d6d89b18 ffffffff810bd8bd
      [84314.936399]  ffff8806e1bf8000 ffff8804d6d89fd8 ffff8804d6d89fd8 ffff8804d6d89fd8
      [84314.936405]  ffff880806308000 ffff8806e1bf8000 ffff8804d6d89c08 ffff8804deb8f190
      [84314.936410] Call Trace:
      [84314.936421]  [<ffffffff810bd8bd>] ? trace_hardirqs_on+0xd/0x10
      [84314.936428]  [<ffffffff81774269>] schedule+0x29/0x70
      [84314.936451]  [<ffffffffa0715bf5>] btrfs_tree_lock+0x75/0x270 [btrfs]
      [84314.936457]  [<ffffffff810715c0>] ? __init_waitqueue_head+0x60/0x60
      [84314.936470]  [<ffffffffa06ba231>] btrfs_search_slot+0x7f1/0x930 [btrfs]
      [84314.936489]  [<ffffffffa0731c2a>] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs]
      [84314.936504]  [<ffffffffa06d2e1f>] btrfs_lookup_inode+0x2f/0xa0 [btrfs]
      [84314.936510]  [<ffffffff810bd6ef>] ? trace_hardirqs_on_caller+0x1f/0x1e0
      [84314.936528]  [<ffffffffa073173c>] __btrfs_update_delayed_inode+0x4c/0x1d0 [btrfs]
      [84314.936543]  [<ffffffffa0731c2a>] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs]
      [84314.936558]  [<ffffffffa0731c2a>] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs]
      [84314.936573]  [<ffffffffa0731c82>] __btrfs_run_delayed_items+0x192/0x1e0 [btrfs]
      [84314.936589]  [<ffffffffa0731d03>] btrfs_run_delayed_items+0x13/0x20 [btrfs]
      [84314.936604]  [<ffffffffa06dbcd4>] btrfs_flush_all_pending_stuffs+0x24/0x80 [btrfs]
      [84314.936620]  [<ffffffffa06ddc13>] btrfs_commit_transaction+0x223/0xa20 [btrfs]
      [84314.936630]  [<ffffffffa06ae5ae>] btrfs_sync_fs+0x6e/0x110 [btrfs]
      [84314.936635]  [<ffffffff811d0b50>] ? __sync_filesystem+0x60/0x60
      [84314.936639]  [<ffffffff811d0b50>] ? __sync_filesystem+0x60/0x60
      [84314.936643]  [<ffffffff811d0b70>] sync_fs_one_sb+0x20/0x30
      [84314.936648]  [<ffffffff811a3541>] iterate_supers+0xf1/0x100
      [84314.936652]  [<ffffffff811d0c45>] sys_sync+0x55/0x90
      [84314.936658]  [<ffffffff8177ef12>] system_call_fastpath+0x16/0x1b
      [84314.936660] INFO: lockdep is turned off.
      [84314.936663] INFO: task btrfs:11955 blocked for more than 120 seconds.
      [84314.936666]       Tainted: G        W  O 3.12.0-fdm-btrfs-next-16+ #70
      [84314.936668] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [84314.936670] btrfs           D ffff880541729a88     0 11955  11608 0x00000000
      [84314.936674]  ffff880541729a38 0000000000000046 ffff8805417299d8 ffffffff810bd8bd
      [84314.936680]  ffff88075430c8a0 ffff880541729fd8 ffff880541729fd8 ffff880541729fd8
      [84314.936685]  ffffffff81c104e0 ffff88075430c8a0 ffff8804de8b00b8 ffff8804de8b0000
      [84314.936690] Call Trace:
      [84314.936695]  [<ffffffff810bd8bd>] ? trace_hardirqs_on+0xd/0x10
      [84314.936700]  [<ffffffff81774269>] schedule+0x29/0x70
      [84314.936717]  [<ffffffffa0715815>] btrfs_tree_read_lock+0xd5/0x140 [btrfs]
      [84314.936721]  [<ffffffff810715c0>] ? __init_waitqueue_head+0x60/0x60
      [84314.936733]  [<ffffffffa06ba201>] btrfs_search_slot+0x7c1/0x930 [btrfs]
      [84314.936746]  [<ffffffffa06bd505>] btrfs_find_item+0x55/0x160 [btrfs]
      [84314.936763]  [<ffffffffa06ff689>] ? free_extent_buffer+0x49/0xc0 [btrfs]
      [84314.936780]  [<ffffffffa073c9ca>] btrfs_ref_to_path+0xba/0x1e0 [btrfs]
      [84314.936797]  [<ffffffffa06f9719>] ? release_extent_buffer+0xb9/0xe0 [btrfs]
      [84314.936813]  [<ffffffffa06ff689>] ? free_extent_buffer+0x49/0xc0 [btrfs]
      [84314.936830]  [<ffffffffa073cb50>] inode_to_path+0x60/0xd0 [btrfs]
      [84314.936846]  [<ffffffffa073d365>] paths_from_inode+0x115/0x3c0 [btrfs]
      [84314.936851]  [<ffffffff8118dd44>] ? kmem_cache_alloc_trace+0x114/0x200
      [84314.936868]  [<ffffffffa0714494>] btrfs_ioctl+0xf14/0x2030 [btrfs]
      [84314.936873]  [<ffffffff817762db>] ? _raw_spin_unlock+0x2b/0x50
      [84314.936877]  [<ffffffff8116598f>] ? handle_mm_fault+0x34f/0xb00
      [84314.936882]  [<ffffffff81075563>] ? up_read+0x23/0x40
      [84314.936886]  [<ffffffff8177a41c>] ? __do_page_fault+0x20c/0x5a0
      [84314.936892]  [<ffffffff811b2946>] do_vfs_ioctl+0x96/0x570
      [84314.936896]  [<ffffffff81776e23>] ? error_sti+0x5/0x6
      [84314.936901]  [<ffffffff810b71e8>] ? trace_hardirqs_off_caller+0x28/0xd0
      [84314.936906]  [<ffffffff81776a09>] ? retint_swapgs+0xe/0x13
      [84314.936910]  [<ffffffff811b2eb1>] SyS_ioctl+0x91/0xb0
      [84314.936915]  [<ffffffff813eecde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [84314.936920]  [<ffffffff8177ef12>] system_call_fastpath+0x16/0x1b
      [84314.936922] INFO: lockdep is turned off.
      [84434.866873] INFO: task btrfs-transacti:11921 blocked for more than 120 seconds.
      [84434.866881]       Tainted: G        W  O 3.12.0-fdm-btrfs-next-16+ #70
      [84434.866883] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [84434.866886] btrfs-transacti D ffff880755b6a478     0 11921      2 0x00000000
      [84434.866893]  ffff8800735b9ce8 0000000000000046 ffff8800735b9c88 ffffffff810bd8bd
      [84434.866899]  ffff8805a1b848a0 ffff8800735b9fd8 ffff8800735b9fd8 ffff8800735b9fd8
      [84434.866904]  ffffffff81c104e0 ffff8805a1b848a0 ffff880755b6a478 ffff8804cece78f0
      [84434.866910] Call Trace:
      [84434.866920]  [<ffffffff810bd8bd>] ? trace_hardirqs_on+0xd/0x10
      [84434.866927]  [<ffffffff81774269>] schedule+0x29/0x70
      [84434.866948]  [<ffffffffa06dd2ef>] wait_current_trans.isra.33+0xbf/0x120 [btrfs]
      [84434.866954]  [<ffffffff810715c0>] ? __init_waitqueue_head+0x60/0x60
      [84434.866970]  [<ffffffffa06dec18>] start_transaction+0x388/0x5a0 [btrfs]
      [84434.866985]  [<ffffffffa06db9b5>] ? transaction_kthread+0xb5/0x280 [btrfs]
      [84434.866999]  [<ffffffffa06dee97>] btrfs_attach_transaction+0x17/0x20 [btrfs]
      [84434.867012]  [<ffffffffa06dba9e>] transaction_kthread+0x19e/0x280 [btrfs]
      [84434.867026]  [<ffffffffa06db900>] ? open_ctree+0x2260/0x2260 [btrfs]
      [84434.867030]  [<ffffffff81070dad>] kthread+0xed/0x100
      [84434.867035]  [<ffffffff81070cc0>] ? flush_kthread_worker+0x190/0x190
      [84434.867040]  [<ffffffff8177ee6c>] ret_from_fork+0x7c/0xb0
      [84434.867044]  [<ffffffff81070cc0>] ? flush_kthread_worker+0x190/0x190
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      3fe81ce2
    • K
      btrfs: bootstrap generic btrfs_find_item interface · e33d5c3d
      Kelley Nielsen 提交于
      There are many btrfs functions that manually search the tree for an
      item. They all reimplement the same mechanism and differ in the
      conditions that they use to find the item. __inode_info() is one such
      example. Zach Brown proposed creating a new interface to take the place
      of these functions.
      
      This patch is the first step to creating the interface. A new function,
      btrfs_find_item, has been added to ctree.c and prototyped in ctree.h.
      It is identical to __inode_info, except that the order of the parameters
      has been rearranged to more closely those of similar functions elsewhere
      in the code (now, root and path come first, then the objectid, offset
      and type, and the key to be filled in last). __inode_info's callers have
      been set to call this new function instead, and __inode_info itself has
      been removed.
      Signed-off-by: NKelley Nielsen <kelleynnn@gmail.com>
      Suggested-by: NZach Brown <zab@redhat.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      e33d5c3d
    • V
  2. 12 11月, 2013 3 次提交
    • D
      btrfs: Use WARN_ON()'s return value in place of WARN_ON(1) · fae7f21c
      Dulshani Gunawardhana 提交于
      Use WARN_ON()'s return value in place of WARN_ON(1) for cleaner source
      code that outputs a more descriptive warnings. Also fix the styling
      warning of redundant braces that came up as a result of this fix.
      Signed-off-by: NDulshani Gunawardhana <dulshani.gunawardhana89@gmail.com>
      Reviewed-by: NZach Brown <zab@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      fae7f21c
    • L
      Btrfs: fix a crash when running balance and defrag concurrently · 48ec4736
      Liu Bo 提交于
      Running balance and defrag concurrently can end up with a crash:
      
      kernel BUG at fs/btrfs/relocation.c:4528!
      RIP: 0010:[<ffffffffa01ac33b>]  [<ffffffffa01ac33b>] btrfs_reloc_cow_block+ 0x1eb/0x230 [btrfs]
      Call Trace:
        [<ffffffffa01398c1>] ? update_ref_for_cow+0x241/0x380 [btrfs]
        [<ffffffffa0180bad>] ? copy_extent_buffer+0xad/0x110 [btrfs]
        [<ffffffffa0139da1>] __btrfs_cow_block+0x3a1/0x520 [btrfs]
        [<ffffffffa013a0b6>] btrfs_cow_block+0x116/0x1b0 [btrfs]
        [<ffffffffa013ddad>] btrfs_search_slot+0x43d/0x970 [btrfs]
        [<ffffffffa0153c57>] btrfs_lookup_file_extent+0x37/0x40 [btrfs]
        [<ffffffffa0172a5e>] __btrfs_drop_extents+0x11e/0xae0 [btrfs]
        [<ffffffffa013b3fd>] ? generic_bin_search.constprop.39+0x8d/0x1a0 [btrfs]
        [<ffffffff8117d14a>] ? kmem_cache_alloc+0x1da/0x200
        [<ffffffffa0138e7a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
        [<ffffffffa0173ef0>] btrfs_drop_extents+0x60/0x90 [btrfs]
        [<ffffffffa016b24d>] relink_extent_backref+0x2ed/0x780 [btrfs]
        [<ffffffffa0162fe0>] ? btrfs_submit_bio_hook+0x1e0/0x1e0 [btrfs]
        [<ffffffffa01b8ed7>] ? iterate_inodes_from_logical+0x87/0xa0 [btrfs]
        [<ffffffffa016b909>] btrfs_finish_ordered_io+0x229/0xac0 [btrfs]
        [<ffffffffa016c3b5>] finish_ordered_fn+0x15/0x20 [btrfs]
        [<ffffffffa018cbe5>] worker_loop+0x125/0x4e0 [btrfs]
        [<ffffffffa018cac0>] ? btrfs_queue_worker+0x300/0x300 [btrfs]
        [<ffffffff81075ea0>] kthread+0xc0/0xd0
        [<ffffffff81075de0>] ? insert_kthread_work+0x40/0x40
        [<ffffffff8164796c>] ret_from_fork+0x7c/0xb0
        [<ffffffff81075de0>] ? insert_kthread_work+0x40/0x40
      ----------------------------------------------------------------------
      
      It turns out to be that balance operation will bump root's @last_snapshot,
      which enables snapshot-aware defrag path, and backref walking stuff will
      find data reloc tree as refs' parent, and hit the BUG_ON() during COW.
      
      As data reloc tree's data is just for relocation purpose, and will be deleted right
      after relocation is done, it's unnecessary to walk those refs belonged to data reloc
      tree, it'd be better to skip them.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      48ec4736
    • R
      btrfs: drop unused parameter from btrfs_item_nr · dd3cc16b
      Ross Kirk 提交于
      Remove unused eb parameter from btrfs_item_nr
      Signed-off-by: NRoss Kirk <ross.kirk@gmail.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      dd3cc16b
  3. 01 9月, 2013 6 次提交
  4. 10 8月, 2013 2 次提交
    • J
      Btrfs: make sure the backref walker catches all refs to our extent · ed8c4913
      Josef Bacik 提交于
      Because we don't mess with the offset into the extent for compressed we will
      properly find both extents for this case
      
      [extent a][extent b][rest of extent a]
      
      but because we already added a ref for the front half we won't add the inode
      information for the second half.  This causes us to leak that memory and not
      print out the other offset when we do logical-resolve.  So fix this by calling
      ulist_add_merge and then add our eie to the existing entry if there is one.
      With this patch we get both offsets out of logical-resolve.  With this and the
      other 2 patches I've sent we now pass btrfs/276 on my vm with compress-force=lzo
      set.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      ed8c4913
    • J
      Btrfs: fix backref walking when we hit a compressed extent · 8ca15e05
      Josef Bacik 提交于
      If you do btrfs inspect-internal logical-resolve on a compressed extent that has
      been partly overwritten it won't find anything.  This is because we try and
      match the extent offset we've searched for based on the extent offset in the
      data extent entry.  However this doesn't work for compressed extents because the
      offsets are for the uncompressed size, not the compressed size.  So instead only
      do this check if we are not compressed, that way we can get an actual entry for
      the physical offset rather than nothing for compressed.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      8ca15e05
  5. 02 7月, 2013 1 次提交
  6. 01 7月, 2013 1 次提交
    • J
      Btrfs: cleanup backref search commit root flag stuff · da61d31a
      Josef Bacik 提交于
      Looking into this backref problem I noticed we're using a macro to what turns
      out to essentially be a NULL check to see if we need to search the commit root.
      I'm killing this, let's just do what everybody else does and checks if trans ==
      NULL.  I've also made it so we pass in the path to __resolve_indirect_refs which
      will have the search_commit_root flag set properly already and that way we can
      avoid allocating another path when we have a perfectly good one to use.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      da61d31a
  7. 18 5月, 2013 1 次提交
  8. 07 5月, 2013 8 次提交
  9. 27 2月, 2013 1 次提交
  10. 13 12月, 2012 2 次提交
  11. 26 10月, 2012 2 次提交
  12. 24 10月, 2012 1 次提交
  13. 09 10月, 2012 2 次提交
    • M
      btrfs: extended inode ref iteration · d24bec3a
      Mark Fasheh 提交于
      The iterate_irefs in backref.c is used to build path components from inode
      refs. This patch adds code to iterate extended refs as well.
      
      I had modify the callback function signature to abstract out some of the
      differences between ref structures. iref_to_path() also needed similar
      changes.
      Signed-off-by: NMark Fasheh <mfasheh@suse.de>
      d24bec3a
    • M
      btrfs: extended inode refs · f186373f
      Mark Fasheh 提交于
      This patch adds basic support for extended inode refs. This includes support
      for link and unlink of the refs, which basically gets us support for rename
      as well.
      
      Inode creation does not need changing - extended refs are only added after
      the ref array is full.
      Signed-off-by: NMark Fasheh <mfasheh@suse.de>
      f186373f
  14. 02 10月, 2012 4 次提交
  15. 29 8月, 2012 1 次提交