1. 09 8月, 2015 2 次提交
    • L
      Btrfs: fix warning in backref walking · acdf898d
      Liu Bo 提交于
      When we do backref walking, we search firstly in queued delayed refs
      and then the on-disk backrefs, but we parse differently for shared
      references, for delayed refs we also add 'ref->root' while for on-disk
      backrefs we don't, this can prevent us from merging refs indexed
      by the same bytenr and cause find_parent_nodes() to throw a warning at
      'WARN_ON(ref->count < 0)', for example, when we have a shared data extent
      with 'ref_cnt=1' and a delayed shared data with a BTRFS_DROP_DELAYED_REF,
      that happens.
      
      For shared references, no matter if it's delayed or on-disk, ref->root is
      not at all used, instead it's ref->parent that really matters, so this has
      delayed refs handled as the same way as on-disk refs.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      acdf898d
    • F
      Btrfs: teach backref walking about backrefs with underflowed offset values · d6589101
      Filipe Manana 提交于
      When cloning/deduplicating file extents (through the clone and extent_same
      ioctls) we can get data back references with offset values that are a
      result of an unsigned integer arithmetic underflow, that is, values that
      are much larger then they could be otherwise.
      
      This is not a problem when decrementing or dropping the back references
      (happens when we overwrite the extents or punch a hole for example, through
      __btrfs_drop_extents()), since we compute the same too large offset value,
      but it is a problem for the backref walking code, used by an incremental
      send and the ioctls that are used by the btrfs tool "inspect-internal"
      commands, as it makes it miss the corresponding file extent items because
      the search key is set for an extent item that starts at an offset matching
      the exceptionally large offset value of the data back reference. For an
      incremental send this causes the send ioctl to fail with -EIO.
      
      So teach the backref walking code to deal with these cases by setting the
      search key's offset to 0 if the backref's offset value is larger than
      LLONG_MAX (the largest possible file offset). This makes sure the backref
      walking code finds the corresponding file extent items at the expense of
      scanning more items and leafs in the btree.
      
      Fixing the clone/dedup ioctls to not produce such underflowed results would
      require major changes breaking backward compatibility, updating user space
      tools, etc.
      
      Simple reproducer case for fstests:
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
      
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
            rm -fr $send_files_dir
            rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
      
        # real QA test starts here
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
        _require_cloner
        _need_to_be_root
      
        send_files_dir=$TEST_DIR/btrfs-test-$seq
      
        rm -f $seqres.full
        rm -fr $send_files_dir
        mkdir $send_files_dir
      
        _scratch_mkfs >>$seqres.full 2>&1
        _scratch_mount
      
        # Create our test file with a single extent of 64K starting at file
        # offset 128K.
        $XFS_IO_PROG -f -c "pwrite -S 0xaa 128K 64K" $SCRATCH_MNT/foo \
            | _filter_xfs_io
      
        _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \
            $SCRATCH_MNT/mysnap1
      
        # Now clone parts of the original extent into lower offsets of the file.
        #
        # The first clone operation adds a file extent item to file offset 0
        # that points to our initial extent with a data offset of 16K. The
        # corresponding data back reference in the extent tree has an offset of
        # 18446744073709535232, which is the result of file_offset - data_offset
        # = 0 - 16K.
        #
        # The second clone operation adds a file extent item to file offset 16K
        # that points to our initial extent with a data offset of 48K. The
        # corresponding data back reference in the extent tree has an offset of
        # 18446744073709518848, which is the result of file_offset - data_offset
        # = 16K - 48K.
        #
        # Those large back reference offsets (result of unsigned arithmetic
        # underflow) confused the back reference walking code (used by an
        # incremental send and the multiple inspect-internal ioctls) and made it
        # miss the back references, which for the case of an incremental send it
        # made it fail with -EIO and print a message like the following to
        # dmesg:
        #
        # "BTRFS error (device sdc): did not find backref in send_root. \
        #  inode=257, offset=0, disk_byte=12845056 found extent=12845056"
        #
        $CLONER_PROG -s $(((128 + 16) * 1024)) -d 0 -l $((16 * 1024)) \
            $SCRATCH_MNT/foo $SCRATCH_MNT/foo
        $CLONER_PROG -s $(((128 + 48) * 1024)) -d $((16 * 1024)) \
            -l $((16 * 1024)) $SCRATCH_MNT/foo $SCRATCH_MNT/foo
      
        _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \
            $SCRATCH_MNT/mysnap2
      
        _run_btrfs_util_prog send $SCRATCH_MNT/mysnap1 -f $send_files_dir/1.snap
        _run_btrfs_util_prog send -p $SCRATCH_MNT/mysnap1 $SCRATCH_MNT/mysnap2 \
            -f $send_files_dir/2.snap
      
        echo "File digest in the original filesystem:"
        md5sum $SCRATCH_MNT/mysnap2/foo | _filter_scratch
      
        # Now recreate the filesystem by receiving both send streams and verify
        # we get the same file contents that the original filesystem had.
        _scratch_unmount
        _scratch_mkfs >>$seqres.full 2>&1
        _scratch_mount
      
        _run_btrfs_util_prog receive $SCRATCH_MNT -f $send_files_dir/1.snap
        _run_btrfs_util_prog receive $SCRATCH_MNT -f $send_files_dir/2.snap
      
        echo "File digest in the new filesystem:"
        md5sum $SCRATCH_MNT/mysnap2/foo | _filter_scratch
      
        status=0
        exit
      
      The test's expected golden output is:
      
        wrote 65536/65536 bytes at offset 131072
        XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
        File digest in the original filesystem:
        6c6079335cff141b8a31233ead04cbff  SCRATCH_MNT/mysnap2/foo
        File digest in the new filesystem:
        6c6079335cff141b8a31233ead04cbff  SCRATCH_MNT/mysnap2/foo
      
      But it failed with:
      
          (...)
          @@ -1,7 +1,5 @@
           QA output created by 097
           wrote 65536/65536 bytes at offset 131072
           XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
          -File digest in the original filesystem:
          -6c6079335cff141b8a31233ead04cbff  SCRATCH_MNT/mysnap2/foo
          -File digest in the new filesystem:
          -6c6079335cff141b8a31233ead04cbff  SCRATCH_MNT/mysnap2/foo
          ...
      
        $ cat /home/fdmanana/git/hub/xfstests/results//btrfs/097.full
        (...)
        ERROR: send ioctl failed with -5: Input/output error
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      d6589101
  2. 11 6月, 2015 3 次提交
  3. 03 6月, 2015 1 次提交
  4. 20 5月, 2015 1 次提交
    • M
      btrfs: clear 'ret' in btrfs_check_shared() loop · 2c2ed5aa
      Mark Fasheh 提交于
      btrfs_check_shared() is leaking a return value of '1' from
      find_parent_nodes(). As a result, callers (in this case, extent_fiemap())
      are told extents are shared when they are not. This in turn broke fiemap on
      btrfs for kernels v3.18 and up.
      
      The fix is simple - we just have to clear 'ret' after we are done processing
      the results of find_parent_nodes().
      
      It wasn't clear to me at first what was happening with return values in
      btrfs_check_shared() and find_parent_nodes() - thanks to Josef for the help
      on irc. I added documentation to both functions to make things more clear
      for the next hacker who might come across them.
      
      If we could queue this up for -stable too that would be great.
      Signed-off-by: NMark Fasheh <mfasheh@suse.de>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      2c2ed5aa
  5. 04 3月, 2015 1 次提交
  6. 15 1月, 2015 2 次提交
  7. 03 1月, 2015 1 次提交
    • F
      Btrfs: correctly get tree level in tree_backref_for_extent · a1317f45
      Filipe Manana 提交于
      If we are using skinny metadata, the block's tree level is in the offset
      of the key and not in a btrfs_tree_block_info structure following the
      extent item (it doesn't exist). Therefore fix it.
      
      Besides returning the correct level in the tree, this also prevents reading
      past the leaf's end in the case where the extent item is the last item in
      the leaf (eb) and it has only 1 inline reference - this is because
      sizeof(struct btrfs_tree_block_info) is greater than
      sizeof(struct btrfs_extent_inline_ref).
      
      Got it while running a scrub which produced the following warning:
      
          BTRFS: checksum error at logical 42123264 on dev /dev/sde, sector 15840: metadata node (level 24) in tree 5
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NSatoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      a1317f45
  8. 02 10月, 2014 1 次提交
  9. 18 9月, 2014 3 次提交
  10. 15 8月, 2014 2 次提交
    • T
      Btrfs: Fix memory corruption by ulist_add_merge() on 32bit arch · 4eb1f66d
      Takashi Iwai 提交于
      We've got bug reports that btrfs crashes when quota is enabled on
      32bit kernel, typically with the Oops like below:
       BUG: unable to handle kernel NULL pointer dereference at 00000004
       IP: [<f9234590>] find_parent_nodes+0x360/0x1380 [btrfs]
       *pde = 00000000
       Oops: 0000 [#1] SMP
       CPU: 0 PID: 151 Comm: kworker/u8:2 Tainted: G S      W 3.15.2-1.gd43d97e-default #1
       Workqueue: btrfs-qgroup-rescan normal_work_helper [btrfs]
       task: f1478130 ti: f147c000 task.ti: f147c000
       EIP: 0060:[<f9234590>] EFLAGS: 00010213 CPU: 0
       EIP is at find_parent_nodes+0x360/0x1380 [btrfs]
       EAX: f147dda8 EBX: f147ddb0 ECX: 00000011 EDX: 00000000
       ESI: 00000000 EDI: f147dda4 EBP: f147ddf8 ESP: f147dd38
        DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
       CR0: 8005003b CR2: 00000004 CR3: 00bf3000 CR4: 00000690
       Stack:
        00000000 00000000 f147dda4 00000050 00000001 00000000 00000001 00000050
        00000001 00000000 d3059000 00000001 00000022 000000a8 00000000 00000000
        00000000 000000a1 00000000 00000000 00000001 00000000 00000000 11800000
       Call Trace:
        [<f923564d>] __btrfs_find_all_roots+0x9d/0xf0 [btrfs]
        [<f9237bb1>] btrfs_qgroup_rescan_worker+0x401/0x760 [btrfs]
        [<f9206148>] normal_work_helper+0xc8/0x270 [btrfs]
        [<c025e38b>] process_one_work+0x11b/0x390
        [<c025eea1>] worker_thread+0x101/0x340
        [<c026432b>] kthread+0x9b/0xb0
        [<c0712a71>] ret_from_kernel_thread+0x21/0x30
        [<c0264290>] kthread_create_on_node+0x110/0x110
      
      This indicates a NULL corruption in prefs_delayed list.  The further
      investigation and bisection pointed that the call of ulist_add_merge()
      results in the corruption.
      
      ulist_add_merge() takes u64 as aux and writes a 64bit value into
      old_aux.  The callers of this function in backref.c, however, pass a
      pointer of a pointer to old_aux.  That is, the function overwrites
      64bit value on 32bit pointer.  This caused a NULL in the adjacent
      variable, in this case, prefs_delayed.
      
      Here is a quick attempt to band-aid over this: a new function,
      ulist_add_merge_ptr() is introduced to pass/store properly a pointer
      value instead of u64.  There are still ugly void ** cast remaining
      in the callers because void ** cannot be taken implicitly.  But, it's
      safer than explicit cast to u64, anyway.
      
      Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=887046
      Cc: <stable@vger.kernel.org> [v3.11+]
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NChris Mason <clm@fb.com>
      4eb1f66d
    • F
      Btrfs: read lock extent buffer while walking backrefs · 6f7ff6d7
      Filipe Manana 提交于
      Before processing the extent buffer, acquire a read lock on it, so
      that we're safe against concurrent updates on the extent buffer.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      6f7ff6d7
  11. 10 6月, 2014 4 次提交
  12. 07 4月, 2014 1 次提交
    • J
      Btrfs: remove transaction from send · 9e351cc8
      Josef Bacik 提交于
      Lets try this again.  We can deadlock the box if we send on a box and try to
      write onto the same fs with the app that is trying to listen to the send pipe.
      This is because the writer could get stuck waiting for a transaction commit
      which is being blocked by the send.  So fix this by making sure looking at the
      commit roots is always going to be consistent.  We do this by keeping track of
      which roots need to have their commit roots swapped during commit, and then
      taking the commit_root_sem and swapping them all at once.  Then make sure we
      take a read lock on the commit_root_sem in cases where we search the commit root
      to make sure we're always looking at a consistent view of the commit roots.
      Previously we had problems with this because we would swap a fs tree commit root
      and then swap the extent tree commit root independently which would cause the
      backref walking code to screw up sometimes.  With this patch we no longer
      deadlock and pass all the weird send/receive corner cases.  Thanks,
      Reportedy-by: NHugo Mills <hugo@carfax.org.uk>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      9e351cc8
  13. 22 3月, 2014 1 次提交
  14. 11 3月, 2014 3 次提交
  15. 29 1月, 2014 10 次提交
    • W
      Btrfs: fix memory leaks on walking backrefs failure · f05c4746
      Wang Shilong 提交于
      When walking backrefs, we may iterate every inode's extent
      and add/merge them into ulist, and the caller will free memory
      from ulist.
      
      However, if we fail to allocate inode's extents element
      memory or ulist_add() fail to allocate memory, we won't
      add allocated memory into ulist, and the caller won't
      free some allocated memory thus memory leaks happen.
      Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      f05c4746
    • W
      Btrfs: add a reschedule point in btrfs_find_all_roots() · bca1a290
      Wang Shilong 提交于
      I can easily trigger the following warnings when enabling quota
      in my virtual machine(running Opensuse), Steps are firstly creating
      a subvolume full of fragment extents, and then create many snapshots
      (500 in my test case).
      
      [ 2362.808459] BUG: soft lockup - CPU#0 stuck for 22s! [btrfs-qgroup-re:1970]
      
      [ 2362.809023] task: e4af8450 ti: e371c000 task.ti: e371c000
      [ 2362.809026] EIP: 0060:[<fa38f4ae>] EFLAGS: 00000246 CPU: 0
      [ 2362.809049] EIP is at __merge_refs+0x5e/0x100 [btrfs]
      [ 2362.809051] EAX: 00000000 EBX: cfadbcf0 ECX: 00000000 EDX: cfadbcb0
      [ 2362.809052] ESI: dd8d3370 EDI: e371dde0 EBP: e371dd6c ESP: e371dd5c
      [ 2362.809054]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      [ 2362.809055] CR0: 80050033 CR2: ac454d50 CR3: 009a9000 CR4: 001407d0
      [ 2362.809099] Stack:
      [ 2362.809100]  00000001 e371dde0 dfcc6890 f29f8000 e371de28 fa39016d 00000011 00000001
      [ 2362.809105]  99bfc000 00000000 93928000 00000000 00000001 00000050 e371dda8 00000001
      [ 2362.809109]  f3a31000 f3413000 00000001 e371ddb8 000040a8 00000202 00000000 00000023
      [ 2362.809113] Call Trace:
      [ 2362.809136]  [<fa39016d>] find_parent_nodes+0x34d/0x1280 [btrfs]
      [ 2362.809156]  [<fa391172>] btrfs_find_all_roots+0xb2/0x110 [btrfs]
      [ 2362.809174]  [<fa3934a8>] btrfs_qgroup_rescan_worker+0x358/0x7a0 [btrfs]
      [ 2362.809180]  [<c024d0ce>] ? lock_timer_base.isra.39+0x1e/0x40
      [ 2362.809199]  [<fa3648df>] worker_loop+0xff/0x470 [btrfs]
      [ 2362.809204]  [<c027a88a>] ? __wake_up_locked+0x1a/0x20
      [ 2362.809221]  [<fa3647e0>] ? btrfs_queue_worker+0x2b0/0x2b0 [btrfs]
      [ 2362.809225]  [<c025ebbc>] kthread+0x9c/0xb0
      [ 2362.809229]  [<c06b487b>] ret_from_kernel_thread+0x1b/0x30
      [ 2362.809233]  [<c025eb20>] ? kthread_create_on_node+0x110/0x110
      
      By adding a reschedule point at the end of btrfs_find_all_roots(), i no longer
      hit these warnings.
      
      Cc: Josef Bacik <jbacik@fb.com>
      Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      bca1a290
    • W
      Btrfs: fix to catch all errors when resolving indirect ref · 95def2ed
      Wang Shilong 提交于
      We can only tolerate ENOENT here, for other errors, we should
      return directly.
      Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      95def2ed
    • W
      Btrfs: fix protection between walking backrefs and root deletion · 538f72cd
      Wang Shilong 提交于
      There is a race condition between resolving indirect ref and root deletion,
      and we should gurantee that root can not be destroyed to avoid accessing
      broken tree here.
      
      Here we fix it by holding @subvol_srcu, and we will release it as soon
      as we have held root node lock.
      Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      538f72cd
    • J
      Btrfs: only process as many file extents as there are refs · 7ef81ac8
      Josef Bacik 提交于
      The backref walking code will search down to the key it is looking for and then
      proceed to walk _all_ of the extents on the file until it hits the end.  This is
      suboptimal with large files, we only need to look for as many extents as we have
      references for that inode.  I have a testcase that creates a randomly written 4
      gig file and before this patch it took 6min 30sec to do the initial send, with
      this patch it takes 2min 30sec to do the intial send.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      7ef81ac8
    • J
      Btrfs: fix extent_from_logical to deal with skinny metadata · 580f0a67
      Josef Bacik 提交于
      I don't think this is an issue and I've not seen it in practice but
      extent_from_logical will fail to find a skinny extent because it uses
      btrfs_previous_item and gives it the normal extent item type.  This is just not
      a place to use btrfs_previous_item since we care about either normal extents or
      skinny extents, so open code btrfs_previous_item to properly check.  This would
      only affect metadata and the only place this is used for metadata is scrub and
      I'm pretty sure it's just for printing stuff out, not actually doing any work so
      hopefully it was never a problem other than a cosmetic one.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      580f0a67
    • J
      Btrfs: attach delayed ref updates to delayed ref heads · d7df2c79
      Josef Bacik 提交于
      Currently we have two rb-trees, one for delayed ref heads and one for all of the
      delayed refs, including the delayed ref heads.  When we process the delayed refs
      we have to hold onto the delayed ref lock for all of the selecting and merging
      and such, which results in quite a bit of lock contention.  This was solved by
      having a waitqueue and only one flusher at a time, however this hurts if we get
      a lot of delayed refs queued up.
      
      So instead just have an rb tree for the delayed ref heads, and then attach the
      delayed ref updates to an rb tree that is per delayed ref head.  Then we only
      need to take the delayed ref lock when adding new delayed refs and when
      selecting a delayed ref head to process, all the rest of the time we deal with a
      per delayed ref head lock which will be much less contentious.
      
      The locking rules for this get a little more complicated since we have to lock
      up to 3 things to properly process delayed refs, but I will address that problem
      later.  For now this passes all of xfstests and my overnight stress tests.
      Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      d7df2c79
    • F
      Btrfs: fix deadlock when iterating inode refs and running delayed inodes · 3fe81ce2
      Filipe David Borba Manana 提交于
      While running btrfs/004 from xfstests, after 503 iterations, dmesg reported
      a deadlock between tasks iterating inode refs and tasks running delayed inodes
      (during a transaction commit).
      
      It turns out that iterating inode refs implies doing one tree search and
      release all nodes in the path except the leaf node, and then passing that
      leaf node to btrfs_ref_to_path(), which in turn does another tree search
      without releasing the lock on the leaf node it received as parameter.
      
      This is a problem when other task wants to write to the btree as well and
      ends up updating the leaf that is read locked - the writer task locks the
      parent of the leaf and then blocks waiting for the leaf's lock to be
      released - at the same time, the task executing btrfs_ref_to_path()
      does a second tree search, without releasing the lock on the first leaf,
      and wants to access a leaf (the same or another one) that is a child of
      the same parent, resulting in a deadlock.
      
      The trace reported by lockdep follows.
      
      [84314.936373] INFO: task fsstress:11930 blocked for more than 120 seconds.
      [84314.936381]       Tainted: G        W  O 3.12.0-fdm-btrfs-next-16+ #70
      [84314.936383] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [84314.936386] fsstress        D ffff8806e1bf8000     0 11930  11926 0x00000000
      [84314.936393]  ffff8804d6d89b78 0000000000000046 ffff8804d6d89b18 ffffffff810bd8bd
      [84314.936399]  ffff8806e1bf8000 ffff8804d6d89fd8 ffff8804d6d89fd8 ffff8804d6d89fd8
      [84314.936405]  ffff880806308000 ffff8806e1bf8000 ffff8804d6d89c08 ffff8804deb8f190
      [84314.936410] Call Trace:
      [84314.936421]  [<ffffffff810bd8bd>] ? trace_hardirqs_on+0xd/0x10
      [84314.936428]  [<ffffffff81774269>] schedule+0x29/0x70
      [84314.936451]  [<ffffffffa0715bf5>] btrfs_tree_lock+0x75/0x270 [btrfs]
      [84314.936457]  [<ffffffff810715c0>] ? __init_waitqueue_head+0x60/0x60
      [84314.936470]  [<ffffffffa06ba231>] btrfs_search_slot+0x7f1/0x930 [btrfs]
      [84314.936489]  [<ffffffffa0731c2a>] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs]
      [84314.936504]  [<ffffffffa06d2e1f>] btrfs_lookup_inode+0x2f/0xa0 [btrfs]
      [84314.936510]  [<ffffffff810bd6ef>] ? trace_hardirqs_on_caller+0x1f/0x1e0
      [84314.936528]  [<ffffffffa073173c>] __btrfs_update_delayed_inode+0x4c/0x1d0 [btrfs]
      [84314.936543]  [<ffffffffa0731c2a>] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs]
      [84314.936558]  [<ffffffffa0731c2a>] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs]
      [84314.936573]  [<ffffffffa0731c82>] __btrfs_run_delayed_items+0x192/0x1e0 [btrfs]
      [84314.936589]  [<ffffffffa0731d03>] btrfs_run_delayed_items+0x13/0x20 [btrfs]
      [84314.936604]  [<ffffffffa06dbcd4>] btrfs_flush_all_pending_stuffs+0x24/0x80 [btrfs]
      [84314.936620]  [<ffffffffa06ddc13>] btrfs_commit_transaction+0x223/0xa20 [btrfs]
      [84314.936630]  [<ffffffffa06ae5ae>] btrfs_sync_fs+0x6e/0x110 [btrfs]
      [84314.936635]  [<ffffffff811d0b50>] ? __sync_filesystem+0x60/0x60
      [84314.936639]  [<ffffffff811d0b50>] ? __sync_filesystem+0x60/0x60
      [84314.936643]  [<ffffffff811d0b70>] sync_fs_one_sb+0x20/0x30
      [84314.936648]  [<ffffffff811a3541>] iterate_supers+0xf1/0x100
      [84314.936652]  [<ffffffff811d0c45>] sys_sync+0x55/0x90
      [84314.936658]  [<ffffffff8177ef12>] system_call_fastpath+0x16/0x1b
      [84314.936660] INFO: lockdep is turned off.
      [84314.936663] INFO: task btrfs:11955 blocked for more than 120 seconds.
      [84314.936666]       Tainted: G        W  O 3.12.0-fdm-btrfs-next-16+ #70
      [84314.936668] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [84314.936670] btrfs           D ffff880541729a88     0 11955  11608 0x00000000
      [84314.936674]  ffff880541729a38 0000000000000046 ffff8805417299d8 ffffffff810bd8bd
      [84314.936680]  ffff88075430c8a0 ffff880541729fd8 ffff880541729fd8 ffff880541729fd8
      [84314.936685]  ffffffff81c104e0 ffff88075430c8a0 ffff8804de8b00b8 ffff8804de8b0000
      [84314.936690] Call Trace:
      [84314.936695]  [<ffffffff810bd8bd>] ? trace_hardirqs_on+0xd/0x10
      [84314.936700]  [<ffffffff81774269>] schedule+0x29/0x70
      [84314.936717]  [<ffffffffa0715815>] btrfs_tree_read_lock+0xd5/0x140 [btrfs]
      [84314.936721]  [<ffffffff810715c0>] ? __init_waitqueue_head+0x60/0x60
      [84314.936733]  [<ffffffffa06ba201>] btrfs_search_slot+0x7c1/0x930 [btrfs]
      [84314.936746]  [<ffffffffa06bd505>] btrfs_find_item+0x55/0x160 [btrfs]
      [84314.936763]  [<ffffffffa06ff689>] ? free_extent_buffer+0x49/0xc0 [btrfs]
      [84314.936780]  [<ffffffffa073c9ca>] btrfs_ref_to_path+0xba/0x1e0 [btrfs]
      [84314.936797]  [<ffffffffa06f9719>] ? release_extent_buffer+0xb9/0xe0 [btrfs]
      [84314.936813]  [<ffffffffa06ff689>] ? free_extent_buffer+0x49/0xc0 [btrfs]
      [84314.936830]  [<ffffffffa073cb50>] inode_to_path+0x60/0xd0 [btrfs]
      [84314.936846]  [<ffffffffa073d365>] paths_from_inode+0x115/0x3c0 [btrfs]
      [84314.936851]  [<ffffffff8118dd44>] ? kmem_cache_alloc_trace+0x114/0x200
      [84314.936868]  [<ffffffffa0714494>] btrfs_ioctl+0xf14/0x2030 [btrfs]
      [84314.936873]  [<ffffffff817762db>] ? _raw_spin_unlock+0x2b/0x50
      [84314.936877]  [<ffffffff8116598f>] ? handle_mm_fault+0x34f/0xb00
      [84314.936882]  [<ffffffff81075563>] ? up_read+0x23/0x40
      [84314.936886]  [<ffffffff8177a41c>] ? __do_page_fault+0x20c/0x5a0
      [84314.936892]  [<ffffffff811b2946>] do_vfs_ioctl+0x96/0x570
      [84314.936896]  [<ffffffff81776e23>] ? error_sti+0x5/0x6
      [84314.936901]  [<ffffffff810b71e8>] ? trace_hardirqs_off_caller+0x28/0xd0
      [84314.936906]  [<ffffffff81776a09>] ? retint_swapgs+0xe/0x13
      [84314.936910]  [<ffffffff811b2eb1>] SyS_ioctl+0x91/0xb0
      [84314.936915]  [<ffffffff813eecde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [84314.936920]  [<ffffffff8177ef12>] system_call_fastpath+0x16/0x1b
      [84314.936922] INFO: lockdep is turned off.
      [84434.866873] INFO: task btrfs-transacti:11921 blocked for more than 120 seconds.
      [84434.866881]       Tainted: G        W  O 3.12.0-fdm-btrfs-next-16+ #70
      [84434.866883] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [84434.866886] btrfs-transacti D ffff880755b6a478     0 11921      2 0x00000000
      [84434.866893]  ffff8800735b9ce8 0000000000000046 ffff8800735b9c88 ffffffff810bd8bd
      [84434.866899]  ffff8805a1b848a0 ffff8800735b9fd8 ffff8800735b9fd8 ffff8800735b9fd8
      [84434.866904]  ffffffff81c104e0 ffff8805a1b848a0 ffff880755b6a478 ffff8804cece78f0
      [84434.866910] Call Trace:
      [84434.866920]  [<ffffffff810bd8bd>] ? trace_hardirqs_on+0xd/0x10
      [84434.866927]  [<ffffffff81774269>] schedule+0x29/0x70
      [84434.866948]  [<ffffffffa06dd2ef>] wait_current_trans.isra.33+0xbf/0x120 [btrfs]
      [84434.866954]  [<ffffffff810715c0>] ? __init_waitqueue_head+0x60/0x60
      [84434.866970]  [<ffffffffa06dec18>] start_transaction+0x388/0x5a0 [btrfs]
      [84434.866985]  [<ffffffffa06db9b5>] ? transaction_kthread+0xb5/0x280 [btrfs]
      [84434.866999]  [<ffffffffa06dee97>] btrfs_attach_transaction+0x17/0x20 [btrfs]
      [84434.867012]  [<ffffffffa06dba9e>] transaction_kthread+0x19e/0x280 [btrfs]
      [84434.867026]  [<ffffffffa06db900>] ? open_ctree+0x2260/0x2260 [btrfs]
      [84434.867030]  [<ffffffff81070dad>] kthread+0xed/0x100
      [84434.867035]  [<ffffffff81070cc0>] ? flush_kthread_worker+0x190/0x190
      [84434.867040]  [<ffffffff8177ee6c>] ret_from_fork+0x7c/0xb0
      [84434.867044]  [<ffffffff81070cc0>] ? flush_kthread_worker+0x190/0x190
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      3fe81ce2
    • K
      btrfs: bootstrap generic btrfs_find_item interface · e33d5c3d
      Kelley Nielsen 提交于
      There are many btrfs functions that manually search the tree for an
      item. They all reimplement the same mechanism and differ in the
      conditions that they use to find the item. __inode_info() is one such
      example. Zach Brown proposed creating a new interface to take the place
      of these functions.
      
      This patch is the first step to creating the interface. A new function,
      btrfs_find_item, has been added to ctree.c and prototyped in ctree.h.
      It is identical to __inode_info, except that the order of the parameters
      has been rearranged to more closely those of similar functions elsewhere
      in the code (now, root and path come first, then the objectid, offset
      and type, and the key to be filled in last). __inode_info's callers have
      been set to call this new function instead, and __inode_info itself has
      been removed.
      Signed-off-by: NKelley Nielsen <kelleynnn@gmail.com>
      Suggested-by: NZach Brown <zab@redhat.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      e33d5c3d
    • V
  16. 12 11月, 2013 3 次提交
    • D
      btrfs: Use WARN_ON()'s return value in place of WARN_ON(1) · fae7f21c
      Dulshani Gunawardhana 提交于
      Use WARN_ON()'s return value in place of WARN_ON(1) for cleaner source
      code that outputs a more descriptive warnings. Also fix the styling
      warning of redundant braces that came up as a result of this fix.
      Signed-off-by: NDulshani Gunawardhana <dulshani.gunawardhana89@gmail.com>
      Reviewed-by: NZach Brown <zab@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      fae7f21c
    • L
      Btrfs: fix a crash when running balance and defrag concurrently · 48ec4736
      Liu Bo 提交于
      Running balance and defrag concurrently can end up with a crash:
      
      kernel BUG at fs/btrfs/relocation.c:4528!
      RIP: 0010:[<ffffffffa01ac33b>]  [<ffffffffa01ac33b>] btrfs_reloc_cow_block+ 0x1eb/0x230 [btrfs]
      Call Trace:
        [<ffffffffa01398c1>] ? update_ref_for_cow+0x241/0x380 [btrfs]
        [<ffffffffa0180bad>] ? copy_extent_buffer+0xad/0x110 [btrfs]
        [<ffffffffa0139da1>] __btrfs_cow_block+0x3a1/0x520 [btrfs]
        [<ffffffffa013a0b6>] btrfs_cow_block+0x116/0x1b0 [btrfs]
        [<ffffffffa013ddad>] btrfs_search_slot+0x43d/0x970 [btrfs]
        [<ffffffffa0153c57>] btrfs_lookup_file_extent+0x37/0x40 [btrfs]
        [<ffffffffa0172a5e>] __btrfs_drop_extents+0x11e/0xae0 [btrfs]
        [<ffffffffa013b3fd>] ? generic_bin_search.constprop.39+0x8d/0x1a0 [btrfs]
        [<ffffffff8117d14a>] ? kmem_cache_alloc+0x1da/0x200
        [<ffffffffa0138e7a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
        [<ffffffffa0173ef0>] btrfs_drop_extents+0x60/0x90 [btrfs]
        [<ffffffffa016b24d>] relink_extent_backref+0x2ed/0x780 [btrfs]
        [<ffffffffa0162fe0>] ? btrfs_submit_bio_hook+0x1e0/0x1e0 [btrfs]
        [<ffffffffa01b8ed7>] ? iterate_inodes_from_logical+0x87/0xa0 [btrfs]
        [<ffffffffa016b909>] btrfs_finish_ordered_io+0x229/0xac0 [btrfs]
        [<ffffffffa016c3b5>] finish_ordered_fn+0x15/0x20 [btrfs]
        [<ffffffffa018cbe5>] worker_loop+0x125/0x4e0 [btrfs]
        [<ffffffffa018cac0>] ? btrfs_queue_worker+0x300/0x300 [btrfs]
        [<ffffffff81075ea0>] kthread+0xc0/0xd0
        [<ffffffff81075de0>] ? insert_kthread_work+0x40/0x40
        [<ffffffff8164796c>] ret_from_fork+0x7c/0xb0
        [<ffffffff81075de0>] ? insert_kthread_work+0x40/0x40
      ----------------------------------------------------------------------
      
      It turns out to be that balance operation will bump root's @last_snapshot,
      which enables snapshot-aware defrag path, and backref walking stuff will
      find data reloc tree as refs' parent, and hit the BUG_ON() during COW.
      
      As data reloc tree's data is just for relocation purpose, and will be deleted right
      after relocation is done, it's unnecessary to walk those refs belonged to data reloc
      tree, it'd be better to skip them.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      48ec4736
    • R
      btrfs: drop unused parameter from btrfs_item_nr · dd3cc16b
      Ross Kirk 提交于
      Remove unused eb parameter from btrfs_item_nr
      Signed-off-by: NRoss Kirk <ross.kirk@gmail.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      dd3cc16b
  17. 01 9月, 2013 1 次提交