1. 05 10月, 2013 1 次提交
    • J
      Btrfs: fix transid verify errors when recovering log tree · 60e7cd3a
      Josef Bacik 提交于
      If we crash with a log, remount and recover that log, and then crash before we
      can commit another transaction we will get transid verify errors on the next
      mount.  This is because we were not zero'ing out the log when we committed the
      transaction after recovery.  This is ok as long as we commit another transaction
      at some point in the future, but if you abort or something else goes wrong you
      can end up in this weird state because the recovery stuff says that the tree log
      should have a generation+1 of the super generation, which won't be the case of
      the transaction that was started for recovery.  Fix this by removing the check
      and _always_ zero out the log portion of the super when we commit a transaction.
      This fixes the transid verify issues I was seeing with my force errors tests.
      Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      60e7cd3a
  2. 21 9月, 2013 26 次提交
  3. 03 9月, 2013 4 次提交
  4. 01 9月, 2013 9 次提交
    • F
      Btrfs: optimize key searches in btrfs_search_slot · d7396f07
      Filipe David Borba Manana 提交于
      When the binary search returns 0 (exact match), the target key
      will necessarily be at slot 0 of all nodes below the current one,
      so in this case the binary search is not needed because it will
      always return 0, and we waste time doing it, holding node locks
      for longer than necessary, etc.
      
      Below follow histograms with the times spent on the current approach of
      doing a binary search when the previous binary search returned 0, and
      times for the new approach, which directly picks the first item/child
      node in the leaf/node.
      
      Current approach:
      
      Count: 6682
      Range: 35.000 - 8370.000; Mean: 85.837; Median: 75.000; Stddev: 106.429
      Percentiles:  90th: 124.000; 95th: 145.000; 99th: 206.000
        35.000 -   61.080:  1235 ################
        61.080 -  106.053:  4207 #####################################################
       106.053 -  183.606:  1122 ##############
       183.606 -  317.341:   111 #
       317.341 -  547.959:     6 |
       547.959 - 8370.000:     1 |
      
      Approach proposed by this patch:
      
      Count: 6682
      Range:  6.000 - 135.000; Mean: 16.690; Median: 16.000; Stddev:  7.160
      Percentiles:  90th: 23.000; 95th: 27.000; 99th: 40.000
         6.000 -    8.418:    58 #
         8.418 -   11.670:  1149 #########################
        11.670 -   16.046:  2418 #####################################################
        16.046 -   21.934:  2098 ##############################################
        21.934 -   29.854:   744 ################
        29.854 -   40.511:   154 ###
        40.511 -   54.848:    41 #
        54.848 -   74.136:     5 |
        74.136 -  100.087:     9 |
       100.087 -  135.000:     6 |
      
      These samples were captured during a run of the btrfs tests 001, 002 and
      004 in the xfstests, with a leaf/node size of 4Kb.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      d7396f07
    • J
      Btrfs: don't use an async starter for most of our workers · 45d5fd14
      Josef Bacik 提交于
      We only need an async starter if we can't make a GFP_NOFS allocation in our
      current path.  This is the case for the endio stuff since it happens in IRQ
      context, but things like the caching thread workers and the delalloc flushers we
      can easily make this allocation and start threads right away.  Also change the
      worker count for the caching thread pool.  Traditionally we limited this to 2
      since we took read locks while caching, but nowadays we do this lockless so
      there's no reason to limit the number of caching threads.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      45d5fd14
    • J
      Btrfs: only update disk_i_size as we remove extents · 7f4f6e0a
      Josef Bacik 提交于
      This fixes a problem where if we fail a truncate we will leave the i_size set
      where we wanted to truncate to instead of where we were able to truncate to.
      Fix this by making btrfs_truncate_inode_items do the disk_i_size update as it
      removes extents, that way it will always be consistent with where its extents
      are.  Then if the truncate fails at all we can update the in-ram i_size with
      what we have on disk and delete the orphan item.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      7f4f6e0a
    • F
      Btrfs: fix deadlock in uuid scan kthread · f45388f3
      Filipe David Borba Manana 提交于
      If there's an ongoing transaction when the uuid scan kthread attempts
      to create one, the kthread will block, waiting for that transaction to
      finish while it's keeping locks on the tree root, and in turn the existing
      transaction is waiting for those locks to be free.
      
      The stack trace reported by the kernel follows.
      
      [36700.671601] INFO: task btrfs-uuid:15480 blocked for more than 120 seconds.
      [36700.671602] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [36700.671602] btrfs-uuid      D 0000000000000000     0 15480      2 0x00000000
      [36700.671604]  ffff880710bd5b88 0000000000000046 ffff8803d36ba850 0000000000030000
      [36700.671605]  ffff8806d76dc530 ffff880710bd5fd8 ffff880710bd5fd8 ffff880710bd5fd8
      [36700.671607]  ffff8808098ac530 ffff8806d76dc530 ffff880710bd5b98 ffff8805e4508e40
      [36700.671608] Call Trace:
      [36700.671610]  [<ffffffff816f36b9>] schedule+0x29/0x70
      [36700.671620]  [<ffffffffa05a3bdf>] wait_current_trans.isra.33+0xbf/0x120 [btrfs]
      [36700.671623]  [<ffffffff81066760>] ? add_wait_queue+0x60/0x60
      [36700.671629]  [<ffffffffa05a5b06>] start_transaction+0x3d6/0x530 [btrfs]
      [36700.671636]  [<ffffffffa05bb1f4>] ? btrfs_get_token_32+0x64/0xf0 [btrfs]
      [36700.671642]  [<ffffffffa05a5fbb>] btrfs_start_transaction+0x1b/0x20 [btrfs]
      [36700.671649]  [<ffffffffa05c8a81>] btrfs_uuid_scan_kthread+0x211/0x3d0 [btrfs]
      [36700.671655]  [<ffffffffa05c8870>] ? __btrfs_open_devices+0x2a0/0x2a0 [btrfs]
      [36700.671657]  [<ffffffff81065fa0>] kthread+0xc0/0xd0
      [36700.671659]  [<ffffffff81065ee0>] ? flush_kthread_worker+0xb0/0xb0
      [36700.671661]  [<ffffffff816fcd1c>] ret_from_fork+0x7c/0xb0
      [36700.671662]  [<ffffffff81065ee0>] ? flush_kthread_worker+0xb0/0xb0
      [36700.671663] INFO: task btrfs:15481 blocked for more than 120 seconds.
      [36700.671664] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [36700.671665] btrfs           D 0000000000000000     0 15481  15212 0x00000004
      [36700.671666]  ffff880248cbf4c8 0000000000000086 ffff8803d36ba700 ffff8801dbd5c280
      [36700.671668]  ffff880807815c40 ffff880248cbffd8 ffff880248cbffd8 ffff880248cbffd8
      [36700.671669]  ffff8805e86a0000 ffff880807815c40 ffff880248cbf4d8 ffff8801dbd5c280
      [36700.671670] Call Trace:
      [36700.671672]  [<ffffffff816f36b9>] schedule+0x29/0x70
      [36700.671679]  [<ffffffffa05d9b0d>] btrfs_tree_lock+0x6d/0x230 [btrfs]
      [36700.671680]  [<ffffffff81066760>] ? add_wait_queue+0x60/0x60
      [36700.671685]  [<ffffffffa0582829>] btrfs_search_slot+0x999/0xb00 [btrfs]
      [36700.671691]  [<ffffffffa05bd9de>] ? btrfs_lookup_first_ordered_extent+0x5e/0xb0 [btrfs]
      [36700.671698]  [<ffffffffa05e3e54>] __btrfs_write_out_cache+0x8c4/0xa80 [btrfs]
      [36700.671704]  [<ffffffffa05e4362>] btrfs_write_out_cache+0xb2/0xf0 [btrfs]
      [36700.671710]  [<ffffffffa05c4441>] ? free_extent_buffer+0x61/0xc0 [btrfs]
      [36700.671716]  [<ffffffffa0594c82>] btrfs_write_dirty_block_groups+0x562/0x650 [btrfs]
      [36700.671723]  [<ffffffffa0610092>] commit_cowonly_roots+0x171/0x24b [btrfs]
      [36700.671729]  [<ffffffffa05a4dde>] btrfs_commit_transaction+0x4fe/0xa10 [btrfs]
      [36700.671735]  [<ffffffffa0610af3>] create_subvol+0x5c0/0x636 [btrfs]
      [36700.671742]  [<ffffffffa05d49ff>] btrfs_mksubvol.isra.60+0x33f/0x3f0 [btrfs]
      [36700.671747]  [<ffffffffa05d4bf2>] btrfs_ioctl_snap_create_transid+0x142/0x190 [btrfs]
      [36700.671752]  [<ffffffffa05d4c6c>] ? btrfs_ioctl_snap_create+0x2c/0x80 [btrfs]
      [36700.671757]  [<ffffffffa05d4c9e>] btrfs_ioctl_snap_create+0x5e/0x80 [btrfs]
      [36700.671759]  [<ffffffff8113a764>] ? handle_pte_fault+0x84/0x920
      [36700.671764]  [<ffffffffa05d87eb>] btrfs_ioctl+0xf0b/0x1d00 [btrfs]
      [36700.671766]  [<ffffffff8113c120>] ? handle_mm_fault+0x210/0x310
      [36700.671768]  [<ffffffff816f83a4>] ? __do_page_fault+0x284/0x4e0
      [36700.671770]  [<ffffffff81180aa6>] do_vfs_ioctl+0x96/0x550
      [36700.671772]  [<ffffffff81170fe3>] ? __sb_end_write+0x33/0x70
      [36700.671774]  [<ffffffff81180ff1>] SyS_ioctl+0x91/0xb0
      [36700.671775]  [<ffffffff816fcdc2>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      f45388f3
    • I
      Btrfs: stop refusing the relocation of chunk 0 · 795a3321
      Ilya Dryomov 提交于
      AFAICT chunk 0 is no longer special, and so it should be restriped just
      like every other chunk.  One reason for this change is us refusing the
      relocation can lead to filesystems that can only be mounted ro, and
      never rw -- see the bugzilla [1] for details.  The other reason is that
      device removal code is already doing this: it will happily relocate
      chunk 0 is part of shrinking the device.
      
      [1] https://bugzilla.kernel.org/show_bug.cgi?id=60594Reported-by: NXavier Bassery <xavier@bartica.org>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      795a3321
    • F
    • A
      btrfs: reuse kbasename helper · ed84885d
      Andy Shevchenko 提交于
      To get name of the file from a pathname let's use kbasename() helper. It allows
      to simplify code a bit.
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      ed84885d
    • A
      btrfs: return btrfs error code for dev excl ops err · e57138b3
      Anand Jain 提交于
      now threads can return BTRFS_ERROR_DEV_EXCL_RUN_IN_PROGRESS
      as defined in btrfs.h for the dev excl operation error in
      the FS, which means with this kernel would stop logging
      (almost an user error) into the /var/log/messages
      
      v2: accepts Josef' comment
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      e57138b3
    • J
      Btrfs: allow partial ordered extent completion · 77cef2ec
      Josef Bacik 提交于
      We currently have this problem where you can truncate pages that have not yet
      been written for an ordered extent.  We do this because the truncate will be
      coming behind to clean us up anyway so what's the harm right?  Well if truncate
      fails for whatever reason we leave an orphan item around for the file to be
      cleaned up later.  But if the user goes and truncates up the file and tries to
      read from the area that had been discarded previously they will get a csum error
      because we never actually wrote that data out.
      
      This patch fixes this by allowing us to either discard the ordered extent
      completely, by which I mean we just free up the space we had allocated and not
      add the file extent, or adjust the length of the file extent we write.  We do
      this by setting the length we truncated down to in the ordered extent, and then
      we set the file extent length and ram bytes to this length.  The total disk
      space stays unchanged since we may be compressed and we can't just chop off the
      disk space, but at least this way the file extent only points to the valid data.
      Then when the file extent is free'd the extent and csums will be freed normally.
      
      This patch is needed for the next series which will give us more graceful
      recovery of failed truncates.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      77cef2ec