1. 01 9月, 2013 1 次提交
    • J
      Btrfs: stop using GFP_ATOMIC for the tree mod log allocations · c8cc6341
      Josef Bacik 提交于
      Previously we held the tree mod lock when adding stuff because we use it to
      check and see if we truly do want to track tree modifications.  This is
      admirable, but GFP_ATOMIC in a critical area that is going to get hit pretty
      hard and often is not nice.  So instead do our basic checks to see if we don't
      need to track modifications, and if those pass then do our allocation, and then
      when we go to insert the new modification check if we still care, and if we
      don't just free up our mod and return.  Otherwise we're good to go and we can
      carry on.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      c8cc6341
  2. 10 8月, 2013 1 次提交
  3. 02 7月, 2013 2 次提交
    • J
      Btrfs: only do the tree_mod_log_free_eb if this is our last ref · 7fb7d76f
      Josef Bacik 提交于
      There is another bug in the tree mod log stuff in that we're calling
      tree_mod_log_free_eb every single time a block is cow'ed.  The problem with this
      is that if this block is shared by multiple snapshots we will call this multiple
      times per block, so if we go to rewind the mod log for this block we'll BUG_ON()
      in __tree_mod_log_rewind because we try to rewind a free twice.  We only want to
      call tree_mod_log_free_eb if we are actually freeing the block.  With this patch
      I no longer hit the panic in __tree_mod_log_rewind.  Thanks,
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      7fb7d76f
    • J
      Btrfs: hold the tree mod lock in __tree_mod_log_rewind · f1ca7e98
      Josef Bacik 提交于
      We need to hold the tree mod log lock in __tree_mod_log_rewind since we walk
      forward in the tree mod entries, otherwise we'll end up with random entries and
      trip the BUG_ON() at the front of __tree_mod_log_rewind.  This fixes the panics
      people were seeing when running
      
      find /whatever -type f -exec btrfs fi defrag {} \;
      
      Thansk,
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      f1ca7e98
  4. 01 7月, 2013 2 次提交
    • J
      Btrfs: optimize reada_for_balance · 0b08851f
      Josef Bacik 提交于
      This patch does two things.  First we no longer explicitly read in the blocks
      we're trying to readahead.  For things like balance_level we may never actually
      use the blocks so this just adds uneeded latency, and balance_level and
      split_node will both read in the blocks they care about explicitly so if the
      blocks need to be waited on it will be done there.  Secondly we no longer drop
      the path if we do readahead, we just set the path blocking before we call
      reada_for_balance() and then we're good to go.  Hopefully this will cut down on
      the number of re-searches.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      0b08851f
    • J
      Btrfs: optimize read_block_for_search · bdf7c00e
      Josef Bacik 提交于
      This patch does two things, first it only does one call to
      btrfs_buffer_uptodate() with the gen specified instead of once with 0 and then
      again with gen specified.  The other thing is to call btrfs_read_buffer() on the
      buffer we've found instead of dropping it and then calling read_tree_block().
      This will keep us from doing yet another radix tree lookup for a buffer we've
      already found.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      bdf7c00e
  5. 14 6月, 2013 3 次提交
  6. 28 5月, 2013 1 次提交
  7. 18 5月, 2013 1 次提交
    • J
      Btrfs: handle running extent ops with skinny metadata · b1c79e09
      Josef Bacik 提交于
      Chris hit a bug where we weren't finding extent records when running extent ops.
      This is because we use the delayed_ref_head when running the extent op, which
      means we can't use the ->type checks to see if we are metadata.  We also lose
      the level of the metadata we are working on.  So to fix this we can just check
      the ->is_data section of the extent_op, and we can store the level of the buffer
      we were modifying in the extent_op.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      b1c79e09
  8. 07 5月, 2013 11 次提交
    • E
      btrfs: make static code static & remove dead code · 48a3b636
      Eric Sandeen 提交于
      Big patch, but all it does is add statics to functions which
      are in fact static, then remove the associated dead-code fallout.
      
      removed functions:
      
      btrfs_iref_to_path()
      __btrfs_lookup_delayed_deletion_item()
      __btrfs_search_delayed_insertion_item()
      __btrfs_search_delayed_deletion_item()
      find_eb_for_page()
      btrfs_find_block_group()
      range_straddles_pages()
      extent_range_uptodate()
      btrfs_file_extent_length()
      btrfs_scrub_cancel_devid()
      btrfs_start_transaction_lflush()
      
      btrfs_print_tree() is left because it is used for debugging.
      btrfs_start_transaction_lflush() and btrfs_reada_detach() are
      left for symmetry.
      
      ulist.c functions are left, another patch will take care of those.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      48a3b636
    • J
      Btrfs: separate sequence numbers for delayed ref tracking and tree mod log · fc36ed7e
      Jan Schmidt 提交于
      Sequence numbers for delayed refs have been introduced in the first version
      of the qgroup patch set. To solve the problem of find_all_roots on a busy
      file system, the tree mod log was introduced. The sequence numbers for that
      were simply shared between those two users.
      
      However, at one point in qgroup's quota accounting, there's a statement
      accessing the previous sequence number, that's still just doing (seq - 1)
      just as it would have to in the very first version.
      
      To satisfy that requirement, this patch makes the sequence number counter 64
      bit and splits it into a major part (used for qgroup sequence number
      counting) and a minor part (incremented for each tree modification in the
      log). This enables us to go exactly one major step backwards, as required
      for qgroups, while still incrementing the sequence counter for tree mod log
      insertions to keep track of their order. Keeping them in a single variable
      means there's no need to change all the code dealing with comparisons of two
      sequence numbers.
      
      The sequence number is reset to 0 on commit (not new in this patch), which
      ensures we won't overflow the two 32 bit counters.
      
      Without this fix, the qgroup tracking can occasionally go wrong and WARN_ONs
      from the tree mod log code may happen.
      Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      fc36ed7e
    • J
      Btrfs: fix all callers of read_tree_block · 416bc658
      Josef Bacik 提交于
      We kept leaking extent buffers when mounting a broken file system and it turns
      out it's because not everybody uses read_tree_block properly.  You need to check
      and make sure the extent_buffer is uptodate before you use it.  This patch fixes
      everybody who calls read_tree_block directly to make sure they check that it is
      uptodate and free it and return an error if it is not.  With this we no longer
      leak EB's when things go horribly wrong.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      416bc658
    • T
      Btrfs: remove unused argument of btrfs_extend_item() · 4b90c680
      Tsutomu Itoh 提交于
      Argument 'trans' is not used in btrfs_extend_item().
      Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      4b90c680
    • T
      Btrfs: cleanup of function where fixup_low_keys() is called · afe5fea7
      Tsutomu Itoh 提交于
      If argument 'trans' is unnecessary in the function where
      fixup_low_keys() is called, 'trans' is deleted.
      Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      afe5fea7
    • T
      Btrfs: remove unused argument of fixup_low_keys() · d6a0a126
      Tsutomu Itoh 提交于
      Argument 'trans' is not used in fixup_low_keys(). So, remove it.
      Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      d6a0a126
    • J
      Btrfs: fix unlock after free on rewinded tree blocks · 47fb091f
      Jan Schmidt 提交于
      When tree_mod_log_rewind decides to make a copy of the current tree buffer
      for its modifications, it subsequently freed the buffer before unlocking it.
      Obviously, those operations are required in reverse order.
      Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      47fb091f
    • J
      Btrfs: fix accessing the root pointer in tree mod log functions · 30b0463a
      Jan Schmidt 提交于
      The tree mod log functions were accessing root->node->... directly, without
      use of btrfs_root_node() or explicit rcu locking. This could lead to an
      extent buffer reference being leaked and another reference being freed too
      early when preemtion was enabled.
      Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      30b0463a
    • J
      Btrfs: fix tree mod log regression on root split operations · 90f8d62e
      Jan Schmidt 提交于
      Commit d9abbf1c changed tree mod log locking around ROOT_REPLACE operations.
      When a tree root is split, however, we were logging removal of all elements
      from the root node before logging removal of half of the elements for the
      split operation. This leads to a BUG_ON when rewinding.
      
      This commit removes the erroneous logging of removal of all elements.
      Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      90f8d62e
    • J
      Btrfs: fix bad extent logging · 09a2a8f9
      Josef Bacik 提交于
      A user sent me a btrfs-image of a file system that was panicing on mount during
      the log recovery.  I had originally thought these problems were from a bug in
      the free space cache code, but that was just a symptom of the problem.  The
      problem is if your application does something like this
      
      [prealloc][prealloc][prealloc]
      
      the internal extent maps will merge those all together into one extent map, even
      though on disk they are 3 separate extents.  So if you go to write into one of
      these ranges the extent map will be right since we use the physical extent when
      doing the write, but when we log the extents they will use the wrong sizes for
      the remainder prealloc space.  If this doesn't happen to trip up the free space
      cache (which it won't in a lot of cases) then you will get bogus entries in your
      extent tree which will screw stuff up later.  The data and such will still work,
      but everything else is broken.  This patch fixes this by not allowing extents
      that are on the modified list to be merged.  This has the side effect that we
      are no longer adding everything to the modified list all the time, which means
      we now have to call btrfs_drop_extents every time we log an extent into the
      tree.  So this allows me to drop all this speciality code I was using to get
      around calling btrfs_drop_extents.  With this patch the testcase I've created no
      longer creates a bogus file system after replaying the log.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      09a2a8f9
    • J
      Btrfs: add a incompatible format change for smaller metadata extent refs · 3173a18f
      Josef Bacik 提交于
      We currently store the first key of the tree block inside the reference for the
      tree block in the extent tree.  This takes up quite a bit of space.  Make a new
      key type for metadata which holds the level as the offset and completely removes
      storing the btrfs_tree_block_info inside the extent ref.  This reduces the size
      from 51 bytes to 33 bytes per extent reference for each tree block.  In practice
      this results in a 30-35% decrease in the size of our extent tree, which means we
      COW less and can keep more of the extent tree in memory which makes our heavy
      metadata operations go much faster.  This is not an automatic format change, you
      must enable it at mkfs time or with btrfstune.  This patch deals with having
      metadata stored as either the old format or the new format so it is easy to
      convert.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      3173a18f
  9. 22 3月, 2013 1 次提交
    • J
      Btrfs: fix locking on ROOT_REPLACE operations in tree mod log · d9abbf1c
      Jan Schmidt 提交于
      To resolve backrefs, ROOT_REPLACE operations in the tree mod log are
      required to be tied to at least one KEY_REMOVE_WHILE_FREEING operation.
      Therefore, those operations must be enclosed by tree_mod_log_write_lock()
      and tree_mod_log_write_unlock() calls.
      
      Those calls are private to the tree_mod_log_* functions, which means that
      removal of the elements of an old root node must be logged from
      tree_mod_log_insert_root. This partly reverts and corrects commit ba1bfbd5
      (Btrfs: fix a tree mod logging issue for root replacement operations).
      
      This fixes the brand-new version of xfstest 276 as of commit cfe73f71.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      d9abbf1c
  10. 21 2月, 2013 2 次提交
  11. 15 2月, 2013 1 次提交
  12. 19 12月, 2012 2 次提交
  13. 17 12月, 2012 5 次提交
  14. 13 12月, 2012 2 次提交
  15. 12 12月, 2012 4 次提交
  16. 26 10月, 2012 1 次提交