1. 01 9月, 2013 1 次提交
  2. 02 7月, 2013 2 次提交
    • J
      Btrfs: check if we can nocow if we don't have data space · 7ee9e440
      Josef Bacik 提交于
      We always just try and reserve data space when we write, but if we are out of
      space but have prealloc'ed extents we should still successfully write.  This
      patch will try and see if we can write to prealloc'ed space and if we can go
      ahead and allow the write to continue.  With this patch we now pass xfstests
      generic/274.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      7ee9e440
    • J
      Btrfs: use a percpu to keep track of possibly pinned bytes · b150a4f1
      Josef Bacik 提交于
      There are all of these checks in the ENOSPC code to see if committing the
      transaction would free up enough space to make the allocation.  This is because
      early on we just committed the transaction and hoped and prayed, which resulted
      in cases where it took _forever_ to get an ENOSPC when we really were out of
      space.  So we check space_info->bytes_pinned, except this isn't completely true
      because it doesn't account for space we may free but are stuck in delayed refs.
      So tests like xfstests 226 would fail because we wouldn't commit the transaction
      to free up the data space.  So instead add a percpu counter that will be a
      little fuzzier, it will add bytes as soon as we try to free up the space, and
      remove any space it doesn't actually free up when we get around to doing the
      actual free.  We then 0 out this counter every transaction period so we have a
      better idea of how much space we will actually free up by committing this
      transaction.  With this patch we now pass xfstests 226.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      b150a4f1
  3. 01 7月, 2013 1 次提交
    • J
      Btrfs: fix transaction throttling for delayed refs · 1be41b78
      Josef Bacik 提交于
      Dave has this fs_mark script that can make btrfs abort with sufficient amount of
      ram.  This is because with more ram we can keep more dirty metadata in cache
      which in a round about way makes for many more pending delayed refs.  What
      happens is we end up not throttling the transaction enough so when we go to
      commit the transaction when we've completely filled the file system we'll
      abort() because we use all of the space in the global reserve and we still have
      delayed refs to run.  To fix this we need to make the delayed ref flushing and
      the transaction throttling dependant upon the number of delayed refs that we
      have instead of how much reserved space is left in the global reserve.  With
      this patch we not only stop aborting transactions but we also get a smoother run
      speed with fs_mark and it makes us about 10% faster.  Thanks,
      Reported-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      1be41b78
  4. 14 6月, 2013 12 次提交
    • J
      Btrfs: exclude logged extents before replying when we are mixed · 8c2a1a30
      Josef Bacik 提交于
      With non-mixed block groups we replay the logs before we're allowed to do any
      writes, so we get away with not pinning/removing the data extents until right
      when we replay them.  However with mixed block groups we allocate out of the
      same pool, so we could easily allocate a metadata block that was logged in our
      tree log.  To deal with this we just need to notice that we have mixed block
      groups and do the normal excluding/removal dance during the pin stage of the log
      replay and that way we don't allocate metadata blocks from areas we have logged
      data extents.  With this patch we now pass xfstests generic/311 with mixed
      block groups turned on.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      8c2a1a30
    • J
      Btrfs: fix qgroup rescan resume on mount · b382a324
      Jan Schmidt 提交于
      When called during mount, we cannot start the rescan worker thread until
      open_ctree is done. This commit restuctures the qgroup rescan internals to
      enable a clean deferral of the rescan resume operation.
      
      First of all, the struct qgroup_rescan is removed, saving us a malloc and
      some initialization synchronizations problems. Its only element (the worker
      struct) now lives within fs_info just as the rest of the rescan code.
      
      Then setting up a rescan worker is split into several reusable stages.
      Currently we have three different rescan startup scenarios:
      	(A) rescan ioctl
      	(B) rescan resume by mount
      	(C) rescan by quota enable
      
      Each case needs its own combination of the four following steps:
      	(1) set the progress [A, C: zero; B: state of umount]
      	(2) commit the transaction [A]
      	(3) set the counters [A, C: zero; B: state of umount]
      	(4) start worker [A, B, C]
      
      qgroup_rescan_init does step (1). There's no extra function added to commit
      a transaction, we've got that already. qgroup_rescan_zero_tracking does
      step (3). Step (4) is nothing more than a call to the generic
      btrfs_queue_worker.
      
      We also get rid of a double check for the rescan progress during
      btrfs_qgroup_account_ref, which is no longer required due to having step 2
      from the list above.
      
      As a side effect, this commit prepares to move the rescan start code from
      btrfs_run_qgroups (which is run during commit) to a less time critical
      section.
      Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      b382a324
    • J
      Btrfs: simplify unlink reservations · d52be818
      Josef Bacik 提交于
      Dave pointed out a problem where if you filled up a file system as much as
      possible you couldn't remove any files.  The whole unlink reservation thing is
      convoluted because it tries to guess if it's going to add space to unlink
      something or not, and has all these odd uncommented cases where it simply does
      not try.  So to fix this I've added a way to conditionally steal from the global
      reserve if we can't make our normal reservation.  If we have more than half the
      space in the global reserve free we will go ahead and steal from the global
      reserve.  With this patch Dave's reproducer now works and I can rm all the files
      on the file system.  Thanks,
      Reported-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      d52be818
    • M
      Btrfs: make the state of the transaction more readable · 4a9d8bde
      Miao Xie 提交于
      We used 3 variants to track the state of the transaction, it was complex
      and wasted the memory space. Besides that, it was hard to understand that
      which types of the transaction handles should be blocked in each transaction
      state, so the developers often made mistakes.
      
      This patch improved the above problem. In this patch, we define 6 states
      for the transaction,
        enum btrfs_trans_state {
      	TRANS_STATE_RUNNING		= 0,
      	TRANS_STATE_BLOCKED		= 1,
      	TRANS_STATE_COMMIT_START	= 2,
      	TRANS_STATE_COMMIT_DOING	= 3,
      	TRANS_STATE_UNBLOCKED		= 4,
      	TRANS_STATE_COMPLETED		= 5,
      	TRANS_STATE_MAX			= 6,
        }
      and just use 1 variant to track those state.
      
      In order to make the blocked handle types for each state more clear,
      we introduce a array:
        unsigned int btrfs_blocked_trans_types[TRANS_STATE_MAX] = {
      	[TRANS_STATE_RUNNING]		= 0U,
      	[TRANS_STATE_BLOCKED]		= (__TRANS_USERSPACE |
      					   __TRANS_START),
      	[TRANS_STATE_COMMIT_START]	= (__TRANS_USERSPACE |
      					   __TRANS_START |
      					   __TRANS_ATTACH),
      	[TRANS_STATE_COMMIT_DOING]	= (__TRANS_USERSPACE |
      					   __TRANS_START |
      					   __TRANS_ATTACH |
      					   __TRANS_JOIN),
      	[TRANS_STATE_UNBLOCKED]		= (__TRANS_USERSPACE |
      					   __TRANS_START |
      					   __TRANS_ATTACH |
      					   __TRANS_JOIN |
      					   __TRANS_JOIN_NOLOCK),
      	[TRANS_STATE_COMPLETED]		= (__TRANS_USERSPACE |
      					   __TRANS_START |
      					   __TRANS_ATTACH |
      					   __TRANS_JOIN |
      					   __TRANS_JOIN_NOLOCK),
        }
      it is very intuitionistic.
      
      Besides that, because we remove ->in_commit in transaction structure, so
      the lock ->commit_lock which was used to protect it is unnecessary, remove
      ->commit_lock.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      4a9d8bde
    • M
      Btrfs: introduce per-subvolume ordered extent list · 199c2a9c
      Miao Xie 提交于
      The reason we introduce per-subvolume ordered extent list is the same
      as the per-subvolume delalloc inode list.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      199c2a9c
    • M
      Btrfs: introduce per-subvolume delalloc inode list · eb73c1b7
      Miao Xie 提交于
      When we create a snapshot, we need flush all delalloc inodes in the
      fs, just flushing the inodes in the source tree is OK. So we introduce
      per-subvolume delalloc inode list.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      eb73c1b7
    • M
      Btrfs: introduce grab/put functions for the root of the fs/file tree · b0feb9d9
      Miao Xie 提交于
      The grab/put funtions will be used in the next patch, which need grab
      the root object and ensure it is not freed. We use reference counter
      instead of the srcu lock is to aovid blocking the memory reclaim task,
      which invokes synchronize_srcu().
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      b0feb9d9
    • M
      Btrfs: cleanup the similar code of the fs root read · cb517eab
      Miao Xie 提交于
      There are several functions whose code is similar, such as
        btrfs_find_last_root()
        btrfs_read_fs_root_no_radix()
      
      Besides that, some functions are invoked twice, it is unnecessary,
      for example, we are sure that all roots which is found in
        btrfs_find_orphan_roots()
      have their orphan items, so it is unnecessary to check the orphan
      item again.
      
      So cleanup it.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      cb517eab
    • M
      Btrfs: make the snap/subv deletion end more early when the fs is R/O · babbf170
      Miao Xie 提交于
      The snapshot/subvolume deletion might spend lots of time, it would make
      the remount task wait for a long time. This patch improve this problem,
      we will break the deletion if the fs is remounted to be R/O. It will make
      the users happy.
      
      Cc: David Sterba <dsterba@suse.cz>
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      babbf170
    • A
      Minor format cleanup. · 1c89cdd1
      Andreas Philipp 提交于
      Clean up the format of the definitions of BTRFS_BLOCK_GROUP_RAID5 and
      BTRFS_BLOCK_GROUP_RAID6.
      Signed-off-by: NAndreas Philipp <philipp.andreas@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      1c89cdd1
    • J
      Btrfs: add ioctl to wait for qgroup rescan completion · 57254b6e
      Jan Schmidt 提交于
      btrfs_qgroup_wait_for_completion waits until the currently running qgroup
      operation completes. It returns immediately when no rescan process is in
      progress. This is useful to automate things around the rescan process (e.g.
      testing).
      Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      57254b6e
    • W
      Btrfs: introduce qgroup_ulist to avoid frequently allocating/freeing ulist · 1e8f9158
      Wang Shilong 提交于
      When doing qgroup accounting, we call ulist_alloc()/ulist_free() every time
      when we want to walk qgroup tree.
      
      By introducing 'qgroup_ulist', we only need to call ulist_alloc()/ulist_free()
      once. This reduce some sys time to allocate memory, see the measurements below
      
      fsstress -p 4 -n 10000 -d $dir
      
      With this patch:
      
      real    0m50.153s
      user    0m0.081s
      sys     0m6.294s
      
      real    0m51.113s
      user    0m0.092s
      sys     0m6.220s
      
      real    0m52.610s
      user    0m0.096s
      sys     0m6.125s	avg 6.213
      -----------------------------------------------------
      Without the patch:
      
      real    0m54.825s
      user    0m0.061s
      sys     0m10.665s
      
      real    1m6.401s
      user    0m0.089s
      sys     0m11.218s
      
      real    1m13.768s
      user    0m0.087s
      sys     0m10.665s       avg 10.849
      
      we can see the sys time reduce ~43%.
      Signed-off-by: NWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      1e8f9158
  5. 18 5月, 2013 2 次提交
    • J
      Btrfs: handle running extent ops with skinny metadata · b1c79e09
      Josef Bacik 提交于
      Chris hit a bug where we weren't finding extent records when running extent ops.
      This is because we use the delayed_ref_head when running the extent op, which
      means we can't use the ->type checks to see if we are metadata.  We also lose
      the level of the metadata we are working on.  So to fix this we can just check
      the ->is_data section of the extent_op, and we can store the level of the buffer
      we were modifying in the extent_op.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      b1c79e09
    • D
      btrfs: annotate quota tree for lockdep · 60b62978
      David Sterba 提交于
      Quota tree has been missing from lockdep annotations, though no warning
      has been seen in the wild.
      
      There's currently one entry that does not belong there,
      BTRFS_ORPHAN_OBJECTID.  No such tree exists, it's probably a copy &
      paste mistake, the id is defined among tree ids.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      60b62978
  6. 07 5月, 2013 14 次提交
    • D
      btrfs: enhance superblock checks · 1104a885
      David Sterba 提交于
      The superblock checksum is not verified upon mount. <awkward silence>
      
      Add that check and also reorder existing checks to a more logical
      order.
      
      Current mkfs.btrfs does not calculate the correct checksum of
      super_block and thus a freshly created filesytem will fail to mount when
      this patch is applied.
      
      First transaction commit calculates correct superblock checksum and
      saves it to disk.
      
      Reproducer:
      $ mfks.btrfs /dev/sda
      $ mount /dev/sda /mnt
      $ btrfs scrub start /mnt
      $ sleep 5
      $ btrfs scrub status /mnt
      ... super:2 ...
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      1104a885
    • D
      btrfs: fix misleading variable name for flags · b6919a58
      David Sterba 提交于
      The variable was named 'data' in btrfs_reserve_extent and that's the
      only function that actually uses it to let btrfs_get_alloc_profile know
      what profile we want. Then it's passed down as u64 flags.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      b6919a58
    • E
      btrfs: make static code static & remove dead code · 48a3b636
      Eric Sandeen 提交于
      Big patch, but all it does is add statics to functions which
      are in fact static, then remove the associated dead-code fallout.
      
      removed functions:
      
      btrfs_iref_to_path()
      __btrfs_lookup_delayed_deletion_item()
      __btrfs_search_delayed_insertion_item()
      __btrfs_search_delayed_deletion_item()
      find_eb_for_page()
      btrfs_find_block_group()
      range_straddles_pages()
      extent_range_uptodate()
      btrfs_file_extent_length()
      btrfs_scrub_cancel_devid()
      btrfs_start_transaction_lflush()
      
      btrfs_print_tree() is left because it is used for debugging.
      btrfs_start_transaction_lflush() and btrfs_reada_detach() are
      left for symmetry.
      
      ulist.c functions are left, another patch will take care of those.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      48a3b636
    • J
      Btrfs: rescan for qgroups · 2f232036
      Jan Schmidt 提交于
      If qgroup tracking is out of sync, a rescan operation can be started. It
      iterates the complete extent tree and recalculates all qgroup tracking data.
      This is an expensive operation and should not be used unless required.
      
      A filesystem under rescan can still be umounted. The rescan continues on the
      next mount.  Status information is provided with a separate ioctl while a
      rescan operation is in progress.
      Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      2f232036
    • J
      Btrfs: separate sequence numbers for delayed ref tracking and tree mod log · fc36ed7e
      Jan Schmidt 提交于
      Sequence numbers for delayed refs have been introduced in the first version
      of the qgroup patch set. To solve the problem of find_all_roots on a busy
      file system, the tree mod log was introduced. The sequence numbers for that
      were simply shared between those two users.
      
      However, at one point in qgroup's quota accounting, there's a statement
      accessing the previous sequence number, that's still just doing (seq - 1)
      just as it would have to in the very first version.
      
      To satisfy that requirement, this patch makes the sequence number counter 64
      bit and splits it into a major part (used for qgroup sequence number
      counting) and a minor part (incremented for each tree modification in the
      log). This enables us to go exactly one major step backwards, as required
      for qgroups, while still incrementing the sequence counter for tree mod log
      insertions to keep track of their order. Keeping them in a single variable
      means there's no need to change all the code dealing with comparisons of two
      sequence numbers.
      
      The sequence number is reset to 0 on commit (not new in this patch), which
      ensures we won't overflow the two 32 bit counters.
      
      Without this fix, the qgroup tracking can occasionally go wrong and WARN_ONs
      from the tree mod log code may happen.
      Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      fc36ed7e
    • S
    • T
      Btrfs: remove unused argument of btrfs_extend_item() · 4b90c680
      Tsutomu Itoh 提交于
      Argument 'trans' is not used in btrfs_extend_item().
      Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      4b90c680
    • T
      Btrfs: cleanup of function where fixup_low_keys() is called · afe5fea7
      Tsutomu Itoh 提交于
      If argument 'trans' is unnecessary in the function where
      fixup_low_keys() is called, 'trans' is deleted.
      Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      afe5fea7
    • M
      Btrfs: use a lock to protect incompat/compat flag of the super block · ceda0864
      Miao Xie 提交于
      The following case will make the incompat/compat flag of the super block
      be recovered.
       Task1					|Task2
       flags = btrfs_super_incompat_flags();	|
      					|flags = btrfs_super_incompat_flags();
       flags |= new_flag1;			|
      					|flags |= new_flag2;
       btrfs_set_super_incompat_flags(flags);	|
      					|btrfs_set_super_incompat_flags(flags);
      the new_flag1 is recovered.
      
      In order to avoid this problem, we introduce a lock named super_lock into
      the btrfs_fs_info structure. If we want to update incompat/compat flags
      of the super block, we must hold it.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      ceda0864
    • W
      Btrfs: introduce a mutex lock for btrfs quota operations · f2f6ed3d
      Wang Shilong 提交于
      The original code has one spin_lock 'qgroup_lock' to protect quota
      configurations in memory. If we want to add a BTRFS_QGROUP_INFO_KEY,
      it will be added to Btree firstly, and then update configurations in
      memory,however, a race condition may happen between these operations.
      For example:
      	->add_qgroup_info_item()
      		->add_qgroup_rb()
      
      For the above case, del_qgroup_info_item() may happen just before
      add_qgroup_rb().
      
      What's worse, when we want to add a qgroup relation:
      	->add_qgroup_relation_item()
      		->add_qgroup_relations()
      
      We don't have any checks whether 'src' and 'dst' exist before
      add_qgroup_relation_item(), a race condition can also happen for
      the above case.
      
      To avoid race condition and have all the necessary checks, we introduce
      a mutex lock 'qgroup_ioctl_lock', and we make all the user change operations
      protected by the mutex lock.
      Signed-off-by: NWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Reviewed-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      f2f6ed3d
    • J
      Btrfs: fix bad extent logging · 09a2a8f9
      Josef Bacik 提交于
      A user sent me a btrfs-image of a file system that was panicing on mount during
      the log recovery.  I had originally thought these problems were from a bug in
      the free space cache code, but that was just a symptom of the problem.  The
      problem is if your application does something like this
      
      [prealloc][prealloc][prealloc]
      
      the internal extent maps will merge those all together into one extent map, even
      though on disk they are 3 separate extents.  So if you go to write into one of
      these ranges the extent map will be right since we use the physical extent when
      doing the write, but when we log the extents they will use the wrong sizes for
      the remainder prealloc space.  If this doesn't happen to trip up the free space
      cache (which it won't in a lot of cases) then you will get bogus entries in your
      extent tree which will screw stuff up later.  The data and such will still work,
      but everything else is broken.  This patch fixes this by not allowing extents
      that are on the modified list to be merged.  This has the side effect that we
      are no longer adding everything to the modified list all the time, which means
      we now have to call btrfs_drop_extents every time we log an extent into the
      tree.  So this allows me to drop all this speciality code I was using to get
      around calling btrfs_drop_extents.  With this patch the testcase I've created no
      longer creates a bogus file system after replaying the log.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      09a2a8f9
    • S
      Btrfs: Include the device in most error printk()s · c2cf52eb
      Simon Kirby 提交于
      With more than one btrfs volume mounted, it can be very difficult to find
      out which volume is hitting an error. btrfs_error() will print this, but
      it is currently rigged as more of a fatal error handler, while many of
      the printk()s are currently for debugging and yet-unhandled cases.
      
      This patch just changes the functions where the device information is
      already available. Some cases remain where the root or fs_info is not
      passed to the function emitting the error.
      
      This may introduce some confusion with volumes backed by multiple devices
      emitting errors referring to the primary device in the set instead of the
      one on which the error occurred.
      
      Use btrfs_printk(fs_info, format, ...) rather than writing the device
      string every time, and introduce macro wrappers ala XFS for brevity.
      Since the function already cannot be used for continuations, print a
      newline as part of the btrfs_printk() message rather than at each caller.
      Signed-off-by: NSimon Kirby <sim@hostway.ca>
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      c2cf52eb
    • J
      Btrfs: add a incompatible format change for smaller metadata extent refs · 3173a18f
      Josef Bacik 提交于
      We currently store the first key of the tree block inside the reference for the
      tree block in the extent tree.  This takes up quite a bit of space.  Make a new
      key type for metadata which holds the level as the offset and completely removes
      storing the btrfs_tree_block_info inside the extent ref.  This reduces the size
      from 51 bytes to 33 bytes per extent reference for each tree block.  In practice
      this results in a 30-35% decrease in the size of our extent tree, which means we
      COW less and can keep more of the extent tree in memory which makes our heavy
      metadata operations go much faster.  This is not an automatic format change, you
      must enable it at mkfs time or with btrfstune.  This patch deals with having
      metadata stored as either the old format or the new format so it is easy to
      convert.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      3173a18f
    • D
      btrfs: clean up transaction abort messages · 08748810
      David Sterba 提交于
      The transaction abort stacktrace is printed only once per module
      lifetime, but we'd like to see it each time it happens per mounted
      filesystem.  Introduce a fs_state flag that records it.
      
      Tweak the messages around abort:
      * add error number to the first abort
      * print the exact negative errno from btrfs_decode_error
      * clean up btrfs_decode_error and callers
      * no dots at the end of the messages
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      08748810
  7. 01 3月, 2013 1 次提交
  8. 21 2月, 2013 7 次提交
    • M
      Btrfs: fix remount vs autodefrag · dc81cdc5
      Miao Xie 提交于
      If we remount the fs to close the auto defragment or make the fs R/O,
      we should stop the auto defragment.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      dc81cdc5
    • Z
      btrfs: define BTRFS_MAGIC as a u64 value · cdb4c574
      Zach Brown 提交于
      super.magic is an le64 but it's treated as an unterminated string when
      compared against BTRFS_MAGIC which is defined as a string.  Instead
      define BTRFS_MAGIC as a normal hex value and use endian helpers to
      compare it to the super's magic.
      
      I tested this by mounting an fs made before the change and made sure
      that it didn't introduce sparse errors.  This matches a similar cleanup
      that is pending in btrfs-progs.  David Sterba pointed out that we should
      fix the kernel side as well :).
      Signed-off-by: NZach Brown <zab@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      cdb4c574
    • J
      Btrfs: place ordered operations on a per transaction list · 569e0f35
      Josef Bacik 提交于
      Miao made the ordered operations stuff run async, which introduced a
      deadlock where we could get somebody (sync) racing in and committing the
      transaction while a commit was already happening.  The new committer would
      try and flush ordered operations which would hang waiting for the commit to
      finish because it is done asynchronously and no longer inherits the callers
      trans handle.  To fix this we need to make the ordered operations list a per
      transaction list.  We can get new inodes added to the ordered operation list
      by truncating them and then having another process writing to them, so this
      makes it so that anybody trying to add an ordered operation _must_ start a
      transaction in order to add itself to the list, which will keep new inodes
      from getting added to the ordered operations list after we start committing.
      This should fix the deadlock and also keeps us from doing a lot more work
      than we need to during commit.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      569e0f35
    • D
      btrfs: add cancellation points to defrag · 210549eb
      David Sterba 提交于
      The defrag operation can take very long, we want to have a way how to
      cancel it. The code checks for a pending signal at safe points in the
      defrag loops and returns EAGAIN. This means a user can press ^C after
      running 'btrfs fi defrag', woks for both defrag modes, files and root.
      
      Returning from the command was instant in my light tests, but may take
      longer depending on the aging factor of the filesystem.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      210549eb
    • J
      Btrfs: steal from global reserve if we are cleaning up orphans · 5d80366e
      Josef Bacik 提交于
      Sometimes xfstest 83 will fail to remount the scratch device because we've
      gotten ourselves so full that we cannot cleanup the orphan items.  In this
      case check to see if we're doing the orphan cleanup and if we are allow us
      to steal our reservation from the global block rsv.  With this patch I've
      not been able to reproduce the failed mount problem.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      5d80366e
    • E
      btrfs: remove cache only arguments from defrag path · de78b51a
      Eric Sandeen 提交于
      The entry point at the defrag ioctl always sets "cache only" to 0;
      the codepaths haven't run for a long time as far as I can
      tell.  Chris says they're dead code, so remove them.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      de78b51a
    • E
      btrfs: handle null fs_info in btrfs_panic() · aa43a17c
      Eric Sandeen 提交于
      At least backref_tree_panic() can apparently pass
      in a null fs_info, so handle that in __btrfs_panic
      to get the message out on the console.
      
      The btrfs_panic macro also uses fs_info, but that's
      largely pointless; it's testing to see if
      BTRFS_MOUNT_PANIC_ON_FATAL_ERROR is not set.
      But if it *were* set, __btrfs_panic() would have,
      well, paniced and we wouldn't be here, testing it!
      So just BUG() at this point.
      
      And since we only use fs_info once now, just use it
      directly.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      aa43a17c