1. 02 10月, 2012 8 次提交
    • L
      Btrfs: use larger limit for translation of logical to inode · 425d17a2
      Liu Bo 提交于
      This is the change of the kernel side.
      
      Translation of logical to inode used to have an upper limit 4k on
      inode container's size, but the limit is not large enough for a data
      with a great many of refs, so when resolving logical address,
      we can end up with
      "ioctl ret=0, bytes_left=0, bytes_missing=19944, cnt=510, missed=2493"
      
      This changes to regard 64k as the upper limit and use vmalloc instead of
      kmalloc to get memory more easily.
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      425d17a2
    • L
      Btrfs: use helper for logical resolve · df031f07
      Liu Bo 提交于
      We already have a helper, iterate_inodes_from_logical(), for logical resolve,
      so just use it.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      df031f07
    • L
      Btrfs: fix a bug in parsing return value in logical resolve · 69917e43
      Liu Bo 提交于
      In logical resolve, we parse extent_from_logical()'s 'ret' as a kind of flag.
      
      It is possible to lose our errors because
      (-EXXXX & BTRFS_EXTENT_FLAG_TREE_BLOCK) is true.
      
      I'm not sure if it is on purpose, it just looks too hacky if it is.
      I'd rather use a real flag and a 'ret' to catch errors.
      Acked-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: NLiu Bo <liub.liubo@gmail.com>
      69917e43
    • L
      Btrfs: use flag EXTENT_DEFRAG for snapshot-aware defrag · 9e8a4a8b
      Liu Bo 提交于
      We're going to use this flag EXTENT_DEFRAG to indicate which range
      belongs to defragment so that we can implement snapshow-aware defrag:
      
      We set the EXTENT_DEFRAG flag when dirtying the extents that need
      defragmented, so later on writeback thread can differentiate between
      normal writeback and writeback started by defragmentation.
      Original-Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      9e8a4a8b
    • M
      Btrfs: fix wrong size for the reservation of the, snapshot creation · 48c03c4b
      Miao Xie 提交于
      We should insert/update 6 items(root ref, root backref, dir item, dir index,
      root item and parent inode) when creating a snapshot, not 5 items, fix it.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      48c03c4b
    • M
      Btrfs: add a new "type" field into the block reservation structure · 66d8f3dd
      Miao Xie 提交于
      Sometimes we need choose the method of the reservation according to the type
      of the block reservation, such as the reservation for the delayed inode update.
      Now we identify the type just by comparing the address of the reservation
      variants, it is very ugly if it is a temporary one because we need compare it
      with all the common reservation variants. So we add a new "type" field to keep
      the type the reservation variants.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      66d8f3dd
    • J
      Btrfs: remove unused hint byte argument for btrfs_drop_extents · 2671485d
      Josef Bacik 提交于
      I audited all users of btrfs_drop_extents and found that nobody actually uses
      the hint_byte argument.  I'm sure it was used for something at some point but
      it's not used now, and the way the pinning works the disk bytenr would never be
      immediately useful anyway so lets just remove it.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      2671485d
    • J
      Btrfs: turbo charge fsync · 5dc562c5
      Josef Bacik 提交于
      At least for the vm workload.  Currently on fsync we will
      
      1) Truncate all items in the log tree for the given inode if they exist
      
      and
      
      2) Copy all items for a given inode into the log
      
      The problem with this is that for things like VMs you can have lots of
      extents from the fragmented writing behavior, and worst yet you may have
      only modified a few extents, not the entire thing.  This patch fixes this
      problem by tracking which transid modified our extent, and then when we do
      the tree logging we find all of the extents we've modified in our current
      transaction, sort them and commit them.  We also only truncate up to the
      xattrs of the inode and copy that stuff in normally, and then just drop any
      extents in the range we have that exist in the log already.  Here are some
      numbers of a 50 meg fio job that does random writes and fsync()s after every
      write
      
      		Original	Patched
      SATA drive	82KB/s		140KB/s
      Fusion drive	431KB/s		2532KB/s
      
      So around 2-6 times faster depending on your hardware.  There are a few
      corner cases, for example if you truncate at all we have to do it the old
      way since there is no way to be sure what is in the log is ok.  This
      probably could be done smarter, but if you write-fsync-truncate-write-fsync
      you deserve what you get.  All this work is in RAM of course so if your
      inode gets evicted from cache and you read it in and fsync it we'll do it
      the slow way if we are still in the same transaction that we last modified
      the inode in.
      
      The biggest cool part of this is that it requires no changes to the recovery
      code, so if you fsync with this patch and crash and load an old kernel, it
      will run the recovery and be a-ok.  I have tested this pretty thoroughly
      with an fsync tester and everything comes back fine, as well as xfstests.
      Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      5dc562c5
  2. 29 8月, 2012 1 次提交
  3. 09 8月, 2012 1 次提交
  4. 31 7月, 2012 1 次提交
  5. 26 7月, 2012 3 次提交
  6. 25 7月, 2012 1 次提交
    • D
      btrfs: allow cross-subvolume file clone · 362a20c5
      David Sterba 提交于
      Lift the EXDEV condition and allow different root trees for files being
      cloned, then pass source inode's root when searching for extents.
      Cloning is not allowed to cross vfsmounts, ie. when two subvolumes from
      one filesystem are mounted separately.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      362a20c5
  7. 24 7月, 2012 6 次提交
  8. 23 7月, 2012 1 次提交
  9. 12 7月, 2012 2 次提交
  10. 16 6月, 2012 1 次提交
  11. 15 6月, 2012 3 次提交
    • L
      Btrfs: do not resize a seeding device · 4e42ae1b
      Liu Bo 提交于
      Seeding devices are not supposed to change any more.
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      4e42ae1b
    • L
      Btrfs: fix defrag regression · 6c282eb4
      Li Zefan 提交于
      If a file has 3 small extents:
      
      | ext1 | ext2 | ext3 |
      
      Running "btrfs fi defrag" will only defrag the last two extents, if those
      extent mappings hasn't been read into memory from disk.
      
      This bug was introduced by commit 17ce6ef8
      ("Btrfs: add a check to decide if we should defrag the range")
      
      The cause is, that commit looked into previous and next extents using
      lookup_extent_mapping() only.
      
      While at it, remove the code that checks the previous extent, since
      it's sufficient to check the next extent.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      6c282eb4
    • J
      Btrfs: use rcu to protect device->name · 606686ee
      Josef Bacik 提交于
      Al pointed out that we can just toss out the old name on a device and add a
      new one arbitrarily, so anybody who uses device->name in printk could
      possibly use free'd memory.  Instead of adding locking around all of this he
      suggested doing it with RCU, so I've introduced a struct rcu_string that
      does just that and have gone through and protected all accesses to
      device->name that aren't under the uuid_mutex with rcu_read_lock().  This
      protects us and I will use it for dealing with removing the device that we
      used to mount the file system in a later patch.  Thanks,
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      606686ee
  12. 30 5月, 2012 5 次提交
  13. 26 5月, 2012 1 次提交
  14. 19 4月, 2012 1 次提交
  15. 29 3月, 2012 5 次提交