1. 22 12月, 2011 1 次提交
    • S
      Btrfs: integrate integrity check module into btrfs · 21adbd5c
      Stefan Behrens 提交于
      This is the last part of the patch series. It modifies the btrfs
      code to use the integrity check module if configured to do so
      with the define BTRFS_FS_CHECK_INTEGRITY. If this define is not set,
      the only effective change is that code is added that handles the
      mount option to activate the integrity check. If the mount option is
      set and the define BTRFS_FS_CHECK_INTEGRITY is not set, that code
      complains in the log and the mount fails with EINVAL.
      
      Add the mount option to activate the usage of the integrity check
      code.
      Add invocation of btrfs integrity check code init and cleanup
      function on mount and umount, respectively.
      Add hook to call btrfs integrity check code version of
      submit_bh/submit_bio.
      Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
      21adbd5c
  2. 16 12月, 2011 1 次提交
    • J
      Btrfs: deal with enospc from dirtying inodes properly · 22c44fe6
      Josef Bacik 提交于
      Now that we're properly keeping track of delayed inode space we've been getting
      a lot of warnings out of btrfs_dirty_inode() when running xfstest 83.  This is
      because a bunch of people call mark_inode_dirty, which is void so we can't
      return ENOSPC.  This needs to be fixed in a few areas
      
      1) file_update_time - this updates the mtime and such when writing to a file,
      which will call mark_inode_dirty.  So copy file_update_time into btrfs so we can
      call btrfs_dirty_inode directly and return an error if we get one appropriately.
      
      2) fix symlinks to use btrfs_setattr for ->setattr.  For some reason we weren't
      setting ->setattr for symlinks, even though we should have been.  This catches
      one of the cases where we were getting errors in mark_inode_dirty.
      
      3) Fix btrfs_setattr and btrfs_setsize to call btrfs_dirty_inode directly
      instead of mark_inode_dirty.  This lets us return errors properly for truncate
      and chown/anything related to setattr.
      
      4) Add a new btrfs_fs_dirty_inode which will just call btrfs_dirty_inode and
      print an error if we have one.  The only remaining user we can't control for
      this is touch_atime(), but we don't really want to keep people from walking
      down the tree if we don't have space to save the atime update, so just complain
      but don't worry about it.
      
      With this patch xfstests 83 complains a handful of times instead of hundreds of
      times.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      22c44fe6
  3. 15 12月, 2011 1 次提交
  4. 01 12月, 2011 1 次提交
    • L
      Btrfs: fix oops when calling statfs on readonly device · b772a86e
      Li Zefan 提交于
      To reproduce this bug:
      
        # dd if=/dev/zero of=img bs=1M count=256
        # mkfs.btrfs img
        # losetup -r /dev/loop1 img
        # mount /dev/loop1 /mnt
        OOPS!!
      
      It triggered BUG_ON(!nr_devices) in btrfs_calc_avail_data_space().
      
      To fix this, instead of checking write-only devices, we check all open
      deivces:
      
        # df -h /dev/loop1
        Filesystem            Size  Used Avail Use% Mounted on
        /dev/loop1            250M   28K  238M   1% /mnt
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      b772a86e
  5. 11 11月, 2011 1 次提交
  6. 10 11月, 2011 3 次提交
  7. 08 11月, 2011 1 次提交
    • S
      btrfs: fix double-free 'tree_root' in 'btrfs_mount()' · 45ea6095
      slyich@gmail.com 提交于
      On error path 'tree_root' is treed in 'free_fs_info()'.
      No need to free it explicitely. Noticed by SLUB in debug mode:
      
      Complete reproducer under usermode linux (discovered on real
      machine):
      
          bdev=/dev/ubda
          btr_root=/btr
          /mkfs.btrfs $bdev
          mount $bdev $btr_root
          mkdir $btr_root/subvols/
          cd $btr_root/subvols/
          /btrfs su cr foo
          /btrfs su cr bar
          mount $bdev -osubvol=subvols/foo $btr_root/subvols/bar
          umount $btr_root/subvols/bar
      
      which gives
      
      device fsid 4d55aa28-45b1-474b-b4ec-da912322195e devid 1 transid 7 /dev/ubda
      =============================================================================
      BUG kmalloc-2048: Object already free
      -----------------------------------------------------------------------------
      
      INFO: Allocated in btrfs_mount+0x389/0x7f0 age=0 cpu=0 pid=277
      INFO: Freed in btrfs_mount+0x51c/0x7f0 age=0 cpu=0 pid=277
      INFO: Slab 0x0000000062886200 objects=15 used=9 fp=0x0000000070b4d2d0 flags=0x4081
      INFO: Object 0x0000000070b4d2d0 @offset=21200 fp=0x0000000070b4a968
      ...
      Call Trace:
      70b31948:  [<6008c522>] print_trailer+0xe2/0x130
      70b31978:  [<6008c5aa>] object_err+0x3a/0x50
      70b319a8:  [<6008e242>] free_debug_processing+0x142/0x2a0
      70b319e0:  [<600ebf6f>] btrfs_mount+0x55f/0x7f0
      70b319f8:  [<6008e5c1>] __slab_free+0x221/0x2d0
      Signed-off-by: NSergei Trofimovich <slyfox@gentoo.org>
      Cc: Arne Jansen <sensille@gmx.net>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: David Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      45ea6095
  8. 06 11月, 2011 2 次提交
    • C
      Btrfs: add a log of past tree roots · af31f5e5
      Chris Mason 提交于
      This takes some of the free space in the btrfs super block
      to record information about most of the roots in the last four
      commits.
      
      It also adds a -o recovery to use the root history log when
      we're not able to read the tree of tree roots, the extent
      tree root, the device tree root or the csum root.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      af31f5e5
    • D
      btrfs: separate superblock items out of fs_info · 6c41761f
      David Sterba 提交于
      fs_info has now ~9kb, more than fits into one page. This will cause
      mount failure when memory is too fragmented. Top space consumers are
      super block structures super_copy and super_for_commit, ~2.8kb each.
      Allocate them dynamically. fs_info will be ~3.5kb. (measured on x86_64)
      
      Add a wrapper for freeing fs_info and all of it's dynamically allocated
      members.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      6c41761f
  9. 24 10月, 2011 1 次提交
    • D
      btrfs: do not allow mounting non-subvolumes via subvol option · f9d9ef62
      David Sterba 提交于
      There's a missing test whether the path passed to subvol=path option
      during mount is a real subvolume, allowing any directory located in
      default subovlume to be passed and accepted for mount.
      
      (current btrfs progs prevent this early)
      $ btrfs subvol snapshot . p1-snap
      ERROR: '.' is not a subvolume
      
      (with "is subvolume?" test bypassed)
      $ btrfs subvol snapshot . p1-snap
      Create a snapshot of '.' in './p1-snap'
      
      $ btrfs subvol list -p .
      ID 258 parent 5 top level 5 path subvol
      ID 259 parent 5 top level 5 path subvol1
      ID 260 parent 5 top level 5 path default-subvol1
      ID 262 parent 5 top level 5 path p1/p1-snapshot
      ID 263 parent 259 top level 5 path subvol1/subvol1-snap
      
      The problem I see is that this makes a false impression of snapshotting the
      given subvolume but in fact snapshots the default one: a user expects outcome
      like ID 263 but in fact gets ID 262 .
      
      This patch makes mount fail with EINVAL with a message in syslog.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      f9d9ef62
  10. 21 10月, 2011 2 次提交
  11. 20 10月, 2011 3 次提交
    • J
      Btrfs: introduce mount option no_space_cache · 73bc1876
      Josef Bacik 提交于
      Some users have requested this and I've found I needed a way to disable cache
      loading without actually clearing the cache, so introduce the no_space_cache
      option.  Before we check the super blocks cache generation field and if it was
      populated we always turned space caching on.  Now we check this and set the
      space cache option on, and then parse the mount options so that if we want it
      off it get's turned off.  Then we check the mount option all the places we do
      the caching work instead of checking the super's cache generation.  This makes
      things more consistent and lets us turn space caching off.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      73bc1876
    • J
      Btrfs: fix how we mount subvol=<whatever> · 830c4adb
      Josef Bacik 提交于
      We've only been able to mount with subvol=<whatever> where whatever was a subvol
      within whatever root we had as the default.  This allows us to mount -o
      subvol=path/to/subvol/you/want relative from the normal fs_tree root.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      830c4adb
    • J
      Btrfs: use d_obtain_alias when mounting subvol/subvolid · ba5b8958
      Josef Bacik 提交于
      Currently what we do is just wrong.  We either
      
      1) Alloc a new "root" dentry with sb->s_root as it's parent which is just wrong
      as we could walk into this subvol later on via another path and hilarity could
      ensue.  Also we don't check the return value of d_splice_alias which isn't good
      either.
      
      or
      
      2) Do a d_find_alias() which we could have lost our dentry from cache at this
      point and found nothing.
      
      So use d_obtain_alias().  In the case that we already have the inode/dentry in
      cache we will get the correct dentry.  If not we will get a disconnected dentry
      tree so if we walk into it later on everything will be connected up properly.
      Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      ba5b8958
  12. 07 7月, 2011 1 次提交
  13. 04 6月, 2011 2 次提交
    • C
      Btrfs: add mount -o inode_cache · 4b9465cb
      Chris Mason 提交于
      This makes the inode map cache default to off until we
      fix the overflow problem when the free space crcs don't fit
      inside a single page.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      4b9465cb
    • A
      more conservative S_NOSEC handling · 9e1f1de0
      Al Viro 提交于
      Caching "we have already removed suid/caps" was overenthusiastic as merged.
      On network filesystems we might have had suid/caps set on another client,
      silently picked by this client on revalidate, all of that *without* clearing
      the S_NOSEC flag.
      
      AFAICS, the only reasonably sane way to deal with that is
      	* new superblock flag; unless set, S_NOSEC is not going to be set.
      	* local block filesystems set it in their ->mount() (more accurately,
      mount_bdev() does, so does btrfs ->mount(), users of mount_bdev() other than
      local block ones clear it)
      	* if any network filesystem (or a cluster one) wants to use S_NOSEC,
      it'll need to set MS_NOSEC in sb->s_flags *AND* take care to clear S_NOSEC when
      inode attribute changes are picked from other clients.
      
      It's not an earth-shattering hole (anybody that can set suid on another client
      will almost certainly be able to write to the file before doing that anyway),
      but it's a bug that needs fixing.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9e1f1de0
  14. 27 5月, 2011 2 次提交
    • C
      Btrfs: add mount -o auto_defrag · 4cb5300b
      Chris Mason 提交于
      This will detect small random writes into files and
      queue the up for an auto defrag process.  It isn't well suited to
      database workloads yet, but works for smaller files such as rpm, sqlite
      or bdb databases.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      4cb5300b
    • D
      btrfs: add cleancache support · 90a887c9
      Dan Magenheimer 提交于
      This sixth patch of eight in this cleancache series "opts-in"
      cleancache for btrfs.  Filesystems must explicitly enable
      cleancache by calling cleancache_init_fs anytime an instance
      of the filesystem is mounted.  Btrfs uses its own readpage
      which must be hooked, but all other cleancache hooks are in
      the VFS layer including the matching cleancache_flush_fs hook
      which must be called on unmount.
      
      Details and a FAQ can be found in Documentation/vm/cleancache.txt
      
      [v6-v8: no changes]
      [v5: jeremy@goop.org: simplify init hook and any future fs init changes]
      Signed-off-by: NDan Magenheimer <dan.magenheimer@oracle.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      Reviewed-by: NJeremy Fitzhardinge <jeremy@goop.org>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rik Van Riel <riel@redhat.com>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Andreas Dilger <adilger@sun.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <joel.becker@oracle.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      90a887c9
  15. 24 5月, 2011 1 次提交
    • J
      fs/btrfs: Add missing btrfs_free_path · b0839166
      Julia Lawall 提交于
      Btrfs_alloc_path should be matched with btrfs_free_path in error-handling code.
      
      A simplified version of the semantic match that finds this problem is as
      follows: (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @r exists@
      local idexpression struct btrfs_path * x;
      expression ra,rb;
      position p1,p2;
      @@
      
      x = btrfs_alloc_path@p1(...)
      ...  when != btrfs_free_path(x,...)
           when != if (...) { ... btrfs_free_path(x,...) ...}
           when != x = ra
      if(...) { ... when != x = rb
           when forall
           when != btrfs_free_path(x,...)
       \(return <+...x...+>; \| return@p2...; \) }
      
      @script:python@
      p1 << r.p1;
      p2 << r.p2;
      @@
      
      cocci.print_main("alloc",p1)
      cocci.print_secs("return",p2)
      // </smpl>
      Signed-off-by: NJulia Lawall <julia@diku.dk>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      b0839166
  16. 21 5月, 2011 1 次提交
    • M
      btrfs: implement delayed inode items operation · 16cdcec7
      Miao Xie 提交于
      Changelog V5 -> V6:
      - Fix oom when the memory load is high, by storing the delayed nodes into the
        root's radix tree, and letting btrfs inodes go.
      
      Changelog V4 -> V5:
      - Fix the race on adding the delayed node to the inode, which is spotted by
        Chris Mason.
      - Merge Chris Mason's incremental patch into this patch.
      - Fix deadlock between readdir() and memory fault, which is reported by
        Itaru Kitayama.
      
      Changelog V3 -> V4:
      - Fix nested lock, which is reported by Itaru Kitayama, by updating space cache
        inode in time.
      
      Changelog V2 -> V3:
      - Fix the race between the delayed worker and the task which does delayed items
        balance, which is reported by Tsutomu Itoh.
      - Modify the patch address David Sterba's comment.
      - Fix the bug of the cpu recursion spinlock, reported by Chris Mason
      
      Changelog V1 -> V2:
      - break up the global rb-tree, use a list to manage the delayed nodes,
        which is created for every directory and file, and used to manage the
        delayed directory name index items and the delayed inode item.
      - introduce a worker to deal with the delayed nodes.
      
      Compare with Ext3/4, the performance of file creation and deletion on btrfs
      is very poor. the reason is that btrfs must do a lot of b+ tree insertions,
      such as inode item, directory name item, directory name index and so on.
      
      If we can do some delayed b+ tree insertion or deletion, we can improve the
      performance, so we made this patch which implemented delayed directory name
      index insertion/deletion and delayed inode update.
      
      Implementation:
      - introduce a delayed root object into the filesystem, that use two lists to
        manage the delayed nodes which are created for every file/directory.
        One is used to manage all the delayed nodes that have delayed items. And the
        other is used to manage the delayed nodes which is waiting to be dealt with
        by the work thread.
      - Every delayed node has two rb-tree, one is used to manage the directory name
        index which is going to be inserted into b+ tree, and the other is used to
        manage the directory name index which is going to be deleted from b+ tree.
      - introduce a worker to deal with the delayed operation. This worker is used
        to deal with the works of the delayed directory name index items insertion
        and deletion and the delayed inode update.
        When the delayed items is beyond the lower limit, we create works for some
        delayed nodes and insert them into the work queue of the worker, and then
        go back.
        When the delayed items is beyond the upper bound, we create works for all
        the delayed nodes that haven't been dealt with, and insert them into the work
        queue of the worker, and then wait for that the untreated items is below some
        threshold value.
      - When we want to insert a directory name index into b+ tree, we just add the
        information into the delayed inserting rb-tree.
        And then we check the number of the delayed items and do delayed items
        balance. (The balance policy is above.)
      - When we want to delete a directory name index from the b+ tree, we search it
        in the inserting rb-tree at first. If we look it up, just drop it. If not,
        add the key of it into the delayed deleting rb-tree.
        Similar to the delayed inserting rb-tree, we also check the number of the
        delayed items and do delayed items balance.
        (The same to inserting manipulation)
      - When we want to update the metadata of some inode, we cached the data of the
        inode into the delayed node. the worker will flush it into the b+ tree after
        dealing with the delayed insertion and deletion.
      - We will move the delayed node to the tail of the list after we access the
        delayed node, By this way, we can cache more delayed items and merge more
        inode updates.
      - If we want to commit transaction, we will deal with all the delayed node.
      - the delayed node will be freed when we free the btrfs inode.
      - Before we log the inode items, we commit all the directory name index items
        and the delayed inode update.
      
      I did a quick test by the benchmark tool[1] and found we can improve the
      performance of file creation by ~15%, and file deletion by ~20%.
      
      Before applying this patch:
      Create files:
              Total files: 50000
              Total time: 1.096108
              Average time: 0.000022
      Delete files:
              Total files: 50000
              Total time: 1.510403
              Average time: 0.000030
      
      After applying this patch:
      Create files:
              Total files: 50000
              Total time: 0.932899
              Average time: 0.000019
      Delete files:
              Total files: 50000
              Total time: 1.215732
              Average time: 0.000024
      
      [1] http://marc.info/?l=linux-btrfs&m=128212635122920&q=p3
      
      Many thanks for Kitayama-san's help!
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Reviewed-by: NDavid Sterba <dave@jikos.cz>
      Tested-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
      Tested-by: NItaru Kitayama <kitayama@cl.bb4u.ne.jp>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      16cdcec7
  17. 13 5月, 2011 1 次提交
  18. 02 5月, 2011 1 次提交
  19. 12 4月, 2011 1 次提交
  20. 05 4月, 2011 1 次提交
    • T
      Btrfs: fix /proc/mounts info. · 200da64e
      Tsutomu Itoh 提交于
      Some mount options are not displayed by /proc/mounts.
      This patch displays the option such as compress_type by /proc/mounts.
      
      Ex.
        [before]
          $ mount | grep sdc2
          /dev/sdc2 on /test12 type btrfs (rw,space_cache,compress=lzo)
          $ cat /proc/mounts | grep sdc2
          /dev/sdc2 /test12 btrfs rw,relatime,compress 0 0
      
        [after]
          $ mount | grep sdc2
          /dev/sdc2 on /test12 type btrfs (rw,space_cache,compress=lzo)
          $ cat /proc/mounts | grep sdc2
          /dev/sdc2 /test12 btrfs rw,relatime,compress=lzo,space_cache 0 0
      Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      200da64e
  21. 28 3月, 2011 1 次提交
    • L
      Btrfs: add initial tracepoint support for btrfs · 1abe9b8a
      liubo 提交于
      Tracepoints can provide insight into why btrfs hits bugs and be greatly
      helpful for debugging, e.g
                    dd-7822  [000]  2121.641088: btrfs_inode_request: root = 5(FS_TREE), gen = 4, ino = 256, blocks = 8, disk_i_size = 0, last_trans = 8, logged_trans = 0
                    dd-7822  [000]  2121.641100: btrfs_inode_new: root = 5(FS_TREE), gen = 8, ino = 257, blocks = 0, disk_i_size = 0, last_trans = 0, logged_trans = 0
       btrfs-transacti-7804  [001]  2146.935420: btrfs_cow_block: root = 2(EXTENT_TREE), refs = 2, orig_buf = 29368320 (orig_level = 0), cow_buf = 29388800 (cow_level = 0)
       btrfs-transacti-7804  [001]  2146.935473: btrfs_cow_block: root = 1(ROOT_TREE), refs = 2, orig_buf = 29364224 (orig_level = 0), cow_buf = 29392896 (cow_level = 0)
       btrfs-transacti-7804  [001]  2146.972221: btrfs_transaction_commit: root = 1(ROOT_TREE), gen = 8
         flush-btrfs-2-7821  [001]  2155.824210: btrfs_chunk_alloc: root = 3(CHUNK_TREE), offset = 1103101952, size = 1073741824, num_stripes = 1, sub_stripes = 0, type = DATA
         flush-btrfs-2-7821  [001]  2155.824241: btrfs_cow_block: root = 2(EXTENT_TREE), refs = 2, orig_buf = 29388800 (orig_level = 0), cow_buf = 29396992 (cow_level = 0)
         flush-btrfs-2-7821  [001]  2155.824255: btrfs_cow_block: root = 4(DEV_TREE), refs = 2, orig_buf = 29372416 (orig_level = 0), cow_buf = 29401088 (cow_level = 0)
         flush-btrfs-2-7821  [000]  2155.824329: btrfs_cow_block: root = 3(CHUNK_TREE), refs = 2, orig_buf = 20971520 (orig_level = 0), cow_buf = 20975616 (cow_level = 0)
       btrfs-endio-wri-7800  [001]  2155.898019: btrfs_cow_block: root = 5(FS_TREE), refs = 2, orig_buf = 29384704 (orig_level = 0), cow_buf = 29405184 (cow_level = 0)
       btrfs-endio-wri-7800  [001]  2155.898043: btrfs_cow_block: root = 7(CSUM_TREE), refs = 2, orig_buf = 29376512 (orig_level = 0), cow_buf = 29409280 (cow_level = 0)
      
      Here is what I have added:
      
      1) ordere_extent:
              btrfs_ordered_extent_add
              btrfs_ordered_extent_remove
              btrfs_ordered_extent_start
              btrfs_ordered_extent_put
      
      These provide critical information to understand how ordered_extents are
      updated.
      
      2) extent_map:
              btrfs_get_extent
      
      extent_map is used in both read and write cases, and it is useful for tracking
      how btrfs specific IO is running.
      
      3) writepage:
              __extent_writepage
              btrfs_writepage_end_io_hook
      
      Pages are cirtical resourses and produce a lot of corner cases during writeback,
      so it is valuable to know how page is written to disk.
      
      4) inode:
              btrfs_inode_new
              btrfs_inode_request
              btrfs_inode_evict
      
      These can show where and when a inode is created, when a inode is evicted.
      
      5) sync:
              btrfs_sync_file
              btrfs_sync_fs
      
      These show sync arguments.
      
      6) transaction:
              btrfs_transaction_commit
      
      In transaction based filesystem, it will be useful to know the generation and
      who does commit.
      
      7) back reference and cow:
      	btrfs_delayed_tree_ref
      	btrfs_delayed_data_ref
      	btrfs_delayed_ref_head
      	btrfs_cow_block
      
      Btrfs natively supports back references, these tracepoints are helpful on
      understanding btrfs's COW mechanism.
      
      8) chunk:
      	btrfs_chunk_alloc
      	btrfs_chunk_free
      
      Chunk is a link between physical offset and logical offset, and stands for space
      infomation in btrfs, and these are helpful on tracing space things.
      
      9) reserved_extent:
      	btrfs_reserved_extent_alloc
      	btrfs_reserved_extent_free
      
      These can show how btrfs uses its space.
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      1abe9b8a
  22. 17 2月, 2011 1 次提交
  23. 01 2月, 2011 1 次提交
  24. 27 1月, 2011 2 次提交
  25. 18 1月, 2011 1 次提交
    • L
      Btrfs: forced readonly mounts on errors · acce952b
      liubo 提交于
      This patch comes from "Forced readonly mounts on errors" ideas.
      
      As we know, this is the first step in being more fault tolerant of disk
      corruptions instead of just using BUG() statements.
      
      The major content:
      - add a framework for generating errors that should result in filesystems
        going readonly.
      - keep FS state in disk super block.
      - make sure that all of resource will be freed and released at umount time.
      - make sure that fter FS is forced readonly on error, there will be no more
        disk change before FS is corrected. For this, we should stop write operation.
      
      After this patch is applied, the conversion from BUG() to such a framework can
      happen incrementally.
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      acce952b
  26. 17 1月, 2011 2 次提交
    • M
      btrfs: fix wrong free space information of btrfs · 6d07bcec
      Miao Xie 提交于
      When we store data by raid profile in btrfs with two or more different size
      disks, df command shows there is some free space in the filesystem, but the
      user can not write any data in fact, df command shows the wrong free space
      information of btrfs.
      
       # mkfs.btrfs -d raid1 /dev/sda9 /dev/sda10
       # btrfs-show
       Label: none  uuid: a95cd49e-6e33-45b8-8741-a36153ce4b64
       	Total devices 2 FS bytes used 28.00KB
       	devid    1 size 5.01GB used 2.03GB path /dev/sda9
       	devid    2 size 10.00GB used 2.01GB path /dev/sda10
       # btrfs device scan /dev/sda9 /dev/sda10
       # mount /dev/sda9 /mnt
       # dd if=/dev/zero of=tmpfile0 bs=4K count=9999999999
         (fill the filesystem)
       # sync
       # df -TH
       Filesystem	Type	Size	Used	Avail	Use%	Mounted on
       /dev/sda9	btrfs	17G	8.6G	5.4G	62%	/mnt
       # btrfs-show
       Label: none  uuid: a95cd49e-6e33-45b8-8741-a36153ce4b64
       	Total devices 2 FS bytes used 3.99GB
       	devid    1 size 5.01GB used 5.01GB path /dev/sda9
       	devid    2 size 10.00GB used 4.99GB path /dev/sda10
      
      It is because btrfs cannot allocate chunks when one of the pairing disks has
      no space, the free space on the other disks can not be used for ever, and should
      be subtracted from the total space, but btrfs doesn't subtract this space from
      the total. It is strange to the user.
      
      This patch fixes it by calcing the free space that can be used to allocate
      chunks.
      
      Implementation:
      1. get all the devices free space, and align them by stripe length.
      2. sort the devices by the free space.
      3. check the free space of the devices,
         3.1. if it is not zero, and then check the number of the devices that has
              more free space than this device,
              if the number of the devices is beyond the min stripe number, the free
              space can be used, and add into total free space.
              if the number of the devices is below the min stripe number, we can not
              use the free space, the check ends.
         3.2. if the free space is zero, check the next devices, goto 3.1
      
      This implementation is just likely fake chunk allocation.
      
      After appling this patch, df can show correct space information:
       # df -TH
       Filesystem	Type	Size	Used	Avail	Use%	Mounted on
       /dev/sda9	btrfs	17G	8.6G	0	100%	/mnt
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      6d07bcec
    • M
      btrfs: fix wrong data space statistics · 299a08b1
      Miao Xie 提交于
      Josef has implemented mixed data/metadata chunks, we must add those chunks'
      space just like data chunks.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Reviewed-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      299a08b1
  27. 13 1月, 2011 1 次提交
  28. 22 12月, 2010 2 次提交
    • L
      btrfs: Add lzo compression support · a6fa6fae
      Li Zefan 提交于
      Lzo is a much faster compression algorithm than gzib, so would allow
      more users to enable transparent compression, and some users can
      choose from compression ratio and speed for different applications
      
      Usage:
      
       # mount -t btrfs -o compress[=<zlib,lzo>] dev /mnt
      or
       # mount -t btrfs -o compress-force[=<zlib,lzo>] dev /mnt
      
      "-o compress" without argument is still allowed for compatability.
      
      Compatibility:
      
      If we mount a filesystem with lzo compression, it will not be able be
      mounted in old kernels. One reason is, otherwise btrfs will directly
      dump compressed data, which sits in inline extent, to user.
      
      Performance:
      
      The test copied a linux source tarball (~400M) from an ext4 partition
      to the btrfs partition, and then extracted it.
      
      (time in second)
                 lzo        zlib        nocompress
      copy:      10.6       21.7        14.9
      extract:   70.1       94.4        66.6
      
      (data size in MB)
                 lzo        zlib        nocompress
      copy:      185.87     108.69      394.49
      extract:   193.80     132.36      381.21
      
      Changelog:
      
      v1 -> v2:
      - Select LZO_COMPRESS and LZO_DECOMPRESS in btrfs Kconfig.
      - Add incompability flag.
      - Fix error handling in compress code.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      a6fa6fae
    • L
      btrfs: Allow to add new compression algorithm · 261507a0
      Li Zefan 提交于
      Make the code aware of compression type, instead of always assuming
      zlib compression.
      
      Also make the zlib workspace function as common code for all
      compression types.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      261507a0
  29. 11 12月, 2010 1 次提交