1. 17 8月, 2011 6 次提交
    • L
      Btrfs: use plain page_address() in header fields setget functions · c97c2916
      Li Zefan 提交于
      We've stopped using highmem for extent buffers.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      c97c2916
    • T
      Btrfs: forced readonly when btrfs_drop_snapshot() fails · cb1b69f4
      Tsutomu Itoh 提交于
      The filesystem turns readonly instead of returning the error to the
      caller when detected error in btrfs_drop_snapshot().
      and, because the caller doesn't check the error, the function type is
      changed to 'void'.
      Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      cb1b69f4
    • L
      Btrfs: check if there is enough space for balancing smarter · cdcb725c
      liubo 提交于
      When checking if there is enough space for balancing a block group,
      since we do not take raid types into consideration, we do not account
      corrent amounts of space that we needed.  This makes us do some extra
      work before we get ENOSPC.
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      cdcb725c
    • L
      Btrfs: fix a bug of balance on full multi-disk partitions · 38c01b96
      liubo 提交于
      When balancing, we'll first try to shrink devices for some space,
      but if it is working on a full multi-disk partition with raid protection,
      we may encounter a bug, that is, while shrinking, total_bytes may be less
      than bytes_used, and btrfs may allocate a dev extent that accesses out of
      device's bounds.
      
      Then we will not be able to write or read the data which stores at the end
      of the device, and get the followings:
      
      device fsid 0939f071-7ea3-46c8-95df-f176d773bfb6 devid 1 transid 10 /dev/sdb5
      Btrfs detected SSD devices, enabling SSD mode
      btrfs: relocating block group 476315648 flags 9
      btrfs: found 4 extents
      attempt to access beyond end of device
      sdb5: rw=145, want=546176, limit=546147
      attempt to access beyond end of device
      sdb5: rw=145, want=546304, limit=546147
      attempt to access beyond end of device
      sdb5: rw=145, want=546432, limit=546147
      attempt to access beyond end of device
      sdb5: rw=145, want=546560, limit=546147
      attempt to access beyond end of device
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      38c01b96
    • L
      Btrfs: fix an oops of log replay · 34f3e4f2
      liubo 提交于
      When btrfs recovers from a crash, it may hit the oops below:
      
      ------------[ cut here ]------------
      kernel BUG at fs/btrfs/inode.c:4580!
      [...]
      RIP: 0010:[<ffffffffa03df251>]  [<ffffffffa03df251>] btrfs_add_link+0x161/0x1c0 [btrfs]
      [...]
      Call Trace:
       [<ffffffffa03e7b31>] ? btrfs_inode_ref_index+0x31/0x80 [btrfs]
       [<ffffffffa04054e9>] add_inode_ref+0x319/0x3f0 [btrfs]
       [<ffffffffa0407087>] replay_one_buffer+0x2c7/0x390 [btrfs]
       [<ffffffffa040444a>] walk_down_log_tree+0x32a/0x480 [btrfs]
       [<ffffffffa0404695>] walk_log_tree+0xf5/0x240 [btrfs]
       [<ffffffffa0406cc0>] btrfs_recover_log_trees+0x250/0x350 [btrfs]
       [<ffffffffa0406dc0>] ? btrfs_recover_log_trees+0x350/0x350 [btrfs]
       [<ffffffffa03d18b2>] open_ctree+0x1442/0x17d0 [btrfs]
      [...]
      
      This comes from that while replaying an inode ref item, we forget to
      check those old conflicting DIR_ITEM and DIR_INDEX items in fs/file tree,
      then we will come to conflict corners which lead to BUG_ON().
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      Tested-by: NAndy Lutomirski <luto@mit.edu>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      34f3e4f2
    • J
      Btrfs: detect wether a device supports discard · d5e2003c
      Josef Bacik 提交于
      We have a problem where if a user specifies discard but doesn't actually support
      it we will return EOPNOTSUPP from btrfs_discard_extent.  This is a problem
      because this gets called (in a fashion) from the tree log recovery code, which
      has a nice little BUG_ON(ret) after it, which causes us to fail the tree log
      replay.  So instead detect wether our devices support discard when we're adding
      them and then don't issue discards if we know that the device doesn't support
      it.  And just for good measure set ret = 0 in btrfs_issue_discard just in case
      we still get EOPNOTSUPP so we don't screw anybody up like this again.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      d5e2003c
  2. 06 8月, 2011 1 次提交
  3. 02 8月, 2011 25 次提交
  4. 28 7月, 2011 8 次提交
    • C
      Merge branch 'integration' into for-linus · ff95acb6
      Chris Mason 提交于
      ff95acb6
    • C
      Btrfs: make sure reserve_metadata_bytes doesn't leak out strange errors · 75c195a2
      Chris Mason 提交于
      The btrfs transaction code will return any errors that come from
      reserve_metadata_bytes.  We need to make sure we don't return funny
      things like 1 or EAGAIN.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      75c195a2
    • C
      Btrfs: use the commit_root for reading free_space_inode crcs · 2cf8572d
      Chris Mason 提交于
      Now that we are using regular file crcs for the free space cache,
      we can deadlock if we try to read the free_space_inode while we are
      updating the crc tree.
      
      This commit fixes things by using the commit_root to read the crcs.  This is
      safe because we the free space cache file would already be loaded if
      that block group had been changed in the current transaction.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      2cf8572d
    • C
      Btrfs: reduce extent_state lock contention for metadata · 19b6caf4
      Chris Mason 提交于
      For metadata buffers that don't straddle pages (all of them), btrfs
      can safely use the page uptodate bits and extent_buffer uptodate bit
      instead of needing to use the extent_state tree.
      
      This greatly reduces contention on the state tree lock.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      19b6caf4
    • C
      Btrfs: remove lockdep magic from btrfs_next_leaf · 31533fb2
      Chris Mason 提交于
      Before the reader/writer locks, btrfs_next_leaf needed to keep
      the path blocking to avoid making lockdep upset.
      
      Now that btrfs_next_leaf only takes read locks, this isn't required.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      31533fb2
    • C
      Btrfs: make a lockdep class for each root · 85d4e461
      Chris Mason 提交于
      This patch was originally from Tejun Heo.  lockdep complains about the btrfs
      locking because we sometimes take btree locks from two different trees at the
      same time.  The current classes are based only on level in the btree, which
      isn't enough information for lockdep to figure out if the lock is safe.
      
      This patch makes a class for each type of tree, and lumps all the FS trees that
      actually have files and directories into the same class.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      85d4e461
    • C
      Btrfs: switch the btrfs tree locks to reader/writer · bd681513
      Chris Mason 提交于
      The btrfs metadata btree is the source of significant
      lock contention, especially in the root node.   This
      commit changes our locking to use a reader/writer
      lock.
      
      The lock is built on top of rw spinlocks, and it
      extends the lock tracking to remember if we have a
      read lock or a write lock when we go to blocking.  Atomics
      count the number of blocking readers or writers at any
      given time.
      
      It removes all of the adaptive spinning from the old code
      and uses only the spinning/blocking hints inside of btrfs
      to decide when it should continue spinning.
      
      In read heavy workloads this is dramatically faster.  In write
      heavy workloads we're still faster because of less contention
      on the root node lock.
      
      We suffer slightly in dbench because we schedule more often
      during write locks, but all other benchmarks so far are improved.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      bd681513
    • J
      Btrfs: fix deadlock when throttling transactions · 81317fde
      Josef Bacik 提交于
      Hit this nice little deadlock.  What happens is this
      
      __btrfs_end_transaction with throttle set, --use_count so it equals 0
        btrfs_commit_transaction
          <somebody else actually manages to start the commit>
          btrfs_end_transaction --use_count so now its -1 <== BAD
            we just return and wait on the transaction
      
      This is bad because we just return after our use_count is -1 and don't let go
      of our num_writer count on the transaction, so the guy committing the
      transaction just sits there forever.  Fix this by inc'ing our use_count if we're
      going to call commit_transaction so that if we call btrfs_end_transaction it's
      valid.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      81317fde