1. 20 10月, 2011 7 次提交
    • J
      Btrfs: stop passing a trans handle all around the reservation code · 4a92b1b8
      Josef Bacik 提交于
      The only thing that we need to have a trans handle for is in
      reserve_metadata_bytes and thats to know how much flushing we can do.  So
      instead of passing it around, just check current->journal_info for a
      trans_handle so we know if we can commit a transaction to try and free up space
      or not.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      4a92b1b8
    • J
      Btrfs: handle enospc accounting for free space inodes · c09544e0
      Josef Bacik 提交于
      Since free space inodes now use normal checksumming we need to make sure to
      account for their metadata use.  So reserve metadata space, and then if we fail
      to write out the metadata we can just release it, otherwise it will be freed up
      when the io completes.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      c09544e0
    • J
      Btrfs: put the block group cache after we commit the super · 300e4f8a
      Josef Bacik 提交于
      In moving some enospc stuff around I noticed that when we unmount we are often
      evicting the free space cache inodes before we do our last commit.  This isn't
      bad, but it makes us constantly have to re-read the inodes back.  So instead
      don't evict the cache until after we do our last commit, this will make things a
      little less crappy and makes a future enospc change work properly.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      300e4f8a
    • J
      Btrfs: fix call to btrfs_search_slot in free space cache · a9b5fcdd
      Josef Bacik 提交于
      We are setting ins_len to 1 even tho we are just modifying an item that should
      be there already.  This may cause the search stuff to split nodes on the way
      down needelessly.  Set this to 0 since we aren't inserting anything.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      a9b5fcdd
    • J
      Btrfs: allow callers to specify if flushing can occur for btrfs_block_rsv_check · 482e6dc5
      Josef Bacik 提交于
      If you run xfstest 224 it you will get lots of messages about not being able to
      delete inodes and that they will be cleaned up next mount.  This is because
      btrfs_block_rsv_check was not calling reserve_metadata_bytes with the ability to
      flush, so if there was not enough space, it simply failed.  But in truncate and
      evict case we could easily flush space to try and get enough space to do our
      work, so make btrfs_block_rsv_check take a flush argument to pass down to
      reserve_metadata_bytes.  Now xfstests 224 runs fine without all those
      complaints.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      482e6dc5
    • J
      Btrfs: ratelimit the generation printk for the free space cache · 6ab60601
      Josef Bacik 提交于
      A user reported getting spammed when moving to 3.0 by this message.  Since we
      switched to the normal checksumming infrastructure all old free space caches
      will be wrong and need to be regenerated so people are likely to see this
      message a lot, so ratelimit it so it doesn't fill up their logs and freak them
      out.  Thanks,
      Reported-by: NAndrew Lutomirski <luto@mit.edu>
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      6ab60601
    • J
      Btrfs: use bytes_may_use for all ENOSPC reservations · fb25e914
      Josef Bacik 提交于
      We have been using bytes_reserved for metadata reservations, which is wrong
      since we use that to keep track of outstanding reservations from the allocator.
      This resulted in us doing a lot of silly things to make sure we don't allocate a
      bunch of metadata chunks since we never had a real view of how much space was
      actually in use by metadata.
      
      This passes Arne's enospc test and xfstests as well as my own enospc tests.
      Hopefully this will get us moving in the right direction.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      fb25e914
  2. 11 9月, 2011 1 次提交
  3. 17 8月, 2011 1 次提交
  4. 28 7月, 2011 1 次提交
  5. 11 7月, 2011 1 次提交
  6. 25 6月, 2011 1 次提交
  7. 11 6月, 2011 1 次提交
    • C
      Btrfs: make sure to recheck for bitmaps in clusters · 38e87880
      Chris Mason 提交于
      Josef recently changed the free extent cache to look in
      the block group cluster for any bitmaps before trying to
      add a new bitmap for the same offset.  This avoids BUG_ON()s due
      covering duplicate ranges.
      
      But it didn't go quite far enough.  A given free range might span
      between one or more bitmaps or free space entries.  The code has
      looping to cover this, but it doesn't check for clustered bitmaps
      every time.
      
      This shuffles our gotos to check for a bitmap in the cluster
      for every new bitmap entry we try to add.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      38e87880
  8. 09 6月, 2011 4 次提交
    • J
      Btrfs: fix duplicate checking logic · f6a39829
      Josef Bacik 提交于
      When merging my code into the integration test the second check for duplicate
      entries got screwed up.  This patch fixes it by dropping ret2 and just using ret
      for the return value, and checking if we got an error before adding the bitmap
      to the local list.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      f6a39829
    • J
      Btrfs: fix bitmap regression · 2cdc342c
      Josef Bacik 提交于
      In cleaning up the clustering code I accidently introduced a regression by
      adding bitmap entries to the cluster rb tree.  The problem is if we've maxed out
      the number of bitmaps we can have for the block group we can only add free space
      to the bitmaps, but since the bitmap is on the cluster we can't find it and we
      try to create another one.  This would result in a panic because the total
      bitmaps was bigger than the max bitmaps that were allowed.  This patch fixes
      this by checking to see if we have a cluster, and then looking at the cluster rb
      tree to see if it has a bitmap entry and if it does and that space belongs to
      that bitmap, go ahead and add it to that bitmap.
      
      I could hit this panic every time with an fs_mark test within a couple of
      minutes.  With this patch I no longer hit the panic and fs_mark goes to
      completion.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      2cdc342c
    • J
      Btrfs: noinline the cluster searching functions · 3de85bb9
      Josef Bacik 提交于
      When profiling the find cluster code it's hard to tell where we are spending our
      time because the bitmap and non-bitmap functions get inlined by the compiler, so
      make that not happen.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      3de85bb9
    • J
      Btrfs: cache bitmaps when searching for a cluster · 86d4a77b
      Josef Bacik 提交于
      If we are looking for a cluster in a particularly sparse or fragmented block
      group, we will do a lot of looping through the free space tree looking for
      various things, and if we need to look at bitmaps we will endup doing the whole
      dance twice.  So instead add the bitmap entries to a temporary list so if we
      have to do the bitmap search we can just look through the list of entries we've
      found quickly instead of having to loop through the entire tree again.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      86d4a77b
  9. 04 6月, 2011 3 次提交
  10. 24 5月, 2011 1 次提交
  11. 06 5月, 2011 1 次提交
  12. 02 5月, 2011 3 次提交
  13. 26 4月, 2011 2 次提交
  14. 25 4月, 2011 6 次提交
    • L
      Btrfs: Support reading/writing on disk free ino cache · 82d5902d
      Li Zefan 提交于
      This is similar to block group caching.
      
      We dedicate a special inode in fs tree to save free ino cache.
      
      At the very first time we create/delete a file after mount, the free ino
      cache will be loaded from disk into memory. When the fs tree is commited,
      the cache will be written back to disk.
      
      To keep compatibility, we check the root generation against the generation
      of the special inode when loading the cache, so the loading will fail
      if the btrfs filesystem was mounted in an older kernel before.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      82d5902d
    • L
      Btrfs: Make the code for reading/writing free space cache generic · 0414efae
      Li Zefan 提交于
      Extract out block group specific code from lookup_free_space_inode(),
      create_free_space_inode(), load_free_space_cache() and
      btrfs_write_out_cache(), so the code can be used to read/write
      free ino cache.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      0414efae
    • L
      Btrfs: Cache free inode numbers in memory · 581bb050
      Li Zefan 提交于
      Currently btrfs stores the highest objectid of the fs tree, and it always
      returns (highest+1) inode number when we create a file, so inode numbers
      won't be reclaimed when we delete files, so we'll run out of inode numbers
      as we keep create/delete files in 32bits machines.
      
      This fixes it, and it works similarly to how we cache free space in block
      cgroups.
      
      We start a kernel thread to read the file tree. By scanning inode items,
      we know which chunks of inode numbers are free, and we cache them in
      an rb-tree.
      
      Because we are searching the commit root, we have to carefully handle the
      cross-transaction case.
      
      The rb-tree is a hybrid extent+bitmap tree, so if we have too many small
      chunks of inode numbers, we'll use bitmaps. Initially we allow 16K ram
      of extents, and a bitmap will be used if we exceed this threshold. The
      extents threshold is adjusted in runtime.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      581bb050
    • L
      Btrfs: Make free space cache code generic · 34d52cb6
      Li Zefan 提交于
      So we can re-use the code to cache free inode numbers.
      
      The change is quite straightforward. Two new structures are introduced.
      
      - struct btrfs_free_space_ctl
      
        We move those variables that are used for caching free space from
        struct btrfs_block_group_cache to this new struct.
      
      - struct btrfs_free_space_op
      
        We do block group specific work (e.g. calculation of extents threshold)
        through functions registered in this struct.
      
      And then we can remove references to struct btrfs_block_group_cache.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      34d52cb6
    • L
      Btrfs: Use bitmap_set/clear() · f38b6e75
      Li Zefan 提交于
      No functional change.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      f38b6e75
    • L
      Btrfs: Remove unused btrfs_block_group_free_space() · 92c42311
      Li Zefan 提交于
      We've already recorded the value in block_group->frees_space.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      92c42311
  15. 18 4月, 2011 1 次提交
    • C
      Btrfs: fix free space cache leak · f65647c2
      Chris Mason 提交于
      The free space caching code was recently reworked to
      cache all the pages it needed instead of using find_get_page everywhere.
      
      One loop was missed though, so it ended up leaking pages.  This fixes
      it to use our page array instead of find_get_page.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      f65647c2
  16. 09 4月, 2011 1 次提交
    • J
      Btrfs: deal with the case that we run out of space in the cache · be1a12a0
      Josef Bacik 提交于
      Currently we don't handle running out of space in the cache, so to fix this we
      keep track of how far in the cache we are.  Then we only dirty the pages if we
      successfully modify all of them, otherwise if we have an error or run out of
      space we can just drop them and not worry about the vm writing them out.
      Thanks,
      
      Tested-by Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      be1a12a0
  17. 05 4月, 2011 2 次提交
    • J
      Btrfs: fix free space cache when there are pinned extents and clusters V2 · 43be2146
      Josef Bacik 提交于
      I noticed a huge problem with the free space cache that was presenting
      as an early ENOSPC.  Turns out when writing the free space cache out I
      forgot to take into account pinned extents and more importantly
      clusters.  This would result in us leaking free space everytime we
      unmounted the filesystem and remounted it.
      
      I fix this by making sure to check and see if the current block group
      has a cluster and writing out any entries that are in the cluster to the
      cache, as well as writing any pinned extents we currently have to the
      cache since those will be available for us to use the next time the fs
      mounts.
      
      This patch also adds a check to the end of load_free_space_cache to make
      sure we got the right amount of free space cache, and if not make sure
      to clear the cache and re-cache the old fashioned way.
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      43be2146
    • M
      btrfs: clear __GFP_FS flag in the space cache inode · adae52b9
      Miao Xie 提交于
      the object id of the space cache inode's key is allocated from the relative
      root, just like the regular file. So we can't identify space cache inode by
      checking the object id of the inode's key, and we have to clear __GFP_FS flag
      at the time we look up the space cache inode.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      adae52b9
  18. 28 3月, 2011 1 次提交
  19. 26 3月, 2011 1 次提交
    • J
      Btrfs: cleanup how we setup free space clusters · 4e69b598
      Josef Bacik 提交于
      This patch makes the free space cluster refilling code a little easier to
      understand, and fixes some things with the bitmap part of it.  Currently we
      either want to refill a cluster with
      
      1) All normal extent entries (those without bitmaps)
      2) A bitmap entry with enough space
      
      The current code has this ugly jump around logic that will first try and fill up
      the cluster with extent entries and then if it can't do that it will try and
      find a bitmap to use.  So instead split this out into two functions, one that
      tries to find only normal entries, and one that tries to find bitmaps.
      
      This also fixes a suboptimal thing we would do with bitmaps.  If we used a
      bitmap we would just tell the cluster that we were pointing at a bitmap and it
      would do the tree search in the block group for that entry every time we tried
      to make an allocation.  Instead of doing that now we just add it to the clusters
      group.
      
      I tested this with my ENOSPC tests and xfstests and it survived.
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      4e69b598
  20. 21 3月, 2011 1 次提交
    • J
      Btrfs: don't be as aggressive about using bitmaps · 32cb0840
      Josef Bacik 提交于
      We have been creating bitmaps for small extents unconditionally forever.  This
      was great when testing to make sure the bitmap stuff was working, but is
      overkill normally.  So instead of always adding small chunks of free space to
      bitmaps, only start doing it if we go past half of our extent threshold.  This
      will keeps us from creating a bitmap for just one small free extent at the front
      of the block group, and will make the allocator a little faster as a result.
      Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      32cb0840