1. 03 6月, 2016 3 次提交
  2. 26 5月, 2016 1 次提交
  3. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  4. 07 1月, 2016 3 次提交
  5. 16 12月, 2015 1 次提交
    • C
      Btrfs: check for empty bitmap list in setup_cluster_bitmaps · 1b9b922a
      Chris Mason 提交于
      Dave Jones found a warning from kasan in setup_cluster_bitmaps()
      
      ==================================================================
      BUG: KASAN: stack-out-of-bounds in setup_cluster_bitmap+0xc4/0x5a0 at
      addr ffff88039bef6828
      Read of size 8 by task nfsd/1009
      page:ffffea000e6fbd80 count:0 mapcount:0 mapping:          (null)
      index:0x0
      flags: 0x8000000000000000()
      page dumped because: kasan: bad access detected
      CPU: 1 PID: 1009 Comm: nfsd Tainted: G        W
      4.4.0-rc3-backup-debug+ #1
       ffff880065647b50 000000006bb712c2 ffff88039bef6640 ffffffffa680a43e
       0000004559c00000 ffff88039bef66c8 ffffffffa62638d1 ffffffffa61121c0
       ffff8803a5769de8 0000000000000296 ffff8803a5769df0 0000000000046280
      Call Trace:
       [<ffffffffa680a43e>] dump_stack+0x4b/0x6d
       [<ffffffffa62638d1>] kasan_report_error+0x501/0x520
       [<ffffffffa61121c0>] ? debug_show_all_locks+0x1e0/0x1e0
       [<ffffffffa6263948>] kasan_report+0x58/0x60
       [<ffffffffa6814b00>] ? rb_last+0x10/0x40
       [<ffffffffa66f8af4>] ? setup_cluster_bitmap+0xc4/0x5a0
       [<ffffffffa6262ead>] __asan_load8+0x5d/0x70
       [<ffffffffa66f8af4>] setup_cluster_bitmap+0xc4/0x5a0
       [<ffffffffa66f675a>] ? setup_cluster_no_bitmap+0x6a/0x400
       [<ffffffffa66fcd16>] btrfs_find_space_cluster+0x4b6/0x640
       [<ffffffffa66fc860>] ? btrfs_alloc_from_cluster+0x4e0/0x4e0
       [<ffffffffa66fc36e>] ? btrfs_return_cluster_to_free_space+0x9e/0xb0
       [<ffffffffa702dc37>] ? _raw_spin_unlock+0x27/0x40
       [<ffffffffa666a1a1>] find_free_extent+0xba1/0x1520
      
      Andrey noticed this was because we were doing list_first_entry on a list
      that might be empty.  Rework the tests a bit so we don't do that.
      Signed-off-by: NChris Mason <clm@fb.com>
      Reprorted-by: NAndrey Ryabinin <ryabinin.a.a@gmail.com>
      Reported-by: NDave Jones <dsj@fb.com>
      1b9b922a
  6. 10 12月, 2015 1 次提交
  7. 03 12月, 2015 1 次提交
  8. 07 11月, 2015 1 次提交
  9. 22 10月, 2015 5 次提交
  10. 08 10月, 2015 1 次提交
  11. 29 7月, 2015 1 次提交
  12. 03 6月, 2015 1 次提交
  13. 11 5月, 2015 1 次提交
    • F
      Btrfs: fix crash after inode cache writeback failure · e43699d4
      Filipe Manana 提交于
      If the writeback of an inode cache failed we were unnecessarilly
      attempting to release again the delalloc metadata that we previously
      reserved. However attempting to do this a second time triggers an
      assertion at drop_outstanding_extent() because we have no more
      outstanding extents for our inode cache's inode. If we were able
      to start writeback of the cache the reserved metadata space is
      released at btrfs_finished_ordered_io(), even if an error happens
      during writeback.
      
      So make sure we don't repeat the metadata space release if writeback
      started for our inode cache.
      
      This issue was trivial to reproduce by running the fstest btrfs/088
      with "-o inode_cache", which triggered the assertion leading to a
      BUG() call and requiring a reboot in order to run the remaining
      fstests. Trace produced by btrfs/088:
      
      [255289.385904] BTRFS: assertion failed: BTRFS_I(inode)->outstanding_extents >= num_extents, file: fs/btrfs/extent-tree.c, line: 5276
      [255289.388094] ------------[ cut here ]------------
      [255289.389184] kernel BUG at fs/btrfs/ctree.h:4057!
      [255289.390125] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      (...)
      [255289.392068] Call Trace:
      [255289.392068]  [<ffffffffa035e774>] drop_outstanding_extent+0x3d/0x6d [btrfs]
      [255289.392068]  [<ffffffffa0364988>] btrfs_delalloc_release_metadata+0x54/0xe3 [btrfs]
      [255289.392068]  [<ffffffffa03b4174>] btrfs_write_out_ino_cache+0x95/0xad [btrfs]
      [255289.392068]  [<ffffffffa036f5c4>] btrfs_save_ino_cache+0x275/0x2dc [btrfs]
      [255289.392068]  [<ffffffffa03e2d83>] commit_fs_roots.isra.12+0xaa/0x137 [btrfs]
      [255289.392068]  [<ffffffff8107d33d>] ? trace_hardirqs_on+0xd/0xf
      [255289.392068]  [<ffffffffa037841f>] ? btrfs_commit_transaction+0x4b1/0x9c9 [btrfs]
      [255289.392068]  [<ffffffff814351a4>] ? _raw_spin_unlock+0x32/0x46
      [255289.392068]  [<ffffffffa037842e>] btrfs_commit_transaction+0x4c0/0x9c9 [btrfs]
      (...)
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      e43699d4
  14. 07 5月, 2015 1 次提交
  15. 26 4月, 2015 1 次提交
  16. 25 4月, 2015 1 次提交
  17. 24 4月, 2015 1 次提交
    • C
      Btrfs: fix inode cache writeout · 85db36cf
      Chris Mason 提交于
      The code to fix stalls during free spache cache IO wasn't using
      the correct root when waiting on the IO for inode caches.  This
      is only a problem when the inode cache is enabled with
      
      mount -o inode_cache
      
      This fixes the inode cache writeout to preserve any error values and
      makes sure not to override the root when inode cache writeout is done.
      Reported-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      85db36cf
  18. 11 4月, 2015 5 次提交
    • C
      Btrfs: allow block group cache writeout outside critical section in commit · 1bbc621e
      Chris Mason 提交于
      We loop through all of the dirty block groups during commit and write
      the free space cache.  In order to make sure the cache is currect, we do
      this while no other writers are allowed in the commit.
      
      If a large number of block groups are dirty, this can introduce long
      stalls during the final stages of the commit, which can block new procs
      trying to change the filesystem.
      
      This commit changes the block group cache writeout to take appropriate
      locks and allow it to run earlier in the commit.  We'll still have to
      redo some of the block groups, but it means we can get most of the work
      out of the way without blocking the entire FS.
      Signed-off-by: NChris Mason <clm@fb.com>
      1bbc621e
    • C
      Btrfs: don't use highmem for free space cache pages · 2b108268
      Chris Mason 提交于
      In order to create the free space cache concurrently with FS modifications,
      we need to take a few block group locks.
      
      The cache code also does kmap, which would schedule with the locks held.
      Instead of going through kmap_atomic, lets just use lowmem for the cache
      pages.
      Signed-off-by: NChris Mason <clm@fb.com>
      2b108268
    • C
      Btrfs: two stage dirty block group writeout · c9dc4c65
      Chris Mason 提交于
      Block group cache writeout is currently waiting on the pages for each
      block group cache before moving on to writing the next one.  This commit
      switches things around to send down all the caches and then wait on them
      in batches.
      
      The end result is much faster, since we're keeping the disk pipeline
      full.
      Signed-off-by: NChris Mason <clm@fb.com>
      c9dc4c65
    • C
      btrfs: move struct io_ctl into ctree.h and rename it · 4c6d1d85
      Chris Mason 提交于
      We'll need to put the io_ctl into the block_group cache struct, so
      name it struct btrfs_io_ctl and move it into ctree.h
      Signed-off-by: NChris Mason <clm@fb.com>
      4c6d1d85
    • C
      btrfs: actively run the delayed refs while deleting large files · 28ed1345
      Chris Mason 提交于
      When we are deleting large files with large extents, we are building up
      a huge set of delayed refs for processing.  Truncate isn't checking
      often enough to see if we need to back off and process those, or let
      a commit proceed.
      
      The end result is long stalls after the rm, and very long commit times.
      During the commits, other processes back up waiting to start new
      transactions and we get into trouble.
      Signed-off-by: NChris Mason <clm@fb.com>
      28ed1345
  19. 04 3月, 2015 7 次提交
  20. 21 2月, 2015 1 次提交
  21. 03 2月, 2015 1 次提交
  22. 22 1月, 2015 1 次提交
    • J
      Btrfs: track dirty block groups on their own list · ce93ec54
      Josef Bacik 提交于
      Currently any time we try to update the block groups on disk we will walk _all_
      block groups and check for the ->dirty flag to see if it is set.  This function
      can get called several times during a commit.  So if you have several terabytes
      of data you will be a very sad panda as we will loop through _all_ of the block
      groups several times, which makes the commit take a while which slows down the
      rest of the file system operations.
      
      This patch introduces a dirty list for the block groups that we get added to
      when we dirty the block group for the first time.  Then we simply update any
      block groups that have been dirtied since the last time we called
      btrfs_write_dirty_block_groups.  This allows us to clean up how we write the
      free space cache out so it is much cleaner.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      ce93ec54