1. 21 7月, 2016 1 次提交
  2. 08 6月, 2016 2 次提交
  3. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  4. 15 12月, 2015 1 次提交
    • B
      GFS2: Make rgrp reservations part of the gfs2_inode structure · a097dc7e
      Bob Peterson 提交于
      Before this patch, multi-block reservation structures were allocated
      from a special slab. This patch folds the structure into the gfs2_inode
      structure. The disadvantage is that the gfs2_inode needs more memory,
      even when a file is opened read-only. The advantages are: (a) we don't
      need the special slab and the extra time it takes to allocate and
      deallocate from it. (b) we no longer need to worry that the structure
      exists for things like quota management. (c) This also allows us to
      remove the calls to get_write_access and put_write_access since we
      know the structure will exist.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      a097dc7e
  5. 24 11月, 2015 1 次提交
    • B
      GFS2: Extract quota data from reservations structure (revert 5407e242) · b54e9a0b
      Bob Peterson 提交于
      This patch basically reverts the majority of patch 5407e242.
      That patch eliminated the gfs2_qadata structure in favor of just
      using the reservations structure. The problem with doing that is that
      it increases the size of the reservations structure. That is not an
      issue until it comes time to fold the reservations structure into the
      inode in memory so we know it's always there. By separating out the
      quota structure again, we aren't punishing the non-quota users by
      making all the inodes bigger, requiring more slab space. This patch
      creates a new slab area to allocate the quota stuff so it's managed
      a little more sanely.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      b54e9a0b
  6. 19 3月, 2015 1 次提交
    • A
      gfs2: perform quota checks against allocation parameters · b8fbf471
      Abhi Das 提交于
      Use struct gfs2_alloc_parms as an argument to gfs2_quota_check()
      and gfs2_quota_lock_check() to check for quota violations while
      accounting for the new blocks requested by the current operation
      in ap->target.
      
      Previously, the number of new blocks requested during an operation
      were not accounted for during quota_check and would allow these
      operations to exceed quota. This was not very apparent since most
      operations allocated only 1 block at a time and quotas would get
      violated in the next operation. i.e. quota excess would only be by
      1 block or so. With fallocate, (where we allocate a bunch of blocks
      at once) the quota excess is non-trivial and is addressed by this
      patch.
      Signed-off-by: NAbhi Das <adas@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
      b8fbf471
  7. 21 8月, 2014 1 次提交
  8. 16 5月, 2014 1 次提交
  9. 03 3月, 2014 1 次提交
    • S
      GFS2: Clean up journal extent mapping · b50f227b
      Steven Whitehouse 提交于
      This patch fixes a long standing issue in mapping the journal
      extents. Most journals will consist of only a single extent,
      and although the cache took account of that by merging extents,
      it did not actually map large extents, but instead was doing a
      block by block mapping. Since the journal was only being mapped
      on mount, this was not normally noticeable.
      
      With the updated code, it is now possible to use the same extent
      mapping system during journal recovery (which will be added in a
      later patch). This will allow checking of the integrity of the
      journal before any reply of the journal content is attempted. For
      this reason the code is moving to bmap.c, since it will be used
      more widely in due course.
      
      An exercise left for the reader is to compare the new function
      gfs2_map_journal_extents() with gfs2_write_alloc_required()
      
      Additionally, should there be a failure, the error reporting is
      also updated to show more detail about what went wrong.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b50f227b
  10. 02 10月, 2013 1 次提交
    • S
      GFS2: Add allocation parameters structure · 7b9cff46
      Steven Whitehouse 提交于
      This patch adds a structure to contain allocation parameters with
      the intention of future expansion of this structure. The idea is
      that we should be able to add more information about the allocation
      in the future in order to allow the allocator to make a better job
      of placing the requests on-disk.
      
      There is no functional difference from applying this patch.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7b9cff46
  11. 27 9月, 2013 1 次提交
    • S
      GFS2: Clean up reservation removal · af5c2697
      Steven Whitehouse 提交于
      The reservation for an inode should be cleared when it is truncated so
      that we can start again at a different offset for future allocations.
      We could try and do better than that, by resetting the search based on
      where the truncation started from, but this is only a first step.
      
      In addition, there are three callers of gfs2_rs_delete() but only one
      of those should really be testing the value of i_writecount. While
      we get away with that in the other cases currently, I think it would
      be better if we made that test specific to the one case which
      requires it.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      af5c2697
  12. 13 9月, 2013 1 次提交
  13. 28 6月, 2013 1 次提交
  14. 03 6月, 2013 1 次提交
  15. 08 4月, 2013 1 次提交
  16. 13 2月, 2013 1 次提交
  17. 02 2月, 2013 1 次提交
  18. 29 1月, 2013 2 次提交
    • S
      GFS2: Use ->writepages for ordered writes · 45138990
      Steven Whitehouse 提交于
      Instead of using a list of buffers to write ahead of the journal
      flush, this now uses a list of inodes and calls ->writepages
      via filemap_fdatawrite() in order to achieve the same thing. For
      most use cases this results in a shorter ordered write list,
      as well as much larger i/os being issued.
      
      The ordered write list is sorted by inode number before writing
      in order to retain the disk block ordering between inodes as
      per the previous code.
      
      The previous ordered write code used to conflict in its assumptions
      about how to write out the disk blocks with mpage_writepages()
      so that with this updated version we can also use mpage_writepages()
      for GFS2's ordered write, writepages implementation. So we will
      also send larger i/os from writeback too.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      45138990
    • S
      GFS2: Split gfs2_trans_add_bh() into two · 350a9b0a
      Steven Whitehouse 提交于
      There is little common content in gfs2_trans_add_bh() between the data
      and meta classes by the time that the functions which it calls are
      taken into account. The intent here is to split this into two
      separate functions. Stage one is to introduce gfs2_trans_add_data()
      and gfs2_trans_add_meta() and update the callers accordingly.
      
      Later patches will then pull in the content of gfs2_trans_add_bh()
      and its dependent functions in order to clean up the code in this
      area.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      350a9b0a
  19. 13 11月, 2012 1 次提交
  20. 07 11月, 2012 1 次提交
    • S
      GFS2: Add Orlov allocator · 9dbe9610
      Steven Whitehouse 提交于
      Just like ext3, this works on the root directory and any directory
      with the +T flag set. Also, just like ext3, any subdirectory created
      in one of the just mentioned cases will be allocated to a random
      resource group (GFS2 equivalent of a block group).
      
      If you are creating a set of directories, each of which will contain a
      job running on a different node, then by setting +T on the parent
      directory before creating the subdirectories, each will land up in a
      different resource group, and thus resource group contention between
      nodes will be kept to a minimum.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      9dbe9610
  21. 24 9月, 2012 1 次提交
    • S
      GFS2: Add structure to contain rgrp, bitmap, offset tuple · 4a993fb1
      Steven Whitehouse 提交于
      This patch introduces a new structure, gfs2_rbm, which is a
      tuple of a resource group, a bitmap within the resource group
      and an offset within that bitmap. This is designed to make
      manipulating these sets of variables easier. There is also a
      new helper function which converts this representation back
      to a disk block address.
      
      In addition, the rbtree nodes which are used for the reservations
      were not being correctly initialised, which is now fixed. Also,
      the tracing was not passing through the inode where it should
      have been. That is mostly fixed aside from one corner case. This
      needs to be revisited since there can also be a NULL rgrp in
      some cases which results in the device being incorrect in the
      trace.
      
      This is intended to be the first step towards cleaning up some
      of the allocation code, and some further bug fixes.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      4a993fb1
  22. 19 7月, 2012 1 次提交
    • B
      GFS2: Reduce file fragmentation · 8e2e0047
      Bob Peterson 提交于
      This patch reduces GFS2 file fragmentation by pre-reserving blocks. The
      resulting improved on disk layout greatly speeds up operations in cases
      which would have resulted in interlaced allocation of blocks previously.
      A typical example of this is 10 parallel dd processes, each writing to a
      file in a common dirctory.
      
      The implementation uses an rbtree of reservations attached to each
      resource group (and each inode).
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      8e2e0047
  23. 06 6月, 2012 1 次提交
  24. 11 5月, 2012 1 次提交
  25. 24 4月, 2012 1 次提交
  26. 05 4月, 2012 1 次提交
  27. 20 3月, 2012 1 次提交
  28. 22 11月, 2011 1 次提交
  29. 21 11月, 2011 1 次提交
    • B
      GFS2: move toward a generic multi-block allocator · 6e87ed0f
      Bob Peterson 提交于
      This patch is a revision of the one I previously posted.
      I tried to integrate all the suggestions Steve gave.
      The purpose of the patch is to change function gfs2_alloc_block
      (allocate either a dinode block or an extent of data blocks)
      to a more generic gfs2_alloc_blocks function that can
      allocate both a dinode _and_ an extent of data blocks in the
      same call. This will ultimately help us create a multi-block
      reservation scheme to reduce file fragmentation.
      
      This patch moves more toward a generic multi-block allocator that
      takes a pointer to the number of data blocks to allocate, plus whether
      or not to allocate a dinode. In theory, it could be called to allocate
      (1) a single dinode block, (2) a group of one or more data blocks, or
      (3) a dinode plus several data blocks.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6e87ed0f
  30. 15 11月, 2011 1 次提交
  31. 08 11月, 2011 1 次提交
  32. 21 10月, 2011 6 次提交
    • S
      GFS2: Move readahead of metadata during deallocation into its own function · b99b98dc
      Steven Whitehouse 提交于
      Move the recently added readahead of the indirect pointer
      tree during deallocation into its own function in order
      that we can use it elsewhere in the future. Also this
      fixes the resetting of the "first" variable in the
      original patch.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b99b98dc
    • B
      GFS2: rewrite fallocate code to write blocks directly · 64dd153c
      Benjamin Marzinski 提交于
      GFS2's fallocate code currently goes through the page cache. Since it's only
      writing to the end of the file or to holes in it, it doesn't need to, and it
      was causing issues on low memory environments. This patch pulls in some of
      Steve's block allocation work, and uses it to simply allocate the blocks for
      the file, and zero them out at allocation time.  It provides a slight
      performance increase, and it dramatically simplifies the code.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      64dd153c
    • B
      GFS2: speed up delete/unlink performance for large files · bd5437a7
      Bob Peterson 提交于
      This patch improves the performance of delete/unlink
      operations in a GFS2 file system where the files are large
      by adding a layer of metadata read-ahead for indirect blocks.
      Mileage will vary, but on my system, deleting an 8.6G file
      dropped from 22 seconds to about 4.5 seconds.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      bd5437a7
    • S
      GFS2: Use cached rgrp in gfs2_rlist_add() · 70b0c365
      Steven Whitehouse 提交于
      Each block which is deallocated, requires a call to gfs2_rlist_add()
      and each of those calls was calling gfs2_blk2rgrpd() in order to
      figure out which rgrp the block belonged in. This can be speeded up
      by making use of the rgrp cached in the inode. We also reset this
      cached rgrp in case the block has changed rgrp. This should provide
      a big reduction in gfs2_blk2rgrpd() calls during deallocation.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      70b0c365
    • S
      GFS2: Call do_strip() directly from recursive_scan() · d56fa8a1
      Steven Whitehouse 提交于
      The recursive_scan() function only ever takes a single "bc"
      argument, so we might as well just call do_strip() directly
      from resource_scan() rather than pass it in as an argument.
      
      Also the "data" argument is always a struct strip_mine, so
      we can pass that in, rather than using a void pointer.
      
      This also moves do_strip() ahead of recursive_scan() so that
      we don't need to add a prototype.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      d56fa8a1
    • S
      GFS2: Make resource groups "append only" during life of fs · 8339ee54
      Steven Whitehouse 提交于
      Since we have ruled out supporting online filesystem shrink,
      it is possible to make the resource group list append only
      during the life of a super block. This gives several benefits:
      
      Firstly, we only need to read new rindex elements as they are added
      rather than needing to reread the whole rindex file each time one
      element is added.
      
      Secondly, the rindex glock can be held for much shorter periods of
      time, and is completely removed from the fast path for allocations.
      The lock is taken in shared mode only when updating the resource
      groups when the first allocation occurs, and after a grow has
      taken place.
      
      Thirdly, this results in a reduction in code size, and everything
      gets a lot simpler to understand in this area.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      8339ee54
  33. 21 7月, 2011 1 次提交