1. 12 3月, 2014 2 次提交
  2. 07 3月, 2014 5 次提交
  3. 06 3月, 2014 1 次提交
  4. 03 3月, 2014 1 次提交
    • S
      GFS2: Clean up journal extent mapping · b50f227b
      Steven Whitehouse 提交于
      This patch fixes a long standing issue in mapping the journal
      extents. Most journals will consist of only a single extent,
      and although the cache took account of that by merging extents,
      it did not actually map large extents, but instead was doing a
      block by block mapping. Since the journal was only being mapped
      on mount, this was not normally noticeable.
      
      With the updated code, it is now possible to use the same extent
      mapping system during journal recovery (which will be added in a
      later patch). This will allow checking of the integrity of the
      journal before any reply of the journal content is attempted. For
      this reason the code is moving to bmap.c, since it will be used
      more widely in due course.
      
      An exercise left for the reader is to compare the new function
      gfs2_map_journal_extents() with gfs2_write_alloc_required()
      
      Additionally, should there be a failure, the error reporting is
      also updated to show more detail about what went wrong.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b50f227b
  5. 27 2月, 2014 1 次提交
  6. 25 2月, 2014 3 次提交
    • S
      GFS2: Remove extra "if" in gfs2_log_flush() · b1ab1e44
      Steven Whitehouse 提交于
      By reordering some of the assignments in gfs2_log_flush() it
      is possible to remove one of the "if" statements as it can be
      merged with one higher up the function.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b1ab1e44
    • S
      GFS2: Move log buffer accounting to transaction · 022ef4fe
      Steven Whitehouse 提交于
      Now we have a master transaction into which other transactions
      are merged, the accounting can be done using this master
      transaction. We no longer require the superblock fields which
      were being used for this function.
      
      In addition, this allows for a clean up in calc_reserved()
      making it rather easier understand. Also, by reducing the
      number of variables used to track the buffers being added
      and removed from the journal, a number of error checks are
      now no longer required.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      022ef4fe
    • S
      GFS2: Move log buffer lists into transaction · d69a3c65
      Steven Whitehouse 提交于
      Over time, we hope to be able to improve the concurrency available
      in the log code. This is one small step towards that, by moving
      the buffer lists from the super block, and into the transaction
      structure, so that each transaction builds its own buffer lists.
      
      At transaction commit time, the buffer lists are merged into
      the currently accumulating transaction. That transaction then
      is passed into the before and after commit functions at journal
      flush time. Thus there should be no change in overall behaviour
      yet.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      d69a3c65
  7. 21 2月, 2014 1 次提交
  8. 17 2月, 2014 1 次提交
  9. 10 2月, 2014 1 次提交
  10. 07 2月, 2014 1 次提交
    • S
      GFS2: Add meta readahead field in directory entries · 44aaada9
      Steven Whitehouse 提交于
      The intent of this new field in the directory entry is to
      allow a subsequent lookup to know how many blocks, which
      are contiguous with the inode, contain metadata which relates
      to the inode. This will then allow the issuing of a single
      read to read these blocks, rather than reading the inode
      first, and then issuing a second read for the metadata.
      
      This only works under some fairly strict conditions, since
      we do not have back pointers from inodes to directory entries
      we must ensure that the blocks referenced in this way will
      always belong to the inode.
      
      This rules out being able to use this system for indirect
      blocks, as these can change as a result of truncate/rewrite.
      
      So the idea here is to restrict this to xattr blocks only
      for the time being. For most inodes, that means only a
      single block. Also, when using ACLs and/or SELinux or
      other LSMs, these will be added at inode creation time
      so that they will be contiguous with the inode on disk and
      also will almost always be needed when we read the inode in
      for permissions checks.
      
      Once an xattr block for an inode is allocated, it will never
      change until the inode is deallocated.
      
      This patch adds the new field, a further patch will add the
      readahead in due course.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      44aaada9
  11. 06 2月, 2014 2 次提交
    • B
      GFS2: Lock i_mutex and use a local gfs2_holder for fallocate · a0846a53
      Bob Peterson 提交于
      This patch causes GFS2 to lock the i_mutex during fallocate. It
      also switches from using a dinode's inode glock to using a local
      holder like the other GFS2 i_operations.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      a0846a53
    • S
      GFS2: journal data writepages update · 774016b2
      Steven Whitehouse 提交于
      GFS2 has carried what is more or less a copy of the
      write_cache_pages() for some time. It seems that this
      copy has slipped behind the core code over time. This
      patch brings it back uptodate, and in addition adds the
      tracepoint which would otherwise be missing.
      
      We could go further, and eliminate some or all of the
      code duplication here. The issue is that if we do that,
      then the function we need to split out from the existing
      write_cache_pages(), which will look a lot like
      gfs2_jdata_write_pagevec(), would land up putting quite a
      lot of extra variables on the stack. I know that has been
      a problem in the past in the writeback code path, which
      is why I've hesitated to do it here.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      774016b2
  12. 04 2月, 2014 1 次提交
    • S
      GFS2: Allocate block for xattr at inode alloc time, if required · b2c8b3ea
      Steven Whitehouse 提交于
      This is another step towards improving the allocation of xattr
      blocks at inode allocation time. Here we take advantage of
      Christoph's recent work on ACLs to allocate a block for the
      xattrs early if we know that we will be adding ACLs to the
      inode later on. The advantage of that is that it is much
      more likely that we'll get a contiguous run of two blocks
      where the first is the inode and the second is the xattr block.
      
      We still have to fall back to the original system in case we
      don't get the requested two contiguous blocks, or in case the
      ACLs are too large to fit into the block.
      
      Future patches will move more of the ACL setting code further
      up the gfs2_inode_create() function. Also, I'd like to be
      able to do the same thing with the xattrs from LSMs in
      due course, too. That way we should be able to slowly reduce
      the number of independent transactions, at least in the
      most common cases.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b2c8b3ea
  13. 03 2月, 2014 1 次提交
    • S
      GFS2: Plug on AIL flush · 885bceca
      Steven Whitehouse 提交于
      When we do a flush of the AIL list, we are writing out what is
      likely to be a lot of small I/Os, which are possibly in an order
      which is not ideal performance-wise. Since this is done by calling
      filemap_fdatatwrite for each individual inode's address space there
      is no overall plugging going on.
      
      In addition to that, we do not always wait for AIL i/o when we flush
      it, so that it is possible for things to get left behind on the queue.
      By adding explicit plugging here, we reduce the chances of this
      being an issues. A quick test using the AIL flush tracepoint shows a
      small, but measurable improvement.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      885bceca
  14. 26 1月, 2014 3 次提交
  15. 18 1月, 2014 1 次提交
  16. 16 1月, 2014 2 次提交
  17. 15 1月, 2014 6 次提交
    • S
      GFS2: Fix kbuild test robot reported warning · 1e3d3620
      Steven Whitehouse 提交于
      Well I don't get the same warning locally as the kbuild
      robot, but I guess this should fix the problem, anyway.
      Here is the warning:
      
      head:   2d9e7230
      commit: ee2411a8 [19/20] GFS2: Clean up quota slot allocation
      config: make ARCH=powerpc allmodconfig
      
      All error/warnings:
      
         fs/gfs2/quota.c: In function 'gfs2_quota_init':
      >> fs/gfs2/quota.c:1246:3: error: implicit declaration of function '__vmalloc' [-Werror=implicit-function-declaration]
            sdp->sd_quota_bitmap = __vmalloc(bm_size, GFP_NOFS, PAGE_KERNEL);
            ^
      >> fs/gfs2/quota.c:1246:24: warning: assignment makes pointer from integer without a cast [enabled by default]
            sdp->sd_quota_bitmap = __vmalloc(bm_size, GFP_NOFS, PAGE_KERNEL);
                                 ^
         fs/gfs2/quota.c: In function 'gfs2_quota_cleanup':
      >> fs/gfs2/quota.c:1361:4: error: implicit declaration of function 'vfree' [-Werror=implicit-function-declaration]
             vfree(sdp->sd_quota_bitmap);
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1e3d3620
    • S
      GFS2: Move quota bitmap operations under their own lock · 2d9e7230
      Steven Whitehouse 提交于
      Gradually, the global qd_lock is being used for less and less.
      After this patch it will only be used for the per super block
      list whose purpose is to allow syncing of changes back to the
      master quota file from the local quota changes file. Fixing
      up that process to make it more efficient will be the subject
      of a later patch, however this patch removes another barrier
      to doing that.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Abhijith Das <adas@redhat.com>
      2d9e7230
    • S
      GFS2: Clean up quota slot allocation · ee2411a8
      Steven Whitehouse 提交于
      Quota slot allocation has historically used a vector of pages
      and a set of homegrown find/test/set/clear bit functions. Since
      the size of the bitmap is likely to be based on the default
      qc file size, thats a couple of pages at most. So we ought
      to be able to allocate that as a single chunk, with a vmalloc
      fallback, just in case of memory fragmentation.
      
      We are then able to use the kernel's own find/test/set/clear
      bit functions, rather than rolling our own.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Abhijith Das <adas@redhat.com>
      ee2411a8
    • S
      GFS2: Only run logd and quota when mounted read/write · 8ad151c2
      Steven Whitehouse 提交于
      While investigating a rather strange bit of code in the quota
      clean up function, I spotted that the reason for its existence
      was that when remounting read only, we were not stopping the
      quotad thread, and thus it was possible for it to still have
      a reference to some of the quotas in that case.
      
      This patch moves the logd and quota thread start and stop into
      the make_fs_rw/ro functions, so that we now stop those threads
      when mounted read only.
      
      This means that quotad will always be stopped before we call
      the quota clean up function, and we can thus dispose of the
      (rather hackish) code that waits for it to give up its
      reference on the quotas.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Abhijith Das <adas@redhat.com>
      8ad151c2
    • S
      GFS2: Use RCU/hlist_bl based hash for quotas · c754fbbb
      Steven Whitehouse 提交于
      Prior to this patch, GFS2 kept all the quotas for each
      super block in a single linked list. This is rather slow
      when there are large numbers of quotas.
      
      This patch introduces a hlist_bl based hash table, similar
      to the one used for glocks. The initial look up of the quota
      is now lockless in the case where it is already cached,
      although we still have to take the per quota spinlock in
      order to bump the ref count. Either way though, this is a
      big improvement on what was there before.
      
      The qd_lock and the per super block list is preserved, for
      the time being. However it is intended that since this is no
      longer used for its original role, it should be possible to
      shrink the number of items on that list in due course and
      remove the requirement to take qd_lock in qd_get.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Abhijith Das <adas@redhat.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      c754fbbb
    • S
      GFS2: No need to invalidate pages for a dio read · 086352f1
      Steven Whitehouse 提交于
      We recently fixed the writeback of pages prior to performing
      direct i/o, however the initial fix was perhaps a bit heavy
      handed. There is no need to invalidate pages if the direct i/o
      is only a read, since they will be identical to what has been
      flushed to disk anyway.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      086352f1
  18. 10 1月, 2014 1 次提交
  19. 08 1月, 2014 2 次提交
    • S
      GFS2: Add hints to directory leaf blocks · 01bcb0de
      Steven Whitehouse 提交于
      This patch adds four new fields to directory leaf blocks.
      The intent is not to use them in the kernel itself, although
      perhaps we may be able to use them as hints at some later date,
      but instead to provide more information for debug/fsck use.
      
      One new field adds a pointer to the inode to which the leaf
      belongs. This can be useful if the pointer to the leaf block
      has become corrupt, as it will allow us to know which inode
      this block should be associated with. This field is set when
      the leaf is created and never changed over its lifetime.
      
      The second field is a "distance from the hash table" field.
      The meaning is as follows:
       0  = An old leaf in which this value has not been set
       1  = This leaf is pointed to directly from the hash table
       2+ = This leaf is part of a chain, pointed to by another leaf
            block, the value gives the position in the chain.
      
      The third and fourth fields combine to give a time stamp of
      the most recent directory insertion or deletion from this
      leaf block. The time stamp is not updated when a new leaf
      block is chained from the current one. The code is currently
      written such that the timestamp on the dir inode will match
      that of the leaf block for the most recent insertion/deletion.
      
      For backwards compatibility, any of these new fields which is
      zero should be considered to be "unknown".
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      01bcb0de
    • S
      GFS2: For exhash conversion, only one block is needed · 22b5a6c0
      Steven Whitehouse 提交于
      For most cases, only a single new block is needed when we reach
      the point of converting from stuffed to exhash directory. The
      exception being when the file name is so long that it will not
      fit within the new leaf block.
      
      So this patch adds a simple test for that situation so that we
      do not need to request the full reservation size in this case.
      
      Potentially we could calculate more accurately the value to use
      in other cases too, but that is much more complicated to do and
      it is doubtful that the benefit would outweigh the extra cost
      in code complexity.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      22b5a6c0
  20. 07 1月, 2014 1 次提交
  21. 06 1月, 2014 3 次提交
    • S
      GFS2: Remember directory insert point · 2b47dad8
      Steven Whitehouse 提交于
      When we look to see if there is enough space to add a dir
      entry without allocation, we have then been repeating the
      same search later when we do the actual insertion. This
      patch caches the details of the location in the gfs2_diradd
      structure, so that we do not have to repeat the search.
      
      This will provide a performance improvement which will be
      greater as the size of the directory increases.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      2b47dad8
    • S
      GFS2: Consolidate transaction blocks calculation for dir add · 534cf9ca
      Steven Whitehouse 提交于
      There are three cases where we need to calculate the number of
      blocks to reserve in a transaction involving linking an inode
      into a directory. The one in rename is a bit more complicated,
      but the basis of it is the same as for link and create. So it
      makes sense to move this calculation into a single function
      rather than repeating it three times.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      534cf9ca
    • S
      GFS2: Add directory addition info structure · 3c1c0ae1
      Steven Whitehouse 提交于
      The intent is that this structure will hold the information
      required when adding entries to a directory (linking). To
      start with, it will contain only the number of blocks which
      are required to link the new entry into the directory. The
      current calculation returns either 0 or the maximim number of
      blocks that can ever be requested by such a transaction.
      
      The intent is that in a later patch, we can update the dir
      code to calculate this value more accurately. In addition
      further patches will also add further fields to the new
      structure to increase its utility.
      
      In addition this patch fixes a bug where the link used during
      inode creation was adding requesting too many blocks in
      some cases. This is harmless unless the fs is close to being
      full.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      3c1c0ae1