1. 09 6月, 2015 1 次提交
    • N
      ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate · 331573fe
      Namjae Jeon 提交于
      This patch implements fallocate's FALLOC_FL_INSERT_RANGE for Ext4.
      
      1) Make sure that both offset and len are block size aligned.
      2) Update the i_size of inode by len bytes.
      3) Compute the file's logical block number against offset. If the computed
         block number is not the starting block of the extent, split the extent
         such that the block number is the starting block of the extent.
      4) Shift all the extents which are lying between [offset, last allocated extent]
         towards right by len bytes. This step will make a hole of len bytes
         at offset.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>
      331573fe
  2. 08 6月, 2015 1 次提交
  3. 15 5月, 2015 2 次提交
  4. 03 5月, 2015 1 次提交
  5. 12 4月, 2015 1 次提交
  6. 03 4月, 2015 4 次提交
    • E
      ext4: don't release reserved space for previously allocated cluster · 9d21c9fa
      Eric Whitney 提交于
      When xfstests' auto group is run on a bigalloc filesystem with a
      4.0-rc3 kernel, e2fsck failures and kernel warnings occur for some
      tests. e2fsck reports incorrect iblocks values, and the warnings
      indicate that the space reserved for delayed allocation is being
      overdrawn at allocation time.
      
      Some of these errors occur because the reserved space is incorrectly
      decreased by one cluster when ext4_ext_map_blocks satisfies an
      allocation request by mapping an unused portion of a previously
      allocated cluster.  Because a cluster's worth of reserved space was
      already released when it was first allocated, it should not be released
      again.
      
      This patch appears to correct the e2fsck failure reported for
      generic/232 and the kernel warnings produced by ext4/001, generic/009,
      and generic/033.  Failures and warnings for some other tests remain to
      be addressed.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      9d21c9fa
    • E
      ext4: fix loss of delalloc extent info in ext4_zero_range() · 94426f4b
      Eric Whitney 提交于
      In ext4_zero_range(), removing a file's entire block range from the
      extent status tree removes all records of that file's delalloc extents.
      The delalloc accounting code uses this information, and its loss can
      then lead to accounting errors and kernel warnings at writeback time and
      subsequent file system damage.  This is most noticeable on bigalloc
      file systems where code in ext4_ext_map_blocks() handles cases where
      delalloc extents share clusters with a newly allocated extent.
      
      Because we're not deleting a block range and are correctly updating the
      status of its associated extent, there is no need to remove anything
      from the extent status tree.
      
      When this patch is combined with an unrelated bug fix for
      ext4_zero_range(), kernel warnings and e2fsck errors reported during
      xfstests runs on bigalloc filesystems are greatly reduced without
      introducing regressions on other xfstests-bld test scenarios.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      94426f4b
    • L
      ext4: allocate entire range in zero range · 0f2af21a
      Lukas Czerner 提交于
      Currently there is a bug in zero range code which causes zero range
      calls to only allocate block aligned portion of the range, while
      ignoring the rest in some cases.
      
      In some cases, namely if the end of the range is past i_size, we do
      attempt to preallocate the last nonaligned block. However this might
      cause kernel to BUG() in some carefully designed zero range requests
      on setups where page size > block size.
      
      Fix this problem by first preallocating the entire range, including
      the nonaligned edges and converting the written extents to unwritten
      in the next step. This approach will also give us the advantage of
      having the range to be as linearly contiguous as possible.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      0f2af21a
    • X
      ext4: fix comments in ext4_can_extents_be_merged() · 4255c224
      Xiaoguang Wang 提交于
      Since commit a9b82415, we are allowed to merge unwritten extents,
      so here these comments are wrong, remove it.
      Signed-off-by: NXiaoguang Wang <wangxg.fnst@cn.fujitsu.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      4255c224
  7. 03 1月, 2015 1 次提交
  8. 03 12月, 2014 2 次提交
  9. 26 11月, 2014 5 次提交
    • J
      ext4: remove never taken branch from ext4_ext_shift_path_extents() · 733ded2a
      Jan Kara 提交于
      path[depth].p_hdr can never be NULL for a path passed to us (and even if
      it could, EXT_LAST_EXTENT() would make something != NULL from it). So
      just remove the branch.
      
      Coverity-id: 1196498
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      733ded2a
    • J
      ext4: move handling of list of shrinkable inodes into extent status code · b0dea4c1
      Jan Kara 提交于
      Currently callers adding extents to extent status tree were responsible
      for adding the inode to the list of inodes with freeable extents. This
      is error prone and puts list handling in unnecessarily many places.
      
      Just add inode to the list automatically when the first non-delay extent
      is added to the tree and remove inode from the list when the last
      non-delay extent is removed.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      b0dea4c1
    • Z
      ext4: change LRU to round-robin in extent status tree shrinker · edaa53ca
      Zheng Liu 提交于
      In this commit we discard the lru algorithm for inodes with extent
      status tree because it takes significant effort to maintain a lru list
      in extent status tree shrinker and the shrinker can take a long time to
      scan this lru list in order to reclaim some objects.
      
      We replace the lru ordering with a simple round-robin.  After that we
      never need to keep a lru list.  That means that the list needn't be
      sorted if the shrinker can not reclaim any objects in the first round.
      
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      edaa53ca
    • Z
      ext4: cache extent hole in extent status tree for ext4_da_map_blocks() · 2f8e0a7c
      Zheng Liu 提交于
      Currently extent status tree doesn't cache extent hole when a write
      looks up in extent tree to make sure whether a block has been allocated
      or not.  In this case, we don't put extent hole in extent cache because
      later this extent might be removed and a new delayed extent might be
      added back.  But it will cause a defect when we do a lot of writes.  If
      we don't put extent hole in extent cache, the following writes also need
      to access extent tree to look at whether or not a block has been
      allocated.  It brings a cache miss.  This commit fixes this defect.
      Also if the inode doesn't have any extent, this extent hole will be
      cached as well.
      
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      2f8e0a7c
    • J
      ext4: fix block reservation for bigalloc filesystems · cbd7584e
      Jan Kara 提交于
      For bigalloc filesystems we have to check whether newly requested inode
      block isn't already part of a cluster for which we already have delayed
      allocation reservation. This check happens in ext4_ext_map_blocks() and
      that function sets EXT4_MAP_FROM_CLUSTER if that's the case. However if
      ext4_da_map_blocks() finds in extent cache information about the block,
      we don't call into ext4_ext_map_blocks() and thus we always end up
      getting new reservation even if the space for cluster is already
      reserved. This results in overreservation and premature ENOSPC reports.
      
      Fix the problem by checking for existing cluster reservation already in
      ext4_da_map_blocks(). That simplifies the logic and actually allows us
      to get rid of the EXT4_MAP_FROM_CLUSTER flag completely.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      cbd7584e
  10. 23 11月, 2014 4 次提交
    • E
      ext4: fix end of region partial cluster handling · 0756b908
      Eric Whitney 提交于
      ext4_ext_remove_space() can incorrectly free a partial_cluster if
      EAGAIN is encountered while truncating or punching.  Extent removal
      should be retried in this case.
      
      It also fails to free a partial cluster when the punched region begins
      at the start of a file on that unaligned cluster and where the entire
      file has not been punched.  Remove the requirement that all blocks in
      the file must have been freed in order to free the partial cluster.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      0756b908
    • E
      ext4: miscellaneous partial cluster cleanups · 345ee947
      Eric Whitney 提交于
      Add some casts and rearrange a few statements for improved readability.
      Some code can also be simplified and made more readable if we set
      partial_cluster to 0 rather than to a negative value when we can tell
      we've hit the left edge of the punched region.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      345ee947
    • E
      ext4: fix end of leaf partial cluster handling · 5bf43760
      Eric Whitney 提交于
      The fix in commit ad6599ab ("ext4: fix premature freeing of
      partial clusters split across leaf blocks"), intended to avoid
      dereferencing an invalid extent pointer when determining whether a
      partial cluster should be freed, wasn't quite good enough.  Assure that
      at least one extent remains at the start of the leaf once the hole has
      been punched.  Otherwise, the pointer to the extent to the right of the
      hole will be invalid and a partial cluster will be incorrectly freed.
      
      Set partial_cluster to 0 when we can tell we've hit the left edge of
      the punched region within the leaf.  This prevents incorrect freeing
      of a partial cluster when ext4_ext_rm_leaf is called one last time
      during extent tree traversal after the punched region has been removed.
      
      Adjust comments to reflect code changes and a correction.  Remove a bit
      of dead code.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      5bf43760
    • E
      ext4: fix partial cluster initialization · f4226d9e
      Eric Whitney 提交于
      The partial_cluster variable is not always initialized correctly when
      hole punching on bigalloc file systems.  Although commit c0634493
      ("ext4: fix partial cluster handling for bigalloc file systems")
      addressed the case where the right edge of the punched region and the
      next extent to its right were within the same leaf, it didn't handle
      the case where the next extent to its right is in the next leaf.  This
      causes xfstest generic/300 to fail.
      
      Fix this by replacing the code in c0634493922 with a more general
      solution that can continue the search for the first cluster to the
      right of the punched region into the next leaf if present.  If found,
      partial_cluster is initialized to this cluster's negative value.
      There's no need to determine if that cluster is actually shared;  we
      simply record it so its blocks won't be freed in the event it does
      happen to be shared.
      
      Also, minimize the burden on non-bigalloc file systems with some minor
      code simplification.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      f4226d9e
  11. 30 10月, 2014 1 次提交
  12. 13 10月, 2014 1 次提交
  13. 02 10月, 2014 1 次提交
  14. 05 9月, 2014 1 次提交
    • T
      ext4: prepare to drop EXT4_STATE_DELALLOC_RESERVED · e3cf5d5d
      Theodore Ts'o 提交于
      The EXT4_STATE_DELALLOC_RESERVED flag was originally implemented
      because it was too hard to make sure the mballoc and get_block flags
      could be reliably passed down through all of the codepaths that end up
      calling ext4_mb_new_blocks().
      
      Since then, we have mb_flags passed down through most of the code
      paths, so getting rid of EXT4_STATE_DELALLOC_RESERVED isn't as tricky
      as it used to.
      
      This commit plumbs in the last of what is required, and then adds a
      WARN_ON check to make sure we haven't missed anything.  If this passes
      a full regression test run, we can then drop
      EXT4_STATE_DELALLOC_RESERVED.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      e3cf5d5d
  15. 02 9月, 2014 10 次提交
    • T
      ext4: rename ext4_ext_find_extent() to ext4_find_extent() · ed8a1a76
      Theodore Ts'o 提交于
      Make the function name less redundant.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      ed8a1a76
    • T
      ext4: reuse path object in ext4_ext_shift_extents() · ee4bd0d9
      Theodore Ts'o 提交于
      Now that the semantics of ext4_ext_find_extent() are much cleaner,
      it's safe and more efficient to reuse the path object across the
      multiple calls to ext4_ext_find_extent() in ext4_ext_shift_extents().
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      ee4bd0d9
    • T
      ext4: teach ext4_ext_find_extent() to realloc path if necessary · 10809df8
      Theodore Ts'o 提交于
      This adds additional safety in case for some reason we end reusing a
      path structure which isn't big enough for current depth of the inode.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      10809df8
    • T
      ext4: allow a NULL argument to ext4_ext_drop_refs() · b7ea89ad
      Theodore Ts'o 提交于
      Teach ext4_ext_drop_refs() to accept a NULL argument, much like
      kfree().  This allows us to drop a lot of checks to make sure path is
      non-NULL before calling ext4_ext_drop_refs().
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      b7ea89ad
    • T
      ext4: call ext4_ext_drop_refs() from ext4_ext_find_extent() · 523f431c
      Theodore Ts'o 提交于
      In nearly all of the calls to ext4_ext_find_extent() where the caller
      is trying to recycle the path object, ext4_ext_drop_refs() gets called
      to release the buffer heads before the path object gets overwritten.
      To simplify things for the callers, and to avoid the possibility of a
      memory leak, make ext4_ext_find_extent() responsible for dropping the
      buffers.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      523f431c
    • T
      ext4: drop EXT4_EX_NOFREE_ON_ERR from rest of extents handling code · dfe50809
      Theodore Ts'o 提交于
      Drop EXT4_EX_NOFREE_ON_ERR from ext4_ext_create_new_leaf(),
      ext4_split_extent(), ext4_convert_unwritten_extents_endio().
      
      This requires fixing all of their callers to potentially
      ext4_ext_find_extent() to free the struct ext4_ext_path object in case
      of an error, and there are interlocking dependencies all the way up to
      ext4_ext_map_blocks(), ext4_swap_extents(), and
      ext4_ext_remove_space().
      
      Once this is done, we can drop the EXT4_EX_NOFREE_ON_ERR flag since it
      is no longer necessary.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      dfe50809
    • T
      ext4: drop EXT4_EX_NOFREE_ON_ERR in convert_initialized_extent() · 4f224b8b
      Theodore Ts'o 提交于
      Transfer responsibility of freeing struct ext4_ext_path on error to
      ext4_ext_find_extent().
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      4f224b8b
    • T
      ext4: collapse ext4_convert_initialized_extents() · e8b83d93
      Theodore Ts'o 提交于
      The function ext4_convert_initialized_extents() is only called by a
      single function --- ext4_ext_convert_initalized_extents().  Inline the
      code and get rid of the unnecessary bits in order to simplify the code.
      
      Rename ext4_ext_convert_initalized_extents() to
      convert_initalized_extents() since it's a static function that is
      actually only used in a single caller, ext4_ext_map_blocks().
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      e8b83d93
    • T
      ext4: teach ext4_ext_find_extent() to free path on error · 705912ca
      Theodore Ts'o 提交于
      Right now, there are a places where it is all to easy to leak memory
      on an error path, via a usage like this:
      
      	struct ext4_ext_path *path = NULL
      
      	while (...) {
      		...
      		path = ext4_ext_find_extent(inode, block, path, 0);
      		if (IS_ERR(path)) {
      			/* oops, if path was non-NULL before the call to
      			   ext4_ext_find_extent, we've leaked it!  :-(  */
      			...
      			return PTR_ERR(path);
      		}
      		...
      	}
      
      Unfortunately, there some code paths where we are doing the following
      instead:
      
      	path = ext4_ext_find_extent(inode, block, orig_path, 0);
      
      and where it's important that we _not_ free orig_path in the case
      where ext4_ext_find_extent() returns an error.
      
      So change the function signature of ext4_ext_find_extent() so that it
      takes a struct ext4_ext_path ** for its third argument, and by
      default, on an error, it will free the struct ext4_ext_path, and then
      zero out the struct ext4_ext_path * pointer.  In order to avoid
      causing problems, we add a flag EXT4_EX_NOFREE_ON_ERR which causes
      ext4_ext_find_extent() to use the original behavior of forcing the
      caller to deal with freeing the original path pointer on the error
      case.
      
      The goal is to get rid of EXT4_EX_NOFREE_ON_ERR entirely, but this
      allows for a gentle transition and makes the patches easier to verify.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      
      		
      705912ca
    • T
      ext4: fix ZERO_RANGE bug hidden by flag aliasing · 713e8dde
      Theodore Ts'o 提交于
      We accidently aliased EXT4_EX_NOCACHE and EXT4_GET_CONVERT_UNWRITTEN
      falgs, which apparently was hiding a bug that was unmasked when this
      flag aliasing issue was addressed (see the subsequent commit).  The
      reproduction case was:
      
         fsx -N 10000 -l 500000 -r 4096 -t 4096 -w 4096 -Z -R -W /vdb/junk
      
      ... which would cause fsx to report corruption in the data file.
      
      The fix we have is a bit of an overkill, but I'd much rather be
      conservative for now, and we can optimize ZERO_RANGE_FL handling
      later.  The fact that we need to zap the extent_status cache for the
      inode is unfortunate, but correctness is far more important than
      performance.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      713e8dde
  16. 01 9月, 2014 1 次提交
  17. 31 8月, 2014 2 次提交
  18. 28 8月, 2014 1 次提交