- 02 7月, 2015 1 次提交
-
-
由 Nikolay Borisov 提交于
Switch ext4 to using sb_getblk_gfp with GFP_NOFS added to fix possible deadlocks in the page writeback path. Signed-off-by: NNikolay Borisov <kernel@kyup.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 21 6月, 2015 1 次提交
-
-
由 Theodore Ts'o 提交于
In order to prevent quota block tracking to be inaccurate when ext4_quota_write() fails with ENOSPC, we make two changes. The quota file can now use the reserved block (since the quota file is arguably file system metadata), and ext4_quota_write() now uses ext4_should_retry_alloc() to retry the block allocation after a commit has completed and released some blocks for allocation. This fixes failures of xfstests generic/270: Quota error (device vdc): write_blk: dquota write failed Quota error (device vdc): qtree_write_dquot: Error -28 occurred while creating quota Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 15 6月, 2015 2 次提交
-
-
由 Lukas Czerner 提交于
Currently existing dio workers can jump in and potentially increase extent tree depth while we're allocating blocks in ext4_alloc_file_blocks(). This may cause us to underestimate the number of credits needed for the transaction because the extent tree depth can change after our estimation. Fix this by waiting for all the existing dio workers in the same way as we do it in ext4_punch_hole. We've seen errors caused by this in xfstest generic/299, however it's really hard to reproduce. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Lukas Czerner 提交于
Currently in ext4_alloc_file_blocks() the number of credits is calculated only once before we enter the allocation loop. However within the allocation loop the extent tree depth can change, hence the number of credits needed can increase potentially exceeding the number of credits reserved in the handle which can cause journal failures. Fix this by recalculating number of credits when the inode depth changes. Note that even though ext4_alloc_file_blocks() is only currently used by extent base inodes we will avoid recalculating number of credits unnecessarily in the case of indirect based inodes. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 09 6月, 2015 1 次提交
-
-
由 Namjae Jeon 提交于
This patch implements fallocate's FALLOC_FL_INSERT_RANGE for Ext4. 1) Make sure that both offset and len are block size aligned. 2) Update the i_size of inode by len bytes. 3) Compute the file's logical block number against offset. If the computed block number is not the starting block of the extent, split the extent such that the block number is the starting block of the extent. 4) Shift all the extents which are lying between [offset, last allocated extent] towards right by len bytes. This step will make a hole of len bytes at offset. Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>
-
- 08 6月, 2015 1 次提交
-
-
由 David Moore 提交于
During a source code review of fs/ext4/extents.c I noted identical consecutive lines. An assertion is repeated for inode1 and never done for inode2. This is not in keeping with the rest of the code in the ext4_swap_extents function and appears to be a bug. Assert that the inode2 mutex is not locked. Signed-off-by: NDavid Moore <dmoorefo@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NEric Sandeen <sandeen@redhat.com>
-
- 02 6月, 2015 1 次提交
-
-
由 Tejun Heo 提交于
With the planned cgroup writeback support, backing-dev related declarations will be more widely used across block and cgroup; unfortunately, including backing-dev.h from include/linux/blkdev.h makes cyclic include dependency quite likely. This patch separates out backing-dev-defs.h which only has the essential definitions and updates blkdev.h to include it. c files which need access to more backing-dev details now include backing-dev.h directly. This takes backing-dev.h off the common include dependency chain making it a lot easier to use it across block and cgroup. v2: fs/fat build failure fixed. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NJan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 15 5月, 2015 2 次提交
-
-
由 Theodore Ts'o 提交于
The xfstests test suite assumes that an attempt to collapse range on the range (0, 1) will return EOPNOTSUPP if the file system does not support collapse range. Commit 280227a7: "ext4: move check under lock scope to close a race" broke this, and this caused xfstests to fail when run when testing file systems that did not have the extents feature enabled. Reported-by: NEric Whitney <enwlinux@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eryu Guan 提交于
The following commit introduced a bug when checking for zero length extent 5946d089 ext4: check for overlapping extents in ext4_valid_extent_entries() Zero length extent could pass the check if lblock is zero. Adding the explicit check for zero length back. Signed-off-by: NEryu Guan <guaneryu@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 03 5月, 2015 1 次提交
-
-
由 Davide Italiano 提交于
fallocate() checks that the file is extent-based and returns EOPNOTSUPP in case is not. Other tasks can convert from and to indirect and extent so it's safe to check only after grabbing the inode mutex. Signed-off-by: NDavide Italiano <dccitaliano@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 12 4月, 2015 1 次提交
-
-
由 Michael Halcrow 提交于
Pulls block_write_begin() into fs/ext4/inode.c because it might need to do a low-level read of the existing data, in which case we need to decrypt it. Signed-off-by: NMichael Halcrow <mhalcrow@google.com> Signed-off-by: NIldar Muslukhov <ildarm@google.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 03 4月, 2015 4 次提交
-
-
由 Eric Whitney 提交于
When xfstests' auto group is run on a bigalloc filesystem with a 4.0-rc3 kernel, e2fsck failures and kernel warnings occur for some tests. e2fsck reports incorrect iblocks values, and the warnings indicate that the space reserved for delayed allocation is being overdrawn at allocation time. Some of these errors occur because the reserved space is incorrectly decreased by one cluster when ext4_ext_map_blocks satisfies an allocation request by mapping an unused portion of a previously allocated cluster. Because a cluster's worth of reserved space was already released when it was first allocated, it should not be released again. This patch appears to correct the e2fsck failure reported for generic/232 and the kernel warnings produced by ext4/001, generic/009, and generic/033. Failures and warnings for some other tests remain to be addressed. Signed-off-by: NEric Whitney <enwlinux@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Whitney 提交于
In ext4_zero_range(), removing a file's entire block range from the extent status tree removes all records of that file's delalloc extents. The delalloc accounting code uses this information, and its loss can then lead to accounting errors and kernel warnings at writeback time and subsequent file system damage. This is most noticeable on bigalloc file systems where code in ext4_ext_map_blocks() handles cases where delalloc extents share clusters with a newly allocated extent. Because we're not deleting a block range and are correctly updating the status of its associated extent, there is no need to remove anything from the extent status tree. When this patch is combined with an unrelated bug fix for ext4_zero_range(), kernel warnings and e2fsck errors reported during xfstests runs on bigalloc filesystems are greatly reduced without introducing regressions on other xfstests-bld test scenarios. Signed-off-by: NEric Whitney <enwlinux@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Lukas Czerner 提交于
Currently there is a bug in zero range code which causes zero range calls to only allocate block aligned portion of the range, while ignoring the rest in some cases. In some cases, namely if the end of the range is past i_size, we do attempt to preallocate the last nonaligned block. However this might cause kernel to BUG() in some carefully designed zero range requests on setups where page size > block size. Fix this problem by first preallocating the entire range, including the nonaligned edges and converting the written extents to unwritten in the next step. This approach will also give us the advantage of having the range to be as linearly contiguous as possible. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Xiaoguang Wang 提交于
Since commit a9b82415, we are allowed to merge unwritten extents, so here these comments are wrong, remove it. Signed-off-by: NXiaoguang Wang <wangxg.fnst@cn.fujitsu.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 03 1月, 2015 1 次提交
-
-
由 Theodore Ts'o 提交于
This reverts commit 14516bb7. This was causing regression test failures with generic/285 with an ext3 filesystem using CONFIG_EXT4_USE_FOR_EXT23. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 03 12月, 2014 2 次提交
-
-
由 Dmitry Monakhov 提交于
It is ridiculous practice to scan inode block by block, this technique applicable only for old indirect files. This takes significant amount of time for really large files. Let's reuse ext4_fiemap which already traverse inode-tree in most optimal meaner. TESTCASE: ftruncate64(fd, 0); ftruncate64(fd, 1ULL << 40); /* lseek will spin very long time */ lseek64(fd, 0, SEEK_DATA); lseek64(fd, 0, SEEK_HOLE); Original report: https://lkml.org/lkml/2014/10/16/620Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Dmitry Monakhov 提交于
Currently ext4_inline_data_fiemap ignores requested arguments (start and len) which may lead endless loop if start != 0. Also fix incorrect extent length determination. Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 26 11月, 2014 5 次提交
-
-
由 Jan Kara 提交于
path[depth].p_hdr can never be NULL for a path passed to us (and even if it could, EXT_LAST_EXTENT() would make something != NULL from it). So just remove the branch. Coverity-id: 1196498 Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Jan Kara 提交于
Currently callers adding extents to extent status tree were responsible for adding the inode to the list of inodes with freeable extents. This is error prone and puts list handling in unnecessarily many places. Just add inode to the list automatically when the first non-delay extent is added to the tree and remove inode from the list when the last non-delay extent is removed. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Zheng Liu 提交于
In this commit we discard the lru algorithm for inodes with extent status tree because it takes significant effort to maintain a lru list in extent status tree shrinker and the shrinker can take a long time to scan this lru list in order to reclaim some objects. We replace the lru ordering with a simple round-robin. After that we never need to keep a lru list. That means that the list needn't be sorted if the shrinker can not reclaim any objects in the first round. Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: NZheng Liu <wenqing.lz@taobao.com> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Zheng Liu 提交于
Currently extent status tree doesn't cache extent hole when a write looks up in extent tree to make sure whether a block has been allocated or not. In this case, we don't put extent hole in extent cache because later this extent might be removed and a new delayed extent might be added back. But it will cause a defect when we do a lot of writes. If we don't put extent hole in extent cache, the following writes also need to access extent tree to look at whether or not a block has been allocated. It brings a cache miss. This commit fixes this defect. Also if the inode doesn't have any extent, this extent hole will be cached as well. Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: NZheng Liu <wenqing.lz@taobao.com> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Jan Kara 提交于
For bigalloc filesystems we have to check whether newly requested inode block isn't already part of a cluster for which we already have delayed allocation reservation. This check happens in ext4_ext_map_blocks() and that function sets EXT4_MAP_FROM_CLUSTER if that's the case. However if ext4_da_map_blocks() finds in extent cache information about the block, we don't call into ext4_ext_map_blocks() and thus we always end up getting new reservation even if the space for cluster is already reserved. This results in overreservation and premature ENOSPC reports. Fix the problem by checking for existing cluster reservation already in ext4_da_map_blocks(). That simplifies the logic and actually allows us to get rid of the EXT4_MAP_FROM_CLUSTER flag completely. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 23 11月, 2014 4 次提交
-
-
由 Eric Whitney 提交于
ext4_ext_remove_space() can incorrectly free a partial_cluster if EAGAIN is encountered while truncating or punching. Extent removal should be retried in this case. It also fails to free a partial cluster when the punched region begins at the start of a file on that unaligned cluster and where the entire file has not been punched. Remove the requirement that all blocks in the file must have been freed in order to free the partial cluster. Signed-off-by: NEric Whitney <enwlinux@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Whitney 提交于
Add some casts and rearrange a few statements for improved readability. Some code can also be simplified and made more readable if we set partial_cluster to 0 rather than to a negative value when we can tell we've hit the left edge of the punched region. Signed-off-by: NEric Whitney <enwlinux@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Whitney 提交于
The fix in commit ad6599ab ("ext4: fix premature freeing of partial clusters split across leaf blocks"), intended to avoid dereferencing an invalid extent pointer when determining whether a partial cluster should be freed, wasn't quite good enough. Assure that at least one extent remains at the start of the leaf once the hole has been punched. Otherwise, the pointer to the extent to the right of the hole will be invalid and a partial cluster will be incorrectly freed. Set partial_cluster to 0 when we can tell we've hit the left edge of the punched region within the leaf. This prevents incorrect freeing of a partial cluster when ext4_ext_rm_leaf is called one last time during extent tree traversal after the punched region has been removed. Adjust comments to reflect code changes and a correction. Remove a bit of dead code. Signed-off-by: NEric Whitney <enwlinux@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Whitney 提交于
The partial_cluster variable is not always initialized correctly when hole punching on bigalloc file systems. Although commit c0634493 ("ext4: fix partial cluster handling for bigalloc file systems") addressed the case where the right edge of the punched region and the next extent to its right were within the same leaf, it didn't handle the case where the next extent to its right is in the next leaf. This causes xfstest generic/300 to fail. Fix this by replacing the code in c0634493922 with a more general solution that can continue the search for the first cluster to the right of the punched region into the next leaf if present. If found, partial_cluster is initialized to this cluster's negative value. There's no need to determine if that cluster is actually shared; we simply record it so its blocks won't be freed in the event it does happen to be shared. Also, minimize the burden on non-bigalloc file systems with some minor code simplification. Signed-off-by: NEric Whitney <enwlinux@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 30 10月, 2014 1 次提交
-
-
由 Jan Kara 提交于
ext4_ext_convert_to_initialized() can return more blocks than are actually allocated from map->m_lblk in case where initial part of the on-disk extent is zeroed out. Luckily this doesn't have serious consequences because the caller currently uses the return value only to unmap metadata buffers. Anyway this is a data corruption/exposure problem waiting to happen so fix it. Coverity-id: 1226848 Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 13 10月, 2014 1 次提交
-
-
由 Dmitry Monakhov 提交于
Besides the fact that this replacement improves code readability it also protects from errors caused direct EXT4_S(sb)->s_es manipulation which may result attempt to use uninitialized csum machinery. #Testcase_BEGIN IMG=/dev/ram0 MNT=/mnt mkfs.ext4 $IMG mount $IMG $MNT #Enable feature directly on disk, on mounted fs tune2fs -O metadata_csum $IMG # Provoke metadata update, likey result in OOPS touch $MNT/test umount $MNT #Testcase_END # Replacement script @@ expression E; @@ - EXT4_HAS_RO_COMPAT_FEATURE(E, EXT4_FEATURE_RO_COMPAT_METADATA_CSUM) + ext4_has_metadata_csum(E) https://bugzilla.kernel.org/show_bug.cgi?id=82201Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 02 10月, 2014 1 次提交
-
-
由 Dmitry Monakhov 提交于
It is reasonable to prepend newly created index to older one. [ Dropped no longer used function parameter newext. -tytso ] Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 05 9月, 2014 1 次提交
-
-
由 Theodore Ts'o 提交于
The EXT4_STATE_DELALLOC_RESERVED flag was originally implemented because it was too hard to make sure the mballoc and get_block flags could be reliably passed down through all of the codepaths that end up calling ext4_mb_new_blocks(). Since then, we have mb_flags passed down through most of the code paths, so getting rid of EXT4_STATE_DELALLOC_RESERVED isn't as tricky as it used to. This commit plumbs in the last of what is required, and then adds a WARN_ON check to make sure we haven't missed anything. If this passes a full regression test run, we can then drop EXT4_STATE_DELALLOC_RESERVED. Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz>
-
- 02 9月, 2014 9 次提交
-
-
由 Theodore Ts'o 提交于
Make the function name less redundant. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Now that the semantics of ext4_ext_find_extent() are much cleaner, it's safe and more efficient to reuse the path object across the multiple calls to ext4_ext_find_extent() in ext4_ext_shift_extents(). Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
This adds additional safety in case for some reason we end reusing a path structure which isn't big enough for current depth of the inode. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Teach ext4_ext_drop_refs() to accept a NULL argument, much like kfree(). This allows us to drop a lot of checks to make sure path is non-NULL before calling ext4_ext_drop_refs(). Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
In nearly all of the calls to ext4_ext_find_extent() where the caller is trying to recycle the path object, ext4_ext_drop_refs() gets called to release the buffer heads before the path object gets overwritten. To simplify things for the callers, and to avoid the possibility of a memory leak, make ext4_ext_find_extent() responsible for dropping the buffers. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Drop EXT4_EX_NOFREE_ON_ERR from ext4_ext_create_new_leaf(), ext4_split_extent(), ext4_convert_unwritten_extents_endio(). This requires fixing all of their callers to potentially ext4_ext_find_extent() to free the struct ext4_ext_path object in case of an error, and there are interlocking dependencies all the way up to ext4_ext_map_blocks(), ext4_swap_extents(), and ext4_ext_remove_space(). Once this is done, we can drop the EXT4_EX_NOFREE_ON_ERR flag since it is no longer necessary. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Transfer responsibility of freeing struct ext4_ext_path on error to ext4_ext_find_extent(). Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
The function ext4_convert_initialized_extents() is only called by a single function --- ext4_ext_convert_initalized_extents(). Inline the code and get rid of the unnecessary bits in order to simplify the code. Rename ext4_ext_convert_initalized_extents() to convert_initalized_extents() since it's a static function that is actually only used in a single caller, ext4_ext_map_blocks(). Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Right now, there are a places where it is all to easy to leak memory on an error path, via a usage like this: struct ext4_ext_path *path = NULL while (...) { ... path = ext4_ext_find_extent(inode, block, path, 0); if (IS_ERR(path)) { /* oops, if path was non-NULL before the call to ext4_ext_find_extent, we've leaked it! :-( */ ... return PTR_ERR(path); } ... } Unfortunately, there some code paths where we are doing the following instead: path = ext4_ext_find_extent(inode, block, orig_path, 0); and where it's important that we _not_ free orig_path in the case where ext4_ext_find_extent() returns an error. So change the function signature of ext4_ext_find_extent() so that it takes a struct ext4_ext_path ** for its third argument, and by default, on an error, it will free the struct ext4_ext_path, and then zero out the struct ext4_ext_path * pointer. In order to avoid causing problems, we add a flag EXT4_EX_NOFREE_ON_ERR which causes ext4_ext_find_extent() to use the original behavior of forcing the caller to deal with freeing the original path pointer on the error case. The goal is to get rid of EXT4_EX_NOFREE_ON_ERR entirely, but this allows for a gentle transition and makes the patches easier to verify. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-