- 05 9月, 2014 1 次提交
-
-
由 Theodore Ts'o 提交于
Instead of initializing the allocation_request structure in ext4_alloc_branch(), set it up in ext4_ind_map_blocks(), and then pass it to ext4_alloc_branch() and ext4_splice_branch(). This allows ext4_ind_map_blocks to pass flags in the allocation request structure without having to add Yet Another argument to ext4_alloc_branch(). Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz>
-
- 02 9月, 2014 16 次提交
-
-
由 Zheng Liu 提交于
This commit adds some statictics in extent status tree shrinker. The purpose to add these is that we want to collect more details when we encounter a stall caused by extent status tree shrinker. Here we count the following statictics: stats: the number of all objects on all extent status trees the number of reclaimable objects on lru list cache hits/misses the last sorted interval the number of inodes on lru list average: scan time for shrinking some objects the number of shrunk objects maximum: the inode that has max nr. of objects on lru list the maximum scan time for shrinking some objects The output looks like below: $ cat /proc/fs/ext4/sda1/es_shrinker_info stats: 28228 objects 6341 reclaimable objects 5281/631 cache hits/misses 586 ms last sorted interval 250 inodes on lru list average: 153 us scan time 128 shrunk objects maximum: 255 inode (255 objects, 198 reclaimable) 125723 us max scan time If the lru list has never been sorted, the following line will not be printed: 586ms last sorted interval If there is an empty lru list, the following lines also will not be printed: 250 inodes on lru list ... maximum: 255 inode (255 objects, 198 reclaimable) 0 us max scan time Meanwhile in this commit a new trace point is defined to print some details in __ext4_es_shrink(). Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Jan Kara <jack@suse.cz> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NZheng Liu <wenqing.lz@taobao.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Zheng Liu 提交于
This commit improves the trace point of extents status tree. We rename trace_ext4_es_shrink_enter in ext4_es_count() because it is also used in ext4_es_scan() and we can not identify them from the result. Further this commit fixes a variable name in trace point in order to keep consistency with others. Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Jan Kara <jack@suse.cz> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NZheng Liu <wenqing.lz@taobao.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Seunghun Lee 提交于
get_blocks is renamed to get_block. Signed-off-by: NSeunghun Lee <waydi1@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Darrick J. Wong 提交于
Enable by default the block_validity feature, which checks for collisions between newly allocated blocks and critical system metadata. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Make the function name less redundant. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Reuse the path object in ext4_move_extents() so we don't unnecessarily free and reallocate it. Also clean up the get_ext_path() wrapper so that it has the same semantics of freeing the path object on error as ext4_ext_find_extent(). Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Now that the semantics of ext4_ext_find_extent() are much cleaner, it's safe and more efficient to reuse the path object across the multiple calls to ext4_ext_find_extent() in ext4_ext_shift_extents(). Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
This adds additional safety in case for some reason we end reusing a path structure which isn't big enough for current depth of the inode. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Teach ext4_ext_drop_refs() to accept a NULL argument, much like kfree(). This allows us to drop a lot of checks to make sure path is non-NULL before calling ext4_ext_drop_refs(). Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
In nearly all of the calls to ext4_ext_find_extent() where the caller is trying to recycle the path object, ext4_ext_drop_refs() gets called to release the buffer heads before the path object gets overwritten. To simplify things for the callers, and to avoid the possibility of a memory leak, make ext4_ext_find_extent() responsible for dropping the buffers. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Drop EXT4_EX_NOFREE_ON_ERR from ext4_ext_create_new_leaf(), ext4_split_extent(), ext4_convert_unwritten_extents_endio(). This requires fixing all of their callers to potentially ext4_ext_find_extent() to free the struct ext4_ext_path object in case of an error, and there are interlocking dependencies all the way up to ext4_ext_map_blocks(), ext4_swap_extents(), and ext4_ext_remove_space(). Once this is done, we can drop the EXT4_EX_NOFREE_ON_ERR flag since it is no longer necessary. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Transfer responsibility of freeing struct ext4_ext_path on error to ext4_ext_find_extent(). Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
The function ext4_convert_initialized_extents() is only called by a single function --- ext4_ext_convert_initalized_extents(). Inline the code and get rid of the unnecessary bits in order to simplify the code. Rename ext4_ext_convert_initalized_extents() to convert_initalized_extents() since it's a static function that is actually only used in a single caller, ext4_ext_map_blocks(). Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Right now, there are a places where it is all to easy to leak memory on an error path, via a usage like this: struct ext4_ext_path *path = NULL while (...) { ... path = ext4_ext_find_extent(inode, block, path, 0); if (IS_ERR(path)) { /* oops, if path was non-NULL before the call to ext4_ext_find_extent, we've leaked it! :-( */ ... return PTR_ERR(path); } ... } Unfortunately, there some code paths where we are doing the following instead: path = ext4_ext_find_extent(inode, block, orig_path, 0); and where it's important that we _not_ free orig_path in the case where ext4_ext_find_extent() returns an error. So change the function signature of ext4_ext_find_extent() so that it takes a struct ext4_ext_path ** for its third argument, and by default, on an error, it will free the struct ext4_ext_path, and then zero out the struct ext4_ext_path * pointer. In order to avoid causing problems, we add a flag EXT4_EX_NOFREE_ON_ERR which causes ext4_ext_find_extent() to use the original behavior of forcing the caller to deal with freeing the original path pointer on the error case. The goal is to get rid of EXT4_EX_NOFREE_ON_ERR entirely, but this allows for a gentle transition and makes the patches easier to verify. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Commit b8a86845 introduced an accidental flag aliasing between EXT4_EX_NOCACHE and EXT4_GET_BLOCKS_CONVERT_UNWRITTEN. Fortunately, this didn't introduce any untorward side effects --- we got lucky. Nevertheless, fix this and leave a warning to hopefully avoid this from happening in the future. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
We accidently aliased EXT4_EX_NOCACHE and EXT4_GET_CONVERT_UNWRITTEN falgs, which apparently was hiding a bug that was unmasked when this flag aliasing issue was addressed (see the subsequent commit). The reproduction case was: fsx -N 10000 -l 500000 -r 4096 -t 4096 -w 4096 -Z -R -W /vdb/junk ... which would cause fsx to report corruption in the data file. The fix we have is a bit of an overkill, but I'd much rather be conservative for now, and we can optimize ZERO_RANGE_FL handling later. The fact that we need to zap the extent_status cache for the inode is unfortunate, but correctness is far more important than performance. Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: Namjae Jeon <namjae.jeon@samsung.com>
-
- 01 9月, 2014 1 次提交
-
-
由 Theodore Ts'o 提交于
If ext4_ext_find_extent() returns an error, we have to clear path1 or path2 or else we would end up trying to free an ERR_PTR, which would be bad. Also eliminate some redundant code and mark the error paths as unlikely() Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 31 8月, 2014 3 次提交
-
-
由 Dmitry Monakhov 提交于
ext4_move_extents is too complex for review. It has duplicate almost each function available in the rest of other codebase. It has useless artificial restriction orig_offset == donor_offset. But in fact logic of ext4_move_extents is very simple: Iterate extents one by one (similar to ext4_fill_fiemap_extents) ->Iterate each page covered extent (similar to generic_perform_write) ->swap extents for covered by page (can be shared with IOC_MOVE_DATA) Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Dmitry Monakhov 提交于
This allows us to make mext_next_extent static and potentially get rid of it. Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Dmitry Monakhov 提交于
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 30 8月, 2014 6 次提交
-
-
由 Wang Shilong 提交于
ext4_journal_get_write_access() has just been called in ext4_append() calling it again here is duplicated. Signed-off-by: NWang Shilong <wshilong@ddn.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 29 8月, 2014 3 次提交
-
-
由 Darrick J. Wong 提交于
When performing a same-directory rename, it's possible that adding or setting the new directory entry will cause the directory to overflow the inline data area, which causes the directory to be converted to an extent-based directory. Under this circumstance it is necessary to re-read the directory when deleting the old dirent because the "old directory" context still points to i_block in the inode table, which is now an extent tree root! The delete fails with an FS error, and the subsequent fsck complains about incorrect link counts and hardlinked directories. Test case (originally found with flat_dir_test in the metadata_csum test program): # mkfs.ext4 -O inline_data /dev/sda # mount /dev/sda /mnt # mkdir /mnt/x # touch /mnt/x/changelog.gz /mnt/x/copyright /mnt/x/README.Debian # sync # for i in /mnt/x/*; do mv $i $i.longer; done # ls -la /mnt/x/ total 0 -rw-r--r-- 1 root root 0 Aug 25 12:03 changelog.gz.longer -rw-r--r-- 1 root root 0 Aug 25 12:03 copyright -rw-r--r-- 1 root root 0 Aug 25 12:03 copyright.longer -rw-r--r-- 1 root root 0 Aug 25 12:03 README.Debian.longer (Hey! Why are there four files now??) Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
由 Darrick J. Wong 提交于
It turns out that there are some serious problems with the on-disk format of journal checksum v2. The foremost is that the function to calculate descriptor tag size returns sizes that are too big. This causes alignment issues on some architectures and is compounded by the fact that some parts of jbd2 use the structure size (incorrectly) to determine the presence of a 64bit journal instead of checking the feature flags. Therefore, introduce journal checksum v3, which enlarges the descriptor block tag format to allow for full 32-bit checksums of journal blocks, fix the journal tag function to return the correct sizes, and fix the jbd2 recovery code to use feature flags to determine 64bitness. Add a few function helpers so we don't have to open-code quite so many pieces. Switching to a 16-byte block size was found to increase journal size overhead by a maximum of 0.1%, to convert a 32-bit journal with no checksumming to a 32-bit journal with checksum v3 enabled. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reported-by: NTR Reardon <thomas_reardon@hotmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
由 Dmitry Monakhov 提交于
In case of delalloc block i_disksize may be less than i_size. So we have to update i_disksize each time we allocated and submitted some blocks beyond i_disksize. We weren't doing this on the error paths, so fix this. testcase: xfstest generic/019 Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 28 8月, 2014 2 次提交
-
-
由 Dmitry Monakhov 提交于
After commit f282ac19 we use different transactions for preallocation and i_disksize update which result in complain from fsck after power-failure. spotted by generic/019. IMHO this is regression because fs becomes inconsistent, even more 'e2fsck -p' will no longer works (which drives admins go crazy) Same transaction requirement applies ctime,mtime updates testcase: xfstest generic/019 Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
由 Dmitry Monakhov 提交于
Currently we reserve only 4 blocks but in worst case scenario ext4_zero_partial_blocks() may want to zeroout and convert two non adjacent blocks. Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 24 8月, 2014 3 次提交
-
-
由 Dmitry Monakhov 提交于
Cc: stable@vger.kernel.org # needed for bug fix patches Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
If we suffer a block allocation failure (for example due to a memory allocation failure), it's possible that we will call ext4_discard_allocated_blocks() before we've actually allocated any blocks. In that case, fe_len and fe_start in ac->ac_f_ex will still be zero, and this will result in mb_free_blocks(inode, e4b, 0, 0) triggering the BUG_ON on mb_free_blocks(): BUG_ON(last >= (sb->s_blocksize << 3)); Fix this by bailing out of ext4_discard_allocated_blocks() if fs_len is zero. Also fix a missing ext4_mb_unload_buddy() call in ext4_discard_allocated_blocks(). Google-Bug-Id: 16844242 Fixes: 86f0afd4Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
由 Theodore Ts'o 提交于
If we run into some kind of error, such as ENOMEM, while calling ext4_getblk() or ext4_dx_find_entry(), we need to make sure this error gets propagated up to ext4_find_entry() and then to its callers. This way, transient errors such as ENOMEM can get propagated to the VFS. This is important so that the system calls return the appropriate error, and also so that in the case of ext4_lookup(), we return an error instead of a NULL inode, since that will result in a negative dentry cache entry that will stick around long past the OOM condition which caused a transient ENOMEM error. Google-Bug-Id: #17142205 Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 08 8月, 2014 1 次提交
-
-
由 Miklos Szeredi 提交于
Christoph Hellwig suggests: 1) make vfs_rename call ->rename2 if it exists instead of ->rename 2) switch all filesystems that you're adding NOREPLACE support for to use ->rename2 3) see how many ->rename instances we'll have left after a few iterations of 2. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 31 7月, 2014 1 次提交
-
-
由 Theodore Ts'o 提交于
If there is a failure while allocating the preallocation structure, a number of blocks can end up getting marked in the in-memory buddy bitmap, and then not getting released. This can result in the following corruption getting reported by the kernel: EXT4-fs error (device sda3): ext4_mb_generate_buddy:758: group 1126, 12793 clusters in bitmap, 12729 in gd In that case, we need to release the blocks using mb_free_blocks(). Tested: fs smoke test; also demonstrated that with injected errors, the file system is no longer getting corrupted Google-Bug-Id: 16657874 Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 29 7月, 2014 2 次提交
-
-
由 Namjae Jeon 提交于
Blocks in collapse range should be collapsed per cluster unit when bigalloc is enable. If bigalloc is not enable, EXT4_CLUSTER_SIZE will be same with EXT4_BLOCK_SIZE. With this bug fixed, patch enables COLLAPSE_RANGE for bigalloc, which fixes a large number of xfstest failures which use fsx. Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Darrick J. Wong 提交于
Before converting an inline directory to a regular directory, check the directory entries to make sure they're not obviously broken. This helps us to avoid a BUG_ON if one of the dirents is trashed. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
-
- 28 7月, 2014 1 次提交
-
-
由 Dmitry Monakhov 提交于
If we have to copy data we must drop i_data_sem because of get_blocks() will be called inside mext_page_mkuptodate(), but later we must reacquire it again because we are about to change extent's tree Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz>
-