- 09 2月, 2013 1 次提交
-
-
由 Theodore Ts'o 提交于
So we can better understand what bits of ext4 are responsible for long-running jbd2 handles, use jbd2__journal_start() so we can pass context information for logging purposes. The recommended way for finding the longer-running handles is: T=/sys/kernel/debug/tracing EVENT=$T/events/jbd2/jbd2_handle_stats echo "interval > 5" > $EVENT/filter echo 1 > $EVENT/enable ./run-my-fs-benchmark cat $T/trace > /tmp/problem-handles This will list handles that were active for longer than 20ms. Having longer-running handles is bad, because a commit started at the wrong time could stall for those 20+ milliseconds, which could delay an fsync() or an O_SYNC operation. Here is an example line from the trace file describing a handle which lived on for 311 jiffies, or over 1.2 seconds: postmark-2917 [000] .... 196.435786: jbd2_handle_stats: dev 254,32 tid 570 type 2 line_no 2541 interval 311 sync 0 requested_blocks 1 dirtied_blocks 0 Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 29 1月, 2013 4 次提交
-
-
由 Guo Chao 提交于
brelse() and ext4_journal_force_commit() are both inlined and able to handle NULL. Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Guo Chao 提交于
Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Guo Chao 提交于
After commit 978fef91 (create __ext4_insert_dentry for dir entry insertion), 'reclen' is not used anymore. Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Guo Chao 提交于
Commit b0336e8d (ext4: calculate and verify checksums of directory leaf blocks) and commit dbe89444 (ext4: Calculate and verify checksums for htree nodes) forget to release buffer when checksum failed, at some places. Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 07 1月, 2013 2 次提交
-
-
由 Guo Chao 提交于
This fixes a buffer cache leak when creating a directory, introduced in commit a774f9c2. Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: NTao Ma <boyu.mt@taobao.com>
-
由 Guo Chao 提交于
If checksum fails, we should also release the buffer read from previous iteration. Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>- Cc: stable@vger.kernel.org -- fs/ext4/namei.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
-
- 27 12月, 2012 1 次提交
-
-
由 Theodore Ts'o 提交于
When trying to mount a file system which does not contain a journal, but which does have a orphan list containing an inode which needs to be truncated, the mount call with hang forever in ext4_orphan_cleanup() because ext4_orphan_del() will return immediately without removing the inode from the orphan list, leading to an uninterruptible loop in kernel code which will busy out one of the CPU's on the system. This can be trivially reproduced by trying to mount the file system found in tests/f_orphan_extents_inode/image.gz from the e2fsprogs source tree. If a malicious user were to put this on a USB stick, and mount it on a Linux desktop which has automatic mounts enabled, this could be considered a potential denial of service attack. (Not a big deal in practice, but professional paranoids worry about such things, and have even been known to allocate CVE numbers for such problems.) Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: NZheng Liu <wenqing.lz@taobao.com> Cc: stable@vger.kernel.org
-
- 11 12月, 2012 11 次提交
-
-
由 Tao Ma 提交于
Ted has sent out a RFC about removing this feature. Eric and Jan confirmed that both RedHat and SUSE enable this feature in all their product. David also said that "As far as I know, it's enabled in all Android kernels that use ext4." So it seems OK for us. And what's more, as inline data depends its implementation on xattr, and to be frank, I don't run any test again inline data enabled while xattr disabled. So I think we should add inline data and remove this config option in the same release. [ The savings if you disable CONFIG_EXT4_FS_XATTR is only 27k, which isn't much in the grand scheme of things. Since no one seems to be testing this configuration except for some automated compile farms, on balance we are better removing this config option, and so that it is effectively always enabled. -- tytso ] Cc: David Brown <davidb@codeaurora.org> Cc: Eric Sandeen <sandeen@redhat.com> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
In case we rename a directory, ext4_rename has to read the dir block and change its dotdot's information. The old ext4_rename encapsulated the dir_block read into itself. So this patch adds a new function ext4_get_first_dir_block() which gets the dir buffer information so the ext4_rename can handle it properly. As it will also change the parent inode number, we return the parent_de so that ext4_rename() can handle it more easily. ext4_find_entry is also changed so that the caller(rename) can tell whether the found entry is an inlined one or not and journaling the corresponding buffer head. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
empty_dir is used when deleting a dir. So it should handle inline dir properly. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
Currently ext4_delete_entry() is used only for dir entry removing from a dir block. So let us create a new function ext4_generic_delete_entry and this function takes a entry_buf and a buf_size so that it can be used for inline data. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
Create a new function ext4_find_inline_entry() to handle the case of inline data. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
search_dirblock is used to search a dir block, but the code is almost the same for searching an inline dir. So create a new fuction search_dir and let search_dirblock call it. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
This patch let add_dir_entry handle the inline data case. So the dir is initialized as inline dir first and then we can try to add some files to it, when the inline space can't hold all the entries, a dir block will be created and the dir entry will be moved to it. Also for an inlined dir, "." and ".." are removed and we only use 4 bytes to store the parent inode number. These 2 entries will be added when we convert an inline dir to a block-based one. [ Folded in patch from Dan Carpenter to remove an unused variable. ] Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
The old add_dirent_to_buf handles all the work related to the work of adding dir entry to a dir block. Now we have inline data, so create 2 new function __ext4_find_dest_de and __ext4_insert_dentry that do the real work and let add_dirent_to_buf call them. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
The __ext4_check_dir_entry() function() is used to check whether the de is over the block boundary. Now with inline data, it could be within the block boundary while exceeds the inode size. So check this function to check the overflow more precisely. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
Currently, the initialization of dot and dotdot are encapsulated in ext4_mkdir and also bond with dir_block. So create a new function named ext4_init_new_dir and the initialization is moved to ext4_init_dot_dotdot. Now it will called either in the normal non-inline case(rec_len of ".." will cover the whole block) or when we converting an inline dir to a block(rec len of ".." will be the real length). The start of the next entry is also returned for inline dir usage. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 13 11月, 2012 1 次提交
-
-
由 Darrick J. Wong 提交于
During a directory entry lookup of a hashed directory, if the hash-based lookup functions fail and we fall back to a linear scan, don't try to verify the dirent checksum on the internal nodes of the hash tree because they don't store a checksum in a hidden dirent like the leaf nodes do. Reported-by: NGeorge Spelvin <linux@horizon.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 11 11月, 2012 1 次提交
-
-
由 Theodore Ts'o 提交于
If there is no space for a checksum in a directory leaf node, previously we would use EXT4_ERROR_INODE() which would mark the file system as inconsistent. While it would be nice to use e2fsck -D, it certainly isn't required, so just print a warning using ext4_warning(). Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
-
- 27 9月, 2012 2 次提交
-
-
由 Carlos Maiolino 提交于
When ext4_bread() returns NULL and err is set to zero, this means there is no phyical block mapped to the specified logical block number. (Previous to commit 90b0a973, err was uninitialized in this case, which caused other problems.) The directory handling routines use ext4_bread() in many places, the fact that ext4_bread() now returns NULL with err set to zero could cause problems since a number of these functions will simply return the value of err if the result of ext4_bread() was the NULL pointer, causing the caller of the function to think that the function was successful. Since directories should never contain holes, this case can only happen if the file system is corrupted. This commit audits all of the callers of ext4_bread(), and makes sure they do the right thing if a hole in a directory is found by ext4_bread(). Some ext4_bread() callers did not need any changes either because they already had its own hole detector paths. Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Bernd Schubert 提交于
ext4_special_inode_operations have their own ifdef CONFIG_EXT4_FS_XATTR to mask those methods. And ext4_iget also always sets it, so there is an inconsistency. Signed-off-by: NBernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 19 9月, 2012 1 次提交
-
-
由 Anatol Pomozov 提交于
Instead of checking whether the handle is valid, we check if journal is enabled. This avoids taking the s_orphan_lock mutex in all cases when there is no journal in use, including the error paths where ext4_orphan_del() is called with a handle set to NULL. Signed-off-by: NAnatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 18 9月, 2012 1 次提交
-
-
由 Carlos Maiolino 提交于
htree_dirblock_to_tree() declares a non-initialized 'err' variable, which is passed as a reference to another functions expecting them to set this variable with their error codes. It's passed to ext4_bread(), which then passes it to ext4_getblk(). If ext4_map_blocks() returns 0 due to a lookup failure, leaving the ext4_getblk() buffer_head uninitialized, it will make ext4_getblk() return to ext4_bread() without initialize the 'err' variable, and ext4_bread() will return to htree_dirblock_to_tree() with this variable still uninitialized. htree_dirblock_to_tree() will pass this variable with garbage back to ext4_htree_fill_tree(), which expects a number of directory entries added to the rb-tree. which, in case, might return a fake non-zero value due the garbage left in the 'err' variable, leading the kernel to an Oops in ext4_dx_readdir(), once this is expecting a filled rb-tree node, when in turn it will have a NULL-ed one, causing an invalid page request when trying to get a fname struct from this NULL-ed rb-tree node in this line: fname = rb_entry(info->curr_node, struct fname, rb_hash); The patch itself initializes the err variable in htree_dirblock_to_tree() to avoid usage mistakes by the called functions, and also fix ext4_getblk() to return a initialized 'err' variable when ext4_map_blocks() fails a lookup. Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 17 8月, 2012 1 次提交
-
-
由 Theodore Ts'o 提交于
Very large directories can cause significant performance problems, or perhaps even invoke the OOM killer, if the process is running in a highly constrained memory environment (whether it is VM's with a small amount of memory or in a small memory cgroup). So it is useful, in cloud server/data center environments, to be able to set a filesystem-wide cap on the maximum size of a directory, to ensure that directories never get larger than a sane size. We do this via a new mount option, max_dir_size_kb. If there is an attempt to grow the directory larger than max_dir_size_kb, the system call will return ENOSPC instead. Google-Bug-Id: 6863013 Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 23 7月, 2012 2 次提交
-
-
由 Artem Bityutskiy 提交于
The '__ext4_handle_dirty_metadata()' does not need the 'now' argument anymore and we can kill it. Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Al Viro 提交于
d_instantiate(dentry, inode); unlock_new_inode(inode); is a bad idea; do it the other way round... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 14 7月, 2012 2 次提交
-
-
由 Al Viro 提交于
boolean "does it have to be exclusive?" flag is passed instead; Local filesystem should just ignore it - the object is guaranteed not to be there yet. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Just the flags; only NFS cares even about that, but there are legitimate uses for such argument. And getting rid of that completely would require splitting ->lookup() into a couple of methods (at least), so let's leave that alone for now... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 10 7月, 2012 1 次提交
-
-
由 Tao Ma 提交于
In ext4_rename, when the old name is a dir, we need to change ".." to its new parent and journal the change, so with metadata_csum enabled, we have to re-calc the csum. As the first block of the dir can be either a htree root or a normal directory block and we have different csum calculation for these 2 types, we have to choose the right one in ext4_rename. btw, it is found by xfstests 013. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Acked-by: NDarrick J. Wong <djwong@us.ibm.com>
-
- 29 5月, 2012 1 次提交
-
-
由 Andreas Dilger 提交于
A hard-linked directory to its parent can cause the VFS to deadlock, and is a sign of a corrupted file system. So detect this case in ext4_lookup(), before the rmdir() lockup scenario can take place. Signed-off-by: NAndreas Dilger <adilger@dilger.ca> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
-
- 11 5月, 2012 1 次提交
-
-
由 Linus Torvalds 提交于
This allows comparing hash and len in one operation on 64-bit architectures. Right now only __d_lookup_rcu() takes advantage of this, since that is the case we care most about. The use of anonymous struct/unions hides the alternate 64-bit approach from most users, the exception being a few cases where we initialize a 'struct qstr' with a static initializer. This makes the problematic cases use a new QSTR_INIT() helper function for that (but initializing just the name pointer with a "{ .name = xyzzy }" initializer remains valid, as does just copying another qstr structure). Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 4月, 2012 5 次提交
-
-
由 Theodore Ts'o 提交于
None of this function callers ever pass in a NULL inode pointer, so this check is unnecessary, and the else clause is dead code. (This change should make the code coverage people a little happier. :-) Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Darrick J. Wong 提交于
Calculate and verify the checksums for directory leaf blocks (i.e. blocks that only contain actual directory entries). The checksum lives in what looks to be an unused directory entry with a 0 name_len at the end of the block. This scheme is not used for internal htree nodes because the mechanism in place there only costs one dx_entry, whereas the "empty" directory entry would cost two dx_entries. Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Darrick J. Wong 提交于
Calculate and verify the checksum for directory index tree (htree) node blocks. The checksum is stored in the last 4 bytes of the htree block and requires the dx_entry array to stop 1 dx_entry short of the end of the block. Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Darrick J. Wong 提交于
Calculate and verify the superblock checksum. Since the UUID and block group number are embedded in each copy of the superblock, we need only checksum the entire block. Refactor some of the code to eliminate open-coding of the checksum update call. Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Darrick J. Wong 提交于
Define flags and change structure definitions to allow checksumming of ext4 metadata. Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 21 2月, 2012 1 次提交
-
-
由 Zheng Liu 提交于
Fix ext4_warning format flag in dx_probe(). CC: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NZheng Liu <wenqing.lz@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 09 1月, 2012 1 次提交
-
-
由 Al Viro 提交于
Both ext3 and ext4 put the half-created symlink inode into the orphan list for a while (see the comment in ext[34]_symlink() for gory details). Then, if everything went fine, they pull it out of the orphan list and bump the link count back to 1. The thing is, inc_nlink() is going to complain about seeing somebody changing i_nlink from 0 to 1. With a good reason, since normally something like that is a bug. Explicit set_nlink(inode, 1) does the same thing as inc_nlink() here, but it does *not* complain - exactly because it should be usable in strange situations like this one. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-