- 30 4月, 2008 2 次提交
-
-
由 Andi Kleen 提交于
I checked ext4_ioctl and it looked largely safe to not be used without BKL. So convert it over to unlocked_ioctl. Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
When we convert an uninitialized extent to an initialized extent we need to make sure we return the number of blocks in the extent from the file system block corresponding to logical file block. Otherwise we cache wrong extent details and this results in file system corruption. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 29 4月, 2008 4 次提交
-
-
由 Aneesh Kumar K.V 提交于
ext4_ext_get_blocks() returns the number of blocks allocated with buffer head unmapped for a read from prealloc space. This is needed so that delayed allocation doesn't do block reservation for prealloc space since the blocks are already reserved on disk. Mark the buffer head unwritten. Some code paths try to read the block if the buffer_head is not new and no uptodate. Marking the buffer head unwritten avoids this reading. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
ext4_ext_get_blocks() returns number of blocks allocated with buffer heads unmapped for a read from prealloc space. This is needed so that delayed allocation doesn't do block reservation for prealloc space since the blocks are already resevred on disk. Fix ext4_ext_get_blocks to not return greater than max_blocks, since some of the code paths cannot handle such a return value. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
ext4_fallocate needs to update file size in each transaction. Otherwise if we crash the file size won't be seen. We were also not marking the inode dirty after updating file size before. Also when we try to retry allocation due to ENOSPC, make sure we reset the variable ret so that we actually do a retry. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
Fail migrate if we allocated new blocks via mmap write. If we write to holes in the file via mmap, we end up allocating new blocks. This block allocation happens without taking inode->i_mutex. Since migrate is protected by i_mutex and migrate expects that no new blocks get allocated during migrate, fail migrate if new blocks get allocated. We can't take inode->i_mutex in the mmap write path because that would result in a locking order violation between i_mutex and mmap_sem. Also adding a separate rw_sempahore for protection is really high overhead for a rare operation such as migrate. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: NJan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 17 4月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
If the preallocated area is small zero out the full extent instead of splitting them. This should avoid the "write every alternate block" problem that could grow the number of extents dramatically. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 29 4月, 2008 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
This patch handles possible ENOSPC errors when writing to an uninitialized extent in case the filesystem is full. A write to a prealloc area causes the split of an unititalized extent into initialized and uninitialized extents. If we don't have space to add new extent information, instead of returning error, convert the existing uninitialized extent to initialized one. We need to zero out the blocks corresponding to the entire extent to prevent uninitialized data reaching userspace. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
This patch enables extent-formatted normal symlinks. Using extents format allows a symlink to refer to a block number larger than 2^32 on large filesystems. We still don't enable extent format for fast symlinks, which are contained in the inode itself. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 17 4月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
Put the old extent details back if we fail to split the uninitialized extent. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 30 4月, 2008 1 次提交
-
-
由 Josef Bacik 提交于
The "resize" option won't be noticed as it comes after the NULL option, so if you try to mount (or in this case remount) with that option it won't be recognized. Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NJosef Bacik <jbacik@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 17 4月, 2008 1 次提交
-
-
由 Hisashi Hifumi 提交于
Currently fdatasync is identical to fsync in ext3. I think fdatasync should skip journal flush in data=ordered and data=writeback mode when it overwrites to already-instantiated blocks on HDD. When I_DIRTY_DATASYNC flag is not set, fdatasync should skip journal writeout because this indicates only atime or/and mtime updates. Following patch is the same approach of ext2's fsync code(ext2_sync_file). I did a performance test using the sysbench. #sysbench --num-threads=128 --max-requests=50000 --test=fileio --file-total-size=128G --file-test-mode=rndwr --file-fsync-mode=fdatasync run The result on ext3 was: -2.6.24 Operations performed: 0 Read, 50080 Write, 59600 Other = 109680 Total Read 0b Written 782.5Mb Total transferred 782.5Mb (12.116Mb/sec) 775.45 Requests/sec executed Test execution summary: total time: 64.5814s total number of events: 50080 total time taken by event execution: 3713.9836 per-request statistics: min: 0.0000s avg: 0.0742s max: 0.9375s approx. 95 percentile: 0.2901s Threads fairness: events (avg/stddev): 391.2500/23.26 execution time (avg/stddev): 29.0155/1.99 -2.6.24-patched Operations performed: 0 Read, 50009 Write, 61596 Other = 111605 Total Read 0b Written 781.39Mb Total transferred 781.39Mb (16.419Mb/sec) 1050.83 Requests/sec executed Test execution summary: total time: 47.5900s total number of events: 50009 total time taken by event execution: 2934.5768 per-request statistics: min: 0.0000s avg: 0.0587s max: 0.8938s approx. 95 percentile: 0.1993s Threads fairness: events (avg/stddev): 390.6953/22.64 execution time (avg/stddev): 22.9264/1.17 Filesystem I/O throughput was improved. Signed-off-by :Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Acked-by: NJan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 30 4月, 2008 1 次提交
-
-
由 Josef Bacik 提交于
This patch fix a panic while running fsfuzzer. We are improperly checking the return of ext4_orphan_get. Signed-off-by: NJosef Bacik <jbacik@redhat.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 29 4月, 2008 2 次提交
-
-
由 Denis V. Lunev 提交于
Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. Signed-off-by: NDenis V. Lunev <den@openvz.org> Cc: <linux-ext4@vger.kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexey Dobriyan 提交于
Use creation by full path instead: "fs/foo". Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 4月, 2008 1 次提交
-
-
由 Jan Kara 提交于
Update ext4 to handle quotaon on remount RW. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 4月, 2008 1 次提交
-
-
由 Benoit Boissinot 提交于
Spelling fix: prefered -> preferred Signed-off-by: NBenoit Boissinot <benoit.boissinot@ens-lyon.org> Signed-off-by: NJesper Juhl <jesper.juhl@gmail.com>
-
- 19 4月, 2008 1 次提交
-
-
由 Dave Hansen 提交于
Some ioctl()s can cause writes to the filesystem. Take these, and make them use mnt_want/drop_write() instead. [AV: updated] Acked-by: NAl Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDave Hansen <haveblue@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 16 4月, 2008 1 次提交
-
-
由 Jan Kara 提交于
mb_cache_entry_alloc() was allocating cache entries with GFP_KERNEL. But filesystems are calling this function while holding xattr_sem so possible recursion into the fs violates locking ordering of xattr_sem and transaction start / i_mutex for ext2-4. Change mb_cache_entry_alloc() so that filesystems can specify desired gfp mask and use GFP_NOFS from all of them. Signed-off-by: NJan Kara <jack@suse.cz> Reported-by: NDave Jones <davej@redhat.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 2月, 2008 1 次提交
-
-
由 Akinobu Mita 提交于
Add missing ext4_journal_stop() in error handling. Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: Stephen Tweedie <sct@redhat.com> Cc: adilger@clusterfs.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mingming Cao <cmm@us.ibm.com>
-
- 23 2月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
ext4_find_next_zero_bit and ext4_find_next_bit needs a long aligned address on x8_64. Add mb_find_next_zero_bit and mb_find_next_bit and use them in the mballoc. Fix: https://bugzilla.redhat.com/show_bug.cgi?id=433286 Eric Sandeen debugged the problem and suggested the fix. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 26 2月, 2008 3 次提交
-
-
由 Aneesh Kumar K.V 提交于
In addition, don't inherit EXT4_EXTENTS_FL from parent directory. If we have a directory with extent flag set and later mount the file system with -o noextents, the files created in that directory will also have extent flag set but we would not have called ext4_ext_tree_init for them. This will cause error later when we are verifying the extent header Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
If we fail to allocate blocks don't call ext4_error. Also don't hide errors from ext4_get_blocks_wrap Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Mingming Cao 提交于
This patch fixes a bug when writing to preallocated but uninitialized blocks, which resulted in a BUG in fs/buffer.c saying that the buffer is not mapped. When writing to a file, ext4_get_block_wrap() is called with create=1 in order to request that blocks be allocated if necessary. It currently calls ext4_get_blocks() with create=0 in order to do a lookup first. If the inode contains an unitialized data block, the buffer head is left unampped, which ext4_get_blocks_wrap() returns, causing the BUG. We fix this by checking to see if the buffer head is unmapped, and if so, we make sure the the buffer head is mapped by calling ext4_ext_get_blocks with create=1. Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 16 2月, 2008 3 次提交
-
-
由 Theodore Ts'o 提交于
The ext4_dec_count() function is only needed when dropping the i_nlink count on inodes which are (or which could be) directories. If we *know* that the inode in question can't possibly be a directory, use drop_nlink or clear_nlink() if we know i_nlink is 1. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Valerie Clement 提交于
When a directory inode is allocated in the last group and the last group contains less than s_blocks_per_group blocks, the initial block allocated for the directory is not always allocated in the same group as the directory inode, but in one of the first groups of the filesystem (group 1 for example). Depending on the current process's pid, ext4_find_near() and ext4_ext_find_goal() can return a block number greater than the maximum blocks count in the filesystem and in that case the block will be not allocated in the same group as the inode. The following patch fixes the problem. Should the modification also be done in ext2/3 code? Signed-off-by: NValerie Clement <valerie.clement@bull.net> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
In ext4_mb_complex_scan_group, if the extent length of the newly found extentet is greater than than the total free blocks counted in group info, break without claiming the block. Document different ext4_error usage, explaining the state with which we continue if we mount with errors=continue Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 22 2月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
When the user was writing into an unitialized extent, ext4_ext_convert_to_initialize() was not requesting journal write access before it started to modify the extent tree. Fix this oversight. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 26 2月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
The path variable returned via ext4_ext_find_extent is a kmalloc variable and needs to be freeded. It also contains a reference to buffer_head which needs to be dropped. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 22 2月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
If ext4_mkdir() fails to allocate the initial block for the directory, don't leave behind a half-created directory inode with the link count left at one. This was caused by an inappropriate call to ext4_dec_count(). Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 16 2月, 2008 2 次提交
-
-
由 Valerie Clement 提交于
With the flex_bg feature enabled, a large file creation oopses the kernel. The BUG_ON is: BUG_ON(len >= EXT4_BLOCKS_PER_GROUP(sb)); As the allocation of the bitmaps and the inode table can be done outside the block group with flex_bg, this allows to allocate up to EXT4_BLOCKS_PER_GROUP blocks in a group. This patch fixes the oops. Signed-off-by: NValerie Clement <valerie.clement@bull.net> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
ext4_fallocate() was trying to acquire i_data_sem outside of jbd2_start_transaction/jbd2_journal_stop, which violates ext4's locking hierarchy. So we take i_mutex to prevent writes and truncates during the complete fallocate operation, and use ext4_get_block_wrap() which acquires and releases i_data_sem for each block allocation. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 26 2月, 2008 1 次提交
-
-
由 Andi Kleen 提交于
Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 15 2月, 2008 2 次提交
-
-
由 Jan Blunck 提交于
* Add path_put() functions for releasing a reference to the dentry and vfsmount of a struct path in the right order * Switch from path_release(nd) to path_put(&nd->path) * Rename dput_path() to path_put_conditional() [akpm@linux-foundation.org: fix cifs] Signed-off-by: NJan Blunck <jblunck@suse.de> Signed-off-by: NAndreas Gruenbacher <agruen@suse.de> Acked-by: NChristoph Hellwig <hch@lst.de> Cc: <linux-fsdevel@vger.kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jan Blunck 提交于
This is the central patch of a cleanup series. In most cases there is no good reason why someone would want to use a dentry for itself. This series reflects that fact and embeds a struct path into nameidata. Together with the other patches of this series - it enforced the correct order of getting/releasing the reference count on <dentry,vfsmount> pairs - it prepares the VFS for stacking support since it is essential to have a struct path in every place where the stack can be traversed - it reduces the overall code size: without patch series: text data bss dec hex filename 5321639 858418 715768 6895825 6938d1 vmlinux with patch series: text data bss dec hex filename 5320026 858418 715768 6894212 693284 vmlinux This patch: Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix cifs] [akpm@linux-foundation.org: fix smack] Signed-off-by: NJan Blunck <jblunck@suse.de> Signed-off-by: NAndreas Gruenbacher <agruen@suse.de> Acked-by: NChristoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 2月, 2008 5 次提交
-
-
由 Theodore Tso 提交于
This flag is simply a generic "this is a crash/burn test filesystem" marker. If it is set, then filesystem code which is "in development" will be allowed to mount the filesystem. Filesystem code which is not considered ready for prime-time will check for this flag, and if it is not set, it will refuse to touch the filesystem. As we start rolling ext4 out to distro's like Fedora, et. al, this makes it less likely that a user might accidentally start using ext4 on a production filesystem; a bad thing, since that will essentially make it be unfsckable until e2fsprogs catches up. Signed-off-by: NTheodore Tso <tytso@MIT.EDU> Signed-off-by: NMingming Cao <cmm@us.ibm.com>
-
由 Aneesh Kumar K.V 提交于
Multiblock allocator calls BUG_ON in many case if the free and used blocks count obtained looking at the bitmap is different from what the allocator internally accounted for. Use ext4_error in such case and don't panic the system. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Eric Sandeen 提交于
struct ext4_allocation_context is rather large, and this bloats the stack of many functions which use it. Allocating it from a named slab cache will alleviate this. For example, with this change (on top of the noinline patch sent earlier): -ext4_mb_new_blocks 200 +ext4_mb_new_blocks 40 -ext4_mb_free_blocks 344 +ext4_mb_free_blocks 168 -ext4_mb_release_inode_pa 216 +ext4_mb_release_inode_pa 40 -ext4_mb_release_group_pa 192 +ext4_mb_release_group_pa 24 Most of these stack-allocated structs are actually used only for mballoc history; and in those cases often a smaller struct would do. So changing that may be another way around it, at least for those functions, if preferred. For now, in those cases where the ac is only for history, an allocation failure simply skips the history recording, and does not cause any other failures. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Jan Kara 提交于
We cannot start transaction in ext4_direct_IO() and just let it last during the whole write because dio_get_page() acquires mmap_sem which ranks above transaction start (e.g. because we have dependency chain mmap_sem->PageLock->journal_start, or because we update atime while holding mmap_sem) and thus deadlocks could happen. We solve the problem by starting a transaction separately for each ext4_get_block() call. We *could* have a problem that we allocate a block and before its data are written out the machine crashes and thus we expose stale data. But that does not happen because for hole-filling generic code falls back to buffered writes and for file extension, we add inode to orphan list and thus in case of crash, journal replay will truncate inode back to the original size. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
In order to prevent a circular locking dependency when an unlink operation is racing with an ext4 migration, we delay taking i_data_sem until just before switch the inode format, and use i_mutex to prevent writes and truncates during the first part of the migration operation. Acked-by: NJan Kara <jack@suse.cz> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-