- 03 3月, 2013 2 次提交
-
-
由 Lukas Czerner 提交于
We're using macro EXT4_B2C() to convert number of blocks to number of clusters for bigalloc file systems. However, we should be using EXT4_NUM_B2C(). Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
-
由 Wei Yongjun 提交于
'orig_data' is malloced in ext4_remount() and should be freed before leaving from the error handling cases, otherwise it will cause memory leak. Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: NLukas Czerner <lczerner@redhat.com> Cc: stable@vger.kernel.org
-
- 02 3月, 2013 1 次提交
-
-
由 Theodore Ts'o 提交于
Use a percpu counter rather than atomic types for shrinker accounting. There's no need for ultimate accuracy in the shrinker, so this should come a little more cheaply. The percpu struct is somewhat large, but there was a big gap before the cache-aligned s_es_lru_lock anyway, and it fits nicely in there. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 18 2月, 2013 2 次提交
-
-
由 Zheng Liu 提交于
Although extent status is loaded on-demand, we also need to reclaim extent from the tree when we are under a heavy memory pressure because in some cases fragmented extent tree causes status tree costs too much memory. Here we maintain a lru list in super_block. When the extent status of an inode is accessed and changed, this inode will be move to the tail of the list. The inode will be dropped from this list when it is cleared. In the inode, a counter is added to count the number of cached objects in extent status tree. Here only written/unwritten/hole extent is counted because delayed extent doesn't be reclaimed due to fiemap, bigalloc and seek_data/hole need it. The counter will be increased as a new extent is allocated, and it will be decreased as a extent is freed. In this commit we use normal shrinker framework to reclaim memory from the status tree. ext4_es_reclaim_extents_count() traverses the lru list to count the number of reclaimable extents. ext4_es_shrink() tries to reclaim written/unwritten/hole extents from extent status tree. The inode that has been shrunk is moved to the tail of lru list. Signed-off-by: NZheng Liu <wenqing.lz@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: Jan kara <jack@suse.cz>
-
由 Zheng Liu 提交于
Single extent cache could be removed because we have extent status tree as a extent cache, and it would be better. Signed-off-by: NZheng Liu <wenqing.lz@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: Jan kara <jack@suse.cz>
-
- 09 2月, 2013 2 次提交
-
-
由 Theodore Ts'o 提交于
So we can better understand what bits of ext4 are responsible for long-running jbd2 handles, use jbd2__journal_start() so we can pass context information for logging purposes. The recommended way for finding the longer-running handles is: T=/sys/kernel/debug/tracing EVENT=$T/events/jbd2/jbd2_handle_stats echo "interval > 5" > $EVENT/filter echo 1 > $EVENT/enable ./run-my-fs-benchmark cat $T/trace > /tmp/problem-handles This will list handles that were active for longer than 20ms. Having longer-running handles is bad, because a commit started at the wrong time could stall for those 20+ milliseconds, which could delay an fsync() or an O_SYNC operation. Here is an example line from the trace file describing a handle which lived on for 311 jiffies, or over 1.2 seconds: postmark-2917 [000] .... 196.435786: jbd2_handle_stats: dev 254,32 tid 570 type 2 line_no 2541 interval 311 sync 0 requested_blocks 1 dirtied_blocks 0 Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Move the jbd2 wrapper functions which start and stop handles out of super.c, where they don't really logically belong, and into ext4_jbd2.c. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 03 2月, 2013 4 次提交
-
-
由 Theodore Ts'o 提交于
Check for incompatible mount options when using the ext4 file system driver to mount ext2 or ext3 file systems. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Jan Kara 提交于
If argument of inode_readahead_blk is too big, we just bail out without printing any error. Fix this since it could confuse users. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Jan Kara 提交于
The loop looking for correct mount option entry is more logical if it is written rewritten as an empty loop looking for correct option entry and then code handling the option. It also saves one level of indentation for a lot of code so we can join a couple of split lines. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Jan Kara 提交于
Several mount option (resuid, resgid, journal_dev, journal_ioprio) are currently handled before we enter standard option handling loop. I don't see a reason for this so move them to normal handling loop to make things more regular. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 29 1月, 2013 1 次提交
-
-
由 Guo Chao 提交于
brelse() and ext4_journal_force_commit() are both inlined and able to handle NULL. Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 28 1月, 2013 2 次提交
-
-
由 Jan Kara 提交于
It does not make much sense to have struct work in ext4_io_end_t because we always use it for only one ext4_io_end_t per inode (the first one in the i_completed_io list). So just move the structure to inode itself. This also allows for a small simplification in processing io_end structures. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Jan Kara 提交于
Currently we sometimes used block_write_full_page() and sometimes ext4_bio_write_page() for writeback (depending on mount options and call path). Let's always use ext4_bio_write_page() to simplify things a bit. Reviewed-by: NZheng Liu <wenqing.lz@taobao.com> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 25 1月, 2013 2 次提交
-
-
由 Chen Gang 提交于
When usrjquota or grpjquota mount options are specified several times, we leak memory storing the names. Free the memory correctly. Signed-off-by: NChen Gang <gang.chen@asianux.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Theodore Ts'o 提交于
In addition, print the error returned from ext4_enable_quotas() Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com> Cc: stable@vger.kernel.org
-
- 13 1月, 2013 1 次提交
-
-
由 Theodore Ts'o 提交于
After we have finished extending the file system, we need to trigger a the lazy inode table thread to zero out the inode tables. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 27 12月, 2012 1 次提交
-
-
由 Theodore Ts'o 提交于
Commit c278531d added a warning when ext4_flush_unwritten_io() is called without i_mutex being taken. It had previously not been taken during orphan cleanup since races weren't possible at that point in the mount process, but as a result of this c278531d, we will now see a kernel WARN_ON in this case. Take the i_mutex in ext4_orphan_cleanup() to suppress this warning. Reported-by: NAlexander Beregalov <a.beregalov@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: NZheng Liu <wenqing.lz@taobao.com> Cc: stable@vger.kernel.org
-
- 26 12月, 2012 2 次提交
-
-
由 Michael Tokarev 提交于
When a journal-less ext4 filesystem is mounted on a read-only block device (blockdev --setro will do), each remount (for other, unrelated, flags, like suid=>nosuid etc) results in a series of scary messages from kernel telling about I/O errors on the device. This is becauese of the following code ext4_remount(): if (sbi->s_journal == NULL) ext4_commit_super(sb, 1); at the end of remount procedure, which forces writing (flushing) of a superblock regardless whenever it is dirty or not, if the filesystem is readonly or not, and whenever the device itself is readonly or not. We only need call ext4_commit_super when the file system had been previously mounted read/write. Thanks to Eric Sandeen for help in diagnosing this issue. Signed-off-By: NMichael Tokarev <mjt@tls.msk.ru> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
-
由 Eric Sandeen 提交于
To more accurately calculate overhead for "bsd" style df reporting, we should count the journal blocks as overhead as well. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Tested-by: NEric Whitney <enwlinux@gmail.com>
-
- 20 12月, 2012 1 次提交
-
-
由 Jan Kara 提交于
Currently we allow enabling dioread_nolock mount option on remount for filesystems where blocksize < PAGE_CACHE_SIZE. This isn't really supported so fix the bug by moving the check for blocksize != PAGE_CACHE_SIZE into parse_options(). Change the original PAGE_SIZE to PAGE_CACHE_SIZE along the way because that's what we are really interested in. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: NEric Sandeen <sandeen@redhat.com> Cc: stable@vger.kernel.org
-
- 11 12月, 2012 4 次提交
-
-
由 Carlos Maiolino 提交于
Flags being used by atomic operations in inode flags (e.g. ext4_test_inode_flag(), should be consistent with that actually stored in inodes, i.e.: EXT4_XXX_FL. It ensures that this consistency is checked at build-time, not at run-time. Currently, the flags consistency are being checked at run-time, but, there is no real reason to not do a build-time check instead of a run-time check. The code is comparing macro defined values with enum type variables, where both are constants, so, there is no problem in comparing constants at build-time. enum variables are treated as constants by the C compiler, according to the C99 specs (see www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf sec. 6.2.5, item 16), so, there is no real problem in comparing an enumeration type at build time Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
Ted has sent out a RFC about removing this feature. Eric and Jan confirmed that both RedHat and SUSE enable this feature in all their product. David also said that "As far as I know, it's enabled in all Android kernels that use ext4." So it seems OK for us. And what's more, as inline data depends its implementation on xattr, and to be frank, I don't run any test again inline data enabled while xattr disabled. So I think we should add inline data and remove this config option in the same release. [ The savings if you disable CONFIG_EXT4_FS_XATTR is only 27k, which isn't much in the grand scheme of things. Since no one seems to be testing this configuration except for some automated compile farms, on balance we are better removing this config option, and so that it is effectively always enabled. -- tytso ] Cc: David Brown <davidb@codeaurora.org> Cc: Eric Sandeen <sandeen@redhat.com> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Guo Chao 提交于
We use kzalloc() to allocate sbi, no need to zero its field. Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Guo Chao 提交于
inode_init_always() will initialize inode->i_data.writeback_index anyway, no need to do this in ext4_alloc_inode(). Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: NLukas Czerner <lczerner@redhat.com>
-
- 29 11月, 2012 2 次提交
-
-
由 Theodore Ts'o 提交于
Previously, ext4_extents.h was being included at the end of ext4.h, which was bad for a number of reasons: (a) it was not being included in the expected place, and (b) it caused the header to be included multiple times. There were #ifdef's to prevent this from causing any problems, but it still was unnecessary. By moving the function declarations that were in ext4_extents.h to ext4.h, which is standard practice for where the function declarations for the rest of ext4.h can be found, we can remove ext4_extents.h from being included in ext4.h at all, and then we can only include ext4_extents.h where it is needed in ext4's source files. It should be possible to move a few more things into ext4.h, and further reduce the number of source files that need to #include ext4_extents.h, but that's a cleanup for another day. Reported-by: NSachin Kamat <sachin.kamat@linaro.org> Reported-by: NWei Yongjun <weiyj.lk@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Vahram Martirosyan 提交于
The memset operation before check can cause a BUG if the memory allocation failed. Since we are using get_zeroed_age, there is no need to use memset anyway. Found by the Spruce system in cooperation with the KEDR Framework. Signed-off-by: NVahram Martirosyan <vmartirosyan@linuxtesting.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 09 11月, 2012 5 次提交
-
-
由 Zheng Liu 提交于
This patch lets ext4 maintain extent status tree. Currently it only tracks delay extent status in extent status tree. When a delay allocation is issued, the related delay extent will be inserted into extent status tree. When a delay extent is written out or invalidated, it will be removed from this tree. Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: NAllison Henderson <achender@linux.vnet.ibm.com> Signed-off-by: NZheng Liu <wenqing.lz@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Zheng Liu 提交于
Let ext4 initialize extent status tree of an inode. Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: NAllison Henderson <achender@linux.vnet.ibm.com> Signed-off-by: NZheng Liu <wenqing.lz@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Lukas Czerner 提交于
There are some places in ext4_fill_super() where we would not return proper error code if something fails. The confusion is caused probably due to the fact that we have two "kind-of" return variables 'ret'and 'err'. 'ret' is used to return error code from ext4_fill_super() where err is used to store return values from other functions within ext4_fill_super(). However some places were missing the obligatory 'ret = err'. We could put the assignment where it is missing, but we can have better "future proof" solution. Or we could convert the code to use just one, but it would require more rewrites. This commit fixes the problem by returning value from 'err' variable if it is set and 'ret' otherwise in error handling branch of the ext4_fill_super(). The reasoning is that 'ret' value is often set to default "-EINVAL" or explicit value, where 'err' is used to store return value from other functions and should be otherwise zero. https://bugzilla.kernel.org/show_bug.cgi?id=48431Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Lukas Czerner 提交于
Notify user when mounting the file system with -o discard option, but the device does not support discard. Obviously we do not want to fail the mount or disable the options, because the underlying device might change in future even without file system remount. Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com> Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Zhao Hongjiang 提交于
Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 08 11月, 2012 1 次提交
-
-
由 Eric Sandeen 提交于
"overhead" was a write-only variable in this function after commit 952fc18e; we set it to 0 for minixdf, or to sbi->s_overhead if !minixdf, but never read it again after that. We need to use it, not sbi->s_overhead, when subtracting out overhead for f_blocks, or we get the wrong answer for minixdf. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 16 10月, 2012 1 次提交
-
-
由 Lukas Czerner 提交于
The result of the bit shift expression in '1 << sbi->s_log_groups_per_flex' can be undefined in the case that s_log_groups_per_flex is 31 because the result of the shift is bigger than INT_MAX. In reality this probably should not cause much problems since we'll end up with INT_MIN which will then be converted into 'unsigned int' type, but nevertheless according to the ISO C99 the result is actually undefined. Fix this by changing the left operand to 'unsigned int' type. Note that the commit d50f2ab6 already tried to fix the undefined behaviour, but this was missed. Thanks to Laszlo Ersek for pointing this out and suggesting the fix. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com> Reported-by: NLaszlo Ersek <lersek@redhat.com>
-
- 10 10月, 2012 1 次提交
-
-
由 Theodore Ts'o 提交于
The function ext4_handle_dirty_super() was calculating the superblock on the wrong block data. As a result, when the superblock is modified while it is mounted (most commonly, when inodes are added or removed from the orphan list), the superblock checksum would be wrong. We didn't notice because the superblock *was* being correctly calculated in ext4_commit_super(), and this would get called when the file system was unmounted. So the problem only became obvious if the system crashed while the file system was mounted. Fix this by removing the poorly designed function signature for ext4_superblock_csum_set(); if it only took a single argument, the pointer to a struct superblock, the ambiguity which caused this mistake would have been impossible. Reported-by: NGeorge Spelvin <linux@horizon.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 03 10月, 2012 1 次提交
-
-
由 Kirill A. Shutemov 提交于
There's no reason to call rcu_barrier() on every deactivate_locked_super(). We only need to make sure that all delayed rcu free inodes are flushed before we destroy related cache. Removing rcu_barrier() from deactivate_locked_super() affects some fast paths. E.g. on my machine exit_group() of a last process in IPC namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 29 9月, 2012 2 次提交
-
-
由 Dmitry Monakhov 提交于
AIO/DIO prefix is wrong because it account unwritten extents which also may be scheduled from buffered write endio Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Dmitry Monakhov 提交于
Generic inode has unused i_private pointer which may be used as cur_aio_dio storage. TODO: If cur_aio_dio will be passed as an argument to get_block_t this allow to have concurent AIO_DIO requests. Reviewed-by: NZheng Liu <wenqing.lz@taobao.com> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 27 9月, 2012 2 次提交
-
-
由 Wei Yongjun 提交于
Convert cpu_to_leXX(leXX_to_cpu(E1) + E2) to use leXX_add_cpu(). dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Eric Sandeen 提交于
If the file system contains errors and it is being mounted read-only, don't clear the orphan list. We should minimize changes to the file system if it is mounted read-only. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-