- 11 10月, 2008 3 次提交
-
-
由 Hidehiro Kawai 提交于
If the journal doesn't abort when it gets an IO error in file data blocks, the file data corruption will spread silently. Because most of applications and commands do buffered writes without fsync(), they don't notice the IO error. It's scary for mission critical systems. On the other hand, if the journal aborts whenever it gets an IO error in file data blocks, the system will easily become inoperable. So this patch introduces a filesystem option to determine whether it aborts the journal or just call printk() when it gets an IO error in file data. If you mount an ext4 fs with data_err=abort option, it aborts on file data write error. If you mount it with data_err=ignore, it doesn't abort, just call printk(). data_err=ignore is the default. Here is the corresponding patch of the ext3 version: http://kerneltrap.org/mailarchive/linux-kernel/2008/9/9/3239374Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Hidehiro Kawai 提交于
If the journal has aborted due to a checkpointing failure, we have to keep the contents of the journal space. Otherwise, the filesystem will lose uncheckpointed metadata completely and become inconsistent. To avoid this, we need to keep needs_recovery flag if checkpoint has failed. With this patch, ext4_put_super() detects a checkpointing failure from the return value of journal_destroy(), then it invokes ext4_abort() to make the filesystem read only and keep needs_recovery flag. Errors from jbd2_journal_flush() are also handled by this patch in some places. Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
The ext4 filesystem is getting stable enough that it's time to drop the "dev" prefix. Also remove the requirement for the TEST_FILESYS flag. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 07 10月, 2008 1 次提交
-
-
由 Andi Kleen 提交于
While reading code I noticed that ext4_put_super() dirties the superblock bh twice. It is always done in ext4_commit_super() too. Remove the redundant dirty operation. Should be a nop semantically. Signed-off-by: NAndi Kleen <ak@linux.intel.com>
-
- 06 10月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
This debugging markers are designed to debug problems such as the random filesystem latency problems reported by Arjan. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 10 10月, 2008 2 次提交
-
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
With modern hard drives, reading 64k takes roughly the same time as reading a 4k block. So request readahead for adjacent inode table blocks to reduce the time it takes when iterating over directories (especially when doing this in htree sort order) in a cold cache case. With this patch, the time it takes to run "git status" on a kernel tree after flushing the caches via "echo 3 > /proc/sys/vm/drop_caches" is reduced by 21%. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 24 9月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
Previously mballoc created a separate set of functions for each proc file. This combines the tunables into a single set of functions which gets used for all of the per-superblock proc files, saving approximately 2k of compiled object code. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 23 9月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
...and into the core setup/teardown code in fs/ext4/super.c so that other parts of ext4 can define tuning parameters. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 07 10月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
This fixes some very common warnings reported by kerneloops.org Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 08 9月, 2008 1 次提交
-
-
由 Li Zefan 提交于
If there group descriptors are corrupted we need unlock the block group lock before returning from the function; else we will oops when freeing a spinlock which is still being held. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 17 9月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
Calculate the journal device name once and stash it away in the journal_s structure. This avoids needing to call bdevname() everywhere and reduces stack usage by not needing to allocate an on-stack buffer. In addition, we eliminate the '/' that can appear in device names (e.g. "cciss/c0d0p9" --- see kernel bugzilla #11321) that can cause problems when creating proc directory names, and include the inode number to support ocfs2 which creates multiple journals with different inode numbers. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 08 9月, 2008 1 次提交
-
-
由 Frederic Bohe 提交于
This fixes a bug which prevented the newly created inodes after a resize from being used on filesystems with flex_bg. Signed-off-by: NFrederic Bohe <frederic.bohe@bull.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 10 10月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
This patch adds dirty block accounting using percpu_counters. Delayed allocation block reservation is now done by updating dirty block counter. In a later patch we switch to non delalloc mode if the filesystem free blocks is greater than 150% of total filesystem dirty blocks Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao<cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 09 9月, 2008 3 次提交
-
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 01 8月, 2008 1 次提交
-
-
由 Al Viro 提交于
* new helper: vfs_quota_on_path(); equivalent of vfs_quota_on() sans the pathname resolution. * callers of vfs_quota_on() that do their own pathname resolution and checks based on it are switched to vfs_quota_on_path(); that way we avoid the races. * reiserfs leaked dentry/vfsmount references on several failure exits. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 20 8月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
The write_cache_pages() function uses the mapping->writeback_index as the starting index to write out when range_cyclic is set. Properly initialize writeback_index so that we start the writeout at index 0. This was found when debugging the small file fragmentation on ext4. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 27 7月, 2008 2 次提交
-
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Alexey Dobriyan 提交于
Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Acked-by: NPekka Enberg <penberg@cs.helsinki.fi> Acked-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Matt Mackall <mpm@selenic.com> [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c] [akpm@linux-foundation.org: fix mm/slab.c] [akpm@linux-foundation.org: fix ubifs] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 7月, 2008 1 次提交
-
-
由 Li Zefan 提交于
- use kzalloc() instead of kmalloc() + memset() - improve a printk info Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 03 8月, 2008 1 次提交
-
-
由 Eric Sandeen 提交于
I noticed when filling a 1T filesystem with 4 threads using the fs_mark benchmark: fs_mark -d /mnt/test -D 256 -n 100000 -t 4 -s 20480 -F -S 0 that I occasionally got checksum mismatch errors: EXT4-fs error (device sdb): ext4_init_inode_bitmap: Checksum bad for group 6935 etc. I'd reliably get 4-5 of them during the run. It appears that the problem is likely a race to init the bg's when the uninit_bg feature is enabled. With the patch below, which adds sb_bgl_locking around initialization, I was able to complete several runs with no errors or warnings. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 27 7月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
If the block group checksums are corrupted, still allow the mount to succeed, so e2fsck can have a chance to try to fix things up. Add code in the remount r/w path to make sure the block group checksums are valid before allowing the filesystem to be remounted read/write. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 12 7月, 2008 3 次提交
-
-
由 Eric Sandeen 提交于
We've talked for a while about getting rid of any feature- setting from the kernel; this gets rid of the code which would set the INCOMPAT_EXTENTS flag on the first file write when mounted as ext4[dev]. With this patch, if the extents feature is not already set on disk, then mounting as ext4 will fall back to noextents with a warning, and if -o extents is explicitly requested, the mount will fail, also with warning. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
The block mapped inode format can address only blocks within 2**32. This causes a number of issues, the biggest of which is that the block allocator needs to be taught that certain inodes can not utilize block numbers > 2**32. So until this is fixed, it is simplest to fail mounting of file systems with more than 2**32 blocks if the -o noextents option is given. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
Enable delalloc by default to ensure it gets sufficient testing and because it makes the filesystem much more efficient. Add a nodealalloc option to disable delayed allocation, and update ext4_show_options to show delayed allocation off if it is disabled. If the data=journal mount option is used, disable delayed allocation since the delalloc code doesn't support data=journal yet. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NMingming Cao <cmm@us.ibm.com>
-
- 15 7月, 2008 1 次提交
-
-
由 Mingming Cao 提交于
This patch does block reservation for delayed allocation, to avoid ENOSPC later at page flush time. Blocks(data and metadata) are reserved at da_write_begin() time, the freeblocks counter is updated by then, and the number of reserved blocks is store in per inode counter. At the writepage time, the unused reserved meta blocks are returned back. At unlink/truncate time, reserved blocks are properly released. Updated fix from Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> to fix the oldallocator block reservation accounting with delalloc, added lock to guard the counters and also fix the reservation for meta blocks. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 12 7月, 2008 7 次提交
-
-
由 Alex Tomas 提交于
Updated with fixes from Mingming Cao <cmm@us.ibm.com> to unlock and release the page from page cache if the delalloc write_begin failed, and properly handle preallocated blocks. Also added a fix to clear buffer_delay in block_write_full_page() after allocating a delayed buffer. Updated with fixes from Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> to update i_disksize properly and to add bmap support for delayed allocation. Updated with a fix from Valerie Clement <valerie.clement@bull.net> to avoid filesystem corruption when the filesystem is mounted with the delalloc option and blocksize < pagesize. Signed-off-by: NAlex Tomas <alex@clusterfs.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Jan Kara 提交于
This patch makes ext4 use inode-based implementation of data=ordered mode in JBD2. It allows us to unify some data=ordered and data=writeback paths (especially writepage since we don't have to start a transaction anymore) and remove some buffer walking. Updated fix from Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> to fix file system hang due to corrupt jinode values. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Jan Kara 提交于
Set sbi->s_journal to NULL after we call journal_destroy(). This will be later needed because after journal_destroy() is called, ext4_clear_inode() can still be called for some inodes (e.g. root inode) and we'll need to detect there that journal doesn't exists anymore. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Li Zefan 提交于
The previous sb_min_blocksize() has already set the block size. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Jose R. Santos 提交于
This patch mostly controls the way inode are allocated in order to make ialloc aware of flex_bg block group grouping. It achieves this by bypassing the Orlov allocator when block group meta-data are packed toghether through mke2fs. Since the impact on the block allocator is minimal, this patch should have little or no effect on other block allocation algorithms. By controlling the inode allocation, it can basically control where the initial search for new block begins and thus indirectly manipulate the block allocator. This allocator favors data and meta-data locality so the disk will gradually be filled from block group zero upward. This helps improve performance by reducing seek time. Since the group of inode tables within one flex_bg are treated as one giant inode table, uninitialized block groups would not need to partially initialize as many inode table as with Orlov which would help fsck time as the filesystem usage goes up. Signed-off-by: NJose R. Santos <jrs@us.ibm.com> Signed-off-by: NValerie Clement <valerie.clement@bull.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 05 7月, 2008 1 次提交
-
-
由 Jan Kara 提交于
When write in ext4_quota_write() fails, we have to properly release i_mutex. One error path has been missing the unlock... Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 5月, 2008 1 次提交
-
-
由 Eric Sandeen 提交于
I can't think of any valid reason for ext4 to not use barriers when they are available; I believe this is necessary for filesystem integrity in the face of a volatile write cache on storage. An administrator who trusts that the cache is sufficiently battery- backed (and power supplies are sufficiently redundant, etc...) can always turn it back off again. SuSE has carried such a patch for ext3 for quite some time now. Also document the mount option while we're at it. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 26 5月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
Cc: Andreas Dilger <adilger@clusterfs.com> Cc: Girish Shilamkar <girish@clusterfs.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 07 6月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
If a journal checksum error is detected, the ext4 filesystem will call ext4_error(), and the mount will either continue, become a read-only mount, or cause a kernel panic based on the superblock flags indicating the user's preference of what to do in case of filesystem corruption being detected. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 14 5月, 2008 1 次提交
-
-
由 Jan Kara 提交于
Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-