- 12 7月, 2008 1 次提交
-
-
由 Jose R. Santos 提交于
This patch mostly controls the way inode are allocated in order to make ialloc aware of flex_bg block group grouping. It achieves this by bypassing the Orlov allocator when block group meta-data are packed toghether through mke2fs. Since the impact on the block allocator is minimal, this patch should have little or no effect on other block allocation algorithms. By controlling the inode allocation, it can basically control where the initial search for new block begins and thus indirectly manipulate the block allocator. This allocator favors data and meta-data locality so the disk will gradually be filled from block group zero upward. This helps improve performance by reducing seek time. Since the group of inode tables within one flex_bg are treated as one giant inode table, uninitialized block groups would not need to partially initialize as many inode table as with Orlov which would help fsck time as the filesystem usage goes up. Signed-off-by: NJose R. Santos <jrs@us.ibm.com> Signed-off-by: NValerie Clement <valerie.clement@bull.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 14 7月, 2008 1 次提交
-
-
由 Stoyan Gaydarov 提交于
__FUNCTION__ is gcc-specific, use __func__ instead Signed-off-by: NStoyan Gaydarov <stoyboyker@gmail.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Mingming Cao <cmm@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 12 7月, 2008 2 次提交
-
-
由 Shen Feng 提交于
Change second/third to fourth. Signed-off-by: NShen Feng <shen@cn.fujitsu.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Duane Griffin 提交于
If the orphan node list includes valid, untruncatable nodes with nlink > 0 the ext4_orphan_cleanup loop which attempts to delete them will not do so, causing it to loop forever. Fix by checking for such nodes in the ext4_orphan_get function. This patch fixes the second case (image hdb.20000009.softlockup.gz) reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882. Signed-off-by: NDuane Griffin <duaneg@dghda.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 30 4月, 2008 1 次提交
-
-
由 Christoph Hellwig 提交于
Move ext4 headers out of include/linux. This is just the trivial move, there's some more thing that could be done later. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 29 4月, 2008 1 次提交
-
-
由 Joe Perches 提交于
include/linux/ext4_fs_i.h is included in include/linux/ext_fs.h twice Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 30 4月, 2008 1 次提交
-
-
由 Andi Kleen 提交于
I checked ext4_ioctl and it looked largely safe to not be used without BKL. So convert it over to unlocked_ioctl. Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 29 4月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
Fail migrate if we allocated new blocks via mmap write. If we write to holes in the file via mmap, we end up allocating new blocks. This block allocation happens without taking inode->i_mutex. Since migrate is protected by i_mutex and migrate expects that no new blocks get allocated during migrate, fail migrate if new blocks get allocated. We can't take inode->i_mutex in the mmap write path because that would result in a locking order violation between i_mutex and mmap_sem. Also adding a separate rw_sempahore for protection is really high overhead for a rare operation such as migrate. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: NJan Kara <jack@suse.cz> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 10 2月, 2008 1 次提交
-
-
由 Theodore Tso 提交于
This flag is simply a generic "this is a crash/burn test filesystem" marker. If it is set, then filesystem code which is "in development" will be allowed to mount the filesystem. Filesystem code which is not considered ready for prime-time will check for this flag, and if it is not set, it will refuse to touch the filesystem. As we start rolling ext4 out to distro's like Fedora, et. al, this makes it less likely that a user might accidentally start using ext4 on a production filesystem; a bad thing, since that will essentially make it be unfsckable until e2fsprogs catches up. Signed-off-by: NTheodore Tso <tytso@MIT.EDU> Signed-off-by: NMingming Cao <cmm@us.ibm.com>
-
- 08 2月, 2008 1 次提交
-
-
由 David Howells 提交于
Stop the EXT4 filesystem from using iget() and read_inode(). Replace ext4_read_inode() with ext4_iget(), and call that instead of iget(). ext4_iget() then uses iget_locked() directly and returns a proper error code instead of an inode in the event of an error. ext4_fill_super() returns any error incurred when getting the root inode instead of EINVAL. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: N"Theodore Ts'o" <tytso@mit.edu> Acked-by: NJan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Acked-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 1月, 2008 18 次提交
-
-
由 Alex Tomas 提交于
Signed-off-by: NAlex Tomas <alex@clusterfs.com> Signed-off-by: NAndreas Dilger <adilger@clusterfs.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
This function is used by the ext4 multi block allocator patches. Also add generic_find_next_le_bit Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Aneesh Kumar K.V 提交于
The below patch add ioctl for migrating ext3 indirect block mapped inode to ext4 extent mapped inode. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Jean Noel Cordenner 提交于
This patch adds 64-bit inode version support to ext4. The lower 32 bits are stored in the osd1.linux1.l_i_version field while the high 32 bits are stored in the i_version_hi field newly created in the ext4_inode. This field is incremented in case the ext4_inode is large enough. A i_version mount option has been added to enable the feature. Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: NAndreas Dilger <adilger@clusterfs.com> Signed-off-by: NKalpak Shah <kalpak@clusterfs.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NJean Noel Cordenner <jean-noel.cordenner@bull.net>
-
由 Girish Shilamkar 提交于
The journal checksum feature adds two new flags i.e JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT and JBD2_FEATURE_COMPAT_CHECKSUM. JBD2_FEATURE_CHECKSUM flag indicates that the commit block contains the checksum for the blocks described by the descriptor blocks. Due to checksums, writing of the commit record no longer needs to be synchronous. Now commit record can be sent to disk without waiting for descriptor blocks to be written to disk. This behavior is controlled using JBD2_FEATURE_ASYNC_COMMIT flag. Older kernels/e2fsck should not be able to recover the journal with _ASYNC_COMMIT hence it is made incompat. The commit header has been extended to hold the checksum along with the type of the checksum. For recovery in pass scan checksums are verified to ensure the sanity and completeness(in case of _ASYNC_COMMIT) of every transaction. Signed-off-by: NAndreas Dilger <adilger@clusterfs.com> Signed-off-by: NGirish Shilamkar <girish@clusterfs.com> Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com>
-
由 Aneesh Kumar K.V 提交于
We are currently taking the truncate_mutex for every read. This would have performance impact on large CPU configuration. Convert the lock to read write semaphore and take read lock when we are trying to read the file. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Aneesh Kumar K.V 提交于
When doing a migrate from ext3 to ext4 inode we need to make sure the test for inode type and walking inode data happens inside lock. To make this happen move truncate_mutex early before checking the i_flags. This actually should enable us to remove the verify_chain(). Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Coly Li 提交于
This patch extends bg_itable_unused of ext4 group descriptor from 16bit into 32bit. In order to add bg_itable_unused_hi into struct ext4_group_desc, some extra fields which are already introduced into e2fsprogs are also added in for consistency. Signed-off-by: NColy Li <coyli@suse.de> Cc: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com>
-
由 Aneesh Kumar K.V 提交于
This patch converts ext4_inode i_blocks to represent total blocks occupied by the inode in file system block size. Earlier the variable used to represent this in 512 byte block size. This actually limited the total size of the file. The feature is enabled transparently when we write an inode whose i_blocks cannot be represnted as 512 byte units in a 48 bit variable. inode flag EXT4_HUGE_FILE_FL Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Aneesh Kumar K.V 提交于
Use the __le16 l_i_reserved1 field of the linux2 struct of ext4_inode to represet the higher 16 bits for i_blocks. With this change max_file size becomes (2**48 -1 )* 512 bytes. We add a RO_COMPAT feature to the super block to indicate that inode have i_blocks represented as a split 48 bits. Super block with this feature set cannot be mounted read write on a kernel with CONFIG_LSF disabled. Super block flag EXT4_FEATURE_RO_COMPAT_HUGE_FILE Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Aneesh Kumar K.V 提交于
Rename ext4_inode.i_dir_acl to i_size_high drop ext4_inode_info.i_dir_acl as it is not used Rename ext4_inode.i_size to ext4_inode.i_size_lo Add helper function for accessing the ext4_inode combined i_size. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Aneesh Kumar K.V 提交于
Rename i_file_acl to i_file_acl_lo. This helps in finding bugs where we use i_file_acl instead of the combined i_file_acl_lo and i_file_acl_high Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Aneesh Kumar K.V 提交于
Fix sparse warnings related to static functions and local variables. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Aneesh Kumar K.V 提交于
Introduce ext4_update_*_feature and use them instead of opencoding. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Avantika Mathur 提交于
In many places variables for block group are of type int, which limits the maximum number of block groups to 2^31. Each block group can have up to 2^15 blocks, with a 4K block size, and the max filesystem size is limited to 2^31 * (2^15 * 2^12) = 2^58 -- or 256 PB This patch introduces a new type ext4_group_t, of type unsigned long, to represent block group numbers in ext4. All occurrences of block group variables are converted to type ext4_group_t. Signed-off-by: NAvantika Mathur <mathur@us.ibm.com>
-
由 Aneesh Kumar K.V 提交于
This patch adds a new data type ext4_lblk_t to represent the logical file blocks. This is the preparatory patch to support large files in ext4 The follow up patch with convert the ext4_inode i_blocks to represent the number of blocks in file system block size. This changes makes it possible to have a block number 2**32 -1 which will result in overflow if the block number is represented by signed long. This patch convert all the block number to type ext4_lblk_t which is typedef to __u32 Also remove dead code ext4_ext_walk_space Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: NEric Sandeen <sandeen@redhat.com>
-
由 Jan Kara 提交于
With 64KB blocksize, a directory entry can have size 64KB which does not fit into 16 bits we have for entry lenght. So we store 0xffff instead and convert value when read from / written to disk. The patch also converts some places to use ext4_next_entry() when we are changing them anyway. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NMingming Cao <cmm@us.ibm.com>
-
由 Takashi Sato 提交于
This patch set supports large block size(>4k, <=64k) in ext4, just enlarging the block size limit. But it is NOT possible to have 64kB blocksize on ext4 without some changes to the directory handling code. The reason is that an empty 64kB directory block would have a rec_len == (__u16)2^16 == 0, and this would cause an error to be hit in the filesystem. The proposed solution is treat 64k rec_len with a an impossible value like rec_len = 0xffff to handle this. The Patch-set consists of the following 2 patches. [1/2] ext4: enlarge blocksize - Allow blocksize up to pagesize [2/2] ext4: fix rec_len overflow - prevent rec_len from overflow with 64KB blocksize Now on 64k page ppc64 box runs with this patch set we could create a 64k block size ext4dev, and able to handle empty directory block. Signed-off-by: NTakashi Sato <sho@tnes.nec.co.jp> Signed-off-by: NMingming Cao <cmm@us.ibm.com>
-
- 18 10月, 2007 9 次提交
-
-
由 Aneesh Kumar K.V 提交于
Convert s_r_blocks_count and s_free_blocks_count to s_r_blocks_count_lo and s_free_blocks_count_lo This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix direct partial access in ext4 code Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
Convert s_blocks_count to s_blocks_count_lo This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix direct partial access in ext4 code Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Aneesh Kumar K.V 提交于
Convert bg_inode_bitmap and bg_inode_table to bg_inode_bitmap_lo and bg_inode_table_lo. This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix one direct partial access Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Aneesh Kumar K.V 提交于
Convert bg_block_bitmap to bg_block_bitmap_lo This helps in catching some BUGS due to direct partial access of these split fields. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Jose R. Santos 提交于
This feature relaxes check restrictions on where each block groups meta data is located within the storage media. This allows for the allocation of bitmaps or inode tables outside the block group boundaries in cases where bad blocks forces us to look for new blocks which the owning block group can not satisfy. This will also allow for new meta-data allocation schemes to improve performance and scalability. Signed-off-by: NJose R. Santos <jrs@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Aneesh Kumar K.V 提交于
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Andreas Dilger 提交于
In pass1 of e2fsck, every inode table in the fileystem is scanned and checked, regardless of whether it is in use. This is this the most time consuming part of the filesystem check. The unintialized block group feature can greatly reduce e2fsck time by eliminating checking of uninitialized inodes. With this feature, there is a a high water mark of used inodes for each block group. Block and inode bitmaps can be uninitialized on disk via a flag in the group descriptor to avoid reading or scanning them at e2fsck time. A checksum of each group descriptor is used to ensure that corruption in the group descriptor's bit flags does not cause incorrect operation. The feature is enabled through a mkfs option mke2fs /dev/ -O uninit_groups A patch adding support for uninitialized block groups to e2fsprogs tools has been posted to the linux-ext4 mailing list. The patches have been stress tested with fsstress and fsx. In performance tests testing e2fsck time, we have seen that e2fsck time on ext3 grows linearly with the total number of inodes in the filesytem. In ext4 with the uninitialized block groups feature, the e2fsck time is constant, based solely on the number of used inodes rather than the total inode count. Since typical ext4 filesystems only use 1-10% of their inodes, this feature can greatly reduce e2fsck time for users. With performance improvement of 2-20 times, depending on how full the filesystem is. The attached graph shows the major improvements in e2fsck times in filesystems with a large total inode count, but few inodes in use. In each group descriptor if we have EXT4_BG_INODE_UNINIT set in bg_flags: Inode table is not initialized/used in this group. So we can skip the consistency check during fsck. EXT4_BG_BLOCK_UNINIT set in bg_flags: No block in the group is used. So we can skip the block bitmap verification for this group. We also add two new fields to group descriptor as a part of uninitialized group patch. __le16 bg_itable_unused; /* Unused inodes count */ __le16 bg_checksum; /* crc16(sb_uuid+group+desc) */ bg_itable_unused: If we have EXT4_BG_INODE_UNINIT not set in bg_flags then bg_itable_unused will give the offset within the inode table till the inodes are used. This can be used by fsck to skip list of inodes that are marked unused. bg_checksum: Now that we depend on bg_flags and bg_itable_unused to determine the block and inode usage, we need to make sure group descriptor is not corrupt. We add checksum to group descriptor to detect corruption. If the descriptor is found to be corrupt, we mark all the blocks and inodes in the group used. Signed-off-by: NAvantika Mathur <mathur@us.ibm.com> Signed-off-by: NAndreas Dilger <adilger@clusterfs.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Eric Sandeen 提交于
CONFIG_EXT4_INDEX is not an exposed config option in the kernel, and it is unconditionally defined in ext4_fs.h. tune2fs is already able to turn off dir indexing, so at this point it's just cluttering up the code. Remove it. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Coly Li 提交于
Fragment support in ext2/3/4 was never implemented, and it probably will never be implemented. So remove it from ext4. Signed-off-by: NColy Li <coyli@suse.de> Acked-by: NAndreas Dilger <adilger@clusterfs.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 18 7月, 2007 3 次提交
-
-
由 Andreas Dilger 提交于
This patch adds support to ext4 for allowing more than 65000 subdirectories. Currently the maximum number of subdirectories is capped at 32000. If we exceed 65000 subdirectories in an htree directory it sets the inode link count to 1 and no longer counts subdirectories. The directory link count is not actually used when determining if a directory is empty, as that only counts subdirectories and not regular files that might be in there. A EXT4_FEATURE_RO_COMPAT_DIR_NLINK flag has been added and it is set if the subdir count for any directory crosses 65000. A later fsck will clear EXT4_FEATURE_RO_COMPAT_DIR_NLINK if there are no longer any directory with >65000 subdirs. Signed-off-by: NAndreas Dilger <adilger@clusterfs.com> Signed-off-by: NKalpak Shah <kalpak@clusterfs.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Kalpak Shah 提交于
We need to make sure that existing ext3 filesystems can also avail the new fields that have been added to the ext4 inode. We use s_want_extra_isize and s_min_extra_isize to decide by how much we should expand the inode. If EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE feature is set then we expand the inode by max(s_want_extra_isize, s_min_extra_isize , sizeof(ext4_inode) - EXT4_GOOD_OLD_INODE_SIZE) bytes. Actually it is still an open question about whether users should be able to set s_*_extra_isize smaller than the known fields or not. This patch also adds the functionality to expand inodes to include the newly added fields. We start by trying to expand by s_want_extra_isize bytes and if its fails we try to expand by s_min_extra_isize bytes. This is done by changing the i_extra_isize if enough space is available in the inode and no EAs are present. If EAs are present and there is enough space in the inode then the EAs in the inode are shifted to make space. If enough space is not available in the inode due to the EAs then 1 or more EAs are shifted to the external EA block. In the worst case when even the external EA block does not have enough space we inform the user that some EA would need to be deleted or s_min_extra_isize would have to be reduced. Signed-off-by: NAndreas Dilger <adilger@clusterfs.com> Signed-off-by: NKalpak Shah <kalpak@clusterfs.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Kalpak Shah 提交于
This patch adds nanosecond timestamps for ext4. This involves adding *time_extra fields to the ext4_inode to extend the timestamps to 64-bits. Creation time is also added by this patch. These extended fields will fit into an inode if the filesystem was formatted with large inodes (-I 256 or larger) and there are currently no EAs consuming all of the available space. For new inodes we always reserve enough space for the kernel's known extended fields, but for inodes created with an old kernel this might not have been the case. So this patch also adds the EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE feature flag(ro-compat so that older kernels can't create inodes with a smaller extra_isize). which indicates if the fields fitting inside s_min_extra_isize are available or not. If the expansion of inodes if unsuccessful then this feature will be disabled. This feature is only enabled if requested by the sysadmin. None of the extended inode fields is critical for correct filesystem operation. Signed-off-by: NAndreas Dilger <adilger@clusterfs.com> Signed-off-by: NKalpak Shah <kalpak@clusterfs.com> Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-