提交 · f3bd1f3fa8ca7ec70cfd87aa94dc5e1a260901f2 · openeuler / raspberrypi-kernel

20 8月, 2008 2 次提交

ext4: journal credits reservation fixes for DIO, fallocate · f3bd1f3f

由 Mingming Cao 提交于 8月 19, 2008

DIO and fallocate credit calculation is different than writepage, as
they do start a new journal right for each call to ext4_get_blocks_wrap().
This patch uses the helper function in DIO and fallocate case, passing
a flag indicating that the modified data are contigous thus could account
less indirect/index blocks.

This patch also fixed the journal credit reservation for direct I/O
(DIO).  Previously the estimated credits for DIO only was calculated for
non-extent files, which was not enough if the file is extent-based.

Also fixed was fallocate double-counting credits for modifying the the
superblock.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f3bd1f3f

ext4: journal credits calulation cleanup and fix for non-extent writepage · a02908f1

由 Mingming Cao 提交于 8月 19, 2008

When considering how many journal credits are needed for modifying a
chunk of data, we need to account for the super block, inode block,
quota blocks and xattr block, indirect/index blocks, also, group bitmap
and group descriptor blocks for new allocation (including data and
indirect/index blocks). There are many places in ext4 do the calculation
on their own and often missed one or two meta blocks, and often they
assume single block allocation, and did not considering the multile
chunk of allocation case.

This patch is trying to cleanup current journal credit code, provides
some common helper funtion to calculate the journal credits, to be used
for writepage, writepages, DIO, fallocate, migration, defrag, and for
both nonextent and extent files.

This patch modified the writepage/write_begin credit caculation for
nonextent files, to use the new helper function. It also fixed the
problem that writepage on nonextent files did not consider the case
blocksize <pagesize, thus could possibelly need multiple block
allocation in a single transaction.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a02908f1

18 7月, 2008 1 次提交

ext4: Cleanup the block reservation code path · 12219aea

由 Aneesh Kumar K.V 提交于 7月 17, 2008

The truncate patch should not use the i_allocated_meta_blocks
value. So add seperate functions to be used in the truncate
and alloc path. We also need to release the meta-data block
that we reserved for the blocks that we are truncating.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

12219aea

12 7月, 2008 1 次提交

ext4: delayed allocation i_blocks fix for stat · 3e3398a0

由 Mingming Cao 提交于 7月 11, 2008

Right now i_blocks is not getting updated until the blocks are actually
allocaed on disk.  This means with delayed allocation, right after files
are copied, "ls -sF" shoes the file as taking 0 blocks on disk.  "du"
also shows the files taking zero space, which is highly confusing to the
user.

Since delayed allocation already keeps track of per-inode total
number of blocks that are subject to delayed allocation, this patch fix
this by using that to adjust the value returned by stat(2). When real
block allocation is done, the i_blocks will get updated. Since the
reserved blocks for delayed allocation will be decreased, this will be
keep value returned by stat(2) consistent.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3e3398a0

15 7月, 2008 1 次提交

ext4: delayed allocation ENOSPC handling · d2a17637

由 Mingming Cao 提交于 7月 14, 2008

This patch does block reservation for delayed
allocation, to avoid ENOSPC later at page flush time.

Blocks(data and metadata) are reserved at da_write_begin()
time, the freeblocks counter is updated by then, and the number of
reserved blocks is store in per inode counter.
        
At the writepage time, the unused reserved meta blocks are returned
back. At unlink/truncate time, reserved blocks are properly released.

Updated fix from  Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
to fix the oldallocator block reservation accounting with delalloc, added
lock to guard the counters and also fix the reservation for meta blocks.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

d2a17637

12 7月, 2008 8 次提交

ext4: Add delayed allocation support in data=writeback mode · 64769240

由 Alex Tomas 提交于 7月 11, 2008

Updated with fixes from Mingming Cao <cmm@us.ibm.com> to unlock and
release the page from page cache if the delalloc write_begin failed, and
properly handle preallocated blocks.  Also added a fix to clear
buffer_delay in block_write_full_page() after allocating a delayed
buffer.

Updated with fixes from Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
to update i_disksize properly and to add bmap support for delayed
allocation.

Updated with a fix from Valerie Clement <valerie.clement@bull.net> to
avoid filesystem corruption when the filesystem is mounted with the
delalloc option and blocksize < pagesize.
Signed-off-by: NAlex Tomas <alex@clusterfs.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

64769240

ext4: Invert the locking order of page_lock and transaction start · cf108bca

由 Jan Kara 提交于 7月 11, 2008

This changes are needed to support data=ordered mode handling via
inodes. This enables us to get rid of the journal heads and buffer
heads for data buffers in the ordered mode. With the changes, during
tranasaction commit we writeout the inode pages using the
writepages()/writepage(). That implies we take page lock during
transaction commit. This can cause a deadlock with the locking order
page_lock -> jbd2_journal_start, since the jbd2_journal_start can wait
for the journal_commit to happen and the journal_commit now needs to
take the page lock. To avoid this dead lock reverse the locking order.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

cf108bca

ext4: Use page_mkwrite vma_operations to get mmap write notification. · 2e9ee850

由 Aneesh Kumar K.V 提交于 7月 11, 2008

We would like to get notified when we are doing a write on mmap section.
This is needed with respect to preallocated area. We split the preallocated
area into initialzed extent and uninitialzed extent in the call back. This
let us handle ENOSPC better. Otherwise we get ENOSPC in the writepage and
that would result in data loss. The changes are also needed to handle ENOSPC
when writing to an mmap section of files with holes.
Acked-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2e9ee850

ext4: fix online resize with mballoc · 5f21b0e6

由 Frederic Bohe 提交于 7月 11, 2008

Update group infos when updating a group's descriptor.
Add group infos when adding a group's descriptor.
Refresh cache pages used by mb_alloc when changes occur.
This will probably need modifications when META_BG resizing will be allowed.
Signed-off-by: NFrederic Bohe <frederic.bohe@bull.net>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>

5f21b0e6

ext4: mballoc avoid use root reserved blocks for non root allocation · 07031431

由 Mingming Cao 提交于 7月 11, 2008

mballoc allocation missed check for blocks reserved for root users. Add
ext4_has_free_blocks() check before allocation. Also modified
ext4_has_free_blocks() to support multiple block allocation request.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

07031431

ext4: cleanup block allocator · 654b4908

由 Aneesh Kumar K.V 提交于 7月 11, 2008

Move the code related to block allocation to a single function and add helper
funtions to differient allocation for data and meta data blocks
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

654b4908

ext4: Use inode preallocation with -o noextents · 7061eba7

由 Aneesh Kumar K.V 提交于 7月 11, 2008

When mballoc is enabled, block allocation for old block-based
files are allocated using mballoc allocator instead of old
block-based allocator. The old ext3 block reservation is turned
off when mballoc is turned on.

However, the in-core preallocation is not enabled for block-based/
non-extent based file block allocation. This result in performance
regression, as now we don't have "reservation" ore in-core preallocation
to prevent interleaved fragmentation in multiple writes workload.

This patch fix this by enable per inode in-core preallocation
for non extent files when mballoc is used.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7061eba7

ext4: New inode allocation for FLEX_BG meta-data groups. · 772cb7c8

由 Jose R. Santos 提交于 7月 11, 2008

This patch mostly controls the way inode are allocated in order to
make ialloc aware of flex_bg block group grouping.  It achieves this
by bypassing the Orlov allocator when block group meta-data are packed
toghether through mke2fs.  Since the impact on the block allocator is
minimal, this patch should have little or no effect on other block
allocation algorithms. By controlling the inode allocation, it can
basically control where the initial search for new block begins and
thus indirectly manipulate the block allocator.

This allocator favors data and meta-data locality so the disk will
gradually be filled from block group zero upward.  This helps improve
performance by reducing seek time.  Since the group of inode tables
within one flex_bg are treated as one giant inode table, uninitialized
block groups would not need to partially initialize as many inode
table as with Orlov which would help fsck time as the filesystem usage
goes up.
Signed-off-by: NJose R. Santos <jrs@us.ibm.com>
Signed-off-by: NValerie Clement <valerie.clement@bull.net>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

772cb7c8

14 7月, 2008 1 次提交

ext4: replace __FUNCTION__ occurrences · 4db9c54a

由 Stoyan Gaydarov 提交于 7月 13, 2008

__FUNCTION__ is gcc-specific, use __func__ instead
Signed-off-by: NStoyan Gaydarov <stoyboyker@gmail.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

4db9c54a

12 7月, 2008 2 次提交

ext4: fix comments to say "ext4" · 8a35694e

由 Shen Feng 提交于 7月 11, 2008

Change second/third to fourth.
Signed-off-by: NShen Feng <shen@cn.fujitsu.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8a35694e

ext4: handle corrupted orphan list at mount · 91ef4caf

由 Duane Griffin 提交于 7月 11, 2008

If the orphan node list includes valid, untruncatable nodes with nlink > 0
the ext4_orphan_cleanup loop which attempts to delete them will not do so,
causing it to loop forever. Fix by checking for such nodes in the
ext4_orphan_get function.

This patch fixes the second case (image hdb.20000009.softlockup.gz)
reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882.
Signed-off-by: NDuane Griffin <duaneg@dghda.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

91ef4caf

30 4月, 2008 1 次提交

ext4: move headers out of include/linux · 3dcf5451

由 Christoph Hellwig 提交于 4月 29, 2008

Move ext4 headers out of include/linux.  This is just the trivial move,
there's some more thing that could be done later. 
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3dcf5451

29 4月, 2008 1 次提交

ext4: remove duplicate include of ext4_fs_i.h header file · 418f6e9e

由 Joe Perches 提交于 4月 29, 2008

include/linux/ext4_fs_i.h is included in include/linux/ext_fs.h twice
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

418f6e9e

30 4月, 2008 1 次提交

Convert ext4 to use unlocked_ioctl · 5cdd7b2d

由 Andi Kleen 提交于 4月 29, 2008

I checked ext4_ioctl and it looked largely safe to not be used
without BKL.  So convert it over to unlocked_ioctl.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

5cdd7b2d

29 4月, 2008 1 次提交

ext4: Fix race between migration and mmap write · 267e4db9

由 Aneesh Kumar K.V 提交于 4月 29, 2008

Fail migrate if we allocated new blocks via mmap write.

If we write to holes in the file via mmap, we end up allocating
new blocks. This block allocation happens without taking inode->i_mutex.
Since migrate is protected by i_mutex and migrate expects that no
new blocks get allocated during migrate, fail migrate if new blocks
get allocated.

We can't take inode->i_mutex in the mmap write path because that
would result in a locking order violation between i_mutex and mmap_sem.
Also adding a separate rw_sempahore for protection is really high overhead
for a rare operation such as migrate.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

267e4db9

10 2月, 2008 1 次提交

ext4: Add new "development flag" to the ext4 filesystem · 469108ff

由 Theodore Tso 提交于 2月 10, 2008

This flag is simply a generic "this is a crash/burn test filesystem"
marker.  If it is set, then filesystem code which is "in development"
will be allowed to mount the filesystem.  Filesystem code which is not
considered ready for prime-time will check for this flag, and if it is
not set, it will refuse to touch the filesystem.

As we start rolling ext4 out to distro's like Fedora, et. al, this makes
it less likely that a user might accidentally start using ext4 on a
production filesystem; a bad thing, since that will essentially make it
be unfsckable until e2fsprogs catches up.
Signed-off-by: NTheodore Tso <tytso@MIT.EDU>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>

469108ff

08 2月, 2008 1 次提交

iget: stop EXT4 from using iget() and read_inode() · 1d1fe1ee

由 David Howells 提交于 2月 07, 2008

Stop the EXT4 filesystem from using iget() and read_inode().  Replace
ext4_read_inode() with ext4_iget(), and call that instead of iget().
ext4_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

ext4_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: N"Theodore Ts'o" <tytso@mit.edu>
Acked-by: NJan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Acked-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1d1fe1ee

29 1月, 2008 18 次提交

ext4: Add multi block allocator for ext4 · c9de560d

由 Alex Tomas 提交于 1月 29, 2008

Signed-off-by: NAlex Tomas <alex@clusterfs.com>
Signed-off-by: NAndreas Dilger <adilger@clusterfs.com>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c9de560d

ext4: Add ext4_find_next_bit() · aa02ad67

由 Aneesh Kumar K.V 提交于 1月 28, 2008

This function is used by the ext4 multi block allocator patches.

Also add generic_find_next_le_bit
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

aa02ad67

ext4: Add EXT4_IOC_MIGRATE ioctl · c14c6fd5

由 Aneesh Kumar K.V 提交于 1月 28, 2008

The below patch add ioctl for migrating ext3 indirect block mapped inode
to ext4 extent mapped inode.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

c14c6fd5

ext4: Add inode version support in ext4 · 25ec56b5

由 Jean Noel Cordenner 提交于 1月 28, 2008

This patch adds 64-bit inode version support to ext4. The lower 32 bits
are stored in the osd1.linux1.l_i_version field while the high 32 bits
are stored in the i_version_hi field newly created in the ext4_inode.
This field is incremented in case the ext4_inode is large enough. A
i_version mount option has been added to enable the feature.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: NAndreas Dilger <adilger@clusterfs.com>
Signed-off-by: NKalpak Shah <kalpak@clusterfs.com>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NJean Noel Cordenner <jean-noel.cordenner@bull.net>

25ec56b5

ext4: Add the journal checksum feature · 818d276c

由 Girish Shilamkar 提交于 1月 28, 2008

The journal checksum feature adds two new flags i.e
JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT and JBD2_FEATURE_COMPAT_CHECKSUM.

JBD2_FEATURE_CHECKSUM flag indicates that the commit block contains the
checksum for the blocks described by the descriptor blocks.
Due to checksums, writing of the commit record no longer needs to be
synchronous. Now commit record can be sent to disk without waiting for
descriptor blocks to be written to disk. This behavior is controlled
using JBD2_FEATURE_ASYNC_COMMIT flag. Older kernels/e2fsck should not be
able to recover the journal with _ASYNC_COMMIT hence it is made
incompat.
The commit header has been extended to hold the checksum along with the
type of the checksum.

For recovery in pass scan checksums are verified to ensure the sanity
and completeness(in case of _ASYNC_COMMIT) of every transaction.
Signed-off-by: NAndreas Dilger <adilger@clusterfs.com>
Signed-off-by: NGirish Shilamkar <girish@clusterfs.com>
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>

818d276c

ext4: Convert truncate_mutex to read write semaphore. · 0e855ac8

由 Aneesh Kumar K.V 提交于 1月 28, 2008

We are currently taking the truncate_mutex for every read. This would have
performance impact on large CPU configuration. Convert the lock to read write
semaphore and take read lock when we are trying to read the file.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

0e855ac8

ext4: Make ext4_get_blocks_wrap take the truncate_mutex early. · c278bfec

由 Aneesh Kumar K.V 提交于 1月 28, 2008

When doing a migrate from ext3 to ext4 inode we need to make sure the test
for inode type and walking inode data happens inside lock. To make this
happen move truncate_mutex early before checking the i_flags.

This actually should enable us to remove the verify_chain().
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

c278bfec

ext4: sync up block group descriptor with e2fsprogs. · 91b51a01

由 Coly Li 提交于 1月 28, 2008

This patch extends bg_itable_unused of ext4 group descriptor
from 16bit into 32bit. In order to add bg_itable_unused_hi into
struct ext4_group_desc, some extra fields which are already introduced into
e2fsprogs are also added in for consistency.
Signed-off-by: NColy Li <coyli@suse.de>
Cc: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>

91b51a01

ext4: Support large files · 8180a562

由 Aneesh Kumar K.V 提交于 1月 28, 2008

This patch converts ext4_inode i_blocks to represent total
blocks occupied by the inode in file system block size.
Earlier the variable used to represent this in 512 byte
block size. This actually limited the total size of the file.

The feature is enabled transparently when we write an inode
whose i_blocks cannot be represnted as 512 byte units in a
48 bit variable.

inode flag  EXT4_HUGE_FILE_FL
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

8180a562

ext4: Add support for 48 bit inode i_blocks. · 0fc1b451

由 Aneesh Kumar K.V 提交于 1月 28, 2008

Use the __le16 l_i_reserved1 field of the linux2 struct of ext4_inode
to represet the higher 16 bits for i_blocks. With this change max_file
size becomes (2**48 -1 )* 512 bytes.

We add a RO_COMPAT feature to the super block to indicate that inode
have i_blocks represented as a split 48 bits. Super block with this
feature set cannot be mounted read write on a kernel with CONFIG_LSF
disabled.

Super block flag EXT4_FEATURE_RO_COMPAT_HUGE_FILE
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

0fc1b451

ext4: Rename i_dir_acl to i_size_high · a48380f7

由 Aneesh Kumar K.V 提交于 1月 28, 2008

Rename ext4_inode.i_dir_acl to i_size_high
drop ext4_inode_info.i_dir_acl as it is not used
Rename ext4_inode.i_size to ext4_inode.i_size_lo
Add helper function for accessing the ext4_inode combined i_size.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

a48380f7

ext4: Rename i_file_acl to i_file_acl_lo · 7973c0c1

由 Aneesh Kumar K.V 提交于 1月 28, 2008

Rename i_file_acl to i_file_acl_lo. This helps
in finding bugs where we use i_file_acl instead
of the combined i_file_acl_lo and i_file_acl_high
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

7973c0c1

ext4: Fix sparse warnings. · 1d03ec98

由 Aneesh Kumar K.V 提交于 1月 28, 2008

Fix sparse warnings related to static functions
and local variables.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

1d03ec98

ext4: Introduce ext4_update_*_feature · 99e6f829

由 Aneesh Kumar K.V 提交于 1月 28, 2008

Introduce ext4_update_*_feature and use them instead
of opencoding.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

99e6f829

ext4: add ext4_group_t, and change all group variables to this type. · fd2d4291

由 Avantika Mathur 提交于 1月 28, 2008

In many places variables for block group are of type int, which limits the
maximum number of block groups to 2^31. Each block group can have up to
2^15 blocks, with a 4K block size, and the max filesystem size is limited to
2^31 * (2^15 * 2^12) = 2^58 -- or 256 PB

This patch introduces a new type ext4_group_t, of type unsigned long, to
represent block group numbers in ext4.
All occurrences of block group variables are converted to type ext4_group_t.
Signed-off-by: NAvantika Mathur <mathur@us.ibm.com>

fd2d4291

ext4: Introduce ext4_lblk_t · 725d26d3

由 Aneesh Kumar K.V 提交于 1月 28, 2008

This patch adds a new data type ext4_lblk_t to represent
the logical file blocks.

This is the preparatory patch to support large files in ext4
The follow up patch with convert the ext4_inode i_blocks to
represent the number of blocks in file system block size. This
changes makes it possible to have a block number 2**32 -1 which
will result in overflow if the block number is represented by
signed long. This patch convert all the block number to type
ext4_lblk_t which is typedef to __u32

Also remove dead code ext4_ext_walk_space
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: NEric Sandeen <sandeen@redhat.com>

725d26d3

ext4: Avoid rec_len overflow with 64KB block size · a72d7f83

由 Jan Kara 提交于 1月 28, 2008

With 64KB blocksize, a directory entry can have size 64KB which does not fit
into 16 bits we have for entry lenght. So we store 0xffff instead and convert
value when read from / written to disk. The patch also converts some places
to use ext4_next_entry() when we are changing them anyway.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>

a72d7f83

ext4: Support large blocksize up to PAGESIZE · afc7cbca

由 Takashi Sato 提交于 1月 28, 2008

This patch set supports large block size(>4k, <=64k) in ext4,
just enlarging the block size limit. But it is NOT possible to have 64kB
blocksize on ext4 without some changes to the directory handling
code.  The reason is that an empty 64kB directory block would have a
rec_len == (__u16)2^16 == 0, and this would cause an error to be hit in
the filesystem.  The proposed solution is treat 64k rec_len
with a an impossible value like rec_len = 0xffff to handle this.

The Patch-set consists of the following 2 patches.
  [1/2]  ext4: enlarge blocksize
         - Allow blocksize up to pagesize

  [2/2]  ext4: fix rec_len overflow
         - prevent rec_len from overflow with 64KB blocksize

Now on 64k page ppc64 box runs with this patch set we could create a 64k
block size ext4dev, and able to handle empty directory block.
Signed-off-by: NTakashi Sato <sho@tnes.nec.co.jp>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>

afc7cbca