提交 · 8df9675f8b498d0bfa1f0b5b06f56bf1ff366dd5 · openanolis / cloud-kernel

01 5月, 2009 1 次提交

ext4: Avoid races caused by on-line resizing and SMP memory reordering · 8df9675f

由 Theodore Ts'o 提交于 5月 01, 2009

Ext4's on-line resizing adds a new block group and then, only at the
last step adjusts s_groups_count. However, it's possible on SMP
systems that another CPU could see the updated the s_group_count and
not see the newly initialized data structures for the just-added block
group. For this reason, it's important to insert a SMP read barrier
after reading s_groups_count and before reading any (for example) the
new block group descriptors allowed by the increased value of
s_groups_count.

Unfortunately, we rather blatently violate this locking protocol
documented in fs/ext4/resize.c. Fortunately, (1) on-line resizes
happen relatively rarely, and (2) it seems rare that the filesystem
code will immediately try to use just-added block group before any
memory ordering issues resolve themselves. So apparently problems
here are relatively hard to hit, since ext3 has been vulnerable to the
same issue for years with no one apparently complaining.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8df9675f

02 5月, 2009 1 次提交

ext4: Use separate super_operations structure for no_journal filesystems · 9ca92389

由 Theodore Ts'o 提交于 5月 01, 2009

By using a separate super_operations structure for filesystems that
have and don't have journals, we can simply ext4_write_super() ---
which is only needed when no journal is present --- and ext4_freeze(),
ext4_unfreeze(), and ext4_sync_fs(), which are only needed when the
journal is present.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9ca92389

01 5月, 2009 2 次提交

ext4: Fix and simplify s_dirt handling · 7234ab2a

由 Theodore Ts'o 提交于 4月 30, 2009

The s_dirt flag wasn't completely handled correctly, but it didn't
really matter when journalling was enabled. It turns out that when
ext4 runs without a journal, we don't clear s_dirt in places where we
should have, with the result that the high-level write_super()
function was writing the superblock when it wasn't necessary.

So we fix this by making ext4_commit_super() clear the s_dirt flag,
and removing many of the other places where s_dirt is manipulated.
When journalling is enabled, the s_dirt flag might be left set more
often, but s_dirt really doesn't matter when journalling is enabled.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7234ab2a

ext4: Simplify ext4_commit_super()'s function signature · e2d67052

由 Theodore Ts'o 提交于 5月 01, 2009

The ext4_commit_super() function took both a struct super_block * and
a struct ext4_super_block *, but the struct ext4_super_block can be
derived from the struct super_block.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e2d67052

25 4月, 2009 1 次提交

ext4: Use is_power_of_2() for clarity · f7c43950

由 Theodore Ts'o 提交于 4月 24, 2009

Signed-off-by: NRobert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f7c43950

28 4月, 2009 1 次提交

ext4: Fallback to vmalloc if kmalloc can't allocate s_flex_groups array · c5ca7c76

由 Theodore Ts'o 提交于 4月 27, 2009

For very large filesystems, the s_flex_groups array can get quite big.
For example, a filesystem that can be resized up to 16TB will have
8192 flex groups (assuming the default flex_bg size of 16), so the
array is 96k, which is *very* marginal for kmalloc(). On the other
hand, a 160GB filesystem without the resize_inode feature will only
require 960 bytes. So we try to allocate the array first using
kmalloc(), and if that fails, we'll try to use vmalloc() instead.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c5ca7c76

13 5月, 2009 1 次提交

ext4: Mark the unwritten buffer_head as mapped during write_begin · 29fa89d0

由 Aneesh Kumar K.V 提交于 5月 12, 2009

Setting BH_Unwritten buffer_heads as BH_Mapped avoids multiple
(unnecessary) calls to get_block() during the call to the write(2)
system call.  Setting BH_Unwritten buffer heads as BH_Mapped requires
that the writepages() functions can handle BH_Unwritten buffer_heads.

After this commit, things work as follows:

ext4_ext_get_block() returns unmapped, unwritten, buffer head when
called with create = 0 for prealloc space. This makes sure we handle
the read path and non-delayed allocation case correctly.  Even though
the buffer head is marked unmapped we have valid b_blocknr and b_bdev
values in the buffer_head.

ext4_da_get_block_prep() called for block resrevation will now return
mapped, unwritten, new buffer_head for prealloc space. This avoids
multiple calls to get_block() for write to same offset. By making such
buffers as BH_New, we also assure that sub-block zeroing of buffered
writes happens correctly.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

29fa89d0

14 5月, 2009 1 次提交

ext4: Properly initialize the buffer_head state · 79ffab34

由 Aneesh Kumar K.V 提交于 5月 13, 2009

These struct buffer_heads are allocated on the stack (and hence are
initialized with stack garbage).  They are only used to call a
get_blocks() function, so that's mostly OK, but b_state must be
initialized to be 0 so we don't have any unexpected BH_* flags set by
accident, such as BH_Unwritten or BH_Delay.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

79ffab34

15 5月, 2009 2 次提交

ext4: Fix race in ext4_inode_info.i_cached_extent · 2ec0ae3a

由 Theodore Ts'o 提交于 5月 15, 2009

If two CPU's simultaneously call ext4_ext_get_blocks() at the same
time, there is nothing protecting the i_cached_extent structure from
being used and updated at the same time. This could potentially cause
the wrong location on disk to be read or written to, including
potentially causing the corruption of the block group descriptors
and/or inode table.

This bug has been in the ext4 code since almost the very beginning of
ext4's development. Fortunately once the data is stored in the page
cache cache, ext4_get_blocks() doesn't need to be called, so trying to
replicate this problem to the point where we could identify its root
cause was *extremely* difficult. Many thanks to Kevin Shanahan for
working over several months to be able to reproduce this easily so we
could finally nail down the cause of the corruption.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: N"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

2ec0ae3a

ext4: Clear the unwritten buffer_head flag after the extent is initialized · 2a8964d6

由 Aneesh Kumar K.V 提交于 5月 14, 2009

The BH_Unwritten flag indicates that the buffer is allocated on disk
but has not been written; that is, the disk was part of a persistent
preallocation area. That flag should only be set when a get_blocks()
function is looking up a inode's logical to physical block mapping.

When ext4_get_blocks_wrap() is called with create=1, the uninitialized
extent is converted into an initialized one, so the BH_Unwritten flag
is no longer appropriate. Hence, we need to make sure the
BH_Unwritten is not left set, since the combination of BH_Mapped and
BH_Unwritten is not allowed; among other things, it will result ext4's
get_block() to be called over and over again during the write_begin
phase of write(2).
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2a8964d6

13 5月, 2009 1 次提交

ext4: Use a fake block number for delayed new buffer_head · 33b9817e

由 Aneesh Kumar K.V 提交于 5月 12, 2009

Use a very large unsigned number (~0xffff) as as the fake block number
for the delayed new buffer. The VFS should never try to write out this
number, but if it does, this will make it obvious.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

33b9817e

14 5月, 2009 1 次提交

ext4: Fix sub-block zeroing for writes into preallocated extents · 9c1ee184

由 Aneesh Kumar K.V 提交于 5月 13, 2009

We need to mark the buffer_head mapping preallocated space as new
during write_begin. Otherwise we don't zero out the page cache content
properly for a partial write. This will cause file corruption with
preallocation.

Now that we mark the buffer_head new we also need to have a valid
buffer_head blocknr so that unmap_underlying_metadata() unmaps the
correct block.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9c1ee184

25 4月, 2009 3 次提交

ext4: Do not try to validate extents on special files · c4b5a614

由 Theodore Ts'o 提交于 4月 24, 2009

The EXTENTS_FL flag should never be set on special files, but if it
is, don't bother trying to validate that the extents tree is valid,
since only files, directories, and non-fast symlinks will ever have an
extent data structure.  We perhaps should flag the filesystem as being
corrupted if we see a special file (named pipes, device nodes, Unix
domain sockets, etc.) with the EXTENTS_FL flag, but e2fsck doesn't
currently check this case, so we'll just ignore this for now, since
it's harmless.

Without this fix, a special device with the extents flag is flagged as
an error by the kernel, so it is impossible to access or delete the
inode, but e2fsck doesn't see it as a problem, leading to
confused/frustrated users.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c4b5a614

ext4: Ignore i_file_acl_high unless EXT4_FEATURE_INCOMPAT_64BIT is present · a9e81742

由 Theodore Ts'o 提交于 4月 24, 2009

Don't try to look at i_file_acl_high unless the INCOMPAT_64BIT feature
bit is set.  The field is normally zero, but older versions of e2fsck
didn't automatically check to make sure of this, so in the spirit of
"be liberal in what you accept", don't look at i_file_acl_high unless
we are using a 64-bit filesystem.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a9e81742

ext4: Fix softlockup caused by illegal i_file_acl value in on-disk inode · 485c26ec

由 Theodore Ts'o 提交于 4月 24, 2009

If the block containing external extended attributes (which is stored
in i_file_acl and i_file_acl_high) is larger than the on-disk
filesystem, the process which tried to access the extended attributes
will endlessly issue kernel printks complaining that
"__find_get_block_slow() failed", locking up that CPU until the system
is forcibly rebooted.

So when we read in the inode, make sure the i_file_acl value is legal,
and if not, flag the filesystem as being corrupted.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

485c26ec

23 4月, 2009 2 次提交

ext4: Fix potential inode allocation soft lockup in Orlov allocator · b5451f7b

由 Theodore Ts'o 提交于 4月 22, 2009

If the Orlov allocator is having trouble finding an appropriate block
group, the fallback code could loop forever, causing a soft lockup
warning in find_group_orlov():

BUG: soft lockup - CPU#0 stuck for 61s! [cp:11728]
     ...
Pid: 11728, comm: cp Not tainted (2.6.30-rc1-dirty #77) Lenovo          
EIP: 0060:[<c021650e>] EFLAGS: 00000246 CPU: 0
EIP is at ext4_get_group_desc+0x54/0x9d
    ...
Call Trace:
 [<c0218021>] find_group_orlov+0x2ee/0x334
 [<c0120a5f>] ? sched_clock+0x8/0xb
 [<c02188e3>] ext4_new_inode+0x2cf/0xb1a
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b5451f7b

ext4: Make the extent validity check more paranoid · e84a26ce

由 Theodore Ts'o 提交于 4月 22, 2009

Instead of just checking that the extent block number is greater or
equal than s_first_data_block, make sure it it is not pointing into
the block group descriptors, since that is clearly wrong.  This helps
prevent filesystem from getting very badly corrupted in case an extent
block is corrupted.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e84a26ce

15 4月, 2009 1 次提交

ext4: Remove code handling bio_alloc failure with __GFP_WAIT · 226e7dab

由 Nikanth Karthikesan 提交于 4月 15, 2009

Remove code handling bio_alloc failure with __GFP_WAIT.
GFP_NOIO implies __GFP_WAIT.
Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

226e7dab

14 4月, 2009 1 次提交

ext4: really print the find_group_flex fallback warning only once · 6b82f3cb

由 Chuck Ebbert 提交于 4月 14, 2009

Missing braces caused the warning to print more than once.
Signed-Off-By: NChuck Ebbert <cebbert@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6b82f3cb

08 4月, 2009 1 次提交

ext4: check block device size on mount · 0f2ddca6

由 From: Thiemo Nagel 提交于 4月 07, 2009

Signed-off-by: NThiemo Nagel <thiemo.nagel@ph.tum.de>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0f2ddca6

05 4月, 2009 1 次提交
- T
  ext4: Fix off-by-one-error in ext4_valid_extent_idx() · e44543b8
  由 Thiemo Nagel 提交于 4月 04, 2009
```
Signed-off-by: NThiemo Nagel <thiemo.nagel@ph.tum.de>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
  e44543b8
08 4月, 2009 1 次提交

ext4: Fix big-endian problem in __ext4_check_blockref() · f73953c0

由 Thiemo Nagel 提交于 4月 07, 2009

Commit fe2c8191 introduced a regression on big-endian system, because
the checks to make sure block references in non-extent inodes are
valid failed to use le32_to_cpu().
Reported-by: NAlexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: NThiemo Nagel <thiemo.nagel@ph.tum.de>
Tested-by: NAlexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

f73953c0

01 4月, 2009 2 次提交

mm: page_mkwrite change prototype to match fault · c2ec175c

由 Nick Piggin 提交于 3月 31, 2009

Change the page_mkwrite prototype to take a struct vm_fault, and return
VM_FAULT_xxx flags.  There should be no functional change.

This makes it possible to return much more detailed error information to
the VM (and also can provide more information eg.  virtual_address to the
driver, which might be important in some special cases).

This is required for a subsequent fix.  And will also make it easier to
merge page_mkwrite() with fault() in future.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Cc: Felix Blyakher <felixb@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c2ec175c

New helper - current_umask() · ce3b0f8d

由 Al Viro 提交于 3月 29, 2009

current->fs->umask is what most of fs_struct users are doing.
Put that into a helper function.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ce3b0f8d

30 3月, 2009 1 次提交

trivial: fix typos/grammar errors in Kconfig texts · 692105b8

由 Matt LaPlante 提交于 1月 26, 2009

Signed-off-by: NMatt LaPlante <kernel1@cyberdogtech.com>
Acked-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

692105b8

28 3月, 2009 4 次提交

ext4: Regularize mount options · 06705bff

由 Theodore Ts'o 提交于 3月 28, 2009

Add support for using the mount options "barrier" and "nobarrier", and
"auto_da_alloc" and "noauto_da_alloc", which is more consistent than
"barrier=<0|1>" or "auto_da_alloc=<0|1>". Most other ext3/ext4 mount
options use the foo/nofoo naming convention. We allow the old forms
of these mount options for backwards compatibility.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

06705bff

ext4: fix locking typo in mballoc which could cause soft lockup hangs · e7c9e3e9

由 Theodore Ts'o 提交于 3月 27, 2009

Smatch (http://repo.or.cz/w/smatch.git/) complains about the locking in
ext4_mb_add_n_trim() from fs/ext4/mballoc.c

  4438          list_for_each_entry_rcu(tmp_pa, &lg->lg_prealloc_list[order],
  4439                                                  pa_inode_list) {
  4440                  spin_lock(&tmp_pa->pa_lock);
  4441                  if (tmp_pa->pa_deleted) {
  4442                          spin_unlock(&pa->pa_lock);
  4443                          continue;
  4444                  }

Brown paper bag time...
Reported-by: NDan Carpenter <error27@gmail.com>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

e7c9e3e9

ext4: fix typo which causes a memory leak on error path · a7b19448

由 Dan Carpenter 提交于 3月 27, 2009

This was found by smatch (http://repo.or.cz/w/smatch.git/)
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

a7b19448

ext4: Rename pa_linear to pa_type · cc0fb9ad

由 Aneesh Kumar K.V 提交于 3月 27, 2009

Impact: code cleanup

This patch rename pa_linear to pa_type and add MB_INODE_PA
and MB_GROUP_PA to indicate inode and group prealloc space.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

cc0fb9ad

31 3月, 2009 1 次提交

ext4: add checks of block references for non-extent inodes · fe2c8191

由 Thiemo Nagel 提交于 3月 31, 2009

Check block references in the inode and indorect blocks for non-extent
inodes to make sure they are valid, and flag an error if they are
invalid.
Signed-off-by: NThiemo Nagel <thiemo.nagel@ph.tum.de>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

fe2c8191

26 3月, 2009 4 次提交

T
ext4: Check for an valid i_mode when reading the inode from disk · 563bdd61
由 Theodore Ts'o 提交于 3月 26, 2009
```
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
563bdd61

ext4: Use lowercase names of quota functions · a269eb18

由 Jan Kara 提交于 1月 26, 2009

Use lowercase names of quota functions instead of old uppercase ones.
Signed-off-by: NJan Kara <jack@suse.cz>
Acked-by: NMingming Cao <cmm@us.ibm.com>
CC: linux-ext4@vger.kernel.org

a269eb18

ext4: quota reservation for delayed allocation · 60e58e0f

由 Mingming Cao 提交于 1月 22, 2009

Uses quota reservation/claim/release to handle quota properly for delayed
allocation in the three steps: 1) quotas are reserved when data being copied
to cache when block allocation is defered 2) when new blocks are allocated.
reserved quotas are converted to the real allocated quota, 2) over-booked
quotas for metadata blocks are released back.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Acked-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NJan Kara <jack@suse.cz>

60e58e0f

ext4: Remove unnecessary quota functions · edf72453

由 Jan Kara 提交于 1月 12, 2009

ext4_dquot_initialize() and ext4_dquot_drop() is no longer
needed because of modified quota locking.
Signed-off-by: NJan Kara <jack@suse.cz>

edf72453

17 3月, 2009 2 次提交

ext4: fix bb_prealloc_list corruption due to wrong group locking · d33a1976

由 Eric Sandeen 提交于 3月 16, 2009

This is for Red Hat bug 490026: EXT4 panic, list corruption in
ext4_mb_new_inode_pa

ext4_lock_group(sb, group) is supposed to protect this list for
each group, and a common code flow to remove an album is like
this:

    ext4_get_group_no_and_offset(sb, pa->pa_pstart, &grp, NULL);
    ext4_lock_group(sb, grp);
    list_del(&pa->pa_group_list);
    ext4_unlock_group(sb, grp);

so it's critical that we get the right group number back for
this prealloc context, to lock the right group (the one 
associated with this pa) and prevent concurrent list manipulation.

however, ext4_mb_put_pa() passes in (pa->pa_pstart - 1) with a 
comment, "-1 is to protect from crossing allocation group".

This makes sense for the group_pa, where pa_pstart is advanced
by the length which has been used (in ext4_mb_release_context()),
and when the entire length has been used, pa_pstart has been
advanced to the first block of the next group.

However, for inode_pa, pa_pstart is never advanced; it's just
set once to the first block in the group and not moved after
that.  So in this case, if we subtract one in ext4_mb_put_pa(),
we are actually locking the *previous* group, and opening the
race with the other threads which do not subtract off the extra
block.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d33a1976

ext4: Add auto_da_alloc mount option · afd4672d

由 Theodore Ts'o 提交于 3月 16, 2009

Add a mount option which allows the user to disable automatic
allocation of blocks whose allocation by delayed allocation when the
file was originally truncated or when the file is renamed over an
existing file. This feature is intended to save users from the
effects of naive application writers, but it reduces the effectiveness
of the delayed allocation code. This mount option disables this
safety feature, which may be desirable for prodcutions systems where
the risk of unclean shutdowns or unexpected system crashes is low.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

afd4672d

14 3月, 2009 1 次提交

ext4: fix bogus BUG_ONs in in mballoc code · 8d03c7a0

由 Eric Sandeen 提交于 3月 14, 2009

Thiemo Nagel reported that:

# dd if=/dev/zero of=image.ext4 bs=1M count=2
# mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \
  -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4
# mount -o loop image.ext4 mnt/
# dd if=/dev/zero of=mnt/file

oopsed, with a BUG_ON in ext4_mb_normalize_request because
size == EXT4_BLOCKS_PER_GROUP

It appears to me (esp. after talking to Andreas) that the BUG_ON
is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should
be allowed, though larger sizes do indicate a problem.

Fix that an another (apparently rare) codepath with a similar check.
Reported-by: NThiemo Nagel <thiemo.nagel@ph.tum.de>
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8d03c7a0

13 3月, 2009 1 次提交

ext4: Print the find_group_flex() warning only once · 2842c3b5

由 Theodore Ts'o 提交于 3月 12, 2009

This is a short-term warning, and even printk_ratelimit() can result
in too much noise in system logs. So only print it once as a warning.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2842c3b5

11 3月, 2009 1 次提交

ext4: fix header check in ext4_ext_search_right() for deep extent trees. · 395a87bf

由 Eric Sandeen 提交于 3月 10, 2009

The ext4_ext_search_right() function is confusing; it uses a
"depth" variable which is 0 at the root and maximum at the leaves, 
but the on-disk metadata uses a "depth" (actually eh_depth) which
is opposite: maximum at the root, and 0 at the leaves.

The ext4_ext_check_header() function is given a depth and checks
the header agaisnt that depth; it expects the on-disk semantics,
but we are giving it the opposite in the while loop in this 
function.  We should be giving it the on-disk notion of "depth"
which we can get from (p_depth - depth) - and if you look, the last
(more commonly hit) call to ext4_ext_check_header() does just this.

Sending in the wrong depth results in (incorrect) messages
about corruption:

EXT4-fs error (device sdb1): ext4_ext_search_right: bad header
in inode #2621457: unexpected eh_depth - magic f30a, entries 340,
max 340(0), depth 1(2)

http://bugzilla.kernel.org/show_bug.cgi?id=12821Reported-by: NDavid Dindorp <ddi@dubex.dk>
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

395a87bf

05 3月, 2009 1 次提交

ext4: Use struct flex_groups to calculate get_orlov_stats() · 7d39db14

由 Theodore Ts'o 提交于 3月 04, 2009

Instead of looping over all of the block groups in a flex group
summing their summary statistics, start tracking used_dirs in struct
flex_groups, and use struct flex_groups instead.  This should save a
bit of CPU for mkdir-heavy workloads.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7d39db14

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功