提交 · 68c9d702bb72f367f3b148963ec6cf5e07ff7f65 · openeuler / raspberrypi-kernel

04 10月, 2008 2 次提交

generic block based fiemap implementation · 68c9d702

由 Josef Bacik 提交于 10月 03, 2008

Any block based fs (this patch includes ext3) just has to declare its own
fiemap() function and then call this generic function with its own
get_block_t. This works well for block based filesystems that will map
multiple contiguous blocks at one time, but will work for filesystems that
only map one block at a time, you will just end up with an "extent" for each
block. One gotcha is this will not play nicely where there is hole+data
after the EOF. This function will assume its hit the end of the data as soon
as it hits a hole after the EOF, so if there is any data past that it will
not pick that up. AFAIK no block based fs does this anyway, but its in the
comments of the function anyway just in case.
Signed-off-by: NJosef Bacik <jbacik@redhat.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: linux-fsdevel@vger.kernel.org

68c9d702

ocfs2: fiemap support · 00dc417f

由 Mark Fasheh 提交于 10月 03, 2008

Plug ocfs2 into ->fiemap. Some portions of ocfs2_get_clusters() had to be
refactored so that the extent cache can be skipped in favor of going
directly to the on-disk records. This makes it easier for us to determine
which extent is the last one in the btree. Also, I'm not sure we want to be
caching fiemap lookups anyway as they're not directly related to data
read/write.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: ocfs2-devel@oss.oracle.com
Cc: linux-fsdevel@vger.kernel.org

00dc417f

09 10月, 2008 2 次提交

vfs: vfs-level fiemap interface · c4b929b8

由 Mark Fasheh 提交于 10月 08, 2008

Basic vfs-level fiemap infrastructure, which sets up a new ->fiemap
inode operation.

Userspace can get extent information on a file via fiemap ioctl. As input,
the fiemap ioctl takes a struct fiemap which includes an array of struct
fiemap_extent (fm_extents). Size of the extent array is passed as
fm_extent_count and number of extents returned will be written into
fm_mapped_extents. Offset and length fields on the fiemap structure
(fm_start, fm_length) describe a logical range which will be searched for
extents. All extents returned will at least partially contain this range.
The actual extent offsets and ranges returned will be unmodified from their
offset and range on-disk.

The fiemap ioctl returns '0' on success. On error, -1 is returned and errno
is set. If errno is equal to EBADR, then fm_flags will contain those flags
which were passed in which the kernel did not understand. On all other
errors, the contents of fm_extents is undefined.

As fiemap evolved, there have been many authors of the vfs patch. As far as
I can tell, the list includes:
Kalpak Shah <kalpak.shah@sun.com>
Andreas Dilger <adilger@sun.com>
Eric Sandeen <sandeen@redhat.com>
Mark Fasheh <mfasheh@suse.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: linux-api@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org

c4b929b8

ext4: fix xattr deadlock · 4d20c685

由 Kalpak Shah 提交于 10月 08, 2008

ext4_xattr_set_handle() eventually ends up calling
ext4_mark_inode_dirty() which tries to expand the inode by shifting
the EAs.  This leads to the xattr_sem being downed again and leading
to a deadlock.

This patch makes sure that if ext4_xattr_set_handle() is in the
call-chain, ext4_mark_inode_dirty() will not expand the inode.
Signed-off-by: NKalpak Shah <kalpak.shah@sun.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4d20c685

07 10月, 2008 1 次提交

jbd2: Fix buffer head leak when writing the commit block · 45a90bfd

由 Theodore Ts'o 提交于 10月 06, 2008

Also make sure the buffer heads are marked clean before submitting bh
for writing.  The previous code was marking the buffer head dirty,
which would have forced an unneeded write (and seek) to the journal
for no good reason.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

45a90bfd

06 10月, 2008 1 次提交

ext4: Add debugging markers that can be used by systemtap · ede86cc4

由 Theodore Ts'o 提交于 10月 05, 2008

This debugging markers are designed to debug problems such as the
random filesystem latency problems reported by Arjan.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ede86cc4

09 10月, 2008 1 次提交

jbd2: abort instead of waiting for nonexistent transaction · 23f8b79e

由 Duane Griffin 提交于 10月 08, 2008

The __jbd2_log_wait_for_space function sits in a loop checkpointing
transactions until there is sufficient space free in the journal. 
However, if there are no transactions to be processed (e.g.  because the
free space calculation is wrong due to a corrupted filesystem) it will
never progress.

Check for space being required when no transactions are outstanding and
abort the journal instead of endlessly looping.

This patch fixes the bug reported by Sami Liedes at:
http://bugzilla.kernel.org/show_bug.cgi?id=10976Signed-off-by: NDuane Griffin <duaneg@dghda.com>
Cc: Sami Liedes <sliedes@cc.hut.fi>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

23f8b79e

10 10月, 2008 4 次提交

ext4: fix initialization of UNINIT bitmap blocks · c806e68f

由 Frederic Bohe 提交于 10月 10, 2008

This fixes a bug which caused on-line resizing of filesystems with a
1k blocksize to fail.  The root cause of this bug was the fact that if
an uninitalized bitmap block gets read in by userspace (which
e2fsprogs does try to avoid, but can happen when the blocksize is less
than the pagesize and an adjacent blocks is read into memory)
ext4_read_block_bitmap() was erroneously depending on the buffer
uptodate flag to decide whether it needed to initialize the bitmap
block in memory --- i.e., to set the standard set of blocks in use by
a block group (superblock, bitmaps, inode table, etc.).  Essentially,
ext4_read_block_bitmap() assumed it was the only routine that might
try to read a block containing a block bitmap, which is simply not
true.  

To fix this, ext4_read_block_bitmap() and ext4_read_inode_bitmap()
must always initialize uninitialized bitmap blocks.  Once a block or
inode is allocated out of that bitmap, it will be marked as
initialized in the block group descriptor, so in general this won't
result any extra unnecessary work.
Signed-off-by: NFrederic Bohe <frederic.bohe@bull.net>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c806e68f

T
ext4: Remove old legacy block allocator · c2ea3fde
由 Theodore Ts'o 提交于 10月 10, 2008
```
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
c2ea3fde

ext4: Use readahead when reading an inode from the inode table · 240799cd

由 Theodore Ts'o 提交于 10月 09, 2008

With modern hard drives, reading 64k takes roughly the same time as
reading a 4k block.  So request readahead for adjacent inode table
blocks to reduce the time it takes when iterating over directories
(especially when doing this in htree sort order) in a cold cache case.
With this patch, the time it takes to run "git status" on a kernel
tree after flushing the caches via "echo 3 > /proc/sys/vm/drop_caches"
is reduced by 21%.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

240799cd

ext4: Improve the documentation for ext4's /proc tunables · 37515fac

由 Theodore Ts'o 提交于 10月 09, 2008

Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Alex Tomas <bzzz@sun.com>
Cc: Andreas Dilger <adilger@sun.com>

37515fac

24 9月, 2008 1 次提交

ext4: Combine proc file handling into a single set of functions · 5e8814f2

由 Theodore Ts'o 提交于 9月 23, 2008

Previously mballoc created a separate set of functions for each proc
file.  This combines the tunables into a single set of functions which
gets used for all of the per-superblock proc files, saving
approximately 2k of compiled object code.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5e8814f2

23 9月, 2008 2 次提交

ext4: move /proc setup and teardown out of mballoc.c · 9f6200bb

由 Theodore Ts'o 提交于 9月 23, 2008

...and into the core setup/teardown code in fs/ext4/super.c so that
other parts of ext4 can define tuning parameters.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9f6200bb

ext4: Don't use 'struct dentry' for internal lookups · f702ba0f

由 Theodore Ts'o 提交于 9月 22, 2008

This is a port of a patch from Linus which fixes a 200+ byte stack
usage problem in ext4_get_parent().

It's more efficient to pass down only the actual parts of the dentry
that matter: the parent inode and the name, instead of allocating a
struct dentry on the stack.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f702ba0f

07 10月, 2008 1 次提交
- T
  ext4/jbd2: Avoid WARN() messages when failing to write to the superblock · 914258bf
  由 Theodore Ts'o 提交于 10月 06, 2008
```
This fixes some very common warnings reported by kerneloops.org
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
  914258bf
14 9月, 2008 2 次提交

ext4: use percpu data structures for lg_prealloc_list · 730c213c

由 Eric Sandeen 提交于 9月 13, 2008

lg_prealloc_list seems to cry out for a per-cpu data structure; on a large
smp system I think this should be better.  I've lightly tested this change
on a 4-cpu system.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Acked-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

730c213c

ext4: Renumber EXT4_IOC_MIGRATE · 8eea80d5

由 Theodore Ts'o 提交于 9月 13, 2008

Pick an ioctl number for EXT4_IOC_MIGRATE that won't conflict with
other ext4 ioctl's.  Since there haven't been any major userspace
users of this ioctl, we can afford to change this now, to avoid
potential problems later.

Also, reorder the ioctl numbers in ext4.h to avoid this sort of
mistake in the future.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8eea80d5

09 10月, 2008 1 次提交

ext4: hook the ext3 migration interface to the EXT4_IOC_SETFLAGS ioctl · 4db46fc2

由 Aneesh Kumar K.V 提交于 10月 08, 2008

This patch hooks the ext3 to ext4 migrate interface to
EXT4_IOC_SETFLAGS ioctl. The userspace interface is via chattr +e.  We
only allow setting extent flags.  Clearing extent flag (migrating from
ext4 to ext3) is not supported.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4db46fc2

14 9月, 2008 1 次提交

ext4: elevate write count for migrate ioctl · 2a43a878

由 Aneesh Kumar K.V 提交于 9月 13, 2008

The migrate ioctl writes to the filsystem, so we need to elevate the
write count.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2a43a878

08 9月, 2008 1 次提交

ext4: add missing unlock in ext4_check_descriptors() on error path · 7ee1ec4c

由 Li Zefan 提交于 9月 08, 2008

If there group descriptors are corrupted we need unlock the block
group lock before returning from the function; else we will oops when
freeing a spinlock which is still being held.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7ee1ec4c

17 9月, 2008 1 次提交

jbd2: clean up how the journal device name is printed · 05496769

由 Theodore Ts'o 提交于 9月 16, 2008

Calculate the journal device name once and stash it away in the
journal_s structure.  This avoids needing to call bdevname()
everywhere and reduces stack usage by not needing to allocate an
on-stack buffer.  In addition, we eliminate the '/' that can appear in
device names (e.g. "cciss/c0d0p9" --- see kernel bugzilla #11321) that
can cause problems when creating proc directory names, and include the
inode number to support ocfs2 which creates multiple journals with
different inode numbers.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

05496769

14 9月, 2008 1 次提交

ext4: fix #11321: create /proc/ext4/*/stats more carefully · 899fc1a4

由 Alexey Dobriyan 提交于 9月 14, 2008

ext4 creates per-suberblock directory in /proc/ext4/ . Name used as
basis is taken from bdevname, which, surprise, can contain slash.

However, proc while allowing to use proc_create("a/b", parent) form of
PDE creation, assumes that parent/a was already created.

bdevname in question is 'cciss/c0d0p9', directory is not created and all
this stuff goes directly into /proc (which is real bug).

Warning comes when _second_ partition is mounted.

http://bugzilla.kernel.org/show_bug.cgi?id=11321Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

899fc1a4

08 9月, 2008 1 次提交

Update flex_bg free blocks and free inodes counters when resizing. · c62a11fd

由 Frederic Bohe 提交于 9月 08, 2008

This fixes a bug which prevented the newly created inodes after a
resize from being used on filesystems with flex_bg.
Signed-off-by: NFrederic Bohe <frederic.bohe@bull.net>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c62a11fd

09 10月, 2008 1 次提交

ext4: Avoid printk floods in the face of directory corruption · 9d9f1775

由 Eric Sandeen 提交于 10月 09, 2008

Note: some people thinks this represents a security bug, since it
might make the system go away while it is printing a large number of
console messages, especially if a serial console is involved.  Hence,
it has been assigned CVE-2008-3528, but it requires that the attacker
either has physical access to your machine to insert a USB disk with a
corrupted filesystem image (at which point why not just hit the power
button), or is otherwise able to convince the system administrator to
mount an arbitrary filesystem image (at which point why not just
include a setuid shell or world-writable hard disk device file or some
such).  Me, I think they're just being silly. --tytso
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
Cc: Eugene Teo <eugeneteo@kernel.sg>

9d9f1775

14 9月, 2008 2 次提交

ext4: Properly update i_disksize. · cf17fea6

由 Aneesh Kumar K.V 提交于 9月 13, 2008

With delayed allocation we use i_data_sem to update i_disksize. We need
to update i_disksize only if the new size specified is greater than the
current value and we need to make sure we don't race with other
i_disksize update. With delayed allocation we will switch to the
write_begin function for non-delayed allocation if we are low on free
blocks. This means the write_begin function for non-delayed allocation
also needs to use the same locking.

We also need to check and update i_disksize even if the new size is less
that inode.i_size because of delayed allocation.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

cf17fea6

ext4: truncate block allocated on a failed ext4_write_begin · ae4d5372

由 Aneesh Kumar K.V 提交于 9月 13, 2008

For blocksize < pagesize we need to remove blocks that got allocated in
block_write_begin() if we fail with ENOSPC for later blocks.
block_write_begin() internally does this if it allocated pages locally.
This makes sure we don't have blocks outside inode.i_size during ENOSPC.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ae4d5372

09 9月, 2008 3 次提交

ext4: Retry block allocation if we have free blocks left · df22291f

由 Aneesh Kumar K.V 提交于 9月 08, 2008

When we truncate files, the meta-data blocks released are not reused
untill we commit the truncate transaction.  That means delayed get_block
request will return ENOSPC even if we have free blocks left.  Force a
journal commit and retry block allocation if we get ENOSPC with free
blocks left.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

df22291f

ext4: Don't add the inode to journal handle until after the block is allocated · 166348dd

由 Aneesh Kumar K.V 提交于 9月 08, 2008

    
Make sure we don't add the inode to the journal handle until after the
block allocation, so that a journal commit will not include the inode in
case of block allocation failure.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

166348dd

ext4: Fix ext4 nomballoc allocator for ENOSPC · 68629f29

由 Aneesh Kumar K.V 提交于 9月 08, 2008

We run into ENOSPC error on nonmballoc ext4, even when there is free blocks
on the filesystem.

The patch includes two changes:

a) Set reservation to NULL if we trying to allocate near group_target_block
from the goal group if the free block in the group is less than windows.
This should give us a better chance to allocate near group_target_block.
This also ensures that if we are not allocating near group_target_block
then we don't trun off reservation. This should enable us to allocate
with reservation from other groups that have large free blocks count.

b) we don't need to check the window size if the block reservation is off.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

68629f29

09 10月, 2008 2 次提交

ext4: Signed arithmetic fix · 5c791616

由 Aneesh Kumar K.V 提交于 10月 08, 2008

This patch converts some usage of ext4_fsblk_t to s64.  This is needed
so that some of the sign conversion works as expected in if loops.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5c791616

ext4: Switch to non delalloc mode when we are low on free blocks count. · 79f0be8d

由 Aneesh Kumar K.V 提交于 10月 08, 2008

The delayed allocation code allocates blocks during writepages(), which
can not handle block allocation failures.  To deal with this, we switch
away from delayed allocation mode when we are running low on free
blocks.  This also allows us to avoid needing to reserve a large number
of meta-data blocks in case all of the requested blocks are
discontiguous.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

79f0be8d

10 10月, 2008 1 次提交

ext4: Add percpu dirty block accounting. · 6bc6e63f

由 Aneesh Kumar K.V 提交于 10月 10, 2008

This patch adds dirty block accounting using percpu_counters.  Delayed
allocation block reservation is now done by updating dirty block
counter.  In a later patch we switch to non delalloc mode if the
filesystem free blocks is greater than 150% of total filesystem dirty
blocks
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Mingming Cao<cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6bc6e63f

09 9月, 2008 1 次提交

ext4: Retry block reservation · 030ba6bc

由 Aneesh Kumar K.V 提交于 9月 08, 2008

During block reservation if we don't have enough blocks left, retry
block reservation with smaller block counts. This makes sure we try
fallocate and DIO with smaller request size and don't fail early. The
delayed allocation reservation cannot try with smaller block count. So
retry block reservation to handle temporary disk full conditions. Also
print free blocks details if we fail block allocation during writepages.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

030ba6bc

09 10月, 2008 1 次提交

ext4: Make sure all the block allocation paths reserve blocks · a30d542a

由 Aneesh Kumar K.V 提交于 10月 09, 2008

With delayed allocation we need to make sure block are reserved before
we attempt to allocate them. Otherwise we get block allocation failure
(ENOSPC) during writepages which cannot be handled. This would mean
silent data loss (We do a printk stating data will be lost). This patch
updates the DIO and fallocate code path to do block reservation before
block allocation. This is needed to make sure parallel DIO and fallocate
request doesn't take block out of delayed reserve space.

When free blocks count go below a threshold we switch to a slow patch
which looks at other CPU's accumulated percpu counter values.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a30d542a

20 8月, 2008 1 次提交

ext4: invalidate pages if delalloc block allocation fails. · c4a0c46e

由 Aneesh Kumar K.V 提交于 8月 19, 2008

We are a bit agressive in invalidating all the pages. But
it is ok because we really don't know why the block allocation
failed and it is better to come of the writeback path
so that user can look for more info.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

c4a0c46e

09 9月, 2008 3 次提交
- T
  ext4: Fix whitespace checkpatch warnings/errors · af5bc92d
  由 Theodore Ts'o 提交于 9月 08, 2008
```
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
  af5bc92d
- T
  ext4: Fix long long checkpatch warnings · e5f8eab8
  由 Theodore Ts'o 提交于 9月 08, 2008
```
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
  e5f8eab8
- T
  ext4: Add printk priority levels to clean up checkpatch warnings · 4776004f
  由 Theodore Ts'o 提交于 9月 08, 2008
```
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
  4776004f
10 10月, 2008 2 次提交

percpu counter: clean up percpu_counter_sum_and_set() · 1f7c14c6

由 Mingming Cao 提交于 10月 09, 2008

percpu_counter_sum_and_set() and percpu_counter_sum() is the same except
the former updates the global counter after accounting.  Since we are
taking the fbc->lock to calculate the precise value of the counter in
percpu_counter_sum() anyway, it should simply set fbc->count too, as the
percpu_counter_sum_and_set() does.

This patch merges these two interfaces into one.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1f7c14c6

L

Linux 2.6.27 · 3fa8749e
由 Linus Torvalds 提交于 10月 09, 2008

3fa8749e