提交 · 5a9ae68a349aa076bc8557ee2fcf865574459282 · openeuler / raspberrypi-kernel

19 11月, 2010 1 次提交

ext4: ext4_fill_super shouldn't return 0 on corruption · 5a9ae68a

由 Darrick J. Wong 提交于 11月 19, 2010

At the start of ext4_fill_super, ret is set to -EINVAL, and any failure path
out of that function returns ret.  However, the generic_check_addressable
clause sets ret = 0 (if it passes), which means that a subsequent failure (e.g.
a group checksum error) returns 0 even though the mount should fail.  This
causes vfs_kern_mount in turn to think that the mount succeeded, leading to an
oops.

A simple fix is to avoid using ret for the generic_check_addressable check,
which was last changed in commit 30ca22c7.
Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5a9ae68a

18 11月, 2010 1 次提交

ext4: missing unlock in ext4_clear_request_list() · f4c8cc65

由 Dan Carpenter 提交于 11月 17, 2010

If the the li_request_list was empty then it returned with the lock
held.  Instead of adding a "goto unlock" I just removed that special
case and let it go past the empty list_for_each_safe().
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f4c8cc65

09 11月, 2010 3 次提交

ext4: Add new ext4 inode tracepoints · 7ff9c073

由 Theodore Ts'o 提交于 11月 08, 2010

Add ext4_evict_inode, ext4_drop_inode, ext4_mark_inode_dirty, and
ext4_begin_ordered_truncate()
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7ff9c073

ext4: do not try to grab the s_umount semaphore in ext4_quota_off · 87009d86

由 Dmitry Monakhov 提交于 11月 08, 2010

It's not needed to sync the filesystem, and it fixes a lock_dep complaint.
Signed-off-by: NDmitry Monakhov <dmonakhov@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

87009d86

ext4: handle writeback of inodes which are being freed · f7ad6d2e

由 Theodore Ts'o 提交于 11月 08, 2010

The following BUG can occur when an inode which is getting freed when
it still has dirty pages outstanding, and it gets deleted (in this
because it was the target of a rename).  In ordered mode, we need to
make sure the data pages are written just in case we crash before the
rename (or unlink) is committed.  If the inode is being freed then
when we try to igrab the inode, we end up tripping the BUG_ON at
fs/ext4/page-io.c:146.

To solve this problem, we need to keep track of the number of io
callbacks which are pending, and avoid destroying the inode until they
have all been completed.  That way we don't have to bump the inode
count to keep the inode from being destroyed; an approach which
doesn't work because the count could have already been dropped down to
zero before the inode writeback has started (at which point we're not
allowed to bump the count back up to 1, since it's already started
getting freed).

Thanks to Dave Chinner for suggesting this approach, which is also
used by XFS.

  kernel BUG at /scratch_space/linux-2.6/fs/ext4/page-io.c:146!
  Call Trace:
   [<ffffffff811075b1>] ext4_bio_write_page+0x172/0x307
   [<ffffffff811033a7>] mpage_da_submit_io+0x2f9/0x37b
   [<ffffffff811068d7>] mpage_da_map_and_submit+0x2cc/0x2e2
   [<ffffffff811069b3>] mpage_add_bh_to_extent+0xc6/0xd5
   [<ffffffff81106c66>] write_cache_pages_da+0x2a4/0x3ac
   [<ffffffff81107044>] ext4_da_writepages+0x2d6/0x44d
   [<ffffffff81087910>] do_writepages+0x1c/0x25
   [<ffffffff810810a4>] __filemap_fdatawrite_range+0x4b/0x4d
   [<ffffffff810815f5>] filemap_fdatawrite_range+0xe/0x10
   [<ffffffff81122a2e>] jbd2_journal_begin_ordered_truncate+0x7b/0xa2
   [<ffffffff8110615d>] ext4_evict_inode+0x57/0x24c
   [<ffffffff810c14a3>] evict+0x22/0x92
   [<ffffffff810c1a3d>] iput+0x212/0x249
   [<ffffffff810bdf16>] dentry_iput+0xa1/0xb9
   [<ffffffff810bdf6b>] d_kill+0x3d/0x5d
   [<ffffffff810be613>] dput+0x13a/0x147
   [<ffffffff810b990d>] sys_renameat+0x1b5/0x258
   [<ffffffff81145f71>] ? _atomic_dec_and_lock+0x2d/0x4c
   [<ffffffff810b2950>] ? cp_new_stat+0xde/0xea
   [<ffffffff810b29c1>] ? sys_newlstat+0x2d/0x38
   [<ffffffff810b99c6>] sys_rename+0x16/0x18
   [<ffffffff81002a2b>] system_call_fastpath+0x16/0x1b
Reported-by: NNick Bowler <nbowler@elliptictech.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Tested-by: NNick Bowler <nbowler@elliptictech.com>

f7ad6d2e

04 11月, 2010 1 次提交

ext4: initialize the percpu counters before replaying the journal · ce7e010a

由 Theodore Ts'o 提交于 11月 03, 2010

We now initialize the percpu counters before replaying the journal,
but after the journal, we recalculate the global counters, to deal
with the possibility of the per-blockgroup counts getting updated by
the journal replay.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ce7e010a

03 11月, 2010 2 次提交

ext4: "ret" may be used uninitialized in ext4_lazyinit_thread() · b2c78cd0

由 Theodore Ts'o 提交于 11月 02, 2010

Newer GCC's reported the following build warning:

   fs/ext4/super.c: In function 'ext4_lazyinit_thread':
   fs/ext4/super.c:2702: warning: 'ret' may be used uninitialized in this function

Fix it by removing the need for the ret variable in the first place.
Signed-off-by: N"Lukas Czerner" <lczerner@redhat.com>
Reported-by: N"Stefan Richter" <stefanr@s5r6.in-berlin.de>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b2c78cd0

ext4: fix lazyinit hang after removing request · f4245bd4

由 Lukas Czerner 提交于 11月 02, 2010

When the request has been removed from the list and no other request
has been issued, we will end up with next wakeup scheduled to
MAX_JIFFY_OFFSET which is bad. So check for that.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f4245bd4

29 10月, 2010 1 次提交

new helper: mount_bdev() · 152a0836

由 Al Viro 提交于 7月 25, 2010

... and switch of the obvious get_sb_bdev() users to ->mount()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

152a0836

28 10月, 2010 11 次提交

N
ext4: fix unbalanced mutex unlock in error path of ext4_li_request_new · beed5ecb
由 Nicolas Kaiser 提交于 10月 27, 2010
```
Signed-off-by: NNicolas Kaiser <nikai@nikai.net>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
beed5ecb

ext4: make various ext4 functions be static · 1f109d5a

由 Theodore Ts'o 提交于 10月 27, 2010

These functions have no need to be exported beyond file context.

No functions needed to be moved for this commit; just some function
declarations changed to be static and removed from header files.

(A similar patch was submitted by Eric Sandeen, but I wanted to handle
code movement in separate patches to make sure code changes didn't
accidentally get dropped.)
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1f109d5a

T
ext4: rename {exit,init}_ext4_*() to ext4_{exit,init}_*() · 5dabfc78
由 Theodore Ts'o 提交于 10月 27, 2010
```
This is a cleanup to avoid namespace leaks out of fs/ext4
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
5dabfc78

ext4: fix kernel oops if the journal superblock has a non-zero j_errno · 7f93cff9

由 Theodore Ts'o 提交于 10月 27, 2010

Commit 84061e07 fixed an accounting bug only to introduce the
possibility of a kernel OOPS if the journal has a non-zero j_errno
field indicating that the file system had detected a fs inconsistency.
After the journal replay, if the journal superblock indicates that the
file system has an error, this indication is transfered to the file
system and then ext4_commit_super() is called to write this to the
disk.

But since the percpu counters are now initialized after the journal
replay, the call to ext4_commit_super() will cause a kernel oops since
it needs to use the percpu counters the ext4 superblock structure.

The fix is to skip setting the ext4 free block and free inode fields
if the percpu counter has not been set.

Thanks to Ken Sumrall for reporting and analyzing the root causes of
this bug.

Addresses-Google-Bug: #3054080
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7f93cff9

ext4: add batched_discard into ext4 feature list · 27ee40df

由 Lukas Czerner 提交于 10月 27, 2010

Should be applied on the top of "lazy inode table initialization"
and "batched discard support" patch-sets.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

27ee40df

ext4: Add batched discard support for ext4 · 7360d173

由 Lukas Czerner 提交于 10月 27, 2010

Walk through allocation groups and trim all free extents. It can be
invoked through FITRIM ioctl on the file system. The main idea is to
provide a way to trim the whole file system if needed, since some SSD's
may suffer from performance loss after the whole device was filled (it
does not mean that fs is full!).

It search for free extents in allocation groups specified by Byte range
start -> start+len. When the free extent is within this range, blocks
are marked as used and then trimmed. Afterwards these blocks are marked
as free in per-group bitmap.

Since fstrim is a long operation it is good to have an ability to
interrupt it by a signal. This was added by Dmitry Monakhov.
Thanks Dimitry.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7360d173

ext4: use bio layer instead of buffer layer in mpage_da_submit_io · bd2d0210

由 Theodore Ts'o 提交于 10月 27, 2010

Call the block I/O layer directly instad of going through the buffer
layer. This should give us much better performance and scalability,
as well as lowering our CPU utilization when doing buffered writeback.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

bd2d0210

ext4: don't update sb journal_devnum when RO dev · c41303ce

由 Maciej Żenczykowski 提交于 10月 27, 2010

An ext4 filesystem on a read-only device, with an external journal
which is at a different device number then recorded in the superblock
will fail to honor the read-only setting of the device and trigger
a superblock update (write).

For example:
  - ext4 on a software raid which is in read-only mode
  - external journal on a read-write device which has changed device num
  - attempt to mount with -o journal_dev=<new_number>
  - hits BUG_ON(mddev->ro = 1) in md.c

Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NMaciej Żenczykowski <zenczykowski@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c41303ce

ext4: add interface to advertise ext4 features in sysfs · 857ac889

由 Lukas Czerner 提交于 10月 27, 2010

User-space should have the opportunity to check what features doest ext4
support in each particular copy. This adds easy interface by creating new
"features" directory in sys/fs/ext4/. In that directory files
advertising feature names can be created.

Add lazy_itable_init to the feature list.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

857ac889

ext4: add support for lazy inode table initialization · bfff6873

由 Lukas Czerner 提交于 10月 27, 2010

When the lazy_itable_init extended option is passed to mke2fs, it
considerably speeds up filesystem creation because inode tables are
not zeroed out.  The fact that parts of the inode table are
uninitialized is not a problem so long as the block group descriptors,
which contain information regarding how much of the inode table has
been initialized, has not been corrupted However, if the block group
checksums are not valid, e2fsck must scan the entire inode table, and
the the old, uninitialized data could potentially cause e2fsck to
report false problems.

Hence, it is important for the inode tables to be initialized as soon
as possble.  This commit adds this feature so that mke2fs can safely
use the lazy inode table initialization feature to speed up formatting
file systems.

This is done via a new new kernel thread called ext4lazyinit, which is
created on demand and destroyed, when it is no longer needed.  There
is only one thread for all ext4 filesystems in the system. When the
first filesystem with inititable mount option is mounted, ext4lazyinit
thread is created, then the filesystem can register its request in the
request list.

This thread then walks through the list of requests picking up
scheduled requests and invoking ext4_init_inode_table(). Next schedule
time for the request is computed by multiplying the time it took to
zero out last inode table with wait multiplier, which can be set with
the (init_itable=n) mount option (default is 10).  We are doing
this so we do not take the whole I/O bandwidth. When the thread is no
longer necessary (request list is empty) it frees the appropriate
structures and exits (and can be created later later by another
filesystem).

We do not disturb regular inode allocations in any way, it just do not
care whether the inode table is, or is not zeroed. But when zeroing, we
have to skip used inodes, obviously. Also we should prevent new inode
allocations from the group, while zeroing is on the way. For that we
take write alloc_sem lock in ext4_init_inode_table() and read alloc_sem
in the ext4_claim_inode, so when we are unlucky and allocator hits the
group which is currently being zeroed, it just has to wait.

This can be suppresed using the mount option no_init_itable.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

bfff6873

ext4: fix NULL pointer dereference in print_daily_error_info() · a1c6c569

由 Sergey Senozhatsky 提交于 10月 27, 2010

Fix NULL pointer dereference in print_daily_error_info, when   
called on unmounted fs (EXT4_SB(sb) returns NULL), by removing error 
reporting timer in ext4_put_super.

Google-Bug-Id: 3017663
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a1c6c569

05 10月, 2010 2 次提交

BKL: Remove BKL from ext4 filesystem · f2143c4e

由 Jan Blunck 提交于 2月 24, 2010

The BKL is still used in ext4_put_super(), ext4_fill_super() and
ext4_remount(). All three calles are protected against concurrent calls by
the s_umount rw semaphore of struct super_block.

Therefore the BKL is protecting nothing in this case.
Signed-off-by: NJan Blunck <jblunck@infradead.org>
Acked-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

f2143c4e

BKL: Explicitly add BKL around get_sb/fill_super · db719222

由 Jan Blunck 提交于 8月 15, 2010

This patch is a preparation necessary to remove the BKL from do_new_mount().
It explicitly adds calls to lock_kernel()/unlock_kernel() around
get_sb/fill_super operations for filesystems that still uses the BKL.

I've read through all the code formerly covered by the BKL inside
do_kern_mount() and have satisfied myself that it doesn't need the BKL
any more.

do_kern_mount() is already called without the BKL when mounting the rootfs
and in nfsctl. do_kern_mount() calls vfs_kern_mount(), which is called
from various places without BKL: simple_pin_fs(), nfs_do_clone_mount()
through nfs_follow_mountpoint(), afs_mntpt_do_automount() through
afs_mntpt_follow_link(). Both later functions are actually the filesystems
follow_link inode operation. vfs_kern_mount() is calling the specified
get_sb function and lets the filesystem do its job by calling the given
fill_super function.

Therefore I think it is safe to push down the BKL from the VFS to the
low-level filesystems get_sb/fill_super operation.

[arnd: do not add the BKL to those file systems that already
       don't use it elsewhere]
Signed-off-by: NJan Blunck <jblunck@infradead.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Christoph Hellwig <hch@infradead.org>

db719222

10 9月, 2010 1 次提交

ext3/ext4: Factor out disk addressability check · 30ca22c7

由 Patrick J. LoPresti 提交于 7月 22, 2010

As part of adding support for OCFS2 to mount huge volumes, we need to
check that the sector_t and page cache of the system are capable of
addressing the entire volume.

An identical check already appears in ext3 and ext4.  This patch moves
the addressability check into its own function in fs/libfs.c and
modifies ext3 and ext4 to invoke it.

[Edited to -EINVAL instead of BUG_ON() for bad blocksize_bits -- Joel]
Signed-off-by: NPatrick LoPresti <lopresti@gmail.com>
Cc: linux-ext4@vger.kernel.org
Acked-by: NAndreas Dilger <adilger@dilger.ca>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

30ca22c7

10 8月, 2010 1 次提交
- A
  convert ext4 to ->evict_inode() · 0930fcc1
  由 Al Viro 提交于 6月 07, 2010
```
pretty much brute-force...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  0930fcc1
04 8月, 2010 1 次提交

jbd2: Change j_state_lock to be a rwlock_t · a931da6a

由 Theodore Ts'o 提交于 8月 03, 2010

Lockstat reports have shown that j_state_lock is a major source of
lock contention, especially on systems with more than 4 CPU cores.  So
change it to be a read/write spinlock.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a931da6a

02 8月, 2010 3 次提交

ext4: Add mount options in superblock · 8b67f04a

由 Theodore Ts'o 提交于 8月 01, 2010

Allow mount options to be stored in the superblock. Also add default
mount option bits for nobarrier, block_validity, discard, and nodelalloc.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8b67f04a

ext4: force block allocation on quota_off · ca0e05e4

由 Dmitry Monakhov 提交于 8月 01, 2010

Perform full sync procedure so that any delayed allocation blocks are
allocated so quota will be consistent.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ca0e05e4

ext4: fix freeze deadlock under IO · 437f88cc

由 Eric Sandeen 提交于 8月 01, 2010

Commit 6b0310fb caused a regression resulting in deadlocks
when freezing a filesystem which had active IO; the vfs_check_frozen
level (SB_FREEZE_WRITE) did not let the freeze-related IO syncing
through.  Duh.

Changing the test to FREEZE_TRANS should let the normal freeze
syncing get through the fs, but still block any transactions from
starting once the fs is completely frozen.

I tested this by running fsstress in the background while periodically
snapshotting the fs and running fsck on the result.  I ran into
occasional deadlocks, but different ones.  I think this is a
fine fix for the problem at hand, and the other deadlocky things
will need more investigation.
Reported-by: NPhillip Susi <psusi@cfl.rr.com>
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

437f88cc

27 7月, 2010 6 次提交

ext4: check to make make sure bd_dev is set before dereferencing it · f613dfcb

由 Theodore Ts'o 提交于 7月 27, 2010

There are some drivers which may not set bdev->bd_dev.  So make sure
it is non-NULL before dereferencing it.

Google-Bug-Id: 1773557
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f613dfcb

ext4: Always journal quota file modifications · 62d2b5f2

由 Jan Kara 提交于 7月 27, 2010

When journaled quota options are not specified, we do writes
to quota files just in data=ordered mode. This actually causes
warnings from JBD2 about dirty journaled buffer because ext4_getblk
unconditionally treats a block allocated by it as metadata. Since
quota actually is filesystem metadata, the easiest way to get rid
of the warning is to always treat quota writes as metadata...
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

62d2b5f2

ext4: Fix potential memory leak in ext4_fill_super · dcc7dae3

由 Cyrill Gorcunov 提交于 7月 27, 2010

Under heavy memory pressure we may hit out of memory
situation and as result kstrdup'ed options will not be
freed. Fix it.
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

dcc7dae3

ext4: Once a day, printk file system error information to dmesg · 66e61a9e

由 Theodore Ts'o 提交于 7月 27, 2010

This allows us to grab any file system error messages by scraping
/var/log/messages.  This will make it easy for us to do error analysis
across the very large number of machines as we deploy ext4 across the
fleet.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

66e61a9e

ext4: Save error information to the superblock for analysis · 1c13d5c0

由 Theodore Ts'o 提交于 7月 27, 2010

Save number of file system errors, and the time function name, line
number, block number, and inode number of the first and most recent
errors reported on the file system in the superblock.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1c13d5c0

T
ext4: Pass line numbers to ext4_error() and friends · c398eda0
由 Theodore Ts'o 提交于 7月 27, 2010
```
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
c398eda0

30 6月, 2010 2 次提交
- T
  ext4: Pass line number to ext4_journal_abort_handle() · 90c7201b
  由 Theodore Ts'o 提交于 6月 29, 2010
```
This allows the error messages to include the line number
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
  90c7201b
- T
  ext4: Enhance ext4_grp_locked_error() to take block and function numbers · e29136f8
  由 Theodore Ts'o 提交于 6月 29, 2010
```
Also use a macro definition so that __func__ and __LINE__ is implicit.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
  e29136f8
29 6月, 2010 1 次提交

ext4: clean up ext4_abort() so __func__ is now implicit · c67d859e

由 Theodore Ts'o 提交于 6月 29, 2010

Use a macro definition for ext4_abort() to clean up the .c files a wee
bit.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c67d859e

17 6月, 2010 1 次提交

fix typos concerning "initiali[zs]e" · 421f91d2

由 Uwe Kleine-König 提交于 6月 11, 2010

Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

421f91d2

15 6月, 2010 1 次提交

ext4: remove vestiges of nobh support · 206f7ab4

由 Christoph Hellwig 提交于 6月 14, 2010

The nobh option was only supported for writeback mode, but given that all
write paths actually create buffer heads it effectively was a no-op already.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

206f7ab4

24 5月, 2010 1 次提交

quota: rename default quotactl methods to dquot_ · 287a8095

由 Christoph Hellwig 提交于 5月 19, 2010

Follow the dquot_* style used elsewhere in dquot.c.

[Jan Kara: Fixed up missing conversion of ext2]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

287a8095