提交 · 7845c0497536c566bfef08db1a38ae1ad2c25464 · openeuler / Kernel

28 10月, 2010 2 次提交

ext4: use search_dirblock() in ext4_dx_find_entry() · 7845c049

由 Theodore Ts'o 提交于 10月 27, 2010

Use the search_dirblock() in ext4_dx_find_entry().  It makes the code
easier to read, and it takes advantage of common code.  It also saves
100 bytes or so of text space.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Brad Spengler <spender@grsecurity.net>

7845c049

ext4: avoid uninitialized memory references in ext3_htree_next_block() · 8941ec8b

由 Theodore Ts'o 提交于 10月 27, 2010

If the first block of htree directory is missing '.' or '..' but is
otherwise a valid directory, and we do a lookup for '.' or '..', it's
possible to dereference an uninitialized memory pointer in
ext4_htree_next_block().

We avoid this by moving the special case from ext4_dx_find_entry() to
ext4_find_entry(); this also means we can optimize ext4_find_entry()
slightly when NFS looks up "..".

Thanks to Brad Spengler for pointing a Clang warning that led me to
look more closely at this code.  The warning was harmless, but it was
useful in pointing out code that was too ugly to live.  This warning was
also reported by Roman Borisov.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Brad Spengler <spender@grsecurity.net>

8941ec8b

05 8月, 2010 1 次提交

ext4: re-inline ext4_rec_len_(to|from)_disk functions · 0cfc9255

由 Eric Sandeen 提交于 8月 05, 2010

commit 3d0518f4, "ext4: New rec_len encoding for very
large blocksizes" made several changes to this path, but from
a perf perspective, un-inlining ext4_rec_len_from_disk() seems
most significant.  This function is called from ext4_check_dir_entry(),
which on a file-creation workload is called extremely often.

I tested this with bonnie:

# bonnie++ -u root -s 0 -f -x 200 -d /mnt/test -n 32

(this does 200 iterations) and got this for the file creations:

ext4 stock:   Average =  21206.8 files/s
ext4 inlined: Average =  22346.7 files/s  (+5%)
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0cfc9255

27 7月, 2010 1 次提交

ext4: Cleanup ext4_check_dir_entry so __func__ is now implicit · 60fd4da3

由 Theodore Ts'o 提交于 7月 27, 2010

    
Also start passing the line number to ext4_check_dir since we're going
to need it in upcoming patch.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

60fd4da3

15 6月, 2010 1 次提交

ext4: remove initialized but not read variables · 5a0790c2

由 Andi Kleen 提交于 6月 14, 2010

No real bugs found, just removed some dead code.

Found by gcc 4.6's new warnings.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5a0790c2

17 5月, 2010 4 次提交

ext4: Make fsync sync new parent directories in no-journal mode · 14ece102

由 Frank Mayhar 提交于 5月 17, 2010

Add a new ext4 state to tell us when a file has been newly created; use
that state in ext4_sync_file in no-journal mode to tell us when we need
to sync the parent directory as well as the inode and data itself.  This
fixes a problem in which a panic or power failure may lose the entire
file even when using fsync, since the parent directory entry is lost.

Addresses-Google-Bug: #2480057
Signed-off-by: NFrank Mayhar <fmayhar@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

14ece102

ext4: Drop whitespace at end of lines · 60e6679e

由 Theodore Ts'o 提交于 5月 17, 2010

This patch was generated using:

#!/usr/bin/perl -i
while (<>) {
    s/[ 	]+$//;
    print;
}
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

60e6679e

ext4: Use bitops to read/modify i_flags in struct ext4_inode_info · 12e9b892

由 Dmitry Monakhov 提交于 5月 16, 2010

At several places we modify EXT4_I(inode)->i_flags without holding
i_mutex (ext4_do_update_inode, ...). These modifications are racy and
we can lose updates to i_flags. So convert handling of i_flags to use
bitops which are atomic.

https://bugzilla.kernel.org/show_bug.cgi?id=15792Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

12e9b892

ext4: Convert calls of ext4_error() to EXT4_ERROR_INODE() · 24676da4

由 Theodore Ts'o 提交于 5月 16, 2010

EXT4_ERROR_INODE() tends to provide better error information and in a
more consistent format.  Some errors were not even identifying the inode
or directory which was corrupted, which made them not very useful.

Addresses-Google-Bug: #2507977
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

24676da4

05 3月, 2010 2 次提交

dquot: cleanup dquot initialize routine · 871a2931

由 Christoph Hellwig 提交于 3月 03, 2010

Get rid of the initialize dquot operation - it is now always called from
the filesystem and if a filesystem really needs it's own (which none
currently does) it can just call into it's own routine directly.

Rename the now static low-level dquot_initialize helper to __dquot_initialize
and vfs_dq_init to dquot_initialize to have a consistent namespace.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

871a2931

dquot: move dquot initialization responsibility into the filesystem · 907f4554

由 Christoph Hellwig 提交于 3月 03, 2010

Currently various places in the VFS call vfs_dq_init directly.  This means
we tie the quota code into the VFS.  Get rid of that and make the
filesystem responsible for the initialization.   For most metadata operations
this is a straight forward move into the methods, but for truncate and
open it's a bit more complicated.

For truncate we currently only call vfs_dq_init for the sys_truncate case
because open already takes care of it for ftruncate and open(O_TRUNC) - the
new code causes an additional vfs_dq_init for those which is harmless.

For open the initialization is moved from do_filp_open into the open method,
which means it happens slightly earlier now, and only for regular files.
The latter is fine because we don't need to initialize it for operations
on special files, and we already do it as part of the namespace operations
for directories.

Add a dquot_file_open helper that filesystems that support generic quotas
can use to fill in ->open.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

907f4554

02 3月, 2010 1 次提交

ext4: Handle non empty on-disk orphan link · 6e3617e5

由 Dmitry Monakhov 提交于 3月 01, 2010

In case of truncate errors we explicitly remove inode from in-core
orphan list via orphan_del(NULL, inode) without modifying the on-disk list.

But later on, the same inode may be inserted in the orphan list again
which will result the on-disk linked list getting corrupted. If inode
i_dtime contains valid value, then skip on-disk list modification.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6e3617e5

17 2月, 2010 1 次提交

ext4: Fix BUG_ON at fs/buffer.c:652 in no journal mode · 73b50c1c

由 Curt Wohlgemuth 提交于 2月 16, 2010

Calls to ext4_handle_dirty_metadata should only pass in an inode
pointer for inode-specific metadata, and not for shared metadata
blocks such as inode table blocks, block group descriptors, the
superblock, etc.

The BUG_ON can get tripped when updating a special device (such as a
block device) that is opened (so that i_mapping is set in
fs/block_dev.c) and the file system is mounted in no journal mode.

Addresses-Google-Bug: #2404870
Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

73b50c1c

16 2月, 2010 1 次提交

ext4: move __func__ into a macro for ext4_warning, ext4_error · 12062ddd

由 Eric Sandeen 提交于 2月 15, 2010

Just a pet peeve of mine; we had a mishash of calls with either __func__
or "function_name" and the latter tends to get out of sync.

I think it's easier to just hide the __func__ in a macro, and it'll
be consistent from then on.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

12062ddd

09 12月, 2009 1 次提交

ext4: quota macros cleanup · 5aca07eb

由 Dmitry Monakhov 提交于 12月 08, 2009

Currently all quota block reservation macros contains hard-coded "2"
aka MAXQUOTAS value. This is no good because in some places it is not
obvious to understand what does this digit represent. Let's introduce
new macro with self descriptive name.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Acked-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5aca07eb

23 11月, 2009 1 次提交

ext4: fix potential buffer head leak when add_dirent_to_buf() returns ENOSPC · 2de770a4

由 Theodore Ts'o 提交于 11月 23, 2009

Previously add_dirent_to_buf() did not free its passed-in buffer head
in the case of ENOSPC, since in some cases the caller still needed it.
However, this led to potential buffer head leaks since not all callers
dealt with this correctly.  Fix this by making simplifying the freeing
convention; now add_dirent_to_buf() *never* frees the passed-in buffer
head, and leaves that to the responsibility of its caller.  This makes
things cleaner and easier to prove that the code is neither leaking
buffer heads or calling brelse() one time too many.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Curt Wohlgemuth <curtw@google.com>
Cc: stable@kernel.org

2de770a4

09 11月, 2009 1 次提交

ext4: partial revert to fix double brelse WARNING() · 1e424a34

由 Theodore Ts'o 提交于 11月 08, 2009

This is a partial revert of commit 6487a9d3 (only the changes made to
fs/ext4/namei.c), since it is causing the following brelse()
double-free warning when running fsstress on a file system with 1k
blocksize and we run into a block allocation failure while converting
a single-block directory to a multi-block hash-tree indexed directory.

WARNING: at fs/buffer.c:1197 __brelse+0x2e/0x33()
Hardware name: 
VFS: brelse: Trying to free free buffer
Modules linked in:
Pid: 2226, comm: jbd2/sdd-8 Not tainted 2.6.32-rc6-00577-g0003f55 #101
Call Trace:
 [<c01587fb>] warn_slowpath_common+0x65/0x95
 [<c0158869>] warn_slowpath_fmt+0x29/0x2c
 [<c021168e>] __brelse+0x2e/0x33
 [<c0288a9f>] jbd2_journal_refile_buffer+0x67/0x6c
 [<c028a9ed>] jbd2_journal_commit_transaction+0x319/0x14d8
 [<c0164d73>] ? try_to_del_timer_sync+0x58/0x60
 [<c0175bcc>] ? sched_clock_cpu+0x12a/0x13e
 [<c017f6b4>] ? trace_hardirqs_off+0xb/0xd
 [<c0175c1f>] ? cpu_clock+0x3f/0x5b
 [<c017f6ec>] ? lock_release_holdtime+0x36/0x137
 [<c0664ad0>] ? _spin_unlock_irqrestore+0x44/0x51
 [<c0180af3>] ? trace_hardirqs_on_caller+0x103/0x124
 [<c0180b1f>] ? trace_hardirqs_on+0xb/0xd
 [<c0164d73>] ? try_to_del_timer_sync+0x58/0x60
 [<c0290d1c>] kjournald2+0x11a/0x310
 [<c017118e>] ? autoremove_wake_function+0x0/0x38
 [<c0290c02>] ? kjournald2+0x0/0x310
 [<c0170ee6>] kthread+0x66/0x6b
 [<c0170e80>] ? kthread+0x0/0x6b
 [<c01251b3>] kernel_thread_helper+0x7/0x10
---[ end trace 5579351b86af61e3 ]---

Commit 6487a9d3 was an attempt some buffer head leaks in an ENOSPC
error path, but in some cases it actually results in an excess ENOSPC,
as shown above.  Fixing this means cleaning up who is responsible for
releasing the buffer heads from the callee to the caller of
add_dirent_to_buf().

Since that's a relatively complex change, and we're late in the rcX
development cycle, I'm reverting this now, and holding back a more
complete fix until after 2.6.32 ships.  We've lived with this
buffer_head leak on ENOSPC in ext3 and ext4 for a very long time; a
few more months won't kill us.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Curt Wohlgemuth <curtw@google.com>

1e424a34

29 9月, 2009 1 次提交

ext4: Handle nested ext4_journal_start/stop calls without a journal · d3d1faf6

由 Curt Wohlgemuth 提交于 9月 29, 2009

This patch fixes a problem with handling nested calls to
ext4_journal_start/ext4_journal_stop, when there is no journal present.
Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d3d1faf6

11 9月, 2009 1 次提交

ext4: Always set dx_node's fake_dirent explicitly. · 1f7bebb9

由 Andreas Schlick 提交于 9月 10, 2009

When ext4_dx_add_entry() has to split an index node, it has to ensure that
name_len of dx_node's fake_dirent is also zero, because otherwise e2fsck
won't recognise it as an intermediate htree node and consider the htree to
be corrupted.
Signed-off-by: NAndreas Schlick <schlick@lavabit.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1f7bebb9

09 9月, 2009 1 次提交

ext[234]: move over to 'check_acl' permission model · 1d5ccd1c

由 Linus Torvalds 提交于 8月 28, 2009

Don't implement per-filesystem 'extX_permission()' functions that have
to be called for every path component operation, and instead just expose
the actual ACL checking so that the VFS layer can now do it for us.
Reviewed-by: NJames Morris <jmorris@namei.org>
Acked-by: NSerge Hallyn <serue@us.ibm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1d5ccd1c

30 8月, 2009 1 次提交

ext4: Limit number of links that can be created by ext4_link() · b05ab1dc

由 Theodore Ts'o 提交于 8月 29, 2009

In ext4_link we need to check using EXT4_LINK_MAX, and not
EXT4_DIR_LINK_MAX(), since ext4_link() is creating hard links of
regular files, and not directories.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b05ab1dc

29 8月, 2009 1 次提交

ext4: Allow rename to create more than EXT4_LINK_MAX subdirectories · 2c94eb86

由 Aneesh Kumar K.V 提交于 8月 28, 2009

Use EXT4_DIR_LINK_MAX so that rename() can move a directory into new
parent directory without running into the EXT4_LINK_MAX limit.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2c94eb86

17 7月, 2009 1 次提交

ext4: More buffer head reference leaks · 6487a9d3

由 Curt Wohlgemuth 提交于 7月 17, 2009

After the patch I posted last week regarding buffer head ref leaks in
no-journal mode, I looked at all the code that uses buffer heads and
searched for more potential leaks.

The patch below fixes the issues I found; these can occur even when a
journal is present.

The change to inode.c fixes a double release if
ext4_journal_get_create_access() fails.

The changes to namei.c are more complicated.  add_dirent_to_buf() will
release the input buffer head EXCEPT when it returns -ENOSPC.  There are
some callers of this routine that don't always do the brelse() in the event
that -ENOSPC is returned.  Unfortunately, to put this fix into ext4_add_entry()
required capturing the return value of make_indexed_dir() and
add_dirent_to_buf().
Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6487a9d3

13 6月, 2009 2 次提交

ext4: teach the inode allocator to use a goal inode number · 11013911

由 Andreas Dilger 提交于 6月 13, 2009

Enhance the inode allocator to take a goal inode number as a
paremeter; if it is specified, it takes precedence over Orlov or
parent directory inode allocation algorithms.

The extents migration function uses the goal inode number so that the
extent trees allocated the migration function use the correct flex_bg.
In the future, the goal inode functionality will also be used to
allocate an adjacent inode for the extended attributes.

Also, for testing purposes the goal inode number can be specified via
/sys/fs/{dev}/inode_goal.  This can be useful for testing inode
allocation beyond 2^32 blocks on very large filesystems.
Signed-off-by: NAndreas Dilger <adilger@sun.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

11013911

ext4: Use a hash of the topdir directory name for the Orlov parent group · f157a4aa

由 Theodore Ts'o 提交于 6月 13, 2009

Instead of using a random number to determine the goal parent grop for
the Orlov top directories, use a hash of the directory name.  This
allows for repeatable results when trying to benchmark filesystem
layout algorithms.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f157a4aa

09 6月, 2009 1 次提交

ext4: fix dx_map_entry to support 256k directory blocks · 9aee2286

由 Toshiyuki Okajima 提交于 6月 08, 2009

The dx_map_entry structure doesn't support over 64KB block size by
current usage of its member("offs"). Because "offs" treats an offset
of copies of the ext4_dir_entry_2 structure as is. This member size is
16 bits. But real offset for over 64KB(256KB) block size needs 18
bits. However, real offset keeps 4 byte boundary, so lower 2 bits is
not used.

Therefore, we do the following to fix this limitation:
For "store": 
	we divide the real offset by 4 and then store this result to "offs" 
	member.
For "use":
	we multiply "offs" member by 4 and then use this result 
	as real offset.
Signed-off-by: NToshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9aee2286

03 5月, 2009 1 次提交

ext4: hook fiemap operation for directories · abc8746e

由 Aneesh Kumar K.V 提交于 5月 02, 2009

Add fiemap callback for directories
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

abc8746e

02 5月, 2009 1 次提交

ext4: Move fs/ext4/namei.h into ext4.h · 596397b7

由 Theodore Ts'o 提交于 5月 01, 2009

The fs/ext4/namei.h header file had only a single function
declaration, and should have never been a standalone file.  Move it
into ext4.h, where should have been from the beginning.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

596397b7

26 4月, 2009 1 次提交

ext4: Replace lock/unlock_super() with an explicit lock for the orphan list · 3b9d4ed2

由 Theodore Ts'o 提交于 4月 25, 2009

Use a separate lock to protect the orphan list, so we can stop
overloading the use of lock_super().
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3b9d4ed2

26 3月, 2009 1 次提交

ext4: Use lowercase names of quota functions · a269eb18

由 Jan Kara 提交于 1月 26, 2009

Use lowercase names of quota functions instead of old uppercase ones.
Signed-off-by: NJan Kara <jack@suse.cz>
Acked-by: NMingming Cao <cmm@us.ibm.com>
CC: linux-ext4@vger.kernel.org

a269eb18

17 3月, 2009 1 次提交

ext4: Add auto_da_alloc mount option · afd4672d

由 Theodore Ts'o 提交于 3月 16, 2009

Add a mount option which allows the user to disable automatic
allocation of blocks whose allocation by delayed allocation when the
file was originally truncated or when the file is renamed over an
existing file. This feature is intended to save users from the
effects of naive application writers, but it reduces the effectiveness
of the delayed allocation code. This mount option disables this
safety feature, which may be desirable for prodcutions systems where
the risk of unclean shutdowns or unexpected system crashes is low.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

afd4672d

24 2月, 2009 1 次提交

ext4: Automatically allocate delay allocated blocks on rename · 8750c6d5

由 Theodore Ts'o 提交于 2月 23, 2009

When renaming a file such that a link to another inode is overwritten,
force any delay allocated blocks that to be allocated so that if the
filesystem is mounted with data=ordered, the data blocks will be
pushed out to disk along with the journal commit.  Many application
programs expect this, so we do this to avoid zero length files if the
system crashes unexpectedly.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8750c6d5

23 2月, 2009 1 次提交

ext4: return -EIO not -ESTALE on directory traversal through deleted inode · e6f009b0

由 Bryan Donlan 提交于 2月 22, 2009

ext4_iget() returns -ESTALE if invoked on a deleted inode, in order to
report errors to NFS properly.  However, in ext4_lookup(), this
-ESTALE can be propagated to userspace if the filesystem is corrupted
such that a directory entry references a deleted inode.  This leads to
a misleading error message - "Stale NFS file handle" - and confusion
on the part of the admin.

The bug can be easily reproduced by creating a new filesystem, making
a link to an unused inode using debugfs, then mounting and attempting
to ls -l said link.

This patch thus changes ext4_lookup to return -EIO if it receives
-ESTALE from ext4_iget(), as ext4 does for other filesystem metadata
corruption; and also invokes the appropriate ext*_error functions when
this case is detected.
Signed-off-by: NBryan Donlan <bdonlan@gmail.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e6f009b0

15 2月, 2009 2 次提交

ext4: New rec_len encoding for very large blocksizes · 3d0518f4

由 Wei Yongjun 提交于 2月 14, 2009

The rec_len field in the directory entry is 16 bits, so to encode
blocksizes larger than 64k becomes problematic. This patch allows us
to supprot block sizes up to 256k, by using the low 2 bits to extend
the range of rec_len to 2**18-1 (since valid rec_len sizes must be a
multiple of 4). We use the convention that a rec_len of 0 or 65535
means the filesystem block size, for compatibility with older kernels.

It's unlikely we'll see VM pages of up to 256k, but at some point we
might find that the Linux VM has been enhanced to support filesystem
block sizes > than the VM page size, at which point it might be useful
for some applications to allow very large filesystem block sizes.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3d0518f4

T
ext4: Use unsigned int for blocksize in dx_make_map() and dx_pack_dirents() · 8bad4597
由 Theodore Ts'o 提交于 2月 14, 2009
```
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
8bad4597

17 1月, 2009 1 次提交

ext4: Add sanity check to make_indexed_dir · e6b8bc09

由 Theodore Ts'o 提交于 1月 16, 2009

Make sure the rec_len field in the '..' entry is sane, lest we overrun
the directory block and cause a kernel oops on a purposefully
corrupted filesystem.

Thanks to Sami Liedes for reporting this bug.

http://bugzilla.kernel.org/show_bug.cgi?id=12430Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

e6b8bc09

09 1月, 2009 1 次提交

generic swap(): ext4: remove local swap() macro · 97e133b4

由 Wu Fengguang 提交于 1月 07, 2009

Use the new generic implementation.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

97e133b4

05 1月, 2009 1 次提交

fs: symlink write_begin allocation context fix · 54566b2c

由 Nick Piggin 提交于 1月 04, 2009

With the write_begin/write_end aops, page_symlink was broken because it
could no longer pass a GFP_NOFS type mask into the point where the
allocations happened.  They are done in write_begin, which would always
assume that the filesystem can be entered from reclaim.  This bug could
cause filesystem deadlocks.

The funny thing with having a gfp_t mask there is that it doesn't really
allow the caller to arbitrarily tinker with the context in which it can be
called.  It couldn't ever be GFP_ATOMIC, for example, because it needs to
take the page lock.  The only thing any callers care about is __GFP_FS
anyway, so turn that into a single flag.

Add a new flag for write_begin, AOP_FLAG_NOFS.  Filesystems can now act on
this flag in their write_begin function.  Change __grab_cache_page to
accept a nofs argument as well, to honour that flag (while we're there,
change the name to grab_cache_page_write_begin which is more instructive
and does away with random leading underscores).

This is really a more flexible way to go in the end anyway -- if a
filesystem happens to want any extra allocations aside from the pagecache
ones in ints write_begin function, it may now use GFP_KERNEL (rather than
GFP_NOFS) for common case allocations (eg.  ocfs2_alloc_write_ctxt, for a
random example).

[kosaki.motohiro@jp.fujitsu.com: fix ubifs]
[kosaki.motohiro@jp.fujitsu.com: fix fuse]
Signed-off-by: NNick Piggin <npiggin@suse.de>
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@kernel.org>		[2.6.28.x]
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
[ Cleaned up the calling convention: just pass in the AOP flags
  untouched to the grab_cache_page_write_begin() function.  That
  just simplifies everybody, and may even allow future expansion of the
  logic.   - Linus ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

54566b2c

01 1月, 2009 1 次提交
- A
  nfsd race fixes: ext4 · 6b38e842
  由 Al Viro 提交于 12月 30, 2008
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  6b38e842
05 11月, 2008 1 次提交

ext4: Change unsigned long to unsigned int · 498e5f24

由 Theodore Ts'o 提交于 11月 05, 2008

Convert the unsigned longs that are most responsible for bloating the
stack usage on 64-bit systems.

Nearly all places in the ext3/4 code which uses "unsigned long" is
probably a bug, since on 32-bit systems a ulong a 32-bits, which means
we are wasting stack space on 64-bit systems.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

498e5f24

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功