提交 · ccd2506bd43113659aa904d5bea5d1300605e2a6 · OpenHarmony / kernel_linux

26 2月, 2009 1 次提交

ext4: add EXT4_IOC_ALLOC_DA_BLKS ioctl · ccd2506b

由 Theodore Ts'o 提交于 2月 26, 2009

Add an ioctl which forces all of the delay allocated blocks to be
allocated.  This also provides a function ext4_alloc_da_blocks() which
will be used by the following commits to force files to be fully
allocated to preserve application-expected ext3 behaviour.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ccd2506b

24 2月, 2009 1 次提交

ext4: Simplify delalloc code by removing mpage_da_writepages() · f63e6005

由 Theodore Ts'o 提交于 2月 23, 2009

The mpage_da_writepages() function is only used in one place, so
inline it to simplify the call stack and make the code easier to
understand.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f63e6005

23 2月, 2009 2 次提交

ext4: Save stack space by removing fake buffer heads · 8dc207c0

由 Theodore Ts'o 提交于 2月 23, 2009

Struct mpage_da_data and mpage_add_bh_to_extent() use a fake struct
buffer_head which is 104 bytes on an x86_64 system, but only use 24
bytes of the structure.  On systems that use a spinlock for atomic_t,
the stack savings will be even greater.

It turns out that using a fake struct buffer_head doesn't even save
that much code, and it makes the code more confusing since it's not
used as a "real" buffer head.  So just store pass b_size and b_state
in mpage_add_bh_to_extent(), and store b_size, b_state, and b_block_nr
in the mpage_da_data structure.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8dc207c0

T
ext4: Simplify delalloc implementation by removing mpd.get_block · ed5bde0b
由 Theodore Ts'o 提交于 2月 23, 2009
```
This parameter was always set to ext4_da_get_block_write().
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
ed5bde0b

28 3月, 2009 1 次提交

ext4: Validate extent details only when read from the disk · 7a262f7c

由 Aneesh Kumar K.V 提交于 3月 27, 2009

Make sure we validate extent details only when read from the disk.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NThiemo Nagel <thiemo.nagel@ph.tum.de>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7a262f7c

12 3月, 2009 1 次提交

ext4: Add checks to validate extent entries. · 56b19868

由 Aneesh Kumar K.V 提交于 3月 12, 2009

This patch adds checks to validate the extent entries along with extent
headers, to avoid crashes caused by corrupt filesystems.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

56b19868

23 2月, 2009 1 次提交

ext4: return -EIO not -ESTALE on directory traversal through deleted inode · e6f009b0

由 Bryan Donlan 提交于 2月 22, 2009

ext4_iget() returns -ESTALE if invoked on a deleted inode, in order to
report errors to NFS properly.  However, in ext4_lookup(), this
-ESTALE can be propagated to userspace if the filesystem is corrupted
such that a directory entry references a deleted inode.  This leads to
a misleading error message - "Stale NFS file handle" - and confusion
on the part of the admin.

The bug can be easily reproduced by creating a new filesystem, making
a link to an unused inode using debugfs, then mounting and attempting
to ls -l said link.

This patch thus changes ext4_lookup to return -EIO if it receives
-ESTALE from ext4_iget(), as ext4 does for other filesystem metadata
corruption; and also invokes the appropriate ext*_error functions when
this case is detected.
Signed-off-by: NBryan Donlan <bdonlan@gmail.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e6f009b0

13 3月, 2009 1 次提交

ext4: New inode/block allocation algorithms for flex_bg filesystems · a4912123

由 Theodore Ts'o 提交于 3月 12, 2009

The find_group_flex() inode allocator is now only used if the
filesystem is mounted using the "oldalloc" mount option.  It is
replaced with the original Orlov allocator that has been updated for
flex_bg filesystems (it should behave the same way if flex_bg is
disabled).  The inode allocator now functions by taking into account
each flex_bg group, instead of each block group, when deciding whether
or not it's time to allocate a new directory into a fresh flex_bg.

The block allocator has also been changed so that the first block
group in each flex_bg is preferred for use for storing directory
blocks.  This keeps directory blocks close together, which is good for
speeding up e2fsck since large directories are more likely to look
like this:

debugfs:  stat /home/tytso/Maildir/cur
Inode: 1844562   Type: directory    Mode:  0700   Flags: 0x81000
Generation: 1132745781    Version: 0x00000000:0000ad71
User: 15806   Group: 15806   Size: 1060864
File ACL: 0    Directory ACL: 0
Links: 2   Blockcount: 2072
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x499c0ff4:164961f4 -- Wed Feb 18 08:41:08 2009
 atime: 0x499c0ff4:00000000 -- Wed Feb 18 08:41:08 2009
 mtime: 0x49957f51:00000000 -- Fri Feb 13 09:10:25 2009
crtime: 0x499c0f57:00d51440 -- Wed Feb 18 08:38:31 2009
Size of extra inode fields: 28
BLOCKS:
(0):7348651, (1-258):7348654-7348911
TOTAL: 259
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a4912123

16 2月, 2009 3 次提交

ext4: tighten restrictions on inode flags · 2dc6b0d4

由 Duane Griffin 提交于 2月 15, 2009

At the moment there are few restrictions on which flags may be set on
which inodes.  Specifically DIRSYNC may only be set on directories and
IMMUTABLE and APPEND may not be set on links.  Tighten that to disallow
TOPDIR being set on non-directories and only NODUMP and NOATIME to be set
on non-regular file, non-directories.

Introduces a flags masking function which masks flags based on mode and
use it during inode creation and when flags are set via the ioctl to
facilitate future consistency.
Signed-off-by: NDuane Griffin <duaneg@dghda.com>
Acked-by: NAndreas Dilger <adilger@sun.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2dc6b0d4

ext4: don't inherit inappropriate inode flags from parent · 8fa43a81

由 Duane Griffin 提交于 2月 15, 2009

At present INDEX and EXTENTS are the only flags that new ext4 inodes do
NOT inherit from their parent.  In addition prevent the flags DIRTY,
ECOMPR, IMAGIC, TOPDIR, HUGE_FILE and EXT_MIGRATE from being inherited. 
List inheritable flags explicitly to prevent future flags from
accidentally being inherited.

This fixes the TOPDIR flag inheritance bug reported at
http://bugzilla.kernel.org/show_bug.cgi?id=9866.
Signed-off-by: NDuane Griffin <duaneg@dghda.com>
Acked-by: NAndreas Dilger <adilger@sun.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8fa43a81

ext4: allocate ->s_blockgroup_lock separately · 705895b6

由 Pekka Enberg 提交于 2月 15, 2009

As spotted by kmemtrace, struct ext4_sb_info is 17664 bytes on 64-bit
which makes it a very bad fit for SLAB allocators.  The culprit of the
wasted memory is ->s_blockgroup_lock which can be as big as 16 KB when
NR_CPUS >= 32.

To fix that, allocate ->s_blockgroup_lock, which fits nicely in a order 2
page in the worst case, separately.  This shinks down struct ext4_sb_info
enough to fit a 2 KB slab cache so now we allocate 16 KB + 2 KB instead of
32 KB saving 14 KB of memory.
Acked-by: NAndreas Dilger <adilger@sun.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

705895b6

15 2月, 2009 2 次提交

ext4: New rec_len encoding for very large blocksizes · 3d0518f4

由 Wei Yongjun 提交于 2月 14, 2009

The rec_len field in the directory entry is 16 bits, so to encode
blocksizes larger than 64k becomes problematic. This patch allows us
to supprot block sizes up to 256k, by using the low 2 bits to extend
the range of rec_len to 2**18-1 (since valid rec_len sizes must be a
multiple of 4). We use the convention that a rec_len of 0 or 65535
means the filesystem block size, for compatibility with older kernels.

It's unlikely we'll see VM pages of up to 256k, but at some point we
might find that the Linux VM has been enhanced to support filesystem
block sizes > than the VM page size, at which point it might be useful
for some applications to allow very large filesystem block sizes.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3d0518f4

T
ext4: Use unsigned int for blocksize in dx_make_map() and dx_pack_dirents() · 8bad4597
由 Theodore Ts'o 提交于 2月 14, 2009
```
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
8bad4597

07 2月, 2009 2 次提交

ext4: remove call to ext4_group_desc() in ext4_group_used_meta_blocks() · e187c658

由 Theodore Ts'o 提交于 2月 06, 2009

The static function ext4_group_used_meta_blocks() only has one caller,
who already has access to the block group's group descriptor. So it's
better to have ext4_init_block_bitmap() pass the group descriptor to
ext4_group_used_meta_blocks(), so it doesn't need to call
ext4_group_desc(). Previously this function did not check if
ext4_group_desc() returned NULL due to an error, potentially causing a
kernel OOPS report. This avoids the issue entirely.
Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e187c658

ext4: Remove stale block allocator references from ext4.h · 074ca442

由 Mike Snitzer 提交于 2月 06, 2009

Remove some leftovers from when the old block allocator was removed
(c2ea3fde).  ext4_sb_info is now a bit lighter.  Also remove a dangling
read_block_bitmap() prototype.
Signed-off-by: NMike Snitzer <snitzer@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

074ca442

29 3月, 2009 3 次提交

fix setuid sometimes wouldn't · 7c2c7d99

由 Hugh Dickins 提交于 3月 28, 2009

check_unsafe_exec() also notes whether the fs_struct is being
shared by more threads than will get killed by the exec, and if so
sets LSM_UNSAFE_SHARE to make bprm_set_creds() careful about euid.
But /proc/<pid>/cwd and /proc/<pid>/root lookups make transient
use of get_fs_struct(), which also raises that sharing count.

This might occasionally cause a setuid program not to change euid,
in the same way as happened with files->count (check_unsafe_exec
also looks at sighand->count, but /proc doesn't raise that one).

We'd prefer exec not to unshare fs_struct: so fix this in procfs,
replacing get_fs_struct() by get_fs_path(), which does path_get
while still holding task_lock, instead of raising fs->count.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Cc: stable@kernel.org
___

 fs/proc/base.c |   50 +++++++++++++++--------------------------------
 1 file changed, 16 insertions(+), 34 deletions(-)
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7c2c7d99

fix setuid sometimes doesn't · e426b64c

由 Hugh Dickins 提交于 3月 28, 2009

Joe Malicki reports that setuid sometimes doesn't: very rarely,
a setuid root program does not get root euid; and, by the way,
they have a health check running lsof every few minutes.

Right, check_unsafe_exec() notes whether the files_struct is being
shared by more threads than will get killed by the exec, and if so
sets LSM_UNSAFE_SHARE to make bprm_set_creds() careful about euid.
But /proc/<pid>/fd and /proc/<pid>/fdinfo lookups make transient
use of get_files_struct(), which also raises that sharing count.

There's a rather simple fix for this: exec's check on files->count
has been redundant ever since 2.6.1 made it unshare_files() (except
while compat_do_execve() omitted to do so) - just remove that check.

[Note to -stable: this patch will not apply before 2.6.29: earlier
releases should just remove the files->count line from unsafe_exec().]
Reported-by: NJoe Malicki <jmalicki@metacarta.com>
Narrowed-down-by: NMichael Itz <mitz@metacarta.com>
Tested-by: NJoe Malicki <jmalicki@metacarta.com>
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e426b64c

compat_do_execve should unshare_files · 53e9309e

由 Hugh Dickins 提交于 3月 28, 2009

2.6.26's commit fd8328be
"sanitize handling of shared descriptor tables in failing execve()"
moved the unshare_files() from flush_old_exec() and several binfmts
to the head of do_execve(); but forgot to make the same change to
compat_do_execve(), leaving a CLONE_FILES files_struct shared across
exec from a 32-bit process on a 64-bit kernel.

It's arguable whether the files_struct really ought to be unshared
across exec; but 2.6.1 made that so to stop the loading binary's fd
leaking into other threads, and a 32-bit process on a 64-bit kernel
ought to behave in the same way as 32 on 32 and 64 on 64.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

53e9309e

28 3月, 2009 22 次提交

fs: avoid I_NEW inodes · aabb8fdb

由 Nick Piggin 提交于 3月 11, 2009

To be on the safe side, it should be less fragile to exclude I_NEW inodes
from inode list scans by default (unless there is an important reason to
have them).

Normally they will get excluded (eg.  by zero refcount or writecount etc),
however it is a bit fragile for list walkers to know exactly what parts of
the inode state is set up and valid to test when in I_NEW.  So along these
lines, move I_NEW checks upward as well (sometimes taking I_FREEING etc
checks with them too -- this shouldn't be a problem should it?)
Signed-off-by: NNick Piggin <npiggin@suse.de>
Acked-by: NJan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

aabb8fdb

Merge code for single and multiple-instance mounts · 1bd79035

由 Sukadev Bhattiprolu 提交于 3月 07, 2009

new_pts_mount() (including the get_sb_nodev()), shares a lot of code
with init_pts_mount(). The only difference between them is the 'test-super'
function passed into sget().

Move all common code into devpts_get_sb() and remove the new_pts_mount() and
init_pts_mount() functions,

Changelog[v3]:
	[Serge Hallyn]: Remove unnecessary printk()s
Changelog[v2]:
	(Christoph Hellwig): Merge code in 'do_pts_mount()' into devpts_get_sb()
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: NSerge Hallyn <serue@us.ibm.com>
Tested-by: NSerge Hallyn <serue@us.ibm.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1bd79035

Remove get_init_pts_sb() · 289f00e2

由 Sukadev Bhattiprolu 提交于 3月 07, 2009

With mknod_ptmx() moved to devpts_get_sb(), init_pts_mount() becomes
a wrapper around get_init_pts_sb(). Remove get_init_pts_sb() and
fold code into init_pts_mount().
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: NSerge Hallyn <serue@us.ibm.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

289f00e2

Move common mknod_ptmx() calls into caller · 945cf2c7

由 Sukadev Bhattiprolu 提交于 3月 07, 2009

We create 'ptmx' node in both single-instance and multiple-instance
mounts. So devpts_get_sb() can call mknod_ptmx() once rather than
have both modes calling mknod_ptmx() separately.
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: NSerge Hallyn <serue@us.ibm.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

945cf2c7

Parse mount options just once and copy them to super block · 482984f0

由 Sukadev Bhattiprolu 提交于 3月 07, 2009

Since all the mount option parsing is done in devpts, we could do it
just once and pass it around in devpts functions and eventually store
it in the super block.
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

482984f0

Unroll essentials of do_remount_sb() into devpts · fdbf5348

由 Sukadev Bhattiprolu 提交于 3月 07, 2009

On remount, devpts fs only needs to parse the mount options. Users cannot
directly create/dirty files in /dev/pts so the MS_RDONLY flag and
shrinking the dcache does not really apply to devpts.

So effectively on remount, devpts only parses the mount options and updates
these options in its super block. As such, we could replace do_remount_sb()
call with a direct parse_mount_options().

Doing so enables subsequent patches to avoid parsing the mount options twice
and simplify the code.
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: NSerge Hallyn <serue@us.ibm.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

fdbf5348

vfs: simple_set_mnt() should return void · a3ec947c

由 Sukadev Bhattiprolu 提交于 3月 04, 2009

simple_set_mnt() is defined as returning 'int' but always returns 0.
Callers assume simple_set_mnt() never fails and don't properly cleanup if
it were to _ever_ fail.  For instance, get_sb_single() and get_sb_nodev()
should:

        up_write(sb->s_unmount);
        deactivate_super(sb);

if simple_set_mnt() fails.

Since simple_set_mnt() never fails, would be cleaner if it did not
return anything.

[akpm@linux-foundation.org: fix build]
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: NSerge Hallyn <serue@us.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a3ec947c

fs: move bdev code out of buffer.c · 585d3bc0

由 Nick Piggin 提交于 2月 25, 2009

Move some block device related code out from buffer.c and put it in
block_dev.c. I'm trying to move non-buffer_head code out of buffer.c
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

585d3bc0

A
constify dentry_operations: rest · 3ba13d17
由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
3ba13d17
A
constify dentry_operations: configfs · 296c2d86
由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
296c2d86
A
constify dentry_operations: sysfs · ee1ec329
由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
ee1ec329
A
constify dentry_operations: JFS · ad28b4ef
由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
ad28b4ef
A
constify dentry_operations: OCFS2 · d8fba0ff
由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
d8fba0ff
A
constify dentry_operations: GFS2 · 92cecbbf
由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
92cecbbf
A
constify dentry_operations: FAT · ce6cdc47
由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
ce6cdc47
A
constify dentry_operations: FUSE · 4269590a
由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
4269590a
A
constify dentry_operations: procfs · d72f71eb
由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
d72f71eb
A
constify dentry_operations: ecryptfs · 5a3fd05a
由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
5a3fd05a
A
constify dentry_operations: CIFS · 4fd03e84
由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
4fd03e84
A
constify dentry_operations: AFS · 79be57cc
由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
79be57cc
A
constify dentry_operations: autofs, autofs4 · 08f11513
由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
08f11513
A
constify dentry_operations: 9p · a488257c
由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
a488257c

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多