提交 · f6089ff87d309a8ddb7b0d4dd92a570f1b0f689b · openeuler / raspberrypi-kernel

14 10月, 2010 5 次提交

由 Christoph Hellwig 提交于 10月 14, 2010

HFS implements hardlink by using indirect catalog entries that refer to a hidden
directly. The link target is cached in the dev field in the HFS+ specific
inode, which is also used for the device number for device files, and inside
for passing the nlink value of the indirect node from hfsplus_cat_write_inode
to a helper function. Now if we happen to write out the indirect node while
hfsplus_link is creating the catalog entry we'll get a link pointing to the
linkid of the current nlink value. This can easily be reproduced by a large
enough loop of local git-clone operations.

Stop abusing the dev field in the HFS+ inode for short term storage by
refactoring the way the permission structure in the catalog entry is
set up, and rename the dev field to linkid to avoid any confusion.

While we're at it also prevent creating hard links to special files, as
the HFS+ dev and linkid share the same space in the on-disk structure.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

f6089ff8

C
hfsplus: validate btree flags · 13571a69
由 Christoph Hellwig 提交于 10月 14, 2010
```
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>
```
13571a69

hfsplus: handle more on-disk corruptions without oopsing · 9250f925

由 Eric Sandeen 提交于 10月 14, 2010

hfs seems prone to bad things when it encounters on disk corruption.  Many
values are read from disk, and used as lengths to memcpy, as an example.
This patch fixes up several of these problematic cases.

o sanity check the on-disk maximum key lengths on mount
  (these are set to a defined value at mkfs time and shouldn't differ)
o check on-disk node keylens against the maximum key length for each tree
o fix hfs_btree_open so that going out via free_tree: doesn't wind
  up in hfs_releasepage, which wants to follow the very pointer
  we were trying to set up:
	HFS_SB(sb)->cat_tree = hfs_btree_open()
    .
  failure gets to hfs_releasepage and tries to follow HFS_SB(sb)->cat_tree

Tested with the fsfuzzer; it survives more than it used to.

[hch: ported of commit cf059462 from hfs]
[hch: added the fixes from 5581d018ed3493d226e7a4d645d9c8a5af6c36b]
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

9250f925

hfsplus: hfs_bnode_find() can fail, resulting in hfs_bnode_split() breakage · b6b41424

由 Al Viro 提交于 10月 14, 2010

oops and fs corruption; the latter can happen even on valid fs in case of oom.

[hch: port of commit 3d10a15d from hfs]
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

b6b41424

hfsplus: fix oops on mount with corrupted btree extent records · ee527162

由 Jeff Mahoney 提交于 10月 14, 2010

A particular fsfuzzer run caused an hfs file system to crash on mount. This
is due to a corrupted MDB extent record causing a miscalculation of
HFSPLUS_I(inode)->first_blocks for the extent tree. If the extent records
are zereod out, then it won't trigger the first_blocks special case and
instead falls through to the extent code, which we're in the middle
of initializing.

This patch catches the 0 size extent records, reports the corruption,
and fails the mount.

[hch: ported of commit 47f365eb from hfs]
Reported-by: NRamon de Carvalho Valle <rcvalle@linux.vnet.ibm.com>
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

ee527162

01 10月, 2010 20 次提交

hfsplus: fix rename over directories · 40de9a7c

由 Christoph Hellwig 提交于 10月 01, 2010

When renaming over a directory we need to use hfsplus_rmdir instead of
hfsplus_unlink to evict the victim.  This makes sure we properly error out
on non-empty directory as required by Posix (BZ #16571), and it also makes
sure we do the right thing in case i_nlink will every be set correctly for
directories on hfsplus.
Reported-by: NVlado Plaga <rechner@vlado-do.de>
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

40de9a7c

hfsplus: convert tree_lock to mutex · 467c3d9c

由 Thomas Gleixner 提交于 10月 01, 2010

tree_lock is used as mutex so make it a mutex.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

467c3d9c

hfsplus: add missing extent locking in hfsplus_write_inode · 7fcc99f4

由 Christoph Hellwig 提交于 10月 01, 2010

Most of the extent handling code already does proper SMP locking, but
hfsplus_write_inode was calling into hfsplus_ext_write_extent without
taking the extents_lock.  Fix this by splitting hfsplus_ext_write_extent
into an internal helper that expects the lock, and a public interface
that first acquires it.

Also add a few locking asserts and document the locking rules in
hfsplus_fs.h.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

7fcc99f4

hfsplus: protect readdir against removals from open_dir_list · 89755dca

由 Christoph Hellwig 提交于 10月 01, 2010

We already have i_mutex for readdir and the namespace operations that add
entries to open_dir_list, the only thing that was missing was the removal
in hfsplus_dir_release.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

89755dca

hfsplus: use atomic bitops for the superblock flags · 84adede3

由 Christoph Hellwig 提交于 10月 01, 2010

The flags in the HFS+-specific superlock do get modified during runtime,
use atomic bitops to make the modifications SMP safe.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

84adede3

hfsplus: add per-superblock lock for volume header updates · 7ac9fb9c

由 Christoph Hellwig 提交于 10月 01, 2010

Lock updates to the mutal fields in the volume header, and document the
locing in the hfsplus_sb_info structure.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

7ac9fb9c

hfsplus: remove the rsrc_inodes list · 58a818f5

由 Christoph Hellwig 提交于 10月 01, 2010

We never walk the list - the only reason for it is to make the resource fork
inodes appear hashed to the writeback code.  Borrow a trick from JFS to do
that without needing a list head.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

58a818f5

hfsplus: do not cache and write next_alloc · 66e5db05

由 Christoph Hellwig 提交于 10月 01, 2010

We never look at it, nor change the next_alloc field in the superblock. So
don't bother caching it or writing it out in hfsplus_sync_fs.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

66e5db05

hfsplus: fix error handling in hfsplus_symlink · f17c89bf

由 Christoph Hellwig 提交于 10月 01, 2010

We need to free the inode again on a hfsplus_create_cat failure.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

f17c89bf

hfsplus: merge mknod/mkdir/creat · 30d3abbe

由 Christoph Hellwig 提交于 10月 01, 2010

Make hfsplus_mkdir and hfsplus_create call hfsplus_mknod instead of
duplicating the code.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

30d3abbe

hfsplus: clean up hfsplus_write_inode · b5080f77

由 Christoph Hellwig 提交于 10月 01, 2010

Add a new hfsplus_system_write_inode for writing the special system inodes
and streamline the fastpath write_inode code.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

b5080f77

hfsplus: clean up hfsplus_iget · fc4fff82

由 Christoph Hellwig 提交于 10月 01, 2010

Add a new hfsplus_system_read_inode for reading the special system inodes
and streamline the fastpath iget code.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

fc4fff82

hfsplus: fix HFSPLUS_I calling convention · 6af502de

由 Christoph Hellwig 提交于 10月 01, 2010

HFSPLUS_I doesn't return a pointer to the hfsplus-specific inode
information like all other FOO_I macros, but dereference the pointer in a way
that made it look like a direct struct derefence. This only works as long
as the HFSPLUS_I macro is used directly and prevents us from keepig a local
hfsplus_inode_info pointer. Fix the calling convention and introduce a local
hip variable in all functions that use it constantly.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

6af502de

hfsplus: fix HFSPLUS_SB calling convention · dd73a01a

由 Christoph Hellwig 提交于 10月 01, 2010

HFSPLUS_SB doesn't return a pointer to the hfsplus-specific superblock
information like all other FOO_SB macros, but dereference the pointer in a way
that made it look like a direct struct derefence. This only works as long
as the HFSPLUS_SB macro is used directly and prevents us from keepig a local
hfsplus_sb_info pointer. Fix the calling convention and introduce a local
sbi variable in all functions that use it constantly.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

dd73a01a

hfsplus: remove BKL from hfsplus_put_super · e753a621

由 Christoph Hellwig 提交于 10月 01, 2010

Except for ->put_super the BKL is now gone from HFS, which means it's
superflous there too as ->put_super is serialized by the VFS.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

e753a621

hfsplus: use alloc_mutex in hfsplus_sync_fs · a9fdbf8c

由 Christoph Hellwig 提交于 10月 01, 2010

Use alloc_mutex to protect hfsplus_sync_fs against itself and concurrent
allocations, which allows to get rid of lock_super in hfsplus.

Note that most fields in the superblock still aren't protected against
concurrent allocations, that will follow later.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

a9fdbf8c

hfsplus: introduce alloc_mutex · 40bf48af

由 Christoph Hellwig 提交于 10月 01, 2010

Use a new per-sb alloc_mutex instead of abusing i_mutex of the alloc_file
to protect block allocations. This gets rid of lockdep nesting warnings
and prepares for extending the scope of alloc_mutex.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

40bf48af

hfsplus: protect setflags using i_mutex · 6333816a

由 Christoph Hellwig 提交于 10月 01, 2010

Use i_mutex for protecting against concurrent setflags ioctls like in
other filesystems and get rid of the BKL in hfsplus_ioctl.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

6333816a

hfsplus: split hfsplus_ioctl · 94744567

由 Christoph Hellwig 提交于 10月 01, 2010

Give each ioctl command a function of it's own.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

94744567

hfsplus: fix BKL leak in hfsplus_ioctl · 249e6353

由 Christoph Hellwig 提交于 10月 01, 2010

Currenly the HFSPLUS_IOC_EXT2_GETFLAGS case never unlocks the BKL, which
can lead to easily reproduced lockups when doing multiple GETFLAGS ioctls.

Fix this by only taking the BKL for the HFSPLUS_IOC_EXT2_SETFLAGS case
as neither HFSPLUS_IOC_EXT2_GETFLAGS not the default error case needs it.
Signed-off-by: NChristoph Hellwig <hch@tuxera.com>

249e6353

24 9月, 2010 5 次提交

o2dlm: force free mles during dlm exit · 5dad6c39

由 Srinivas Eeda 提交于 9月 21, 2010

While umounting, a block mle doesn't get freed if dlm is shutdown after
master request is received but before assert master. This results in unclean
shutdown of dlm domain.

This patch frees all mles that lie around after other nodes were notified about
exiting the dlm and marking dlm state as leaving. Only block mles are expected
to be around, so we log ERROR for other mles but still free them.
Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

5dad6c39

ocfs2: Sync inode flags with ext2. · 0000b862

由 Tao Ma 提交于 9月 19, 2010

We sync our inode flags with ext2 and define them by hex
values. But actually in commit 36695673(4 years ago), all
these values are moved to include/linux/fs.h. So we'd
better also use them as what ext2 did. So sync our inode
flags with ext2 by using FS_*.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

0000b862

ocfs2: Move 'wanted' into parens of ocfs2_resmap_resv_bits. · 4a452de4

由 Tao Ma 提交于 9月 19, 2010

The first time I read the function ocfs2_resmap_resv_bits, I consider
about what 'wanted' will be used and consider about the comments.
Then I find it is only used if the reservation is empty. ;)

So we'd better move it to the parens so that it make the code more
readable, what's more, ocfs2_resmap_resv_bits is used so frequently
and we should save some cpus.
Acked-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

4a452de4

ocfs2: Use cpu_to_le16 for e_leaf_clusters in ocfs2_bg_discontig_add_extent. · 47dea423

由 Tao Ma 提交于 9月 13, 2010

e_leaf_clusters is a le16, so use cpu_to_le16 instead
of cpu_to_le32.

What's more, we change 'clusters' to unsigned int to
signify that the size of 'clusters' isn't important here.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

47dea423

ocfs2: update ctime when changing the file's permission by setfacl · 12828061

由 Tao Ma 提交于 9月 13, 2010

In commit 30e2bab2, ext3 fixed it. So change it accordingly in ocfs2.

Steps to reproduce:
# touch aaa
# stat -c %Z aaa
1283760364
# setfacl -m  'u::x,g::x,o::x' aaa
# stat -c %Z aaa
1283760364
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

12828061

23 9月, 2010 4 次提交

/proc/pid/smaps: fix dirty pages accounting · 1c2499ae

由 KOSAKI Motohiro 提交于 9月 22, 2010

Currently, /proc/<pid>/smaps has wrong dirty pages accounting.
Shared_Dirty and Private_Dirty output only pte dirty pages and ignore
PG_dirty page flag.  It is difference against documentation, but also
inconsistent against Referenced field.  (Referenced checks both pte and
page flags)

This patch fixes it.

Test program:

 large-array.c
 ---------------------------------------------------
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>

 char array[1*1024*1024*1024L];

 int main(void)
 {
         memset(array, 1, sizeof(array));
         pause();

         return 0;
 }
 ---------------------------------------------------

Test case:
 1. run ./large-array
 2. cat /proc/`pidof large-array`/smaps
 3. swapoff -a
 4. cat /proc/`pidof large-array`/smaps again

Test result:
 <before patch>

00601000-40601000 rw-p 00000000 00:00 0
Size:            1048576 kB
Rss:             1048576 kB
Pss:             1048576 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:    218992 kB   <-- showed pages as clean incorrectly
Private_Dirty:    829584 kB
Referenced:       388364 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB

 <after patch>

00601000-40601000 rw-p 00000000 00:00 0
Size:            1048576 kB
Rss:             1048576 kB
Pss:             1048576 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:   1048576 kB  <-- fixed
Referenced:       388480 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: NHugh Dickins <hughd@google.com>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1c2499ae

aio: do not return ERESTARTSYS as a result of AIO · a0c42bac

由 Jan Kara 提交于 9月 22, 2010

OCFS2 can return ERESTARTSYS from its write function when the process is
signalled while waiting for a cluster lock (and the filesystem is mounted
with intr mount option).  Generally, it seems reasonable to allow
filesystems to return this error code from its IO functions.  As we must
not leak ERESTARTSYS (and similar error codes) to userspace as a result of
an AIO operation, we have to properly convert it to EINTR inside AIO code
(restarting the syscall isn't really an option because other AIO could
have been already submitted by the same io_submit syscall).
Signed-off-by: NJan Kara <jack@suse.cz>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a0c42bac

/proc/vmcore: fix seeking · c227e690

由 Arnd Bergmann 提交于 9月 22, 2010

Commit 73296bc6 ("procfs: Use generic_file_llseek in /proc/vmcore")
broke seeking on /proc/vmcore.  This changes it back to use default_llseek
in order to restore the original behaviour.

The problem with generic_file_llseek is that it only allows seeks up to
inode->i_sb->s_maxbytes, which is zero on procfs and some other virtual
file systems.  We should merge generic_file_llseek and default_llseek some
day and clean this up in a proper way, but for 2.6.35/36, reverting vmcore
is the safer solution.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Reported-by: NCAI Qian <caiqian@redhat.com>
Tested-by: NCAI Qian <caiqian@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c227e690

Prevent freeing uninitialized pointer in compat_do_readv_writev · 767b68e9

由 Dan Rosenberg 提交于 9月 22, 2010

In 32-bit compatibility mode, the error handling for
compat_do_readv_writev() may free an uninitialized pointer, potentially
leading to all sorts of ugly memory corruption.  This is reliably
triggerable by unprivileged users by invoking the readv()/writev()
syscalls with an invalid iovec pointer.  The below patch fixes this to
emulate the non-compat version.

Introduced by commit b8373363 ("compat: factor out
compat_rw_copy_check_uvector from compat_do_readv_writev")
Signed-off-by: NDan Rosenberg <dan.j.rosenberg@gmail.com>
Cc: stable@kernel.org (2.6.35)
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

767b68e9

22 9月, 2010 2 次提交

bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends · 692ebd17

由 Jan Kara 提交于 9月 21, 2010

Inodes of devices such as /dev/zero can get dirty for example via
utime(2) syscall or due to atime update. Backing device of such inodes
(zero_bdi, etc.) is however unable to handle dirty inodes and thus
__mark_inode_dirty complains.  In fact, inode should be rather dirtied
against backing device of the filesystem holding it. This is generally a
good rule except for filesystems such as 'bdev' or 'mtd_inodefs'. Inodes
in these pseudofilesystems are referenced from ordinary filesystem
inodes and carry mapping with real data of the device. Thus for these
inodes we have to use inode->i_mapping->backing_dev_info as we did so
far. We distinguish these filesystems by checking whether sb->s_bdi
points to a non-trivial backing device or not.

Example: Assume we have an ext3 filesystem on /dev/sda1 mounted on /.
There's a device inode A described by a path "/dev/sdb" on this
filesystem. This inode will be dirtied against backing device "8:0"
after this patch. bdev filesystem contains block device inode B coupled
with our inode A. When someone modifies a page of /dev/sdb, it's B that
gets dirtied and the dirtying happens against the backing device "8:16".
Thus both inodes get filed to a correct bdi list.

Cc: stable@kernel.org
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

692ebd17

char: Mark /dev/zero and /dev/kmem as not capable of writeback · 371d217e

由 Jan Kara 提交于 9月 21, 2010

These devices don't do any writeback but their device inodes still can get
dirty so mark bdi appropriately so that bdi code does the right thing and files
inodes to lists of bdi carrying the device inodes.

Cc: stable@kernel.org
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

371d217e

20 9月, 2010 1 次提交

Coda: mount hangs because of missed REQ_WRITE rename · 112d421d

由 Jan Harkes 提交于 9月 17, 2010

Coda's REQ_* defines were renamed to avoid clashes with the block layer
(commit 4aeefdc6: "coda: fixup clash with block layer REQ_*
defines").

However one was missed and response messages are no longer matched with
requests and waiting threads are no longer woken up.  This patch fixes
this.
Signed-off-by: NJan Harkes <jaharkes@cs.cmu.edu>
[ Also fixed up whitespace while at it  -Linus ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

112d421d

18 9月, 2010 3 次提交

ocfs2/net: fix uninitialized ret in o2net_send_message_vec() · 50aff040

由 Wu Fengguang 提交于 8月 21, 2010

mmotm/fs/ocfs2/cluster/tcp.c: In function ‘o2net_send_message_vec’:
mmotm/fs/ocfs2/cluster/tcp.c:980:6: warning: ‘ret’ may be used uninitialized in this function

It seems a real bug introduced by commit 9af0b38f (ocfs2/net:
Use wait_event() in o2net_send_message_vec()).

cc: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

50aff040

ceph: select CRYPTO · be4f104d

由 Sage Weil 提交于 9月 17, 2010

We select CRYPTO_AES, but not CRYPTO.
Signed-off-by: NSage Weil <sage@newdream.net>

be4f104d

ceph: check mapping to determine if FILE_CACHE cap is used · a43fb731

由 Sage Weil 提交于 9月 17, 2010

See if the i_data mapping has any pages to determine if the FILE_CACHE
capability is currently in use, instead of assuming it is any time the
rdcache_gen value is set (i.e., issued -> used).

This allows the MDS RECALL_STATE process work for inodes that have cached
pages.
Signed-off-by: NSage Weil <sage@newdream.net>

a43fb731