提交 · cb289d6244a37cf932c571d6deb0daa8030f931b · openeuler / raspberrypi-kernel

25 1月, 2010 1 次提交

eventfd - allow atomic read and waitqueue remove · cb289d62

由 Davide Libenzi 提交于 1月 13, 2010

KVM needs a wait to atomically remove themselves from the eventfd ->poll()
wait queue head, in order to handle correctly their IRQfd deassign
operation.

This patch introduces such API, plus a way to read an eventfd from its
context.
Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
Signed-off-by: NAvi Kivity <avi@redhat.com>

cb289d62

21 1月, 2010 1 次提交

compat_ioctl: Supress "unknown cmd" message on serial /dev/console · 3f001711

由 Atsushi Nemoto 提交于 1月 10, 2010

After the commit fb07a5f8 ("compat_ioctl: remove all VT ioctl
handling"), I got this error message on 64-bit mips kernel with 32-bit
busybox userland:

ioctl32(init:1): Unknown cmd fd(0) cmd(00005600){t:'V';sz:0} arg(7fd76480) on /dev/console

The cmd 5600 is VT_OPENQRY.  The busybox's init issues this ioctl to
know vt-console or serial-console.  If the console was serial console,
VT ioctls are not handled by the serial driver.

And by quick search, I found some programs using VT_GETMODE to check
vt-console is available or not.
Signed-off-by: NAtsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

3f001711

20 1月, 2010 10 次提交

ecryptfs: use after free · ece550f5

由 Dan Carpenter 提交于 1月 19, 2010

The "full_alg_name" variable is used on a couple error paths, so we
shouldn't free it until the end.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Cc: stable@kernel.org
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

ece550f5

ecryptfs: Eliminate useless code · 4aa25bcb

由 Julia Lawall 提交于 1月 16, 2010

The variable lower_dentry is initialized twice to the same (side effect-free)
expression.  Drop one initialization.

A simplified version of the semantic match that finds this problem is:
(http://coccinelle.lip6.fr/)

// <smpl>
@forall@
idexpression *x;
identifier f!=ERR_PTR;
@@

x = f(...)
... when != x
(
x = f(...,<+...x...+>,...)
|
* x = f(...)
)
// </smpl>
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

4aa25bcb

ecryptfs: fix interpose/interpolate typos in comments · fe0fc013

由 Erez Zadok 提交于 1月 04, 2010

Signed-off-by: NErez Zadok <ezk@cs.sunysb.edu>
Acked-by: NDustin Kirkland <kirkland@canonical.com>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

fe0fc013

ecryptfs: pass matching flags to interpose as defined and used there · 3469b573

由 Erez Zadok 提交于 12月 06, 2009

ecryptfs_interpose checks if one of the flags passed is
ECRYPTFS_INTERPOSE_FLAG_D_ADD, defined as 0x00000001 in ecryptfs_kernel.h.
But the only user of ecryptfs_interpose to pass a non-zero flag to it, has
hard-coded the value as "1". This could spell trouble if any of these values
changes in the future.
Signed-off-by: NErez Zadok <ezk@cs.sunysb.edu>
Cc: Dustin Kirkland <kirkland@canonical.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

3469b573

ecryptfs: remove unnecessary d_drop calls in ecryptfs_link · c44a66d6

由 Erez Zadok 提交于 12月 06, 2009

Unnecessary because it would unhash perfectly valid dentries, causing them
to have to be re-looked up the next time they're needed, which presumably is
right after.
Signed-off-by: NAseem Rastogi <arastogi@cs.sunysb.edu>
Signed-off-by: NShrikar archak <shrikar84@gmail.com>
Signed-off-by: NErez Zadok <ezk@cs.sunysb.edu>
Cc: Saumitra Bhanage <sbhanage@cs.sunysb.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

c44a66d6

ecryptfs: don't ignore return value from lock_rename · 0d132f73

由 Erez Zadok 提交于 12月 05, 2009

Signed-off-by: NErez Zadok <ezk@cs.sunysb.edu>
Cc: Dustin Kirkland <kirkland@canonical.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

0d132f73

ecryptfs: initialize private persistent file before dereferencing pointer · e27759d7

由 Erez Zadok 提交于 12月 03, 2009

Ecryptfs_open dereferences a pointer to the private lower file (the one
stored in the ecryptfs inode), without checking if the pointer is NULL.
Right afterward, it initializes that pointer if it is NULL.  Swap order of
statements to first initialize.  Bug discovered by Duckjin Kang.
Signed-off-by: NDuckjin Kang <fromdj2k@gmail.com>
Signed-off-by: NErez Zadok <ezk@cs.sunysb.edu>
Cc: Dustin Kirkland <kirkland@canonical.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

e27759d7

eCryptfs: Remove mmap from directory operations · 38e3eaee

由 Tyler Hicks 提交于 11月 03, 2009

Adrian reported that mkfontscale didn't work inside of eCryptfs mounts.
Strace revealed the following:

open("./", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 3
fcntl64(3, F_GETFD) = 0x1 (flags FD_CLOEXEC)
open("./fonts.scale", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4
getdents(3, /* 80 entries */, 32768) = 2304
open("./.", O_RDONLY) = 5
fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
fstat64(5, {st_mode=S_IFDIR|0755, st_size=16384, ...}) = 0
mmap2(NULL, 16384, PROT_READ, MAP_PRIVATE, 5, 0) = 0xb7fcf000
close(5) = 0
--- SIGBUS (Bus error) @ 0 (0) ---
+++ killed by SIGBUS +++

The mmap2() on a directory was successful, resulting in a SIGBUS
signal later.  This patch removes mmap() from the list of possible
ecryptfs_dir_fops so that mmap() isn't possible on eCryptfs directory
files.

https://bugs.launchpad.net/ecryptfs/+bug/400443Reported-by: NAdrian C. <anrxc@sysphere.org>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

38e3eaee

eCryptfs: Add getattr function · f8f484d1

由 Tyler Hicks 提交于 11月 04, 2009

The i_blocks field of an eCryptfs inode cannot be trusted, but
generic_fillattr() uses it to instantiate the blocks field of a stat()
syscall when a filesystem doesn't implement its own getattr(). Users
have noticed that the output of du is incorrect on newly created files.

This patch creates ecryptfs_getattr() which calls into the lower
filesystem's getattr() so that eCryptfs can use its kstat.blocks value
after calling generic_fillattr(). It is important to note that the
block count includes the eCryptfs metadata stored in the beginning of
the lower file plus any padding used to fill an extent before
encryption.

https://bugs.launchpad.net/ecryptfs/+bug/390833Reported-by: NDominic Sacré <dominic.sacre@gmx.de>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

f8f484d1

eCryptfs: Use notify_change for truncating lower inodes · 5f3ef64f

由 Tyler Hicks 提交于 10月 14, 2009

When truncating inodes in the lower filesystem, eCryptfs directly
invoked vmtruncate(). As Christoph Hellwig pointed out, vmtruncate() is
a filesystem helper function, but filesystems may need to do more than
just a call to vmtruncate().

This patch moves the lower inode truncation out of ecryptfs_truncate()
and renames the function to truncate_upper().  truncate_upper() updates
an iattr for the lower inode to indicate if the lower inode needs to be
truncated upon return.  ecryptfs_setattr() then calls notify_change(),
using the updated iattr for the lower inode, to complete the truncation.

For eCryptfs functions needing to truncate, ecryptfs_truncate() is
reintroduced as a simple way to truncate the upper inode to a specified
size and then truncate the lower inode accordingly.

https://bugs.launchpad.net/bugs/451368Reported-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NDustin Kirkland <kirkland@canonical.com>
Cc: ecryptfs-devel@lists.launchpad.net
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

5f3ef64f

19 1月, 2010 1 次提交

fs/bio.c: fix shadows sparse warning · f06f135d

由 Thiago Farina 提交于 1月 19, 2010

fs/bio.c:81:33: warning: symbol 'bslab' shadows an earlier one
fs/bio.c:74:25: originally declared here
Signed-off-by: NThiago Farina <tfransosi@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f06f135d

18 1月, 2010 7 次提交

Btrfs: fix possible panic on unmount · 11dfe35a

由 Josef Bacik 提交于 11月 13, 2009

We can race with the unmount of an fs and the stopping of a kthread where we
will free the block group before we're done using it. The reason for this is
because we do not hold a reference on the block group while its caching, since
the allocator drops its reference once it exits or moves on to the next block
group. This patch fixes the problem by taking a reference to the block group
before we start caching and dropping it when we're done to make sure all
accesses to the block group are safe. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

11dfe35a

Btrfs: deal with NULL acl sent to btrfs_set_acl · a9cc71a6

由 Chris Mason 提交于 1月 17, 2010

It is legal for btrfs_set_acl to be sent a NULL acl.  This
makes sure we don't dereference it.  A similar patch was sent by
Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a9cc71a6

Btrfs: fix regression in orphan cleanup · 6c090a11

由 Josef Bacik 提交于 1月 15, 2010

Currently orphan cleanup only ever gets triggered if we cross subvolumes during
a lookup, which means that if we just mount a plain jane fs that has orphans in
it, they will never get cleaned up.  This results in panic's like these

http://www.kerneloops.org/oops.php?number=1109085

where adding an orphan entry results in -EEXIST being returned and we panic.  In
order to fix this, we check to see on lookup if our root has had the orphan
cleanup done, and if not go ahead and do it.  This is easily reproduceable by
running this testcase

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>
#include <stdio.h>

int main(int argc, char **argv)
{
	char data[4096];
	char newdata[4096];
	int fd1, fd2;

	memset(data, 'a', 4096);
	memset(newdata, 'b', 4096);

	while (1) {
		int i;

		fd1 = creat("file1", 0666);
		if (fd1 < 0)
			break;

		for (i = 0; i < 512; i++)
			write(fd1, data, 4096);

		fsync(fd1);
		close(fd1);

		fd2 = creat("file2", 0666);
		if (fd2 < 0)
			break;

		ftruncate(fd2, 4096 * 512);

		for (i = 0; i < 512; i++)
			write(fd2, newdata, 4096);
		close(fd2);

		i = rename("file2", "file1");
		unlink("file1");
	}

	return 0;
}

and then pulling the power on the box, and then trying to run that test again
when the box comes back up.  I've tested this locally and it fixes the problem.
Thanks to Tomas Carnecky for helping me track this down initially.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6c090a11

Btrfs: Fix race in btrfs_mark_extent_written · 6c7d54ac

由 Yan, Zheng 提交于 1月 15, 2010

Fix bug reported by Johannes Hirte. The reason of that bug
is btrfs_del_items is called after btrfs_duplicate_item and
btrfs_del_items triggers tree balance. The fix is check that
case and call btrfs_search_slot when needed.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6c7d54ac

Btrfs, fix memory leaks in error paths · 2423fdfb

由 Jiri Slaby 提交于 1月 06, 2010

Stanse found 2 memory leaks in relocate_block_group and
__btrfs_map_block. cluster and multi are not freed/assigned on all
paths. Fix that.
Signed-off-by: NJiri Slaby <jslaby@suse.cz>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2423fdfb

Btrfs: align offsets for btrfs_ordered_update_i_size · a038fab0

由 Yan, Zheng 提交于 12月 28, 2009

Some callers of btrfs_ordered_update_i_size can now pass in
a NULL for the ordered extent to update against.  This makes
sure we properly align the offset they pass in when deciding
how much to bump the on disk i_size.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a038fab0

btrfs: fix missing last-entry in readdir(3) · 406266ab

由 Jan Engelhardt 提交于 12月 09, 2009

parent 49313cdac7b34c9f7ecbb1780cfc648b1c082cd7 (v2.6.32-1-g49313cd)
commit ff48c08e1c05c67e8348ab6f8a24de8034e0e34d
Author: Jan Engelhardt <jengelh@medozas.de>
Date:   Wed Dec 9 22:57:36 2009 +0100

Btrfs: fix missing last-entry in readdir(3)

When one does a 32-bit readdir(3), the last entry of a directory is
missing. This is however not due to passing a large value to filldir,
but it seems to have to do with glibc doing telldir or something
quirky. In any case, this patch fixes it in practice.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

406266ab

17 1月, 2010 7 次提交

nommu: fix shared mmap after truncate shrinkage problems · 7e660872

由 David Howells 提交于 1月 15, 2010

Fix a problem in NOMMU mmap with ramfs whereby a shared mmap can happen
over the end of a truncation.  The problem is that
ramfs_nommu_check_mappings() checks that the reduced file size against the
VMA tree, but not the vm_region tree.

The following sequence of events can cause the problem:

	fd = open("/tmp/x", O_RDWR|O_TRUNC|O_CREAT, 0600);
	ftruncate(fd, 32 * 1024);
	a = mmap(NULL, 32 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
	b = mmap(NULL, 16 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
	munmap(a, 32 * 1024);
	ftruncate(fd, 16 * 1024);
	c = mmap(NULL, 32 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);

Mapping 'a' creates a vm_region covering 32KB of the file.  Mapping 'b'
sees that the vm_region from 'a' is covering the region it wants and so
shares it, pinning it in memory.

Mapping 'a' then goes away and the file is truncated to the end of VMA
'b'.  However, the region allocated by 'a' is still in effect, and has
_not_ been reduced.

Mapping 'c' is then created, and because there's a vm_region covering the
desired region, get_unmapped_area() is _not_ called to repeat the check,
and the mapping is granted, even though the pages from the latter half of
the mapping have been discarded.

However:

	d = mmap(NULL, 16 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);

Mapping 'd' should work, and should end up sharing the region allocated by
'a'.

To deal with this, we shrink the vm_region struct during the truncation,
lest do_mmap_pgoff() take it as licence to share the full region
automatically without calling the get_unmapped_area() file op again.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7e660872

nommu: fix race between ramfs truncation and shared mmap · 81759b5b

由 David Howells 提交于 1月 15, 2010

Fix the race between the truncation of a ramfs file and an attempt to make
a shared mmap of region of that file.

The problem is that do_mmap_pgoff() calls f_op->get_unmapped_area() to
verify that the file region is made of contiguous pages and to find its
base address - but there isn't any locking to guarantee this region until
vma_prio_tree_insert() is called by add_vma_to_mm().

Note that moving the functionality into f_op->mmap() doesn't help as that
is also called before vma_prio_tree_insert().

Instead make ramfs_nommu_check_mappings() grab nommu_region_sem whilst it
does its checks.  This means that this function will wait whilst mmaps
take place.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

81759b5b

do_add_mount() should sanitize mnt_flags · 27d55f1f

由 Al Viro 提交于 1月 16, 2010

MNT_WRITE_HOLD shouldn't leak into new vfsmount and neither
should MNT_SHARED (the latter will be set properly, along with
the rest of shared-subtree data structures)
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

27d55f1f

A
CIFS shouldn't make mountpoints shrinkable · 7e1295d9
由 Al Viro 提交于 1月 16, 2010
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
7e1295d9

mnt_flags fixes in do_remount() · 7b43a79f

由 Al Viro 提交于 1月 16, 2010

* need vfsmount_lock over modifying it
* need to preserve MNT_SHARED/MNT_UNBINDABLE
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7b43a79f

A
attach_recursive_mnt() needs to hold vfsmount_lock over set_mnt_shared() · df1a1ad2
由 Al Viro 提交于 1月 16, 2010
```
race in mnt_flags update
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
df1a1ad2

may_umount() needs namespace_sem · 8ad08d8a

由 Al Viro 提交于 1月 16, 2010

otherwise it races with clone_mnt() changing mnt_share/mnt_slaves
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8ad08d8a

16 1月, 2010 9 次提交

inotify: only warn once for inotify problems · 976ae32b

由 Eric Paris 提交于 1月 15, 2010

inotify will WARN() if it finds that the idr and the fsnotify internals
somehow got out of sync.  It was only supposed to do this once but due
to this stupid bug it would warn every single time a problem was
detected.
Signed-off-by: NEric Paris <eparis@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

976ae32b

inotify: do not reuse watch descriptors · 9e572cc9

由 Eric Paris 提交于 1月 15, 2010

Since commit 7e790dd5 ("inotify: fix
error paths in inotify_update_watch") inotify changed the manor in which
it gave watch descriptors back to userspace.  Previous to this commit
inotify acted like the following:

  inotify_add_watch(X, Y, Z) = 1
  inotify_rm_watch(X, 1);
  inotify_add_watch(X, Y, Z) = 2

but after this patch inotify would return watch descriptors like so:

  inotify_add_watch(X, Y, Z) = 1
  inotify_rm_watch(X, 1);
  inotify_add_watch(X, Y, Z) = 1

which I saw as equivalent to opening an fd where

  open(file) = 1;
  close(1);
  open(file) = 1;

seemed perfectly reasonable.  The issue is that quite a bit of userspace
apparently relies on the behavior in which watch descriptors will not be
quickly reused.  KDE relies on it, I know some selinux packages rely on
it, and I have heard complaints from other random sources such as debian
bug 558981.

Although the man page implies what we do is ok, we broke userspace so
this patch almost reverts us to the old behavior.  It is still slightly
racey and I have patches that would fix that, but they are rather large
and this will fix it for all real world cases.  The race is as follows:

 - task1 creates a watch and blocks in idr_new_watch() before it updates
   the hint.
 - task2 creates a watch and updates the hint.
 - task1 updates the hint with it's older wd
 - task removes the watch created by task2
 - task adds a new watch and will reuse the wd originally given to task2

it requires moving some locking around the hint (last_wd) but this should
solve it for the real world and be -stable safe.

As a side effect this patch papers over a bug in the lib/idr code which
is causing a large number WARN's to pop on people's system and many
reports in kerneloops.org.  I'm working on the root cause of that idr
bug seperately but this should make inotify immune to that issue.
Signed-off-by: NEric Paris <eparis@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9e572cc9

xfs: xfs_swap_extents needs to handle dynamic fork offsets · e09f9860

由 Dave Chinner 提交于 1月 14, 2010

When swapping extents, we can corrupt inodes by swapping data forks
that are in incompatible formats.  This is caused by the two indoes
having different fork offsets due to the presence of an attribute
fork on an attr2 filesystem.  xfs_fsr tries to be smart about
setting the fork offset, but the trick it plays only works on attr1
(old fixed format attribute fork) filesystems.

Changing the way xfs_fsr sets up the attribute fork will prevent
this situation from ever occurring, so in the kernel code we can get
by with a preventative fix - check that the data fork in the
defragmented inode is in a format valid for the inode it is being
swapped into.  This will lead to files that will silently and
potentially repeatedly fail defragmentation, so issue a warning to
the log when this particular failure occurs to let us know that
xfs_fsr needs updating/fixing.

To help identify how to improve xfs_fsr to avoid this issue, add
trace points for the inodes being swapped so that we can determine
why the swap was rejected and to confirm that the code is making the
right decisions and modifications when swapping forks.

A further complication is even when the swap is allowed to proceed
when the fork offset is different between the two inodes then value
for the maximum number of extents the data fork can hold can be
wrong. Make sure these are also set correctly after the swap occurs.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

e09f9860

xfs: fix missing error check in xfs_rtfree_range · 3daeb42c

由 Dave Chinner 提交于 1月 14, 2010

When xfs_rtfind_forw() returns an error, the block is returned
uninitialised.  xfs_rtfree_range() is not checking the error return,
so could be using an uninitialised block number for modifying bitmap
summary info.

The problem was found by gcc when compiling the *userspace* libxfs
code - it is an copy of the kernel code with the exact same bug.
gcc gives an uninitialised variable warning on the userspace code
but not on the kernel code. You gotta love the consistency (Mmmm,
slightly chewy today!).
Signed-off-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

3daeb42c

xfs: fix stale inode flush avoidance · 4b6a4688

由 Dave Chinner 提交于 1月 11, 2010

When reclaiming stale inodes, we need to guarantee that inodes are
unpinned before returning with a "clean" status. If we don't we can
reclaim inodes that are pinned, leading to use after free in the
transaction subsystem as transactions complete.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

4b6a4688

xfs: Remove inode iolock held check during allocation · 126976c7

由 Dave Chinner 提交于 1月 10, 2010

lockdep complains about a the lock not being initialised as we do an
ASSERT based check that the lock is not held before we initialise it
to catch inodes freed with the lock held.

lockdep does this check for us in the lock initialisation code, so
remove the ASSERT to stop the lockdep warning.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

126976c7

xfs: reclaim all inodes by background tree walks · 57817c68

由 Dave Chinner 提交于 1月 10, 2010

We cannot do direct inode reclaim without taking the flush lock to
ensure that we do not reclaim an inode under IO. We check the inode
is clean before doing direct reclaim, but this is not good enough
because the inode flush code marks the inode clean once it has
copied the in-core dirty state to the backing buffer.

It is the flush lock that determines whether the inode is still
under IO, even though it is marked clean, and the inode is still
required at IO completion so we can't reclaim it even though it is
clean in core. Hence the requirement that we need to take the flush
lock even on clean inodes because this guarantees that the inode
writeback IO has completed and it is safe to reclaim the inode.

With delayed write inode flushing, we coul dend up waiting a long
time on the flush lock even for a clean inode. The background
reclaim already handles this efficiently, so avoid all the problems
by killing the direct reclaim path altogether.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

57817c68

xfs: Avoid inodes in reclaim when flushing from inode cache · 018027be

由 Dave Chinner 提交于 1月 10, 2010

The reclaim code will handle flushing of dirty inodes before reclaim
occurs, so avoid them when determining whether an inode is a
candidate for flushing to disk when walking the radix trees.  This
is based on a test patch from Christoph Hellwig.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

018027be

xfs: reclaim inodes under a write lock · c8e20be0

由 Dave Chinner 提交于 1月 10, 2010

Make the inode tree reclaim walk exclusive to avoid races with
concurrent sync walkers and lookups. This is a version of a patch
posted by Christoph Hellwig that avoids all the code duplication.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

c8e20be0

14 1月, 2010 4 次提交

A
Fix configfs leak · 9b6e3102
由 Al Viro 提交于 1月 13, 2010
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
9b6e3102

Fix the -ESTALE handling in do_filp_open() · 9850c056

由 Al Viro 提交于 1月 13, 2010

Instead of playing sick games with path saving, cleanups, just retry
the entire thing once with LOOKUP_REVAL added.  Post-.34 we'll convert
all -ESTALE handling in there to that style, rather than playing with
many retry loops deep in the call chain.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9850c056

ecryptfs: Fix refcnt leak on ecryptfs_follow_link() error path · 806892e9

由 OGAWA Hirofumi 提交于 1月 12, 2010

If ->follow_link handler return the error, it should decrement
nd->path refcnt. But, ecryptfs_follow_link() doesn't decrement.

This patch fix it by using usual nd_set_link() style error handling,
instead of playing with nd->path.
Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

806892e9

Fix ACC_MODE() for real · 6d125529

由 Al Viro 提交于 12月 24, 2009

commit 5300990c had stepped on a rather
nasty mess: definitions of ACC_MODE used to be different.  Fixed the
resulting breakage, converting them to variant that takes O_... value;
all callers have that and it actually simplifies life (see tomoyo part
of changes).
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6d125529